Analysis of Neutral and Adaptive Population Structure
We estimated several metrics of genetic diversity for each population (percentage of polymorphic loci, frequency of the most common [major] allele, expected (HE) and observed (HO) heterozygosity) as well as FIS for each neutral and outlier dataset using the populations program in STACKS. We also estimated pairwise population differentiation (FST) for each dataset and assessed the significance of these estimates using permutations (999) with GenoDive (v3.04; Meirmans 2020).
We assessed population structure of neutral and outlier datasets using two approaches in R. First, we conducted a discriminant analysis of principal components (DAPC) using the package adegenet (v2.1.3; Jombart et al. 2010), which creates synthetic axes that maximize between-K and minimize within-K variance. The optimal K was chosen using the Bayesian Information Criterion (BIC), and the alpha-score was used to determine the number of PCs to retain so as not to incur overfitting issues. We also used the spatially explicit method TESS3, implemented in the package tess3r (v1.1.0; Caye et al. 2016), which estimates global ancestry coefficients while considering spatial proximity among individuals. For TESS3, we chose the optimal K using the cross-validation score.
To test for the potential influence of isolation by distance (IBD) we plotted the relationship between the shortest marine distance and pairwise neutral FST and assessed the significance of their linear relationship using the cor.test function of the stats package in R. We also performed Mantel tests between pairwise neutral FST and shortest marine distance using the package vegan (v2.5.6; Oksanen et al. 2019) and evaluated significance with 999 permutations. As our data were non-normally distributed (Shapiro-Wilk normality tests p<0.05; Shapiro & Wilk 1965), we used the non-parametric Spearman’s r statistic for all tests. Geographic distances were represented as the shortest marine distance between each pair of sites, estimated using least cost path analysis in ArcGIS, for which land cells were categorized as impermeable barriers.