Analysis of Neutral and Adaptive Population Structure
We estimated several metrics of genetic diversity for each population
(percentage of polymorphic loci, frequency of the most common
[major] allele, expected (HE) and observed
(HO) heterozygosity) as well as FIS for
each neutral and outlier dataset using the populations program in
STACKS. We also estimated pairwise population differentiation
(FST) for each dataset and assessed the significance of
these estimates using permutations (999) with GenoDive (v3.04; Meirmans
2020).
We assessed population structure of neutral and outlier datasets using
two approaches in R. First, we conducted a discriminant analysis of
principal components (DAPC) using the package adegenet (v2.1.3; Jombart
et al. 2010), which creates synthetic axes that maximize between-K and
minimize within-K variance. The optimal K was chosen using the Bayesian
Information Criterion (BIC), and the alpha-score was used to determine
the number of PCs to retain so as not to incur overfitting issues. We
also used the spatially explicit method TESS3, implemented in the
package tess3r (v1.1.0; Caye et al. 2016), which estimates global
ancestry coefficients while considering spatial proximity among
individuals. For TESS3, we chose the optimal K using the
cross-validation score.
To test for the potential influence of isolation by distance (IBD) we
plotted the relationship between the shortest marine distance and
pairwise neutral FST and assessed the significance of
their linear relationship using the cor.test function of the
stats package in R. We also performed Mantel tests between pairwise
neutral FST and shortest marine distance using the
package vegan (v2.5.6; Oksanen et al. 2019) and evaluated significance
with 999 permutations. As our data were non-normally distributed
(Shapiro-Wilk normality tests p<0.05; Shapiro & Wilk 1965),
we used the non-parametric Spearman’s r statistic for all tests.
Geographic distances were represented as the shortest marine distance
between each pair of sites, estimated using least cost path analysis in
ArcGIS, for which land cells were categorized as impermeable barriers.