Multivariate redundancy analysis (RDA): assessing spatial & environmental influences on genetic structure
We used RDAs to estimate the relative influence of spatial and environmental factors on neutral and putatively adaptive genetic structure in S. umbilicalis and N. lapillus . Our response variables were minor allele frequencies (MAF) of each locus, estimated using the software PLINK (v1.9; Chang et al. 2015), and detrended using the Hellinger method implemented in the decostand function of vegan. To account for our large number of molecular markers, we conducted PCAs on each neutral and outlier dataset and retained only meaningful PCs (those with eigenvalues >1) as response variables in our models. Environmental (PCs) and spatial (dbMEM and AEM vectors) variables were used as predictor variables. We tested for correlations between these predictors and removed one variable when correlation exceeded 0.7, resulting in a final predictor variable dataset of four environmental PCs representing SST, AT and exposure, six dbMEMs, and three AEMs. As N. lapillus is a direct developer, we excluded the AEM vectors from our RDA models for this species.
We conducted RDA and partial RDA analyses on our four response variable datasets (neutral and outlier datasets for N. lapillus andS. umbilicalis ) using vegan. First, we conducted a backwards and forwards selection procedure using the ordistep function to determine the combination of predictor variables that best explained each of our response variable datasets (i.e., the model producing the highest adjusted R2). From this “best” model, we conducted partial RDAs, where we conditioned the model to control for the influence of either geographic structure (dbMEMs), larval connectivity (AEMs) or environmental variation (PCs) by first estimating and removing their effects and then performing an RDA on the residual matrix. Thereby, we were able to partition the variance of our “best” models, to determine the amount of explainable variation in our dataset attributed to each set of predictor variables, while controlling for all other variables. The significance of our models and associated predictor variables were tested using analyses of variance (ANOVAs), implemented using the anova function of the stats package in R, with 1,000 permutations.