Data analysis
Our general modeling framework included site category as a fixed effect
(HP source, LP native, old introduction, and recent introduction) and
site as a random effect nested within drainage (Fig. 1). We assigned
drainages to reflect paired HP to LP translocations. For example,
although the El Cedro is a tributary of the Guanapo River, we assigned
the two El Cedro sites their own drainage because the translocation
occurred within that river. Turure was assigned to its own drainage (as
opposed to its HP source site Guanapo) because these two sites are not
physically connected. We used this model structure to test for
differences among all univariate response variables (stream
characteristics, fish traits, gut content) with linear mixed-effects
models (R package lmer, Bates et al. 2015), after transforming
response variables to meet assumptions of normality, comparing null and
fixed-effect models using a likelihood ratio test.
Before testing statistical differences among gut microbial communities,
we performed standard pre-processing steps (Table S1). MiSeq lane did
not significantly affect the composition of our control samples
(PerMANOVA p<0.1). We assessed statistical differences in
community structure across site categories using permutational analysis
of variance (PerMANOVA) on the unweighted (presence-absence) and
weighted (incorporates abundance) Unifrac distance matrices using
phyloseq (McMurdie & Holmes 2013) and vegan (Oksanen et al.2017) in R (R Core Team 2017). Permutations were robust to the order of
factors, which included gut position, stream treatment, and site, and
were constrained by drainage as a random effect. We also tested the
effect of site category on posterior gut microbiomes (all sites), in LP
sites alone (LP natives and introductions), and within each drainage,
adjusting p-values using Bonferroni correction.
We quantified the variance in community composition explained by
environmental variables (stream temperature, which correlated to stream
pH, r2=0.42, p<0.001), gut morphology, and
gut content using distance-based redundancy analysis (function dbRDA in
vegan, Legendre and Anderson 1999). DbRDA regresses predictor variables
on a sample dissimilarity matrix and tests their significance for
explaining community distances. Our model included drainage as a
conditional factor. We partitioned the variance each predictor variable
explained using multivariate variance partitioning (function varpart,
vegan). Compared to the commonly-used envfit function, dbRDA and varpart
partition variance from the raw distance matrix, rather than correlating
variables with reduced data, which predisposes predictors to explain a
high degree of variance. Gut content analysis (which also included gut
length measurements) was performed on a subset of fish distinct from the
subset for which gut microbiome was sequenced. Therefore, diet was
described at the population level rather than paired to microbiome
measurements. To account for this, we used population-level means for
all predictor variables. This did not affect conclusions. Since the
microbiome from HP source populations was so distinct from other site
categories, we repeated all variance partitioning and dbRDA without HP
source to examine factors controlling mirobiome in LP environments.
We used the distance matrices to calculate mean distance between a
source population and its respective introductions (and the LP native
site from the corresponding drainage). Betadispersion, or the
compositional variation of the microbiome among a group of individuals,
was calculated using betadisper in vegan (R Core Team 2017). In addition
to betadispersion, we quantified microbiome richness, Shannon-Weiner
diversity (Pielou 1975), and Pielou’s Evenness (Pielou 1975) on rarefied
data. We assessed differences using the univariate linear mixed effects
models described above.