Differential expression analysis
Due to differences in timing of fish rearing, sample processing, and sequencing, we did not combine Aripo and Quare datasets for statistical analysis. We instead performed analysis in an identical fashion for both drainages and subsequently conducted separate analyses to test hypotheses concerning mechanistic parallelism across drainages (see below). Standard differential expression analysis packages could not accommodate the random effects in our experimental design (i.e. family and week; see below) but including these effects improved model fit and reflected our split-brood experimental design. Upon fitting a generalized linear mixed model with negative binomial link to each data set (detailed in the Supplemental Methods), a portion of genes had statistically significant random effects due to either family or week or both. Model comparisons between the generalized linear model without random effects and the generalized linear mixed model including week and family effects supported incorporating random effects, as likelihood ratio tests indicated better fit of the mixed models for 391 out of 19,004 genes in the Aripo dataset and 1,194 out of 19,902 genes in the Quare dataset. For consistency, we performed differential expression analysis using generalized linear mixed models with random effects for all genes as further detailed in the Supplemental Materials.
We normalized read counts using DESeq2 in R (Love, Huber, & Anders, 2014) and performed differential expression analysis using the lme4 package in R (github.com/lme4). Normalized count data were modeled using a generalized linear mixed model with negative binomial distribution. We included population of origin, rearing environment, and their interaction as fixed effects. In addition, we included family (siblings split across rearing environments) and week (tissue was collected from animals in balanced blocks across multiple weeks) as random effects. A Wald’s test provided p-values for main effects and interaction effects for each gene (Lehmann & Romano, 2005). We adjusted p-values for multiple hypothesis testing using a direct approach for FDR correction (Storey, 2002) as implemented in the fdrtool package in R (Strimmer, 2008). We considered transcripts differentially expressed (DE) if the adjusted p-value for an effect was below 0.05. To examine whether differential expression calls were influenced by transcript abundance, we compared mean and median counts of DE and non-DE genes using two sample t-tests and Wilcoxon rank sum tests. We also compared overall patterns of gene-wise variance between DE and non-DE groups with respect to either population or rearing effect using Siegel-Tukey tests and Kolmogorov-Smirnov tests. We performed GO term enrichment analysis for all sets of DE transcripts using annotation information for ‘Biological Processes’ in the topGO package in R (Alexa & Rahnenfuhrer, 2016).