2.3 Population genetic analyses of 96 SNP array
Allele frequencies and deviations from Hardy–Weinberg proportions
measured as F IS and their associated significance
levels for the 96 SNP fluidigm array were obtained from GENEPOP (version
4.3; Raymond, 1995;
Rousset, 2008). Holm’s
(1979) sequential Bonferroni approach was
applied to adjust significance levels when evaluating the results from
multiple testing. F ST(Weir & Cockerham, 1984) was estimated
using FSTAT v2.9.4 (Goudet, 2003).
CHIFISH v5.0 (Ryman & Palm, 2006;
available at http://www.zoologi.su.se/~ryman/) was used
for F ST significance testing.
We also computed Nei´s (1973) parametricF ST(F ST=(H T-H S)/H T)
using GenAlEx v6.5 (Peakall & Smouse,
2012) to allow direct comparison with F STcalculated from Pool-seq data where only Nei´sF ST is possible to obtain. Confidence intervals
for Nei’s F ST were calculated using the following
equation: F ST ± tdf√s2/n (s2 is the variance ofF ST among loci), and for Weir & Cockerham’sF ST using
FSTAT
(Goudet, 2003). We note that Nei´s
pairwise F ST is typically around half that of
Weir & Cockerham´s. To avoid confusion, we consistently try to indicate
the type of F ST we refer to.
We assessed the most likely number of populations (K ) using
STRUCTURE v2.3.4 (Pritchard, Stephens, &
Donnelly, 2000) using the default model allowing population admixture
and correlated allele frequencies. We used a burn-in of 250,000 steps
and 500,000 Markov chain (MCMC) replicates to estimate Q(assignment probability for each individual to each cluster) and
likelihoods for different K (= 1–15). Estimation of the most
likely K was repeated over ten runs and the output was analyzed
using STRUCTURE HARVESTER v0.6.94 (Earl &
vonHoldt, 2012). Mean individual Q values to each deme over runs
were derived from CLUMPP (Jakobsson &
Rosenberg, 2007). We based our estimation of the most likely value ofK on the mean likelihood value from STRUCTURE, ΔK(Evanno, Regnaut, & Goudet, 2005) from
STRUCTURE HARVESTER, and on results from KFinder v1.0
(Wang, 2019).
We also explored population structure using BAPS v6.0
(Corander, Marttinen, & Mantyniemi,
2006) and the details from this analysis are provided in Appendix S2.
We constructed an individual-based neighbor-joining phylogenetic tree
based on Nei’s D A distance estimates
(Nei, Tajima, & Tateno, 1983) from the
96 SNP array using POPTREE2 (Takezaki,
Nei, & Tamura, 2009), and MEGAX 10.0.5
(Kumar, Stecher, Li, Knyaz, & Tamura,
2018). We used the default number of bootstrap replications (1,000);
the tree was condensed to only include branches with bootstrap support
of at least 70%.
2.4 Pool-seq data processing and
variant
calling
We assessed the quality of the raw sequence reads of each pool using
FastQC v0.11.5 (Leggett, Ramirez-Gonzalez,
Clavijo, Waite, & Davey, 2013), and the results from different pools
were jointly evaluated using MultiQC v1.5
(Ewels, Magnusson, Lundin, & Käller,
2016). Low quality bases (phred score <20) and Illumina
adapters were trimmed off the reads using BBDuk as implemented in
BBTools v38.08 (http://sourceforge.net/projects/bbmap/). The
trimmed reads were mapped against the brown trout reference assembly
(comprising 2,371,863,509 bp;
https://vgp.github.io/genomeark/Salmo_trutta/)
using the Burrows–Wheeler Aligner v0.7.17
(BWA, using bwa mem algorithm; Li &
Durbin, 2009). Resulting bam files were sorted, merged per pool and
filtered for paired reads using SAMtools v1.8
(Li et al., 2009). The quality of the
obtained bam files per pool were evaluated with Qualimap v2.2.1
(García-Alcalde et al., 2012) and
summarized in MultiQC v1.5. Read depth histograms obtained from Qualimap
were assessed to define minimum and maximum depth thresholds for
subsequent population genomic analyses. SAMtools was applied for variant
calling using minimum mapping quality and base quality scores of 20 and
the parameter “base alignment quality” (BAQ; “-B”) to reduce false
SNPs caused by misalignments, resulting in one mpileup file for the two
pools. We used the ‘identify-genomic-indel-regions.pl’ script of
PoPoolation2 v1201 (Kofler, Pandey, &
Schlötterer, 2011) to remove any indels from the mpileup file. No SNPs
were kept from the error-prone 5 bp upstream and downstream of indels. A
synchronized file was created for downstream analyses using the
‘mpileup2sync.jar’ script of PoPoolation2.