3 Results
After aligning markers in common for all samples, 246 neutral markers
(Hess et al., 2016) and up to 13 candidate markers from chromosome 28
(Table 1) were included for further analyses. A total of 9,471
individuals from 113 populations met inclusion criteria
(>90% loci successfully genotyped and had an estimated
<0.5% genotyping error based on replicate genotyping) and
were included in this study.
Population structure as visualized by the PCA of allelic frequencies of
neutral markers indicated genetic divergence by geographic locations
(Figure 2). DAPC and ΔK revealed two genetic groupings that
coincided with coastal and inland localities (Appendix S1 Figure S1).
Most coastal collections, except for Mill Creek and Indian Creek,
exhibited non-overlapping allele frequencies relative to all inland
collections. The Klickitat River which is located between coastal and
inland populations formed a cluster intermediate of the two population
types. Inland collections from the Yakima and Clearwater rivers
clustered distinctly from others in study (Figure 2).
A second PCA produced using candidate markers separated individuals
according to proportion of premature and mature migration genotypes
(Figure 3). In contrast to results with neutral markers that separated
individuals by sample location and population structure, the PCA with
adaptive markers separated individuals by migration timing genotypes.
Cluster membership delineated via DAPC assuming K=2 grouped
individuals from 25 putatively coastal lineage collections together and
grouped individuals from 90 putatively inland lineage collections
together in a second cluster (Appendix S1 Figure S1).
Candidate markers were analyzed for all sampling locations in Haploview
with solid spine and this resulted in two haploblocks, one with markers
1-7 and another with markers 8-13 (Figure 4a). One haplotype block
contained all markers within greb1L and another included all or
the majority of markers located within the intergenic region upstream ofgreb1L and rock1 . There was one marker located withinrock1 , but it did not demonstrate as strong of LD as other
markers included in the second haplotype block. The intergenic haplotype
block, containing markers 8-12, maintained high LD in both inland and
coastal collections.
When haplotype blocks were examined separately for coastal (Figure 4b)
and inland (Figure 4c) lineages, high LD was retained at markers 8-12
for both lineages. Additionally, minor allele frequencies (MAF) were
lower for all inland markers except for candidate markers 8-12 (Appendix
S1 Table S3; Figure S2). Variation in LD occurred among markers 1-7 and
was stronger in the coastal lineage (Figure 4b-c). Elevated LD in the
coastal lineage markers resulted in one haplotype block, spanning
markers 1-12 (Figure 4b). The solid spine analysis revealed three
haplotype blocks in the inland lineages which were split between markers
two and three and markers seven and eight (Figure 4c). The haplotype
block split between markers seven and eight observed in the inland
lineage was the same position as the split in all collections (Figure
4a), indicating the split for all collections was influenced by the
inland collections. Further, a greater divergence between average MAF
values can be observed between markers seven and eight of the inland
collections than in the coastal collections (Figure S2). Confidence
intervals (0.95 upper, 0.7 lower; Gabriel et al. 2002) and the four
gamete rule, which assumes recombination when all four possible
haplotypes are detected at frequencies exceeding 0.01 (frequency
> 0.02-0.03; Wang et al. 2002), were applied in further LD
analyses Variation in haplotype blocks was observed between analyses and
the differences were the inclusion or exclusion of markers 1 and 13 and
the split between markers 5 and 6 or between markers 7 and 8. The
difference in the location of where haplotype blocks were split could be
influenced by fixed alleles at markers 4, 6, and 7 in some collections
(Appendix S1 Table S3). All Snake River collections were limited to
markers 2, 3, 6, and 9 because these markers were developed earlier than
the rest and were the only markers available for these collections. This
resulted in limited data availability (4 instead of 13 candidate
markers) for the farthest inland collections. Haploview analysis
comparing lineages was done both with and without the individuals that
were only genotyped at 4 of the 13 markers and both analyses yielded the
same results.
We examined six different combinations of markers to determine which
markers produce similar frequency results: a single marker (9), three
markers (2,3,6), four markers (2,3,6,9), five markers (8-12), six
markers (2-7), and 11 markers (2-12). This allowed for comparison across
marker groups to determine if frequencies across different marker
combinations were similar. In general, all six combinations of marker
groups provided similar haplotype frequencies with differences in
associated haplotypes only differing by 1-7% (Figure S3). The groups
with the most similar haplotype frequencies were marker 9 alone and
markers 8-12, followed by markers 2, 3, and 6 and markers 2-7, markers
2, 3, 6, and 9 and markers 2-12 have similar average genotype
frequencies (Figure S3).
Average genotype proportions were mapped across all collection locations
with markers 2, 3, 6, and 9 because all collections were genotyped at
these markers (Figure 5). The most common genotype in individuals
sampled was the mature genotype. The mature genotype was predominant
throughout much of the range in the Columbia River, however many
populations west of the Cascade Mountains and in the Salmon River have
greater proportions of the premature genotype than other collections
(Figure 5). However, only 9 of the 113 populations had a higher
frequency of premature alleles for early migration.
To evaluate haplotype frequencies for a single haplotype block in as
many locations as possible, we further scrutinized haplotypes for
markers 2, 3, 6 across the landscape and found five unique haplotypes
(Figure S4a). Haplotype frequencies for collections (Figure S4a) showed
similar patterns of geographic distribution as the genotype frequencies
(Figure 5), but with improved resolution for heterozygous haplotypes
that were within a single haplotype block underlying greb1L .
According to results of overall haplotype frequency (Figure S4a), the
heterozygote haplotype 4 is present more frequently than the premature
haplotype 5. Additionally, there is a distinct separation of
heterozygote haplotypes between coastal (haplotypes 2 and 3) and inland
(haplotype 4) collections (Figure S4a).
To model impacts of significant environmental variables on allelic
frequencies of migration timing associated markers, RDAs were done for
all Columbia River basin collections and then separately for coastal and
inland lineage collections. Significant environmental variables retained
in the RDA for all collections were migration distance, minimum
temperature of the warmest month, 20-year average August water
temperature, annual mean temperature, isothermality, and annual
precipitation (Figure 6). Annual precipitation had the greatest effect
when all collections were analyzed together (Figure 6). Environmental
variables retained in the coastal lineage RDA were average temperature
of the coldest quarter and precipitation of the wettest month (Figure
S5a). Environmental variables retained in the interior lineage RDA were
20-year average August water temperature and minimum temperature of the
warmest month (Figure S5b).