3.2 | Population genomics
The Bayesian assignment test identified an optimal genetic clustering value of K = 4 for the combined river dataset (Figure 3A). The most common cluster was observed exclusively in four Volga River localities (VO1, VO3, VO4, and VO5) and one Meramec River locality (MO2). The latter cluster was also admixed in all the remaining localities. The Volga River exhibited some population substructure with individuals from locality VO2 contained in a unique cluster with some admixture. There was greater substructure in the Meramec River with two distinct clusters observed in localities MO1, MO3, MO4 and MO5 with some admixture.
The Volga-only dataset had an optimal genetic cluster size of K = 4 (Figure 3B). A greater amount of population substructure was observed in this dataset as compared to the combined river dataset (Figure 3A). Localities VO1, VO4 and VO5 represented unique clusters with no admixture. Locality VO3 was represented by the same genetic cluster as locality VO4 with one individual exhibiting admixture. Locality VO2 was again represented by a unique genetic cluster with some admixture. The Meramec-only dataset had an optimal genetic cluster size of K = 3 (Figure 3C). Localities MO2, MO4 and MO5 contained individuals from the same genetic cluster with no admixture. Localities MO1 and MO3 were represented by unique clusters with some admixture.
The DAPC cross-validation functions determined PC counts with the lowest RMSE to be PC = 30 for the combined river, PC = 35 for the Volga-only, and PC = 25 for the Meramec-only datasets. The DAPC conducted for the combined river dataset supports a genetic distinction between the Volga and Meramec river populations (Figure 4A). The Meramec River localities displayed a higher amount of distinct genetic clustering than observed within the Volga River. Within the Meramec, locality MO3 was the most distinct and localities MO2 and MO4 were the most similar with some overlap. Within the Volga River, individuals clustered by locality, but the clusters were closely associated and overlapping. Locality VO1 was the most distinct. The results of the DAPC for the Volga-only dataset was somewhat similar to the combined river dataset (Figure 4B). Individuals clustered by locality but there was a high degree of overlap among localities VO2, VO3 and VO5. Locality VO4 displayed a greater distinction from the other localities as compared to the combined dataset and locality VO1 was again the most distinct. The results of the DAPC for the Meramec-only dataset again displayed individuals clustered by locality but with less genetic distinction (Figure 4C). Localities MO2, MO3, and MO4 displayed a high degree of overlap. Localities MO1 and MO5 formed distinct clusters with greater genetic diversity. Along the scatterplot, localities are arranged in an upstream (MO1) to down downstream (MO5) order.
Measures of the pairwise FST (WC84) among all ten localities in the combined river dataset showed a substantial amount of genetic distance between the Meramec and Volga river localities with values ranging between 0.2585 – 0.3560 and an average of 0.3276 (Table S2). However, the variance between localities within each river system was small but similar in both systems. The range of values in the Volga River was 0.0005 – 0.0078 with an average of 0.0038, and the range of values in the Meramec River was 0.0000 – 0.0062 with an average of 0.0019. In the river-only datasets the pairwise FSTvalues were small and nearly identical to the observations in the combined river dataset (Table S3).
For the Volga-only dataset there was no evidence of IBD with no significant correlation between in-river distances and linearized FST values (p = 0.259, R2 = 0.245; Figure 5A). However, for the Meramec-only dataset there was evidence of IBD. There was a significant correlation between in-river distances and linearized FST values (p = 0.015, R2 = 0.904; Figure 5B). The results of the AMOVA test on the combined river dataset revealed the genetic variation observed is explained by both differences among river systems and differences among individuals within the same locality (Table 2A). The AMOVA tests on the independent river system datasets also revealed that very little of the variation is explained by differences between localities. Most of the variation is explained by differences among individuals within localities (Table 2B, C).