3.2.2. FST outliers
High genomic divergence between the demes was found all across the genome (Figure 4). The POWSIM results showed that the distributions of the observed and simulated (expected) F ST values differed from each other (Kolmogorov-Smirnov, p<0.05). The observed distribution showed a higher frequency of largeF ST values than the expected distribution (Figure 5). The highest value of the expected distribution wasF ST=0.864, and we found 194 SNPs (out of 12,177,462) with a larger F ST in the observed distribution, of which 19 had F ST=1. Of these 194 outliers, one was associated with a gene significantly enriched (p<0.05) by two GO terms connected with the growth process (GO:0048590): ‘extracellular matrix and structure organization’ (Table S4).
A total of 60,887 SNPs out of 12,177,462 occurred above the 99.5th percentile of the F STdistribution (F ST ≥ 0.53). 698 of these outlier SNPs were within genes and 432 genes were identified as feasible for topGO analysis. Results showed that these 432 genes were significantly (p≤0.01) enriched with 69 GO terms: 21 genes were significantly linked with GO terms associated to ‘glycosaminoglycan biosynthesis’ (Table S4). Seven genes were associated to ‘gonad morphogenesis’ while two were involved in binding of sperm with eggs. The five top GO term superclusters (the ones associated with the smallest p-values) were: ‘chondroitin sulfate metabolism’ (i.e. glycosaminoglycan biosynthesis), ‘response to ozone’ (linked with animals’ response to stimulus/stress, GO:0050896), ‘cell-cell adhesion involved in gastrulation’ (linked with embryonic morphogenesis, GO:0048598), ‘fatty acid derivative metabolism’ (energy storage), and ‘reactive oxygen species metabolism’ (linked with phagocytosis and signal transduction; Figure S5).
For 5 kb windows, 8,508 out of 340,297 windows were considered as outliers with F ST-values above the 97.5th percentile (F ST ≥ 0.35). 2,148 outlier windows were within genes. TopGO identified 1,494 feasible genes, associated to 113 significant GO terms (p≤0.01). Seven genes were significantly enriched with the GO terms ‘glucose catabolic process to pyruvate’ and ‘canonical glycolysis’ (Table S4). Four genes were enriched with the GO term ‘NADH oxidation’. Furthermore, six genes were significantly enriched with the GO term ‘sperm capacitation’. Top five GO term superclusters identified by REVIGO were: ‘phagocytosis’, ‘thrombin-activated receptor signaling pathway’, ‘ciliary body morphogenesis’, ‘peptidyl-glutamic acid modification’, and ‘fatty acid derivative biosynthesis’. These GO terms are mainly linked with metabolic (GO.0008152) or immunological (GO:0006910) processes (Figure S6).