Comparison of linkage groups to SNP cohorts identified in previous
work
Using the draft female genome assembly, Trevoy et al. (2019) conducted
principal component analyses and found that plateaus of high-loading
SNPs in linkage disequilibrium (LD) from a number of scaffolds were
driving clustering patterns on the first four principal component (PC)
axes; plateaus in PCs 1 and 3 were primarily related to geography, PC 2
was sex-linked, and PC 4 was much smaller and not clearly attributed to
geography or sex. To determine physical linkage and the chromosomal
locations of these SNPs, we assessed the correspondence between the
scaffolds from the draft female genome containing SNPs with the highest
loadings up to and including the plateaus in each PC shown in Trevoy et
al. (2019) and the final female assembly using BLAST+ v2.10.0 (Camacho
et al., 2009). For PCs 1 and 2, we included draft scaffolds containing
SNPs that had loadings equal to or greater than 0.05; for PC 3 this
cut-off was 0.087, and for PC 4, it was 0.1. We created a custom BLAST
database out of the final female assembly, and then used BLASTn to query
the draft scaffolds for each PC against the new assembly, specifying a
minimum e-value of 10-5. For each PC, hits were sorted
first based on e-value and then bitscore, outputting the single best
match to the final assembly for each draft assembly scaffold.