Peter Euclide

and 7 more

Conservation and management professionals often works across jurisdictional boundaries to identify broad ecological patterns. These collaborations help to protect populations whose distributions span political borders. One common limitation to multijurisdictional collaboration is consistency in data recording and reporting. This limitation can impact genetic research which relies on data about specific markers in an organism’s genome. Incomplete overlap of markers between separate studies can prevent direct comparisons. Standardized marker panels can reduce the impact this issue and provide a common starting place for new research. Genotyping-in-thousands (GTSeq) is one approach used to create standardized marker panels for non-model organisms. Here we describe the development, optimization, and early assessments of a new GTSeq panel for use with walleye (Sander vitreus) from the Great Lakes region of North America. High genome-coverage sequencing conducted using RAD-capture provided genotypes for thousands of single nucleotide polymorphisms (SNPs). From these markers, SNP and microhaplotype makers were chosen that were informative for genetic stock identification (GSI) and kinship analysis. The final GTSeq panel contained 500 markers, including 197 microhaplotypes and 303 SNPs. Leave-one-out GSI simulations indicated that GSI accuracy should be greater than 80% in most jurisdictions. The false-positive rates of parent-offspring and full-sibling kinship identification was found to be low. Finally, genotypes could be consistently scored among separate sequencing runs >94% of the time. Results indicate that the GTSeq panel we developed should perform well for multijurisdictional research throughout the Great Lakes region.

Seth Smith

and 11 more

Here we present an annotated, chromosome-anchored, genome assembly for Lake Trout (Salvelinus namaycush) – a highly diverse salmonid species of notable conservation concern and an excellent model for research on adaptation and speciation. We leveraged Pacific Biosciences long-read sequencing, paired-end Illumina sequencing, proximity ligation (Hi-C), and a previously published linkage map to produce a highly contiguous assembly composed of 7,378 contigs (contig N50 = 1.8 mb) assigned to 4,120 scaffolds (scaffold N50 = 44.975 mb). 84.7% of the genome was assigned to 42 chromosome-sized scaffolds and 93.2% of Benchmarking Universal Single Copy Orthologs were recovered, putting this assembly on par with the best currently available salmonid genomes. Estimates of genome size based on k-mer frequency analysis were highly similar to the total size of the finished genome, suggesting that the entirety of the genome was recovered. A mitome assembly was also produced. Self-vs-self synteny analysis allowed us to identify homeologs resulting from the Salmonid specific autotetraploid event (Ss4R) and alignment with three other salmonid species allowed us to identify homologous chromosomes in other species. We also generated multiple resources useful for future genomic research on Lake Trout including a repeat library and a sex averaged recombination map. A novel RNA sequencing dataset was also used to produce a publicly available set of gene annotations using the National Center for Biotechnology Information Eukaryotic Genome Annotation Pipeline. Potential applications of these resources to population genetics and the conservation of native populations are discussed.