Vivaswat Shastry

and 9 more

Infections by maternally inherited bacterial endosymbionts, especially Wolbachia, are common in insects and other invertebrates but infection dynamics across species ranges are largely under studied. Specifically, we lack a broad understanding of the origin of Wolbachia infections in novel hosts and the factors governing their spread. We used Genotype-by-Sequencing (GBS) data from previous population genomics studies for range-wide surveys of Wolbachia presence and genetic diversity in over 2,700 North American butterflies of the genus Lycaeides. As few as one sequence read identified by assembly to a Wolbachia pan-reference genome provided high accuracy in detecting infections as determined by confirmatory PCR tests. Using a conservative threshold of five reads, we detected Wolbachia in all but two of the 107 sampling localities spanning the continent, and with most localities having high infection frequencies (mean = 91\% infection rate). Three major lineages of Wolbachia were identified as separate strains that appear to represent three separate invasions of Lycaeides butterflies. Overall, we found extensive evidence for acquisition of Wolbachia through interspecific transfer between host lineages. Strain wLycC was confined to a single butterfly taxon, hybrid lineages derived from it, and closely adjacent populations in other taxa. While the other two strains were detected throughout the rest of the continent, strain wLycB almost always co-occurred with wLycA. Our demographic modeling suggests wLycB is a recent invasion. These results demonstrate the utility of using resequencing data from hosts to quantify Wolbachia genetic variation and provide evidence of multiple colonizations of novel hosts through hybridization between butterfly lineages and complex dynamics between Wolbachia strains.

Maya Gans

and 3 more

Community ecology includes linking variation in system functions to the distribution and abundance of taxa. In inferring processes, functions, and causal taxa, it is common practice to assume a core community can be defined and that attributes of the core are representative of the entire dataset. Assuming categorical thresholds in abundance exist has the potential to be misleading, especially if rare taxa are contributing to ecological processes. Additionally, there are no standard criteria for core membership, complicating comparisons across studies. Rather, the existence of a core set of taxa can be treated as a hypothesis that may or may not be supported. We considered four methods commonly used for defining a core in studies of microbiomes and applied them to two published microbial data sets and simulations covering a range of plausible communities. We evaluated the ability of each method to correctly categorize taxa. Assignment of core taxa varied substantially among methods and datasets. Additionally, the ability of evaluated methods to capture the simulated core was contingent on the distribution of taxon abundances. While able to correctly identify core taxa in select cases, the methods disagreed more often than not. Given the lack of agreement among core assignment methods, categorization of taxa into sets corresponding to core and non-core is questionable and requires testing and validation before use in any particular context. Our results do not support applying methods of dimension reduction for core taxa classification, but instead provide additional rationale to favor analyses that use abundance data in their entirety.

Vivaswat Shastry

and 6 more

Non-random mating among individuals can lead to spatial clustering of genetically similar individuals and population stratification. This deviation from panmixia is commonly observed in natural populations. Consequently, individuals can have parentage in single populations or involving hybridization between differentiated populations. Accounting for this mixture and structure is important when mapping the genetics of traits and learning about the formative evolutionary processes that shape genetic variation among individuals and populations. Stratified genetic relatedness among individuals is commonly quantified using estimates of ancestry that are derived from a statistical model. Development of these models for polyploid and mixed-ploidy individuals and populations has lagged behind those for diploids. Here, we extend and test a hierarchical Bayesian model, called entropy, which can utilize low-depth sequence data to estimate genotype and ancestry parameters in autopolyploid and mixed-ploidy individuals (including sex chromosomes and autosomes within individuals). Our analysis of simulated data illustrated the trade-off between sequencing depth and genome coverage and found lower error associated with low depth sequencing across a larger fraction of the genome than with high depth sequencing across a smaller fraction of the genome. The model has high accuracy and sensitivity as verified with simulated data and through analysis of admixture among populations of diploid and tetraploid Arabidopsis arenosa.