2.5 Genome wide test of selection
To detect positive selection outside of the coding regions of genes, we
used a maximum likelihood analysis of the haplotype frequency spectrum
across the genome in order to identify putative the targets of positive
selection via signatures of both soft and hard sweeps. For this purpose,
we used LASSI Plus (Harris & DeGiorgio, 2020) and the saltiLASSI
statistic (DeGiorgio & Szpiech, 2022). This approach is capable of
using unphased sequencing data to infer haplotypes and identify genomic
regions within population samples that exhibit greater than expected
changes in their haplotype allele frequencies given background genomic
patterns that are taken as neutrality. This method is able to both
estimate the likelihood of a given haplotype sweeping, as well as the
inferred width and number of haplotypes sweeping within a given species.
To avoid reference bias, we aligned our short read sequencing data from
15 individuals of each species to their respective reference genomes
(Yang et al., 2021) using the bwa-mem2 v.2.0pre2 (Vasimuddin, Misra, Li,
& Aluru, 2019), and then called variants using bcftools
v.1.13-35-ge3ba077 to generate an all-site vcf (Danecek et al., 2021).
The resulting vcfs were filtered for low quality calls (QUAL
> 30), read depth (5-50), no indels and to no more than 2
alternative alleles at a given site. Inferences of selective sweeps were
made using the salti statistic in the LASSI Plus software package (k=10,
window size 52, step size = 12). To identify any outlier windows across
the genome, we extracted all windows with a salti statistic (L) higher
than 4 standard deviations above the mean. L is a composite likelihood
ratio test statistic of the haplotype frequency spectra in a given
window being distorted relative to genomic background.
Finally, in order to compare the distribution and location of sweeps
between our different species, while avoiding reference bias from
aligning the samples to a single reference, we scaffolded each species
genome against a common chromosome level assembly from a species in a
sister genus, the beetle Lochmaea crataegi (NCBI:
GCA_947563755.1). Scaffolding was performed using Ragtag v.2.1.0, with
default settings using minimap2 (Alonge et al., 2022), with alignments
filtered to remove any contigs shorter than 50kb. To identify outlier
loci (i.e. those likely to have experienced a sweep), we extracted all
windows with a salti statistic (L) greater than 4 standard deviations
above the mean.