RESULTS
Genomic Datasets
Our final filtered datasets comprised 11,600 and 3,844 polymorphic SNPs
in S. umbilicalis and N. lapillus , respectively (Table 1).
The filtered S. umbilicalis dataset contained 196 individuals,
with an average of 16.33 samples per site (range of 6-19), and an
average of 11.4% missing data across all populations (range of
2.1-29.2%; Table S4). The filtered N. lapillus dataset contained
200 samples, with an average of 16.67 samples per population (range of
13-19), and an average of 11.7% missing data across all populations
(range of 1.8-29.9%; Table S4).
There were significantly fewer polymorphic loci in the N.
lapillus dataset, which largely resulted from the filtering pipeline
removing 93.35% of SNPs with mean read depths <10 (in
comparison to 57.66% in the S. umbilicalis dataset). This result
highlights an important limitation of RADseq datasets, in that they only
capture a portion of genome-wide variation (Lowry et al. 2017a).
Consequently, species with larger genomes will have fewer genomic
regions sequenced at sufficient coverage for reliable variant detection.
This suggests that N. lapillus has a much larger genome size thanS. umbilicalis . Further, as gene content does not scale
proportionally with genome size (Lowry et al. 2017a,b), the presumed
larger genome of N. lapillus should also result in fewer coding
sequences and the detection of fewer adaptive SNPs, which was also
evident here.