ddRAD genotyping and SNP filtering
DNA was extracted using a modified CTAB method (Milligan, 1992). The DNA
samples were quantified using a Qubit 2.0 Fluorometer (Invitrogen, MA,
USA) and adjusted to 12.6 ng/μl through dilution with TE buffer.
Sequencing libraries were prepared following a modified version of
Peterson’s protocol for ddRAD-seq (Peterson et al., 2012). For detailed
library preparation methods, refer to appendix 1. The libraries were
sequenced using a HiSeq2000 platform (Illumina, CA, USA) with 51 bp
single-end reads at BGI Japan (Kobe, Japan).
SNPs were detected using dDocent (Puritz et al., 2014) and Stacks
(Catchen et al., 2013; Catchen et al., 2011), resulting in three
datasets: denovo, referenced, and demography datasets. The detection
conditions and number of SNPs used in each analysis are summarized in
Table S1. In all data sets, we excluded five individuals with low
individual-level genotyping rates from SNP detection. In the referenced
and denovo datasets, SNPs were detected using dDocent, following its
tutorial. When using the denovo dataset, the reference genome ofC. subpubescens was not used as a reference sequence and the two
outgroup individuals were not included. In contrast, when using the
referenced dataset, the reference genome and two outgroup individuals
were used. Total raw SNPs generated via dDocent were filtered using
vcftools -0.1.14 to meet the conditions outlined in Table S1. For the
demography dataset, SNPs were re-extracted from the .bam files created
for the referenced dataset using dDocent. First, gstacks from Stacks
version 2.60 was used to generate catalogs of variable sites (Rochette
et al., 2019). Subsequently, populations from Stacks were employed to
extract SNPs with the following options: -r 0.8 -p X –min-mac 1
–max-obs-het 0.5 –vcf (where X represents the number of
species/ecotypes in each dataset). The pairwise two-dimensional minor
allele site frequency spectrum (2D-mSFS) was calculated from the .vcf
file using the R script 2D-msfs-R
(https://github.com/garageit46/2D-msfs-R). Missing data were addressed
through bootstrapping within the same ecotype.