2.5 Genetic structure of A. buergeriana var.buergeriana populations
Leaf samples were obtained from 6–21 individuals in each study population, and DNA was extracted by the CTAB method (Doyle & Doyle, 1990). We performed MIG-seq (Suyama & Matsuki, 2015) to detect genome-wide SNPs. A MIG-seq library was prepared following to the protocol of Suyama et al. (in press). A first PCR was performed to amplify inter-simple sequence repeat regions using MIG-seq primer set 1 (Suyama & Matsuki, 2015), and then a second PCR was performed on the purified/equalized first PCR product to add the sequences necessary for sequencing on the MiSeq system and for sample identification. The second PCR products were pooled and fragments of 350 bp or more were isolated. We used MiSeq Reagent Kit v3 (150 cycle, Illumina) and performed sequencing with an Illumina MiSeq Sequencer (Illumina, San Diego, CA, USA) following the manufacturer’s protocol. We used the ”DarkCycle” option to skip sequencing of the first 17 bases of reads 1 and 2 (simple sequence repeat primer regions and anchors).
Low-quality reads and extremely short reads containing adaptor sequences were removed by using trimmomatic 0.39 (Bolger et al., 2014). De novo SNP discovery was performed by using the Stacks 2.41 software pipeline (Catchen et al., 2013; Rochette et al., 2019). For de novo SNP discovery, we used the following parameters: minimum depth of coverage required to create a stack (m) = 3, maximum distance between stacks (M) = 2, and maximum mismatches between loci when building the catalog (n) = 2. Three different filtering criteria were applied for quality control of the SNP data. First, SNPs that were retained by 80% or more samples were included in the SNP dataset. Second, SNPs with a minor allele frequency of less than 0.05 were removed. Third, loci containing SNPs with extremely high observed heterozygosity (H o ≥ 0.6) were removed. Fourth, after performing a Hardy-Weinberg equilibrium test on each population, we excluded loci where allele frequencies deviated from the Hardy-Weinberg equilibrium at P < 0.01 in three or more populations.
The following population genetic statistics of SNP sites for each population were calculated with the Stacks populations module: expected heterozygosity H e, observed heterozygosity H o, nucleotide diversity π, and inbreeding coefficient F IS(Hartl & Clark, 1997). The population genetic structure was examined by PCA using PLINK 1.9 (Purcell et al., 2007). In addition, a Bayesian clustering analysis was performed with STRUCTURE software version 2.3.4 (Pritchard et al., 2000; Falush et al., 2003). For the STRUCTURE analysis, simulations were performed with 100k burn-in iterations and 100k Markov chain Monte Carlo iterations. The number of genetic clusters (K ) was calculated 10 times for each possible K value from 1 to 10, and the appropriate number of clusters was estimated based on the ΔK value (Evanno et al., 2005). Then, to examine the genetic structure within each mountain region in more detail, we performed additional STRUCTURE analyses. First, SNP re-detection was performed in each of three mountain regions, the Utsukushigahara, Norikura+Ontake, and Iizuna regions based on results of the initial analysis. The population structure obtained based on all samples, with the above filtering criteria used in SNP detection. Second, 10 independent STRUCTURE analysis runs were performed for each mountain region with 100,000 burn-in steps and an additional 100,000 steps with the admixture model; log-likelihood values were estimated for each possible Kvalue (K = 1–10), and the appropriate number of clusters was estimated based on the ΔK .