2.5 Genetic structure of A. buergeriana var.buergeriana populations
Leaf samples were obtained from 6–21 individuals in each study
population, and DNA was extracted by the CTAB method (Doyle & Doyle,
1990). We performed MIG-seq (Suyama & Matsuki, 2015) to detect
genome-wide SNPs. A MIG-seq library was prepared following to the
protocol of Suyama et al. (in press). A first PCR was performed to
amplify inter-simple sequence repeat regions using MIG-seq primer set 1
(Suyama & Matsuki, 2015), and then a second PCR was performed on the
purified/equalized first PCR product to add the sequences necessary for
sequencing on the MiSeq system and for sample identification. The second
PCR products were pooled and fragments of 350 bp or more were isolated.
We used MiSeq Reagent Kit v3 (150 cycle, Illumina) and performed
sequencing with an Illumina MiSeq Sequencer (Illumina, San Diego, CA,
USA) following the manufacturer’s protocol. We used the ”DarkCycle”
option to skip sequencing of the first 17 bases of reads 1 and 2 (simple
sequence repeat primer regions and anchors).
Low-quality reads and extremely short reads containing adaptor sequences
were removed by using trimmomatic 0.39 (Bolger et al., 2014). De
novo SNP discovery was performed by using the Stacks 2.41 software
pipeline (Catchen et al., 2013; Rochette et al., 2019). For de
novo SNP discovery, we used the following parameters: minimum depth of
coverage required to create a stack (m) = 3, maximum distance between
stacks (M) = 2, and maximum mismatches between loci when building the
catalog (n) = 2. Three different filtering criteria were applied for
quality control of the SNP data. First, SNPs that were retained by 80%
or more samples were included in the SNP dataset. Second, SNPs with a
minor allele frequency of less than 0.05 were removed. Third, loci
containing SNPs with extremely high observed heterozygosity (H o ≥
0.6) were removed. Fourth, after performing a Hardy-Weinberg equilibrium
test on each population, we excluded loci where allele frequencies
deviated from the Hardy-Weinberg equilibrium at P <
0.01 in three or more populations.
The following population genetic statistics of SNP sites for each
population were calculated with the Stacks populations module: expected
heterozygosity H e, observed heterozygosity H o, nucleotide
diversity π, and inbreeding coefficient F IS(Hartl & Clark, 1997). The population genetic structure was examined by
PCA using PLINK 1.9 (Purcell et al., 2007). In addition, a Bayesian
clustering analysis was performed with STRUCTURE software version 2.3.4
(Pritchard et al., 2000; Falush et al., 2003). For the STRUCTURE
analysis, simulations were performed with 100k burn-in iterations and
100k Markov chain Monte Carlo iterations. The number of genetic clusters
(K ) was calculated 10 times for each possible K value from
1 to 10, and the appropriate number of clusters was estimated based on
the ΔK value (Evanno et al., 2005). Then, to examine the genetic
structure within each mountain region in more detail, we performed
additional STRUCTURE analyses. First, SNP re-detection was performed in
each of three mountain regions, the Utsukushigahara, Norikura+Ontake,
and Iizuna regions based on results of the initial analysis. The
population structure obtained based on all samples, with the above
filtering criteria used in SNP detection. Second, 10 independent
STRUCTURE analysis runs were performed for each mountain region with
100,000 burn-in steps and an additional 100,000 steps with the admixture
model; log-likelihood values were estimated for each possible Kvalue (K = 1–10), and the appropriate number of clusters was
estimated based on the ΔK .