Fig.
2. PCA of landscape factors of Taihang Mountains.
PCA analysis was performed on 160 individual landscape factors with
VIF<5.
Finally, eight landscape factors were retained, namely average
precipitation in August, average precipitation in October, average
precipitation in November, built-up land (residential and
infrastructure), rain-fed cultivated land, workability (restricted site
management), solar radiation in August, and soil PH.
2.6 FST outliers filtered
and selected SNPs
identified
F ST outliers generally show the selected genes or
loci among populations. To identify the selected SNPs in the twoOpisthopappus species, BAYESCAN 2.1 software was used to filter
the F ST outliers (Fischer et al., 2011; Foll et
al., 2010; Ruan et al., 2021). Prior odds of the selection model were
set at 10,000 to reduce false-positive results under a variety of
demographic events. A logarithmic scale for model choice of selection
over neutrality was defined as: substantial (log10PO > 0.5,
0-0.05), strong (log10PO > 1.0, 0.05-0.15), very strong
(log10PO > 1.5, 0.15-0.25) and decisive (log10PO
> 2, F ST>0.25). A gene
or locus with log10PO > 0.5 was considered as a potential
selective outlier under natural selection (Feng et al., 2015). Finally,
29 genes/loci identified based on the BayeScan were considered as
putative SNPs under selection. These SNPs were retained for the
subsequent landscape features association analysis. Then the filtered
SNPs were extracted from the VCF file.
2.7 Association of SNPs with landscape
factors
The SNP associations with landscape factors were assessed using
Samβada v.0.9.0 and latent factor
mixed model (LFMM) software (Chien et al., 2020; Feng et al., 2015; Ruan
et al., 2021). Samβada builds logistic regressions to estimate an
individual’s probability of presenting a particular molecular marker
depending on the landscape factors that characterize its sampling site
(Li et al., 2019; Vargas-Mendoza et al., 2016).
In order to accurately describe
the landscape factors of each
population, the eigenvalues of the first four principal components of
principal component analysis (PC1-4) were chosen, which explained
77.04% of the total landscape features. In Samßada, the effect of each
landscape factor was tested by adding one factor at a time to the
population landscape factors (dimensionP+1), and the more likely model
was assessed (without or with the landscape factors). For each test
model, Samßada created an output file containing the model parameters,
logarithmic likelihood, G score, Wald score, AIC, and BIC. To ensure the
model’s accuracy, all the models were screened according to the AIC
value (Mahtani-Williams et al., 2020; Stucki et al., 2017). Then the
first 29 valid models were selected with the smallest AIC value, and the
proportion of each factor in these 29 models was counted. And the 29
models involved a total of three genes among selected SNPs. These genes
were subsequently subjected to carry KEGG annotation
(https://www.genome.jp/kegg/).
LFMM is a hierarchical Bayesian hybrid model, which considers the
background of population structure as the random effect of population
history and isolation by distance model, and through the potentialK value of population structure (Frichot et al., 2013; Wang et
al., 2017). In LFMM, the genetic data matrix was tested based on a
z-score as a fixed effect. The number of possible factor K was
set to 2 (according to the Structure results). LFMM ran 5 times with
10,000 iterations in the Gibbs sampling algorithm and a burn-in period
of 5,000 cycles for each K value. Z-scores from five independent
replicate runs were combined using Fisher–Stouffer method, and the P
values were adjusted using the genomic inflation factor (λ). P values
were further adjusted based on an FDR correction of 1 % using the R
‘qvalue’ package to get Q values (Li et al., 2019).
3. Result
3.1 Genetic characteristics about two
species
For the all 17 populations, the HLT population had the largest genetic
diversity values, Ar, H o and H s were
1.262, 0.138, 0.104 respectively (Table 2). SLD population had the
minimum values, Ar and H S were
1.198 and 0.082 respectively. However, WML population showed a lowestH O value of 0.116.