Haplotype network reconstruction
We assigned cpDNA SNPs to different haplotypes using the pegas R
package v.1.1 . We constructed the haplotype network using the maximum
parsimonious tree method available in PopART v.1.7 , with a 95%
statistical parsimony criterion. To test for phylogeographic signal, we
calculated the N ST, and theG ST , using Permut v.1.2.1 applying 1,000
permutation tests .
Inference of population size change in M. fragrans
Past demographic events in M. fragrans populations from the
Moluccas were inferred using Approximate Bayesian Computation (ABC)
analyses. ABC approaches estimate posterior probability distributions
substituting the calculation of likelihood by comparing observed data
with simulated data . Using the nSSR dataset, we characterized the
demographic evolution through time for each population using the program
DIYABCskylineplot . Coalescent simulations and the computation of prior
parameters and summary statistics were done using command line in DIYABC
v.2.0 . We set the log-uniform priors for scaled effective population
size (θ ) ranging from 1 to 1,000,000 for each population. Time
parameters were randomly drawn at the interval [1;2000], and uniform
prior in the interval [0;1] for the generalized stepwise mutation
model. We performed 100,000 simulations for each population.
To estimate the robustness of past demographic event inferences, we
further explored the likelihood of demographic changes through time
using an importance sampling algorithm under a maximum likelihood
framework as implemented in MIGRAINE . First, we set the parameter of
multistep mutation proportion (p GSM) from the range [0;1] on
each population. We used the OnePopVarSize model since this model
consider a single past change in population size. We determined high
values of parameter points and trees to ensure smooth demographic
projection with 20,000 trees, 500 points, and a total of 16 iterations.
All demographic settings inferred by MIGRAINE were using the GSM model
with a finite number of allelic states (K = 50).
Reconstructing population history of M. fragrans
We inferred the ancestry and divergence time of the genetic clusters
defined by STRUCTURE (see Results section), using DIYABC Random Forest
v.1.1.1 . Populations assigned to a given genetic cluster according to
STRUCTURE results were pooled as a single genetic group to reduce
demographic model complexity. For these analyses, we only kept
individuals with a cluster assignment probability higher than 0.85.
We first performed the analysis on each dataset (nSSRs and cpDNA)
separately to understand the behavior of confidence interval on the
posterior distribution of demographic parameters. We then combined the
two datasets and analyzed them together to assess the robustness of the
analyses.
Two scenarios of species’ population divergence were tested with
historical demographic parameters drawn from the prior distribution
(Supplementary Table S1), with no admixture event included in the
analysis (Supplementary Figure S1). We also linked this demographic
prior distribution according to historical data of M. fragransdistribution in the Moluccas as stated by Hanna and Lape to infer the
ancestry and divergence time of the genetic groups.
For each scenario, 100,000 coalescent simulations were performed. We
checked for the congruence of simulation results under the different
scenario with the observed data using (i) Principal Component Analysis
(PCA) on the summary statistics from simulated data sets, and (ii)
projecting the observed data on the first two PC axes. We used ABC with
random forest (RF) to perform scenario choice and parameter estimation,
in DIYABCRF v.1.1.1 module. For scenario choice, we used the maximum
number of simulated data sets (100,000 each scenario) and the best
scenario were chosen according to a classification vote. For parameter
estimation, RFs of 1000 trees were used to obtain median point estimates
for each demographic parameter in the model, as well as a 95% posterior
credible interval.