Population demography inference
To identify divergence patterns among species/ecotypes and their timing, a coalescent-based maximum likelihood method was employed, estimating parameters of the population demographic model with fastsimcoal2 version 2705 (Excoffier et al., 2021). Divergence models were sequentially applied for three to six ecotypes in the Chichijima and Hahajima Islands (Fig. S2a–e) and from three to five ecotypes in the Chichijima, Hahajima, and Mukojima Islands (Fig. S2f and g). Comparative divergence patterns for a model involving more than four-ecotype divergences were designed based on the results from lower ecotype divergence modeling, admixture, phylogenetic tree, and neighbor-net network analyses. As the ecotype SH on Hahajima Island originated from hybridization between ecotypes SG and ST (Setsuko et al. 2023; also refer to the Results section), we did not include it in the present analysis to reduce complexity. Migration between species/ecotypes was only considered within the same island groups, as each island group was isolated by the sea, even during the Last Glacial Maximum (Setsuko et al., 2017). Only recent migration was assumed based on preliminary analyses; although ancient migration was also considered, such models exhibited much lower log-likelihood values (data not shown). Details of each model are presented in Figs. S3–9.
The likelihood of each model was maximized from 50 random starting values, 40 expectation-conditional-maximization (ECM) optimization cycles, and 100,000 coalescent simulations. The mutation rate was set to 1.74 × 10−8, estimated in a woody species,Populus tremula (Gossmann et al., 2012). We considered the model with the lowest Akaike’s information criterion value as the best model. The goodness of fitness of the best model was checked by visually comparing observed and simulated 2D-mSFSs. The confidence interval of the best model was calculated via parametric bootstrapping. We simulated the model using fastsimcal2 with maximum likelihood estimate parameter values 100 times and obtained its 2D-mSFSs. Using the simulated 2D-mSFSs as input data, parameters of the best model were recalculated with the observed parameter values as a stating value, 15 ECM cycles, and 100,000 coalescent simulations. Finally, the 95% confidence interval (CI) was calculated from the obtained parameter values. Considering the ecological traits of our study species, five years per generation were used to convert an event time from generations ago to years ago.