Genetic divergence and diversity estimation
To estimate absolute genetic divergence between populations, we computed pairwise DXY following the formula derived by Nei (Nei & Li, 1979). When calculating DXY , two alleles at each SNP were interpreted as two haplotypes and corresponding allele frequencies as haplotype frequencies. PairwiseDXY values were summed over all SNPs and the sum was normalized by effective sequence length. For each pair of populations, the effective sequence length was defined by sites without missing data in both populations. The obtainedDXY matrix was used in multidimensional scaling using the ‘cmdscale’ package implemented in R (Figure 2), as well as neighbor-joining tree constructed using MEGA7 (Kumar, Stecher, & Tamura, 2016). We also performed Principal Component Analysis (PCA) on the SNP frequency matrix (summarizing the frequency of each SNP in each population) using the “prcomp” function in R (Venables & Ripley, 2002) to test whether the SNP frequencies differed among populations. Finally, to assess the extent to which genetic polymorphisms were fixed,FST statistics were computed following a method for many SNPs (Nei & Miller, 1990; Willing, Dreyer, & van Oosterhout, 2012).
The levels of genetic diversity within populations were measured by π and Watterson’s θ statistics. π summarizes the average number of nucleotide differences between two sequences randomly sampled from a population (Nei, 1987), while Watterson’s θ estimates nucleotide polymorphism based on the number of observed segregating sites (Watterson, 1977). To correct systematic errors of high-throughput sequencing, we computed θ values following a published algorithm (He et al., 2013).
Analyses of molecular variance (AMOVA) basing onDXY and FST are used to test whether genetic variation was partitioned by subspecies or geographical region. In the test for geographical region, the populations are assigned into three groups with the Malay Peninsula and Wallacea as the boundaries, which are two major discontinuities revealed in mangrove species (Guo et al., 2018b, 2016; J. Li et al., 2016; Yang et al., 2017). The first group includes MC, PN and LS, the second group includes BB, CA, DW, BS and AK, and the last group includes all the other populations.
Mantel tests of DXY andFST against geographic distance was performed to test the Isolation by Distance model. Geographical distances between sampling sites were approximated either by spheric distance or dispersal pathway along coasts (called coastline distance). The coastline distance is estimated according to the simulation of one-month oceanic dispersal ability using the methods described in (Van der Stocken, Carroll, Menemenlis, Simard, & Koedam, 2019), with approximate ruler of 350 km.
Geographic barriers delineating the largest genetic discontinuities between pairs of populations were identified using BARRIER 2.2 (Manni, Gue, & Heyer, 2004). By randomly selecting half of the 94 genes, we calculated one FST matrix for the 47 genes. We repeated this process 100 times and obtained 100FST matrices. Robustness of each inferred barrier was thus assessed by the 100 matrices.