Haplotype network variation across the genome
We inferred haplotype networks across the 94 loci we sequenced. Using an
expectation-maximization method to infer among-SNP linkage
disequilibrium, we split these regions into 454 linked segments (Table
S2). Segments with missing data and those less than 100bp in length were
discarded, retaining 231 segments for haplotype network reconstruction,
with A. alba as the outgroup (Figure 4).
Among these segments, 134 were not genetically distinguishable among
subspecies with only one or a few haplotypes identified and all
haplotypes closely related to each other and shared among the three
subspecies. The other 66 segments reliably distinguishaustralasica from the other two subspecies. Among these 66
segments, the BB population shares haplotypes with australasicainstead of marina at seven loci. The third type of segments, 14
in total, delimits marina from the other two subspecies. Five
segments distinguish eucalyptifolia , but BB shares haplotypes
with eucalyptifolia in all cases. Most importantly, in three
segments, haplotypes split into three clusters and each subspecies
contains haplotypes from a single cluster. These three segments provide
the best subspecies delineation. At other eight segments, each
subspecies also contains a cluster of haplotypes, except BB shares
haplotypes with eucalyptifolia . Finally, one segment separatesmarina and australasica, but eucalyptifoliacontains haplotypes from both clusters.
The three segments clearly delineating subspecies are from three genomic
loci, Am0259, Amc232, and Amc302. We roughly estimate that about 3% of
the A. marina genome is highly differentiated among subspecies
(three out of the 94 genomic loci surveyed). Am0259 partially covers a
protein coding gene, the ortholog of which in Arabidopsis thalinais annotated as “shaggy-related protein kinase.” Amc232 and Amc302 are
noncoding. The eight segments that follow subspecies delineation with
the exception of the BB population are from seven genomic loci.
Similarly, we estimate that about 7% (7 out of 94) of the A.
marina genome is highly diverged among subspecies but the divergence is
eliminated in populations where subspecies coexist.