Read mapping and variant calling
The quality of short reads produced by the Illumina sequencing platforms was first examined by FastQC (Andrews, 2010). Short reads were then mapped to reference sequences using MAQ 0.7.1(Li, Ruan, & Durbin, 2008). Notably, the reference sequences were obtained by sequencing DNA amplicons of all 94 loci from one A. marina individual using the Sanger method. We also did this for one A. alba individual for use as outgroup. In mapping and pileup, the mutation rate between reference and read was set to 0.002, the threshold of mismatch base quality sum was 200, and the minimum mapping quality of reads was 30. To exclude false-positive mismatches, we counted the mismatch rate for each site across the read and mismatch rate for each base quality. We trimmed the first and last 10 bases of each read and filtered bases with quality score less than 30.
By identifying variant sites using MAQ 0.7.1, we obtained nucleotide polymorphism information within each population. To avoid bias introduced by sequencing errors, we discarded sites with insufficient site coverage (<100 reads) and those with minor allele frequency less than 0.01 in each population (He et al., 2013). We obtained a list of single nucleotide polymorphisms (SNPs) per population, with allele frequencies. To reduce false SNPs introduced by homopolymers or insertions/deletions, putative variants in those regions were masked. The 16 sets of SNPs were used in the analyses below.