Read mapping and variant calling
The quality of short reads produced by the HiSeq2000 platform was first examined by FastQC and then mapped to the previously obtained reference sequences using MAQ 0.7.1(H. Li, Ruan, & Durbin, 2008). In mapping and pileup, the mutation rate between reference and read was set to 0.002, the threshold of mismatch base quality sum was 200, and the minimum mapping quality of reads was 30. To exclude false-positive mismatches, we counted the mismatch rate for each site across the read and mismatch rate for each base quality. We trimmed the first and last 10 bases of each read and filtered bases with quality score less than 30, using in-house Perl scripts (available on GitHub: https://github.com/GgamerL/AvicenniaSolexa/tree/SolexaAvicennia).
Variant sites were also identified using MAQ 0.7.1. To avoid bias introduced by sequencing errors, we discarded sites with insufficient site coverage (<100 reads) and those with minor allele frequency less than 0.01 (He et al., 2013). Single nucleotide polymorphisms (SNP) were used in the subsequent analyses. To reduce false SNPs introduced by homopolymers or insertions/deletions, putative variants in those regions were masked.