Read mapping and variant calling
The quality of short reads produced by the HiSeq2000 platform was first
examined by FastQC and then mapped to the previously obtained reference
sequences using MAQ 0.7.1(H. Li, Ruan, & Durbin, 2008). In mapping and
pileup, the mutation rate between reference and read was set to 0.002,
the threshold of mismatch base quality sum was 200, and the minimum
mapping quality of reads was 30. To exclude false-positive mismatches,
we counted the mismatch rate for each site across the read and mismatch
rate for each base quality. We trimmed the first and last 10 bases of
each read and filtered bases with quality score less than 30, using
in-house Perl scripts (available on GitHub:
https://github.com/GgamerL/AvicenniaSolexa/tree/SolexaAvicennia).
Variant sites were also identified using MAQ 0.7.1. To avoid bias
introduced by sequencing errors, we discarded sites with insufficient
site coverage (<100 reads) and those with minor allele
frequency less than 0.01 (He et al., 2013). Single nucleotide
polymorphisms (SNP) were used in the subsequent analyses. To reduce
false SNPs introduced by homopolymers or insertions/deletions, putative
variants in those regions were masked.