Sequence analysis was then conducted with reference to the analysis pipeline detailed by (Meier et al., 2016). The reads from the Paired-end were merged by the software PEAR 0.9.6 (Zhang et al. , 2014).. The reads of each PCR products were then matched to their specific template specimen which was achieved due to primer pair combination that were uniquely labelled. A python script by (Srivathsan, unpublished) was used to 1) demultiplex data, 2) tally the reads for each sample, 3) identify and cluster identical reads into groups, 4) identify dominant groups of reads and combine with variants that were otherwise of identical length and lastly 5) tally the reads found in the group showing highest identity and compare with the group showing the next highest identity (Meier et al., 2016). Quality control was carried out by a set of criteria namely more than 50x read count, more than 10x barcode count and for the number of dominant reads to be five times or more than second most dominant reads (Meier et al., 2016). This was to ensure that coverage attributed to each barcode was sufficient and not from confounding sequences such as contaminant DNA fragments. In addition, quality control rejects dominant sequences that may have arisen out of amplification error in the PCR step. Next, the sequences that passed the quality control were entered into the search query in Basic Local Alignment Search Tool (BLAST) to search for sequences that match >97% to non-Onthophagus taxa, which were contaminant sequences and thus eliminated from analysis. After quality control, MEGA7, an online software, was used to align the sequences to ensure that there were no stop codons. Then, a new Python script (Srivathsan, unpublished) was used to construct c and a threshold of 3% which is widely used to distinguish between species in literatures on insects (Hebert et al., 2003; Srivathsan & Meier, 2012).
Figure S4. Haplotype map based on 434 specimens across all sampling sites, CCNR (n=64), Ubin (n=122), Perak (n=5), Kenyir (n=8), Gombak (n=40), Langkawi (n=195).