PCR amplification specificity
Sequences identified as the most likely partial barcodes by taxonomic filtering were the maximally abundant sequence per specimen in 345 cases for the FC amplicon and in 365 cases for the BR amplicon. Combined, the selected FC and BR sequences were both the maximally abundant sequences per specimen in only 286 cases. A further 29 selected FC sequences and 50 selected BR sequences each had abundance ranks between two and ten. Two selected FC sequences had abundance ranks of 18 and 30, respectively, and 10 selected BR sequences each had abundance ranks between 11 and 101. In 37 cases where the maximally abundant FC sequence was not selected, the maximally abundant sequences were identified as deriving from insects (25), annelids (8), arachnids (2), algae (1), and amoebae (1). Similarly, in 74 cases where the maximally abundant BR sequence was not selected, these were identified as deriving from insects (44), other hexapods (2), annelids (2), or gastropods (1); and as Homo sapiens (5), and eukaryote (2) or prokaryote (18) micro-organisms, including 10 cases of Wolbachia .
To investigate the origins of maximally abundant but non-target sequences (i.e. those with unexpected taxonomic identifications), the allpairs_global function in VSEARCH was used to identify any identical sequences among the maximally abundant sequences per specimen, plus the selected (presumed correct) sequences (if not maximally abundant) from each specimen. Out of 37 specimens with maximally abundant but non-target FC sequences, only three of those sequences were identical to another maximally abundant but non-target FC sequence, in two cases from adjacent PCR wells and in all three cases from within the same PCR plates. Similarly, out of 74 specimens with maximally abundant but non-target BR sequences, ten of those sequences were each identical to one or more other maximally abundant but non-target BR sequence, in only four cases from adjacent PCR wells but in all cases from within the same PCR plates.