Andrew Dopheide

and 6 more

Despite recent advances in high-throughput DNA sequencing technologies, a lack of locally relevant DNA reference databases may limit the potential for DNA-based monitoring of biodiversity for conservation and biosecurity applications. Museums and national collections represent a compelling source of authoritatively identified genetic material for DNA database development yet obtaining DNA barcodes from long-stored specimens may be difficult due to sample degradation. We demonstrate a sensitive and efficient laboratory and bioinformatic process for generating DNA barcodes from hundreds of invertebrate specimens simultaneously via the Illumina MiSeq system. Using this process, we recovered full-length (334) or partial (105) COI barcodes from 439 of 450 (98 %) national collection-held invertebrate specimens. This included full-length barcodes from 146 specimens which produced low-yield DNA and no visible PCR bands, and which produced as little as a single sequence per specimen, demonstrating high sensitivity of the process. In many cases, the identity of the most abundant sequences per specimen were not the correct barcodes, necessitating the development of a taxonomy-informed process for identifying correct sequences among the sequencing output. The recovery of only partial barcodes for some taxa indicates a need to refine certain PCR primers. Nonetheless, our approach represents a highly sensitive, accurate, and efficient method for targeted reference database generation, providing a foundation for DNA-based assessments and monitoring of biodiversity.