Experimental evaluation of species and genetic variability based on DNA metabarcoding from the aquatic environment: extra OTUs formed by NUMTS may reduce the diversity of ASVs
The data on the intraspecific genetic variation for monitoring and conservation of wild populations is an important link for the assessment of the organisms resistance to changing environmental conditions and anthropogenic pressures. The metabarcoding of DNA from the aquatic environment provides a gradual transition to non-invasive methods of biodiversity research, including within-species level. However, the degradation of DNA under UV light in the aquatic environment limits the choice of markers in favor of short standardized regions. Hence, the consequences of information loss when shifting from barcode to metabarcode are not entirely clear. The efforts on approbation and calibration at the intraspecies level under experimental conditions are limited to molecular genetic markers designed for target species. In this study, we aimed to address these challenges: to assess the intraspecific variation in different taxa based on the COI barcode reduced to Leray region (~313bp), accessible from the GenBank, as well as experimentally evaluate the possibility to identify Operational Taxonomic Units (OTUs) and Amplicon Sequence Variant (ASVs) in marine eDNA among abundant species of the Zostera sp. community in the northern Sea of Japan: Hexagrammos octogrammus, Pholidapus dybowskii (Teleostei: Perciformes), and Pandalus latirostris (Arthropoda: Decapoda). The three abovementioned species were collected at two distant locations in the Great Bay of the Japan Sea and placed into a separate 150-liter aquaria to produce both – individual and mock communities eDNA samples. Then all individuals were euthanized and genotyped individually for 650 bp and 313 bp COI gene regions. The COI Leray region was amplified based on the eDNA of mock communities and individual specimens. The resulting amplicons were sequenced on the Illumina 250 bp pair-end platform and processed based on the Begum pipeline. Along with the OTUs based on both global and local references we tried to retrieve individual haplotypes from the obtained reads. We found that eDNA samples from the experiment when blasting on local reference produce additional OTUS which we consider to be NUMTS. Surprisingly, the presence of NUMTS in the eDNA samples reduces the detection of ASVs, which may be related both to the low sequencing coverage in the experiment and probably to the natural competition of pseudogenes for primer binding sites during amplification. Perhaps a PCR-free, metagenomic approach, despite poor accessibility, might solve these difficulties. In addition, we have gathered and analyzed natural water samples from one of the sample locations of Zostera sp. community with a little sequence coverage and failed to retrieve any reliable information about OTUs and ASVs of taxa in mock communities, which may indicate much higher biomass of non-target organisms in the studied community.
A total of 90 sequence data sets were collected for some common groups of multicellular organisms (Mollusca, Echinodermata, Crustacea, Polychaeta and Actinopterygii) through the search on the mitochondrial COI gene in the popset database of the NCBI. The separate sets of sequences of Leray region were generated. Then, the values of haplotypic variability, as well as the number of population clusters of the same dataset were calculated for the region of original length and Leray region. The produced results reflect the decrease of population diversity by 1 cluster in average while switching from barcode to metabarcode. In addition we found that the length of the Leray fragment can vary in the Echinoderms.