3.2 On the basis of the local reference
The search for taxonomic assignment based on the local reference, consisting of COI haplotypes of genotyped individuals from the mock communities, revealed 17 OTUs with 209499 reads (13966.6±5802.6) (Table 3). Of these, 10 OTUs corresponded to the species H. octogrammus, 6 OTUs belonged to P. latirostris, and one to P. dybowskii. The majority of the reads (68%) accounted for OTUs assigned to the species H. octogrammus (145653 or 70%). Regarding the original reference haplotypes, most of the reads (66.7%) also came from the 3 haplotypes of H. octogrammus. They form a common cluster together with OTUs 4, 5, and 6 (Figure 4) with intraspecific variability of no more than 0.013. Other haplotypes of this species form 2 separate phylogroups with divergence from 0.071 to 0.106. In the most divergent phylogroup, 3 OTUs had a deletion of 1 nucleotide, as well as 1 to 3 amino acid substitutions relative to the reference haplotypes. The P. dybowskii haplotypes accounted for 26% of all reads, which revealed similarities only to PD20-1 hap1. Accordingly, it completely matched OTE22. P. latirostris had the least number of reads (939 or 0.4%), but its homologous haplotypes were the most diverse, forming four phylogroups and containing from 1 to 3 amino acid substitutions relative to the reference sequences. The divergence from the cluster with the original haplotypes was between 0.023 and 0.126. The samples from the mock communities had 1308 (Vostok Bay) and 12703 (Vityaz Bay) reads.
The presence of additional OTUs with deep divergence in the species H. octogrammus and P. latirostris requires proof of their homology to these taxa. This cannot be done for shrimp, because there is no complete reference base for this genus of shrimps. The availability of a nucleotide sequence reference database for the species H. octogrammus, due to its completeness, allows us to find out whether additional OTUs belong to any of the known greenlings species (family Hexagrammidae). When comparing the identified OTUs with the specified database, it was found that the nearest OTU to the original cluster is determined as H. octogrammus, but occupies a basal position in relation to it (Figure S2). The remaining 4 OTUs form a separate cluster occupying an intermediate position between the genera Hexagrammos and Pleurogrammus.
The results of OTUs condensation using the lulu program left 6 OTUs (Table S3). The species P. dybowskii and P. latirostris retained one haplotype each as central OTUs (Figure 4, Table S3). At the same time, the species H. octogrammus with rather high intraspecific variability retains 3 OTUs in the cluster with the reference haplotype, as well as one of the haplotypes of the divergent cluster, carrying a deletion of 2 nucleotides.
As for the ASV detection approach implemented here, for two of the three species, the success of detecting of the exact genetic variants was inversely proportional to the success of detecting additional OTUs. Thus, only for the species P. dybowskii all (two) genotyped haplotypes were detected with high coverage. Both, in individual samples and in artificial communities (Table 4). Their reads in minor numbers were scattered throughout the samples, but showed absolutely no reciprocal cross-contamination. The original haplotypes of species H. octogrammus were almost not found, even in artificial communities. For P. latirostris species, we were able to find 106 reads of the PL_hap1 haplotype, linked to the locality of the Vityaz Bay, and 528 reads for haplotype PL_hap2 which is from the Vostok Bay community. At the same time, haplotype 2 turned out to be much more represented in the mock community of the Vityaz Bay. In the sample from natural environments there were only minor numbers of reads for ASV of the species P. dybowskii and P. latirostris presented.