3.5 Gene prediction and gene annotation
A consensus of the results of all three methods for protein-coding genes prediction was reached, and the final number of non-redundant protein-coding genes was 38,841, with a total length of 0.54 Gb (Table 2, Supporting Information Table S5). More than 32,591 protein-coding genes (83.91%) were annotated in at least one functional database (Table 3). All genes for each database are annotated in Supporting Information Table S6. Additionally, the Asian Clam gene sets comprised 260,971 exons, and the average gene length was ~ 13.97 kb. The Asian Clam genome contained 3,048 pseudogenes, 45 microRNAs, 420 rRNAs, and 3,707 tRNAs (Table 2, Supporting Information Table S7).