3.1 Statistics of sequencing data
More than 252.77 Gb of clean data for survey analysis (Illumina) were generated, and the data covered the depth of 154.13X for the Asian Clam genome (Table 1). Two single-molecule real-time (SMRT) cells were processed, and approximately 15.03 million PacBio reads (∼293.72 Gb) were generated (Table 1). The max subread for PacBio offline was 286.39 kb; the N50 and mean length of subreads were 31.18 kb and 19.54 kb, respectively. The valid subreads were mainly distributed from 500 bp to 40,000 bp (Supporting Information Figure S1). After the Hi-C data processing by filtering of low-quality reads, we obtained approximately 780.87 million clean reads (~233.26 Gb) from two libraries that were used for chromosomal construction (Table 1). Additionally, approximately 8 Gb clean data for transcriptome sequencing was performed for subsequent gene prediction analysis (Table 1).