Genome sequencing, size estimation and assembly
The genome size of O. kokonorica was estimated to be
~520 Mb by K-mer analysis based on 35.27 Gb of cleaned
Illumina data (Figure S1; Table S1). A combination of Illumina, Nanopore
and Hi-C technologies were adopted for sequencing to accurately assemble
its genome. Based on 61.87 Gb of Nanopore long reads corresponding to
110× coverage of the estimated ~520 Mb genome (Table
S1), we polished the raw assembled genome using NextPolish and performed
deredundancy with purge_haplotigs, resulting in a final genome assembly
with length of 556 Mb and contig N50 of 9.08 Mb. The Benchmarking
Universal Single-Copy Orthologs (BUSCO) evaluation score was 97.6%,
indicating a very complete and high-quality genome assembly (Table 1;
Table S2). Based on ~137 Gb of Hi-C data, we further
connected 127 contigs onto 20 pseudochromosomes. In total, 99.80%
(554.86 Mb) of the assembly was anchored and oriented on 20
pseudochromosomes (Figure S2; Table S3). 98.93% of Illumina short reads
could be properly mapped to the final genome assembly (Table S4). These
assessments indicate that the genome of O. kokonorica was
assembled with high quality, completeness and accuracy.