2.2 DNA preparation and sequencing
In order to estimate genome size and heterozygosity, 50 male adults were
collected from the same colony, and high-quality genomic DNA was
extracted by Sangon Biotech Ezup column animal genomic DNA purification
kit. The extracted DNA was used to construct the paired end 150 (PE150)
library which was sequenced by illumina sequencer. 20.62 Gb clean data
were obtained by quality control, the total sequencing depth was about
120.75×, the content of GC was about 29.07%, the proportion of Q20 was
more than 96.96% and the proportion of Q30 was more than 91.91%.
Another 50 male adults from the same colony were selected and delivered
to the Biomarker Biotechnology Co., Ltd to construct long-read sequenced
DNA libraries and sequence by the Oxford Nanepore Technologies (ONT). A
total of 20.67 Gb raw data with a N50 read length of 24.7 kb were
acquired through the third-generation nanopore platform sequencer. 20.32
Gb clean data were obtained after filtering adapter, short fragments and
low-quality data.
Furthermore, another 500 male adults from the same colony were selected
to construct the Hi-C DNA libraries which was sequenced through illumina
sequencing platform; a total of 22.92 GB clean data was obtained with
the proportion of Q30 reaching more than 91.46 %. After the library
quality evaluation, the proportion of reads containing enzyme digestion
sites in Hi-C library reached more than 31.09 % (Supplementary Table
1). The transcriptomes of 50 male adults of C. striatipennisKieffer was sequenced from the same colony and the transcriptomes ofC. riparius Meigen was downloaded from NCBI, both of them were
used to assist genome annotation.