3.1 Species identification
C. striatipennis Kieffer had ever been confusedly identified asC. kiiensis Tokunaga or C. strenzkei Fittkau (Lacerda et
al., 2014). It widely distributes in China and was always under the name
of C. kiiensis Tokunaga. In present study, the divergences of the
cytochrome oxidase subunit 1 (COI) gene between this study and Martin’s
released were 0.46% by quantifying the genetic distance matrix
(Supplementary Table 2). This genetic distance divergence is within the
range of genetic distance among most species (Hebert et al., 2003).
Therefore, the laboratory colony of chironomid used in present study is
inseparable from C . striatipennis named by Martin on
morphology and DNA barcodes, so we follow Martin’s identification and
name it C. striatipennis Kieffer ( Amora et al., 2015).
3.2Genome
sequencing and characteristics
The estimated genome size of C. striatipennis was about 170.79
Mb, the heterozygosity rate was about 1.13%, the repeat sequence part
of genome was about 23.63% and with karyotype of 2N=2X=8, as determined
through K-mer analysis (Supplementary Figure 1). These characteristies
indicated that C. striatipennis has a highly heterozygous complex
genome.
In order to obtain a better assembly, Oxford Nanepore Technologies
sequencing data was used to the preliminary genome draft assembly and
then using the illumine data to polish the preliminary genome draft.
BUSCO assessment indicated that the completeness of the gene set of
assembled genome draft was 95.0%, which signified the genome assembly
of C. striatipennis was complete and suit for further anchoring
sequences to chromosome analysis (Supplementary Figure 2, Supplementary
Table 3). Basing on 22.92 Gb clean reads from Hi-C library, 78 scaffolds
including 4 pseudomolecules which represented 4 chromosomes and 74
detritus were assembled. The lengths of 4 pseudochromosomes ranged from
20.40 Mb to 60.13 Mb with a scaffold N50 value of 64.51 Mb (Figure 3).
In addition, about 179.77 Mb contigs were mapped into 4
pseudochromosomes with an anchoring rate of 98.86 % (Supplementary
Table 5). The chromatin interaction data suggest that our Hi-C assembly
is of high quality (Figure 2). We used BUSCO to identify 95.0%
(3119/3285), 98.7% (1349/1367) and 98.1% (936/954) conserved genes ofC. striatipennis by alignment to corresponding database of
Diptera, Insecta and Metazoa (Supplementary Figure 3). The above results
are compared with other genomes assembled in Chironomid, it can be
concluded that the genome assembly of C. striatipennis was more
high-quality and complete (Tab.1).
C. striatipennis is the first species with chromosome-level
genome assembly in genus Chironomus . The genome size of C.
striatipennis is similar to other species in the genusChironomus (C. riparius, C.tentans, C. tepperi(236 Mb)), but it
is much larger than that of species in subfamily Orthocladiinae
(C. marinus , Belgica antarctica (99 Mb) and P.
akamusi ) and in genus Polypedilum ( P. vanderplanki ,P. pembai (122 Mb)) (Tab.1) (Kaiser et al., 2016; Kelley et al.,
2014; Sun et al., 2021). The above results are also consistent with
those of Cronette et al. (2015). In conclusion, it can be inferred the
genome size of genus Chironomus is larger than that of subfamily
Orthocladiinae and genus Polypedilum .
Table 1. Genome statistics and comparisons among chironomid species
whose genome has been sequenced