2.6 Pseudomolecule construction by Hi-C
To construct a chromosomal-level assembly of the genome, Hi-C raw data
were first trimmed by fastp v. 0.12.6 (Chen et al. 2018). After
quality control, the low-quality reads, adapter contamination and
ambiguous bases as N’s were removed, whilst duplicates were filtered
out. High quality clean paired-end reads were retained. The clean
paired-end reads were aligned with the draft assembled genome using
Juicer pipeline v. 2.3.2 (Durand et al. 2016). Afterwards,
according to the location of DpnII restriction sites, the ratio of Self
Circle, Dangling End and Dumped Pairs was identified so as to evaluate
the validity of the paired-end reads. The contigs were then clustered,
ordered and oriented using the 3D de novo assembly (3d-DNA, v. 170 123)
pipeline (Dudchenko et al. 2017). Hi-C contact matrix was
visualized using Juicebox v. 1.9.8 (Dudchenko et al. 2018;
Robinson et al. 2018). The misassembly and misconnection were
manually adjusted based on neighboring interactions. The validated
assembly was used to construct pseudomolecules using the
finalize-output.sh script from 3d-DNA. Meanwhile, the completeness of
the assembly was evaluated using Benchmarking Universal Single-Copy
Orthologs (BUSCO) v3.1 (Simão et al. 2015).