loading page

Optimization of the “in-silico” mate-pair method improved contiguity and accuracy of genome assembly
  • Tao Zhou,
  • Liang Lu,
  • Chenhong Li
Tao Zhou
Shanghai Ocean University

Corresponding Author:[email protected]

Author Profile
Liang Lu
Shanghai Ocean University
Author Profile
Chenhong Li
Shanghai Ocean University
Author Profile

Abstract

A combination of next-generation sequencing technologies and mate-pair libraries of large insert sizes is used as a standard method to generate genome assemblies with high contiguity. The third-generation sequencing techniques also are used to improve the quality of assembled genomes. However, both mate-pair libraries and the third-generation libraries require high-molecular-weight DNA, making the use of these libraries inappropriate for samples with only degraded DNA. An in silico method that generates mate-pair libraries using a reference genome was devised for the task of assembling target genomes. Although the contiguity and completeness of assembled genomes were significantly improved by this method, a high level of errors manifested in the assembly, further to which the methods for using reference genomes were not optimized. Here, we tested different strategies for using reference genomes to generate in silico mate-pairs. The results showed that using a closely related reference genome from the same genus was more effective than using divergent references. Conservation of in silico mate-pairs by comparing two references and using those to guide genome assembly reduced the number of misassemblies (18.6% – 46.1%) and increased the contiguity of assembled genomes (9.7% – 70.7%), while maintaining gene completeness at a level that was either similar or marginally lower than that obtained via the current method. Finally, we compared the optimized method with another reference-guided assembler, RaGOO. We found that RaGOO produced longer scaffolds (17.8 Mbp vs 3.0 Mbp), but resulted in a much higher misassembly rate (85.68%) than our optimized in silico mate-pair method.
12 Aug 2021Submitted to Molecular Ecology Resources
21 Sep 2021Assigned to Editor
21 Sep 2021Submission Checks Completed
21 Sep 2021Reviewer(s) Assigned
26 Nov 2021Review(s) Completed, Editorial Evaluation Pending