this is for holding javascript data
David Coil edited Genome Assembly and Annotation.md
over 9 years ago
Commit id: fe6617a8adbf93a45aafc108886d92617529c31d
deletions | additions
diff --git a/Genome Assembly and Annotation.md b/Genome Assembly and Annotation.md
index c656565..75c7081 100644
--- a/Genome Assembly and Annotation.md
+++ b/Genome Assembly and Annotation.md
...
4. scaffolding
5. verification of scaffolds/contigs
The first step simply removes poor quality sequences, as well as adaptor sequences left over from sequencing. Some assemblers follow this with error correction where reads are compared to each other to eliminate sequencing errors. Next is contig assembly where overlapping reads are assembled into long continuous
streches stretches of sequences. Scaffolding refers to the alignment and orientation of these contigs relative to each other (where possible). The last step is verification where reads are mapping back to the contigs/scaffolds to eliminate misassemblies.
There is a plethora of programs that can perform some, or most of these steps. These programs include commercial and open-source options, some are very user friendly and some are extremely difficult to use/install. Common assemblers for bacterial genomes include SPAdes \cite{Bankevich_2012}, MIRA \cite{Chevreux_2004}, SGA \cite{Simpson_2010}, Velvet \cite{Zerbino_2008} CLC (CLC Bio), and A5 \cite{Tritt_2012}. Good sources for overviews of genome assemblers and the assembly process include the GAGE project \cite{Salzberg_2012}, the GAGE-B project \cite{Magoc_2013}, and the Assemblathon Project \cite{Earl_2011}.