Jenna M. Lang edited Genome Assembly and Annotation.md  about 10 years ago

Commit id: 6c10aabcf2dd6e00a85fa99d7cd6826d9dc5386f

deletions | additions      

       

The DNGNGWU0001-00040 markers represent 40 highly conserved bacterial genes, if one is missing it won't show up as a zero, it is necessary to manually verify the list. Most of the genes should only appear once. An occasional 2 is fine, but if all/a majority of the genes appear twice or even three times you have most likely sequenced multiple bacteria together. Additionally check to make sure there is no 18S RNA (at the top of the list) to ensure your sample has not been contaminated with a eukaryotes (e.g. yeast).   Important Note: Markers 4, 8 and 38 are no longer included in the Phylosift analysis so do not be concerned if they are not listed.  ##Annotation  ###Options  There are a number of different pipelines available for annotation of bacterial genomes. These include Prokka, IMG, RAST, PGAP and others.  ####Prokka  Command line based  http://www.vicbioinformatics.com/software.prokka.shtml  ####Integrated Microbial Genomes (IMG)  Requires a login and they do the annotation  http://img.jgi.doe.gov/  ####Rapid Annotation using Subsystem Technology (RAST)  Web-based, using closed (unknown) algorithms  http://rast.nmpdr.org/  ####Prokaryotic Genome Automatic Annotation Pipeline (PGAAP)  Built into NCBI and only accessible upon request  http://www.ncbi.nlm.nih.gov/genome/annotation_prok/  Each of these pipelines has advantages and disadvantages, and each will give slightly different results. Here we recommend RAST since it is web-based, easy to use, returns results within hours and provides a framework for analyzing the results. However, RAST annotations are very difficult to submit to NCBI so we recommend allowing NCBI to annotate the genome with PGAAP upon submission.  ###RAST Annotation  Annotation of the genome using RAST is also an easy way to locate the full-length 16S gene which is required for the Section IX, "Building A Phylogenetic Tree" portion of the workflow.  Navigate to  http://rast.nmpdr.org/  Register a new account  Once you have created an account, log in   Hover over the Your Jobs tab at the top of the page and click on Upload New Job  In order to proceed you must specify a domain, a genus, a species, and the genetic code (usually "11")  Click Finish the Upload  The annotation will take some time, ranging from 2 hours to a few days, depending on server load. RAST will email you when it is complete. Once the annotation is complete, use their SEED Viewer to explore the annotation and metabolic pathways of the organism.