Authorea

Jenna M. Lang edited Organism Identification using 16s rRNA gene sequence.md over 9 years ago

Commit id: 89bfe6d9303eb2c97f46937632fc0972e4961378

deletions | additions

#Organism identification using 16S rRNA gene sequence In a classroom or undergraduate research setting the project may not have a particular bacterial species in mind. In this case it It is necessary to screen the 16S rDNA Sanger sequencing results for possible genome sequencing candidates. We recommend starting with BLAST results, then continuing onto the Genomes Online Database (GOLD), and simply Google searching. performing an internet search to obtain information about the organism you have isolated. In many cases cases, it will be handy useful toalso build a phylogenetic tree to aid in identification since identification, as the BLAST search results may not be sufficiently informative. ##BLAST 16S rDNA sequence Begin by navigating to the Standard Nucleotide BLAST at NCBI: http://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome Paste in your Sanger consensus sequence. We recommend checking the box to exclude Uncultured/environmental sample sequences sequences, since these will not be informative for identification. Be sure the nucleotide collection (nr/nt) is selected under database and click the BLAST button. ##Interpreting the results Depending on the quality of the Sanger sequencing and the particular bacteria sequenced, the BLAST search results can range from definitive to relatively uninformative. Examples of both are discussed below. 1. In some cases it is not necessary to build a phylogenetic tree for further identification. If all of the top hits are the same species (or end in sp.), have e-values _e_-values of 0.0, good query coverage, and 99% to 100% identity identity, you can proceed to "Using GOLD". 2. In other cases cases, the results aremuch more ambiguous. The results may show more than 99% identity to multiple species within multiple genera. In this case, proceed to section 11 "Building a 16S rDNA Phylogenetic Tree", before using GOLD. 3. Another possibility is that you will get significantly less than 99% identity to any sequences at NCBI. in the NCBI database. One explanation for this is that your sequence is of poor quality sequence, this quality. This might require more stringent trimming using SeqTrace or even resequencing if the quality is bad poor enough to make assigning taxonomy difficult. Another possibility is having that you have isolated something that is not that very closely related to anything in the NCBI database. In the latter case case, we would recommend first re-doing the BLAST search, but unchecking the "Uncultured/environmental sample" to see if the sequence matches others that have been found, but are not associated with a cultured organism. In either case, we would recommend re-sequencing for confirmation and then proceeding to section 11 "Building a 16S rDNA Phylogenetic Tree" to examine the phylogenetic context of the novel sequence. ##Using GOLD (the Genomes Online Database) Go to: http://genomesonline.org/cgi-bin/GOLD/index.cgi Under the Search tab tab, click the "Quick Search" option and you should be taken to a page that looks like the screen shot displayed in Figure \ref{fig:GOLD}. Fill out the blue Organism Information (Organism Name) section, with information about your microbe from BLAST and click submit search. We usually search for only the genus to get a sense for how well that genus is represented in the database and which species are present. Figure \ref{fig:GOLD\_results} shows an example screen shot of the results for "_Brachybacterium_." Clicking on a project ID will take you to a more detailed description of the project including its project status (complete, permanent draft, incomplete, targeted). While some "incomplete" and "targeted" projects will be completed, many will not not, so we mostly tend to ignore these categories. If you have relatively ambiguous identification results (e.g. (_e.g_. you think you have some sort of _Brachybacterium_ but aren't sure which species) species,) it could be worthwhile to perform an alignment of your 16S sequence with those from genomes already in Genbank. ##Align 16S Sequences using Align Sequences Nucleotide BLAST First locate the 16S sequences of the genome you'd like to compare to, by searching the NCBI Nucleotide database for "Species 16s gene".

http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch&BLAST_SPEC=blast2seq&LINK_LOC=align2seq Paste in the two 16S rDNA sequences and click on the "BLAST" button. Unless both your sequence and the sequence to which you are comparingto were amplified with the same primers, the query coverage will not be 100%. A low identity can be the result of poor sequence quality or taxonomic distance. A choice of whether to sequence an organism based on these results depends on the project goal. For example example, an identity of 100% suggests that at least at the 16S level, the candidate organism is very similar to what is already in the database. However, many organisms vary greatly in gene content between strains and an additional genome may still be informative. There is also significant debate over what level The use ofrelatedness at the 16S level should be used to determine the difference between species, or if this rRNA gene sequence percent identity as a proxy for species delimitation in bacteria iseven a relevant question subject of some debate in the field. \cite{Chan_2012}\cite{Drancourt_2005}\cite{Hanage_2006}\cite{Stackebrandt_2002}.