David Coil edited 16S rDNA Sequencing and Analysis (Organism Identification).md  over 9 years ago

Commit id: 2517ed8eeaf3cdf657e73ed40ca20e0a0f6c3058

deletions | additions      

       

##Sanger Sequence Processing  Upon receiving Sanger reads from a sequencing facility, typically via e-mail, it is necessary to do some pre-processing before they can be analyzed. These steps include quality trimming the reads, reverse complementing the reverse sequence, aligning the reads, generating a consensus sequence, and converting to FASTA format. There are very limited options for free software that allow the user to perform these steps.   In this workflow we recommend using an automated pipeline available at the Ribosomal Database Project (REF) Project\cite{Cole_2013}  if working with a large number of sequences. This pipeline only provides a rough view, since it doesn't complement or align the reads, it simply quality trims them and outputs the data in a format that can be fed directly to the BLAST program at NCBI (REF). NCBI\cite{2231712}.  This will at least give an idea of what genera, and sometimes which species, each sample belongs to. We then recommend processing samples of interest using SeqTrace \cite{stucky2012seqtrace} which allows the user to see the trace, process the sequences manually, and a get a longer, more accurate sequence for analysis. We have also created a script that will perform the same steps at SeqTrace automatically, but does not allow you to adjust any of the parameters. The choice of our script (easy, little control) versus SeqTrace (more complex, more control) will depend on the user and the project.