Chuck Pepe-Ranney edited Sequence Quality Control and Analysis.tex  almost 10 years ago

Commit id: 5bbcd45842295d7074cbb6da767176f0d4b508aa

deletions | additions      

       

\subsection{Sequence Quality Control and Analysis}  \subsubsection{Quality Control}  The 16S sequence collection was demultiplexed and sequences with sample barcodes not matching expected barcodes were discarded. We used the maximum expected error metric \cite{23955772} calculated from sequence quality scores to cull poor quality sequences from the dataset. Specifically, we discarded any sequence with a maximum expected error count greater than 1 after truncating to 175 nt. The forward primer and barcode was trimmed from the remaining reads. We checked that all primer trimmed, error screened and truncated sequences were derived from the same region of the LSU or SSU rRNA gene (23S and 16S sequences, respectively) by aligning the reads to Silva LSU or SSU rRNA gene alignment (“Ref” collection, release 115) with the Mothur \cite{19801464} NAST-algorithm \cite{16845035} aligner and inspecting the alignment coordinates. Reads falling outside the expected alignment coordinates were culled from the dataset. Remaining reads were trimmed to consistent alignment coordinates such that all reads began and ended at the same position in the SSU rRNA gene and screened for chimeras with UChime in “denovo” mode \cite{21700674} via the Mothur UChime wrapper.  \subsubsection{Taxonomic annotations}