Chuck methods edits  almost 9 years ago

Commit id: 5111a82e54b91c8daee4bab261a1f2bc7519e17c

deletions | additions      

       

Roche 454 FLX system using titanium chemistry at Selah Genomics (Columbia,  SC).   SSU rRNA gene sequences were initially screened by maximum expected errors at a specific read length threshold (17) and \citep{edgar2013}. Reads that had less  than~0.5 expected errors at a length of 250 nt were then  aligned to the Silva reference Reference Alignment as provided in the Mothur software package  using the Mothur NAST aligner \citep{}(21, 22). Anomalous reads \citep{DeSantis2005,schloss2009}. Reads that did not align to the  expected region of the SSU rRNA gene  were discarded discarded. After expected error  and alignment based quality control,  87\% of original reads remained. Remaining The remaining  reads were annotated using the “UClust” taxonomic annotation framework in QIIME (18, 19) with \citep{caparaso2010,edgar2010}. We used  97\% cluster seeds from the Silva SSU rRNA database \citep{quast2013}  as reference for taxonomic annotation  (provided at QIIME website) (20). \citep{quast2013}.  Sequences were distributed into OTUs using the with a centroid based clustering algorithm (i.e.  UPARSE methodology (17). \citep{edgar2013}). The centroid selection also included robust chimera  screening \citep{edgar2013}.  OTU centroids were established at a threshold of 97\% sequence identity. identity and non-centroid sequences were mapped back to  centroids. Reads that could not be mapped to an OTU centroid at greater  than or equal to 97\% sequence identity were discarded.  For phylogenetic reconstruction, alignment was performed with SSU-Align (23, 24). \citep{nawrocki2009,nawrocki2013}.  Columns in the alignment that were aligned with poor confidence (< ($<$  95\% of characters had posterior probability alignment scores  > $>$  95\%) were masked. not considered when building  the phylogenetic tree.  FastTree (25) \citep{price2010}  was used with default parameters to build the phylogeny. NMDS ordination was performed on weighted Unifrac (32) \citep{lozupone2005}  distances. The Phyloseq (33) \citep{mcmurdie2013}  wrapper for Vegan (34) \citep{oksanen2015}  (both R packages) was used to compute sample values along NMDS axes. The 'adonis' function in Vegan was used to perform Adonis tests (default parameters) (36). \citep{Anderson2001a}.  We used DESeq2, DESeq2 (R package),  an RNA-Seq differential expression statistical framework (29), \citep{love2014},  to identify OTUs that were enriched in high density gradient fractions from $^{13}$C-treatments relative to corresponding density fractions from control treatments (for review of RNA-Seq differential expression statistics applied to microbiome OTU count data see (30)). We define "high density gradient fractions" as gradient fractions whose density falls between 1.7125 - and  1.755 g ml$^{-1}$. Briefly, DESeq2 includes several features that enable robust estimates of standard error in addition to reliable ranking of logarithmic fold change (LFC) in abundance (i.e. gamma-Poisson regression  coefficients) even with low count groups where LFC can often be noisy CITE.  Further, statistical evaluation of LFC can be performed with selected 

one standard deviation above the mean of all LFC values. P-values were  corrected for multiple comparisons by using the Benjamini and Hochberg (BH)  method (31). Independent filtering was performed on the basis of sparsity prior  to correcting P-values for multiple comparisons. The sparsity threshold value  that yielded the most P-values less than 0.10 was used forsparsity  independent filtering. filtering by  sparsity.  Briefly, OTUs were eliminated if they failed to appear in at least XX\% 45\%  of high density gradient fractions for a given $^{13}$C and control treatment pair, these OTUs are unlikely to have sufficient data to allow for  the determination of statistical significance.