Chuck Pepe-Ranney edited Sequence Quality Control and Analysis.tex  almost 10 years ago

Commit id: f78c1209bf0acc39776e72c655b42ce77e6f8b99

deletions | additions      

       

\subsubsection{Alpha and Beta diversity analyses}  Alpha diversity calculations were made using PyCogent Python bioinformatics modules \cite{17708774}. Beta diversity analyses were made using Phyloseq \cite{24699258} and its dependencies \cite{vegan}. Log$_{2}$ fold change of group mean ratios and corresponding null hypothesis based significance values were calculated using DESeq2 \cite{Love_2014}. All dispersion estimates from DESeq2 were calculated using a local fit for mean-dispersion. Native DESeq2 independent filtering was disabled in favor of explicit sparsity filtering. The sparsity thresholds that produced the maximum number of OTUs with adjusted p-values for differential abundance below a false discovery rate of 10\% were selected for biofilm versus planktonic sequence 16S/plastid 23S library comparisons. The specific sparsity threshold for plastid 23S and 16S libraries for biofilm versus plankton comparisons was 10\% (OTUs found in less than the sparsity threshold of samples were discarded from the analysis). Cook's distance filtering was also disabled when calculating p-values with DESeq2. We used the Benjamini-Hochberg method to adjust p-values for multiple testing \cite{citeulike:1042553}. Identical DESeq2 methods were used to assess enriched OTUs from relative abundances grouped into high (c:P = 500) or low (C:P < 500 and control) categories. A sparsity threshold of 25\% was used for ordination of both plastid 23S and bacterial 16S libraries. Additionally, we discarded any OTUs from the 23S data that could not be annotated as belonging in the Eukaryota. All DNA sequence based results were visualized using GGPlot2 \cite{Wickham_2009}.  Adonis tests were performed using the Bray-Curtis similarity measure for pairwise library comparisons with the default value for number of permutations (999) (adonis ("adonis"  function in Vegan R package, \citet{vegan}). Principal coordinates of OTUs were found by averaging site pricipal coordinate values for each OTU with OTU relative abundance values (within sites) as weights. The principal coordinate OTU weighted averages were then expanded to match the site-wise variances \cite{vegan}.