Authorea

Chuck Pepe-Ranney edited Results.tex almost 10 years ago

Commit id: d134015f3f330d791414474d994369af70773e00

deletions | additions

In bacterial libraries, sequences were distributed into 636 total OTUs; 58\% of quality controlled sequences fell into the top 25 OTUs in order of decreasing sum of relative abundance across all samples. 23S plastid rRNA gene sequences were distributed into 359 total OTUs; 71\% of sequences fell into the top 25 OTUs sorted by mean relative abundance across all samples. Rank abundance curves for each mesocosm specific pair of planktonic and biofilm samples showed planktonic communities to be more sharply skewed in both the algal and bacterial datasets (Figure 9). We used an RNA-Seq differential expression statistical framework to find OTUs enriched in the given sample classes (R package DESeq2 developed by \citet{deseq}) (for review of RNA-Seq differential expression statistics applied to microbiome OTU count data see \citet{24699258}). We use the term "differential abundance" (coined by \citet{24699258}) to denote OTUs that have different proportion means across sample classes. We are particularly interested in two sample classes: 1) environment type (biofilm or planktonic) and, 2) high carbon (C:P = 500) versus not high carbon (C:P = 10, C:P = 100 and C:P = control). A differentially abundant OTU, for instance, would have a oroportion proportion mean in one class that is statistically different from its mean in another. This differential abundance could mark an enrichment of the OTU in either sample class and the direction of the enrichment is apparent in the sign (positive or negative) of the metric used to summarize the proportion mean difference. Here we use log$_{2}$ of the proportion mean ratio (means are derived from OTU proportions for all samples in each given class) as our differential abundance metric. It is also important to note that the DESeq2 R package we are using to calculate the differential abundance metric "shrinks" the metric in inverse proportion to the information content for each OTU. In this way the magnitude of the differential abundance metric will be high only for OTUs which we have strong confidence of true differential abundance and the metric can be used to effectively rank OTUs by magnitude of the sample class affect (i.e. OTUs with high proportion mean differences but also high within sample class proportion variance will not produce misleadingly large differential abundance metric values). The DESeq2 RNA-Seq statistical framework has been shown to improve power and specificity when identifying differentially abundant OTUs across sample classes in microbiome experiments \cite{24699258}. To investigate differences in the biofilm and overlying planktonic communities we identified the OTUs that were most dramatically enriched in biofilm versus planktonic communities and vice versa. The most differentially abundanct OTUs when proportion means are calculated based on the environment type sample class (plantonic versus biofilm) were enriched in planktonic samples (with respect to biofilm) (Figure 6). This is consistent with the higher alpha diversity in biofilm communities compared to planktonic communities. That is, sequence counts were spread across a greater diversty of taxa in the biofilm libraries compared to the planktonic libraries. Of the top five differentially abundant environment type OTUs, one is annotated as in the \textit{Bacteroidetes}, two \textit{Gammaproteobacteria}, one \textit{Betaproteobacteria} and one \textit{Alphaproteobacteria}; all five are enriched in the planktonic liraries relative to biofilm. Table 1 lists the top 25 OTUs ordered by the magnitude our differential abundance metric. Only five bacterial OTU centroid sequences for the top 25 environment type enriched OTUs share high sequence identity (>= 97\%) with cultured isolates (Table 1). The taxonomic composition of envirnoment type differentially abundant OTUs is qualitatively consistent with positions OTUs in the sample ordination space (see Figures 5 and 6).