Jenna M. Lang edited Results PICRUSt.md  over 9 years ago

Commit id: f05bab559232c9b03191e33a388f4f8311b6124e

deletions | additions      

       

In a case like this for which metagenomic sequencing is infeasible, another approach suggests itself. There is evidence that a correlation exists between the evolutionary relatedness of two organisms and the similarity of their genomic content \cite{Martiny_2012}. This allows us to leverage the information obtained by sequencing the genome of one organism to predict the functional potential of another, even if the other genome is represented only by a 16S rRNA sequence. The power of this approach is increased when very many, very closely-related genome sequences are available. This predictive approach has recently been implemented in the software package PICRUSt. PICRUSt uses the phylogenetic placement of a 16S rRNA sequence within a phylogeny of sequenced genomes to infer the content of the genome of the organism represented by that 16S rRNA sequence.  With PICRUSt one can calculate a metric (NSTI) that measures how closely related the average 16S rDNA sequence in an environmental sample is to an available sequenced genome. When this number is low, PICRUSt is likely to perform well in predicting the genomes of the organisms in an environmental sample (_i.e._, a metagenome). The average NSTI for our 15 meals was 0.038, which is on par with the NSTI for the Human Microbiome Project samples (mean NSTI = 0.03 ± 0.02 s.d.), for which a massive effort has been made to obtain reference genome sequences \cite{22018227}. This low NSTI metric suggests that PICRUSt may perform well when predicting the metaboloc potential of the microbial communities found in the meals prepared for this study. Here, we have shown the most significant KEGG functional category, for “Other N-glycan degradation" (KO 00511, _p_ = 8.21e-3), which was highest in the VEGAN dietary pattern (Figure 7). Again, this is not a significant result when a p-value correction is applied, but is nevertheless highlighted as a potential source of information when using a pilot study like this to inform future research questions. As a simple sanity check for the PICRUSt predictions, we compared the relative abundance of genes present in the KEGG functional category "Sporulation" between meals that were cooked were compared to those that were raw (Figure 8). As expected, because organisms that can form spores are more likely to survive the cooking process, Sporulation-associated genes are more abundant in cooked versus raw foods. All KEGG (Level 3) pathways that vary significantly between dietary patters are presented in Table 8.)  These findings suggest that there are functional differences in bacterial populations associated with different foods and meals, and that these may be related not only to bacterial substrate preferences, but also techniques used in meal preparation.