this is for holding javascript data
Jenna M. Lang edited Methods.md
over 8 years ago
Commit id: 19518ad38abebac227148c4f404004a38715311e
deletions | additions
diff --git a/Methods.md b/Methods.md
index fe1ea76..290e547 100644
--- a/Methods.md
+++ b/Methods.md
...
\cite{Caporaso_2012}.
##Bioinformatic Analysis
Unless otherwise noted, all microbial community analyses were conducted using the QIIME workflow version 1.8 or R \cite{R}. All python scripts referred to are components of QIIME \cite{Caporaso_2010}.
###Demultiplex and QC.
An in-house script was used to assign sequences to samples, using dual-index barcoding. This script is available on github (https://github.com/gjospin/Demul_trim_prep)
###OTU assignment and QC
Chimeric sequences were identified using usearch61 as implemented in the identify\_chimeric\_seqs.py script, resulting in the removal of 8760 sequences. The pick\_open\_reference\_otus.py script was used to cluster sequences at 97% similarity to generate OTUs
(operational taxonomic units, (Operational Taxonomic Units, a proxy for species). Taxonomy was assigned to each OTU by comparing a representative sequence from each cluster to the gg\_13\_8\_otus reference taxonomy provided by the Greengenes Database Consortium (http://greengenes.secondgenome.com.) OTUs that were classified as chloroplasts or mitochondria were removed from further analysis. The number of high-quality sequences remaining per sample ranged from 26831 to 77843 (see Table 1). All subsequent beta diversity analyses (comparisons across samples) were performed with all samples rarefied to 26830 sequences.
###Comparison of ISS surfaces to analogous surfaces in homes on Earth and to the Human Microbiome Project
The sequences and associated metadata from a 40-home pilot study for the Wildlife of Our Homes Project are available for download from Figshare \cite{885e3742-e0c3-4719-a6a8-dba9930a33ca}.
We also obtained 100 random samples from each of 13 body sites from the HMP Data Portal (http://hmpdacc.org/HM16STR/)\cite{Huttenhower_2012}\cite{Gevers_2012}. These
two additional datasets were used in a combined analysis with the ISS sequences presented here. Because the sequences from the three projects are not all the same lengths, each dataset was independently analyzed using a closed-reference OTU-picking approach,
with a 97% similarity cutoff, and the resultant biom tables were merged with the merge\_otu\_tables.py script.
Shannon diversity, as well as non-metric multidimensional scaling (NMDS) based on Bray-Curtis and Unweighted Unifrac distances were computed and plotted using Phyloseq \cite{McMurdie_2013} and the ggplot2 \cite{Wilkinson_2011} packages in R \cite{R}.