Authorea

Jonathan A. Eisen edited Methods.md over 8 years ago

Commit id: d1e014706cdb22bf7adbcfe7b71a57ad4ac76644

deletions | additions

Upon successful completion of the swabbing on May 9, 2014, http://blogs.nasa.gov/stationreport/2014/05/09/iss-daily-summary-report-050914/, all swabs were stored at -80 °C in the Minus Eighty-degree Laboratory Freezer for ISS (MELFI) freezer onboard the ISS, until transfer to the SpaceX Dragon spacecraft. In the Dragon, the swabs were stored at -80 °C in the General Laboratory Active Cryogenic ISS Experiment Refrigerator (GLACIER), that runs off of Dragon's batteries until it is plugged in (either to the ISS or on the ground.) The Dragon re-entered the Earth's atmosphere and splashed down in the Pacific Ocean at 12:05 pm PT on May 18, 2014. Samples were transferred to a cooler with dry ice, and shipped to the Earth Microbiome Project (EMP) lab (http://earthmicrobiome.org). ##DNA Extraction and Library Preparation All samples were prepared using a modified version of the Mo BIO UltraClean®-htp 96 Well Swab DNA Kit (MO BIO). Samples were purified using the Zymo ZR-96 DNA Cleanup and Concentrator™-5 kit according to Zymo Protocol (Zymo). DNA was then amplified using the EMP barcoded primer set, adapted for the Illumina HiSeq2000 and MiSeq by adding nine extra bases in the adapter region of the forward amplification primer that support paired-end sequencing. The V4 region of the 16S rRNA gene (515F-806R) was amplified with region-specific primers that included the Illumina flowcell adapter sequences and a twelve base barcode sequence. Each 25 ul PCR reaction contained the following: 12 ul of PCR water certified DNA-free (MO BIO), 10 ul of 1x 5 Prime HotMasterMix (5 Prime), 1 ul of Forward Primer (5 uM concentration, 200 pM final), 1 ul of Golay Barcode Tagged Reverse Primer (5 uM concentration, 200 pM final), and 1 ul of template DNA. The conditions for PCR were as follows: 94°C for 3 minutes to denature the DNA, with 35 cycles at 94 °C for 45 s, 50 °C for 60 s, and 72 °C for 90 s, with a final extension of 10 min at 72 °C to ensure complete amplification. Amplicons were quantified using PicoGreen (Invitrogen) and a plate reader. Once quantified, different volumes of each of the products were pooled into a single tube so that each amplicon is represented equally. This pool was then cleaned up using UltraClean® PCR Clean-Up Kit (MO BIO), and then quantified using Qubit (Invitrogen). Sequencing of the prepared library was performed on the Illumina MiSeq platform, using the sequencing primers and procedures described in the supplementary methods of \cite{Caporaso_2012}. ##Bioinformatic Analysis

Chimeric sequences were identified using usearch61 as implemented in the identify\_chimeric\_seqs.py script, resulting in the removal of 8760 sequences. The pick\_open\_reference\_otus.py script was used to cluster sequences at 97% similarity to generate OTUs (Operational Taxonomic Units, a proxy for species). Taxonomy was assigned to each OTU by comparing a representative sequence from each cluster to the gg\_13\_8\_otus reference taxonomy provided by the Greengenes Database Consortium (http://greengenes.secondgenome.com.) OTUs that were classified as chloroplasts or mitochondria were removed from further analysis. The number of high-quality sequences remaining per sample ranged from 26831 to 77843 (see Table 1). All subsequent beta diversity analyses (comparisons across samples) were performed with all samples rarefied to 26830 sequences. ###Comparison of ISS surfaces to analogous surfaces in homes on Earth and to the Human Microbiome Project The sequences and associated metadata from a 40-home pilot study for the Wildlife of Our Homes Project are available for download from Figshare \cite{885e3742-e0c3-4719-a6a8-dba9930a33ca}. We also obtained 100 samples from each of 13 body sites from the HMP Data Portal (http://hmpdacc.org/HM16STR/)\cite{Huttenhower_2012}\cite{Gevers_2012}. These two additional datasets were used in a combined analysis with the ISS sequences presented here. Because the sequences from the three projects are not all the same lengths, each dataset was independently analyzed using a closed-reference OTU-picking approach, with a 97% similarity cutoff, and the resultant biom tables were merged with the merge\_otu\_tables.py script. Shannon diversity, as well as non-metric multidimensional scaling (NMDS) based on Bray-Curtis and Unweighted Unifrac \cite{Lozupone_2005} distances were computed and plotted using Phyloseq \cite{McMurdie_2013} and the ggplot2 \cite{Wilkinson_2011} packages in R \cite{R}. ##Comparison to rooms with mechanical ventilation or open windows. We obtained a list of human pathogens, compiled by Kembel et al, 2012 from the author. We then used BLAST \cite{2231712} to search a representative sequence from each of the ISS OTUs against the NCBI Reference Sequence (RefSeq) database \cite{Pruitt_2004}. OTUs with 97% similarity to an organism that was on the list of known pathogens were flagged as "related to a known human pathogen". The phylogenetic diversity (Faith's PD) was calculated using the alpha\_diversity.py script, with samples rarefied to 700 sequences.