Bioinformatics
Processing of raw sequence data was performed using “Quantitative Insights into Molecular Microbial Ecology 2” (QIIME2 version 2018.4; https://qiime2.org/) (Bolyen et al ., 2018).
Sequence reads were first demultiplexed using the q2-demux plugin (https://github.com/qiime2/q2-demux). Only forward reads were used for the 18S region, as the overlap between forward and reverse reads is too short to merge the two without significant sequence loss. For 18S only, forward reads were trimmed to 210 bp, which covers the informative region of our 18S target (Lee et al. , 2008). For ITS2 and 16S, forward and reverse reads were trimmed where median quality score fell below 30, and if at any point quality score fell below 3 within the trimmed region, those sequences were removed from further analysis. All target regions were quality filtered, and de-replicated using the q2-dada2 plugin (Callahan et al. , 2016), which was also used for pairing 16S and ITS2 sequences. The q2-dada2 plugin uses nucleotide quality scores to produce sequence variants (SVs), or sequence clusters with 100% similarity representing the estimated true biological variation within each sample. Although sequences are clustered at 100% similarity as opposed to the traditional 97% similarity, DADA2 produces fewer spurious sequences, fewer clusters, and results in a more accurate representation of the true biological variation present (Callahanet al. , 2016). After DADA2 processing, microbial groups contained an average of 9000-17000 sequences per sample (Supporting information, Table S1). All extraction and PCR controls were clean except for bacteria. Bacterial contaminants were subsequently removed from all samples to reduce potential background contamination. The following databases were used to assign taxonomy and remove non-target DNA: The database MaarjAM (Öpik et al. , 2010) was used to assign taxonomy and remove non-target DNA for AMF, UNITE was used for general fungi (http,//unite.ut.ee; Abarenkov et al., 2010; Kõljalg et al., 2013); and Greengenes was used for bacteria (http://greengenes.lbl.gov). All SVs that did not match with at least 70% identity (for bacteria) or 90% identity (for fungi) with at least 70% coverage to sequences within one of the above databases were removed. To help remove non-target DNA, we added many sequences representing non-target organisms (including many Asclepias spp.) to both our AMF and ITS databases to better identify contaminants and reduce misclassification when assigning taxonomy to these sequences. Taxonomy for each microbial group was then assigned using QIIME2 q2-feature-classifier (https://github.com/qiime2/q2-feature-classifier), a naive Bayes machine-learning classifier which has been shown to meet or exceed classification accuracy of existing methods (Bokulich et al. , 2017), setting a confidence threshold of 0.94 for fungi and 0.7 for bacteria. For 16S data, sequences identified as chloroplast or mitochondrial DNA were also removed, which resulted in the removal of >90% of bacterial sequences. For non-AM root fungi, all Glomeromycota were removed in order to analyze AMF and non-AM root fungi separately.