Bioinformatics
Processing of raw sequence data was performed using “Quantitative
Insights into Molecular Microbial Ecology 2” (QIIME2 version 2018.4;
https://qiime2.org/) (Bolyen et al ., 2018).
Sequence reads were first demultiplexed using the q2-demux plugin
(https://github.com/qiime2/q2-demux). Only forward reads were used for
the 18S region, as the overlap between forward and reverse reads is too
short to merge the two without significant sequence loss. For 18S only,
forward reads were trimmed to 210 bp, which covers the informative
region of our 18S target (Lee et al. , 2008). For ITS2 and 16S,
forward and reverse reads were trimmed where median quality score fell
below 30, and if at any point quality score fell below 3 within the
trimmed region, those sequences were removed from further analysis. All
target regions were quality filtered, and de-replicated using the
q2-dada2 plugin (Callahan et al. , 2016), which was also used for
pairing 16S and ITS2 sequences. The q2-dada2 plugin uses nucleotide
quality scores to produce sequence variants (SVs), or sequence clusters
with 100% similarity representing the estimated true biological
variation within each sample. Although sequences are clustered at 100%
similarity as opposed to the traditional 97% similarity, DADA2 produces
fewer spurious sequences, fewer clusters, and results in a more accurate
representation of the true biological variation present (Callahanet al. , 2016). After DADA2 processing, microbial groups contained
an average of 9000-17000 sequences per sample (Supporting information,
Table S1). All extraction and PCR controls were clean except for
bacteria. Bacterial contaminants were subsequently removed from all
samples to reduce potential background contamination. The following
databases were used to assign taxonomy and remove non-target DNA: The
database MaarjAM (Öpik et al. , 2010) was used to assign
taxonomy and remove non-target DNA for AMF, UNITE was used for general
fungi (http,//unite.ut.ee; Abarenkov et al., 2010; Kõljalg et al.,
2013); and Greengenes was used for bacteria (http://greengenes.lbl.gov).
All SVs that did not match with at least 70% identity (for bacteria) or
90% identity (for fungi) with at least 70% coverage to sequences
within one of the above databases were removed. To help remove
non-target DNA, we added many sequences representing non-target
organisms (including many Asclepias spp.) to both our AMF and ITS
databases to better identify contaminants and reduce misclassification
when assigning taxonomy to these sequences. Taxonomy for each microbial
group was then assigned using QIIME2 q2-feature-classifier
(https://github.com/qiime2/q2-feature-classifier), a naive Bayes
machine-learning classifier which has been shown to meet or exceed
classification accuracy of existing methods (Bokulich et al. ,
2017), setting a confidence threshold of 0.94 for fungi and 0.7 for
bacteria. For 16S data, sequences identified as chloroplast or
mitochondrial DNA were also removed, which resulted in the removal of
>90% of bacterial sequences. For non-AM root fungi, all
Glomeromycota were removed in order to analyze AMF and non-AM root fungi
separately.