Introduction
The human microbiome has been thoroughly studied as an integral
component of human health and performance (Turnbaugh et al .,
2009, Jiang et al ., 2015, Gilbert et al ., 2018).
Similarly, the plant microbiome is critical in understanding the health
and ecology of living plants (Berg et al . 2017; Hirakue and
Sugiyama 2018), and their rate of decay and nutrient turnover after
senescence (Cline et al ., 2018). The plant microbiome is
incredibly complex and diverse; just 0.1 g of foliar plant tissue can
host over 100 putative fungal species (Bullington et al ., 2015;
Siddique and Unterseher 2016; Unterseher et al., 2016), whereas
the richness of arbuscular mycorrhizal fungi (AMF) in roots is
considerably lower (Lekberg and Waller 2016). Knowledge of the spatial
heterogeneity and overall richness of plant-associated microorganisms
should inform how we sample, as this may influence our understanding of
the processes that shape microbial communities and their relationships
with other organisms. Currently, there is no consensus on an optimal
sampling strategy needed to characterize plant-associated microbial
communities. In fact, there is little technical guidance on how common
sampling strategies influence our interpretation and analyses of the
microbial communities we observe.
Through the use of next-generation sequencing technology, we now know
that biases associated with culture-based microbiome surveys leave a
large proportion of microbial diversity completely undetected (Peiet al., 2016; Dissanayake et al ., 2018). Early departures
from culture-based methods like cloning and direct extraction of
environmental DNA greatly improved our understanding of plant-associated
microbial communities, but they revealed new biases suggesting that our
understanding of microbial communities was still incomplete (Arnold et
al. 2007). Even 454 pyrosequencing often did not provide enough
high-quality sequences to adequately characterize microbial communities,
resulting in sequencing effort curves that failed to reach an asymptote
(Jumpponen and Jones 2009; Unterseher 2011). Deeper, higher quality
sequencing technology now allows researchers to see a more comprehensive
picture of plant-associated microbial communities than previous methods.
Thus, while sequencing effort for individual samples is no longer
a bottleneck in characterizing plant-associated microbial communities,sampling effort could very well be.
Many advances have been made to reduce biases when processing microbial
community data (McMurdie and Holmes 2014; Allali et al. , 2017,
Bolyen et al ., 2018), but descriptions of sampling effort or
strategy are often omitted or vague, with little or no justification of
methods used. In a recent review, Dickie et al . (2018) found that
95% of metabarcoding studies examined reported inappropriate or
incomplete field or sampling methods that rendered them
non-reproducible. This growing problem was again emphasized in a recent
editorial in Molecular Ecology (Zinger et al .,
2019). In addition to unclear methods, many studies fail to
report the robustness of their sampling efforts (e.g. through species
accumulation curves that display the number of species recovered for
each additional sample), despite the inherent consequences of
undersampling. Indeed, undersampling of microbial communities appears to
have become the rule, rather than the exception. One reason for this is
that time and budget constraints limit the amount of lab work that can
be performed, and often it is inappropriate or impossible to
destructively sample whole plants. Most of the time researchers only
sample small quantities (mg) of plant tissue representing a tiny
fraction of the plant’s total biomass (often < 1%). Microbial
communities observed in these samples are then used to make inferences
on the larger population of colonizing microbes. In uncontrolled field
studies where there are countless environmental drivers, this incomplete
sampling may obscure subtle underlying patterns in microbial
distributions (Schloss et al ., 2018), which in turn could
compromise our estimates about their diversity and the degree to which
sites, treatments and individual hosts truly differ.
Not only do we sample small proportions of total plant biomass, but
researchers often differ in how these samples are collected and
processed. To provide guidance for sampling plant-associated microbial
groups, we compared two of the more common sampling strategies seen in
the literature to see if they differ in richness and compositional
estimates and if this depends on the microorganism targeted. The first
sampling strategy involves collecting a specific surface area or volume
of plant tissue (e.g. leaf discs or lengths of root segments) followed
by tissue homogenization (e.g., Jumpponen and Jones 2009; Daleo et
al ., 2018; Toju et al. 2019). In this strategy, the spatial
extent of available plant tissue is somewhat maintained, but many taxa
may be missed if their distributions are patchy, which in turn may lead
to increased variance in richness and composition among samples. From
hereon we will refer to this strategy as “homogenizing tissue
after subsampling ” (Figure 1A).The second common sampling strategy
involves either collecting a pre-specified amount of tissue from each
plant (e.g., six leaves or root segments per plant) or collecting plant
tissue somewhat haphazardly without much standardization among plant
samples. Then samples are ground prior to collection of a standardized,
homogenized subsample (e.g., Zimmerman and Vitousek 2011, Unterseheret al . 2016). We will refer to this method as
“homogenizing tissue before subsampling” (Figure 1B). One
criticism of this strategy is that plants undoubtedly differ in size,
and by collecting a set number of leaves or roots, or just any plant
material available, researchers are sampling from initial microbial
species pools that vary in size. Differences in the homogenized pool
sizes can then bias richness measurements toward larger plants. In this
approach, it is thought that the homogenized pool may yield a more
representative subsample of the microbial community present within the
plant. It is also assumed that any subsample from the powdered
tissue, regardless of sampling approach used, will yield the same or a
similar microbial community. To our knowledge, however, these
assumptions have not been tested.
Because the optimal strategy may depend on the microorganism targeted,
we tested these two strategies by extracting and amplifying bacterial
DNA (16S), general fungal DNA (ITS2) and arbuscular mycorrhizal fungal
DNA (18S), from roots (all) and/or leaves (ITS2 only) of naturally
occurring, mature, showy milkweed (Asclepias speciosa) . We chose
showy milkweed as it is a model species for plant defensive chemistry,
and previous studies have shown high microbial colonization (Hahnet al ., 2018). Due to the highly diverse nature of foliar fungal
endophytes (FFE) (Jumpponen and Jones 2009), we extracted DNA from many
additional subsamples per plant to assess the extent of undersampling in
this group. Finally, we sampled leaves twice across the season to assess
if broad-scale seasonal differences in FFE were still detectable despite
potential undersampling, in order to better understand when—and for
which type of questions—insufficient sampling is a problem. We
predicted that 1) homogenizing before subsampling (strategy 2)
would yield richer, more even microbial communities among plants,
because more of the plant tissue would be initially homogenized, 2)
different sampling strategies would result in different microbial
community structures due to the spatial heterogeneity of microbes within
plants, and 3) differences between sampling strategies would depend on
the organism targeted and tissue sampled (roots or leaves), potentially
due to inherent differences in global and likely local richness, as well
as microbial modes of dispersal (above vs. belowground).