Introduction
The human microbiome has been thoroughly studied as an integral component of human health and performance (Turnbaugh et al ., 2009, Jiang et al ., 2015, Gilbert et al ., 2018). Similarly, the plant microbiome is critical in understanding the health and ecology of living plants (Berg et al . 2017; Hirakue and Sugiyama 2018), and their rate of decay and nutrient turnover after senescence (Cline et al ., 2018). The plant microbiome is incredibly complex and diverse; just 0.1 g of foliar plant tissue can host over 100 putative fungal species (Bullington et al ., 2015; Siddique and Unterseher 2016; Unterseher et al., 2016), whereas the richness of arbuscular mycorrhizal fungi (AMF) in roots is considerably lower (Lekberg and Waller 2016). Knowledge of the spatial heterogeneity and overall richness of plant-associated microorganisms should inform how we sample, as this may influence our understanding of the processes that shape microbial communities and their relationships with other organisms. Currently, there is no consensus on an optimal sampling strategy needed to characterize plant-associated microbial communities. In fact, there is little technical guidance on how common sampling strategies influence our interpretation and analyses of the microbial communities we observe.
Through the use of next-generation sequencing technology, we now know that biases associated with culture-based microbiome surveys leave a large proportion of microbial diversity completely undetected (Peiet al., 2016; Dissanayake et al ., 2018). Early departures from culture-based methods like cloning and direct extraction of environmental DNA greatly improved our understanding of plant-associated microbial communities, but they revealed new biases suggesting that our understanding of microbial communities was still incomplete (Arnold et al. 2007). Even 454 pyrosequencing often did not provide enough high-quality sequences to adequately characterize microbial communities, resulting in sequencing effort curves that failed to reach an asymptote (Jumpponen and Jones 2009; Unterseher 2011). Deeper, higher quality sequencing technology now allows researchers to see a more comprehensive picture of plant-associated microbial communities than previous methods. Thus, while sequencing effort for individual samples is no longer a bottleneck in characterizing plant-associated microbial communities,sampling effort could very well be.
Many advances have been made to reduce biases when processing microbial community data (McMurdie and Holmes 2014; Allali et al. , 2017, Bolyen et al ., 2018), but descriptions of sampling effort or strategy are often omitted or vague, with little or no justification of methods used. In a recent review, Dickie et al . (2018) found that 95% of metabarcoding studies examined reported inappropriate or incomplete field or sampling methods that rendered them non-reproducible. This growing problem was again emphasized in a recent editorial in Molecular Ecology (Zinger et al ., 2019). In addition to unclear methods, many studies fail to report the robustness of their sampling efforts (e.g. through species accumulation curves that display the number of species recovered for each additional sample), despite the inherent consequences of undersampling. Indeed, undersampling of microbial communities appears to have become the rule, rather than the exception. One reason for this is that time and budget constraints limit the amount of lab work that can be performed, and often it is inappropriate or impossible to destructively sample whole plants. Most of the time researchers only sample small quantities (mg) of plant tissue representing a tiny fraction of the plant’s total biomass (often < 1%). Microbial communities observed in these samples are then used to make inferences on the larger population of colonizing microbes. In uncontrolled field studies where there are countless environmental drivers, this incomplete sampling may obscure subtle underlying patterns in microbial distributions (Schloss et al ., 2018), which in turn could compromise our estimates about their diversity and the degree to which sites, treatments and individual hosts truly differ.
Not only do we sample small proportions of total plant biomass, but researchers often differ in how these samples are collected and processed. To provide guidance for sampling plant-associated microbial groups, we compared two of the more common sampling strategies seen in the literature to see if they differ in richness and compositional estimates and if this depends on the microorganism targeted. The first sampling strategy involves collecting a specific surface area or volume of plant tissue (e.g. leaf discs or lengths of root segments) followed by tissue homogenization (e.g., Jumpponen and Jones 2009; Daleo et al ., 2018; Toju et al. 2019). In this strategy, the spatial extent of available plant tissue is somewhat maintained, but many taxa may be missed if their distributions are patchy, which in turn may lead to increased variance in richness and composition among samples. From hereon we will refer to this strategy as “homogenizing tissue after subsampling ” (Figure 1A).The second common sampling strategy involves either collecting a pre-specified amount of tissue from each plant (e.g., six leaves or root segments per plant) or collecting plant tissue somewhat haphazardly without much standardization among plant samples. Then samples are ground prior to collection of a standardized, homogenized subsample (e.g., Zimmerman and Vitousek 2011, Unterseheret al . 2016). We will refer to this method as “homogenizing tissue before subsampling” (Figure 1B). One criticism of this strategy is that plants undoubtedly differ in size, and by collecting a set number of leaves or roots, or just any plant material available, researchers are sampling from initial microbial species pools that vary in size. Differences in the homogenized pool sizes can then bias richness measurements toward larger plants. In this approach, it is thought that the homogenized pool may yield a more representative subsample of the microbial community present within the plant. It is also assumed that any subsample from the powdered tissue, regardless of sampling approach used, will yield the same or a similar microbial community. To our knowledge, however, these assumptions have not been tested.
Because the optimal strategy may depend on the microorganism targeted, we tested these two strategies by extracting and amplifying bacterial DNA (16S), general fungal DNA (ITS2) and arbuscular mycorrhizal fungal DNA (18S), from roots (all) and/or leaves (ITS2 only) of naturally occurring, mature, showy milkweed (Asclepias speciosa) . We chose showy milkweed as it is a model species for plant defensive chemistry, and previous studies have shown high microbial colonization (Hahnet al ., 2018). Due to the highly diverse nature of foliar fungal endophytes (FFE) (Jumpponen and Jones 2009), we extracted DNA from many additional subsamples per plant to assess the extent of undersampling in this group. Finally, we sampled leaves twice across the season to assess if broad-scale seasonal differences in FFE were still detectable despite potential undersampling, in order to better understand when—and for which type of questions—insufficient sampling is a problem. We predicted that 1) homogenizing before subsampling (strategy 2) would yield richer, more even microbial communities among plants, because more of the plant tissue would be initially homogenized, 2) different sampling strategies would result in different microbial community structures due to the spatial heterogeneity of microbes within plants, and 3) differences between sampling strategies would depend on the organism targeted and tissue sampled (roots or leaves), potentially due to inherent differences in global and likely local richness, as well as microbial modes of dispersal (above vs. belowground).