loading page

Short read lengths recover ecological patterns in 16S rRNA gene amplicon data
  • Stephanie Jurburg
Stephanie Jurburg
Helmholtz-Centre for Environmental Research - UFZ

Corresponding Author:[email protected]

Author Profile


Metabarcoding is an increasingly popular and accessible method for assessing bacterial communities across a wide range of environments, and as the sequence data archives grow, sequence data reuse will likely become an important source of novel insights into the ecology of microbes. While literature on the benefits of longer read lengths for the study of microbial communities, little is known about the (re)usability of shorter (<200 bp) read lengths, but this information is essential to improve the reuse and comparability of metabarcoding data across studies. This study reanalyzed three 16S rRNA datasets targeting aquatic, animal-associated, and soil microbiomes, and evaluated how processing the sequence data across a range of read lengths affected the resulting taxonomic assignments, biodiversity metrics, and differential (i.e., before-after treatment) analyses. Short read lengths successfully recovered ecological patterns, and limited increases in resolution were observed beyond 100 bp reads across environments. Furthermore, abundance-weighted diversity metrics (e.g., Inverse Simpson index or Bray-Curtis dissimilarities) were more robust to variation in read lengths. Importantly, the total number of ASVs detected increased with read length, highlighting the need to consider metabarcoding-derived diversity estimates within the context of the bioinformatics parameters selected. This study provides evidence-based guidelines for the processing of short reads.
27 Aug 2023Submitted to Molecular Ecology Resources
29 Aug 2023Assigned to Editor
29 Aug 2023Submission Checks Completed
29 Aug 2023Review(s) Completed, Editorial Evaluation Pending
31 Aug 2023Reviewer(s) Assigned