Alex Veglia

and 4 more

Amplicon sequencing is an effective and increasingly applied method for studying viral communities in the environment. Here, we present vAMPirus, a user-friendly, comprehensive, and versatile DNA and RNA virus amplicon sequence analysis program, designed to support investigators in exploring virus amplicon sequencing data and running informed, reproducible analyses. vAMPirus intakes raw virus amplicon libraries and, by default, performs nucleotide- and protein-based analyses to produce results such as sequence abundance information, taxonomic classifications, phylogenies, and community diversity metrics. The vAMPirus pipelines additionally include optional approaches that can increase the biological signal-to-noise ratio in results by leveraging tools not yet commonly applied to virus amplicon data analyses. In this paper, we validate the vAMPirus analytical framework and illustrate its implementation into the general virus amplicon sequencing workflow by recapitulating findings from two previously published double-stranded DNA virus datasets. As a case study, we also apply the program to explore the diversity and distribution of a coral reef-associated RNA virus. vAMPirus is incorporated with the Nextflow workflow manager, offering straightforward scalability, standardization, and communication of virus lineage-specific analyses. The vAMPirus framework itself is also designed to be adaptable; community-driven analytical standards will continue to be incorporated as the field advances. vAMPirus supports researchers in revealing patterns of virus diversity and population dynamics in nature, while promoting study reproducibility and comparability.