2.1.1 Overview of vAMPirus execution
vAMPirus is composed of three main components that are recommended to be deployed sequentially: 1. A startup script to install dependencies and databases for taxonomy processes, 2. A ‘DataCheck’ pipeline that provides users with detailed information on data quality and diversity to inform subsequent analysis, and 3. An ‘Analyze’ pipeline that runs a comprehensive biology-focused analysis of the data using specified parameters and program options. vAMPirus is incorporated with Nextflow, a scientific workflow manager that allows easy configuration and deployment of the program using Conda, Docker, Singularity or cloud systems like Amazon Web Services (Di Tommaso et al., 2017). Nextflow natively communicates with scheduling managers like SLURM, PBS, or Torque, making it easy to run vAMPirus on high-performance computing clusters or on a local laptop or workstation. vAMPirus analyses can be configured using the Nextflow configuration file to promote efficient utilization of computing resources and reduce run times. Real-time monitoring and remote launching of vAMPirus analyses can be done using Nextflow Tower with no alterations to the vAMPirus script.