2.6 Core analysis, differentially abundant taxa and functional prediction
Shared ASVs among sources of each site were visualized with the “MicEco” package. Core communities were defined to facilitate the interpretation of host and environmental microbiota. ASVs being present in at least 70 % of samples were considered as core and rare ASVs were those that were present in fewer than 30 % of samples (Björk et al., 2018). All other ASVs were considered transient. Indicator ASVs were identified with the multipatt function with the “indicspecies” package (Cáceres et al., 2023). Differentially abundant ASVs between sources were also identified with the ANOVA-Like Differential Gene Expression Analysis (ALDEx2) with the “ALDEx2” package (Fernandes et al., 2014).
Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt2) was used to predict physiological and metabolic functions of the host and environment microbiota based on ASVs generated from the QIIME2 DADA2 pipeline (Douglas et al., 2020; Langille et al., 2013). This procedure predicts the relative abundance of functional genes (expressed as Kegg Orthologs–KOs) in a 16S ASV community from the phylogenetic conservation of these genes in all currently sequenced and assembled prokaryotic genomes. Quality control was implemented by computing weighted nearest sequenced taxon index (NSTI) values of each ASV. NSTI evaluates the prediction accuracy of PICRUSt because it reflects the average genetic distance (measured as number of substitutions per site) between each ASV against a reference genome (Douglas et al., 2020; Langille et al., 2013). NSTI values higher than 2 were eliminated following the developer’s guidelines (Douglas et al., 2020). PERMANOVA with 999 permutations was adopted to compare functional pathways between sources and sites. Potential differentially abundant functional MetaCyc pathways between sources were analysed by ALDEx2. Those that were significantly differentially abundant (p < 0.01) were then visualized with the “ComplexHeatmap” package (Gu et al., 2016). All R packages mentioned were implemented in RStudio ver. 1.2.5019. In order to support and facilitate scientific reproducibility, all analyses performed were included in the script as part of the supplementary materials.