Combined Discriminant Analysis of Metabarcoding and Metabolomics datasets
Data Integration Analysis for Biomarker discovery using Latent variable approaches for Omics studies (DIABLO) from the package “mixOmics” was used for the integration of metabolomics and root metabarcoding datasets (Rohart, Gautier, Singh & Lê Cao 2017). This supervised approach allowed the integrated analysis of multiple datasets and was used to identify discriminant features in both datasets that drive differences between treatment groups. Values of the design matrix were set to 0.1 to prioritise the discriminant ability of the model. Center log ratio (clr) transformation was applied to both datasets, and root metabarcoding data was aggregated at the genus level beforehand. An optimal number of 3 components for “centroid.dist” distance was determined using the function perf() with 6-fold cross-validation and 10 repeats. The number of features selected for sparse PLS-DA was tuned with the function tune.block.splsda() using 4-fold cross-validation with 10 repeats. The features selected for each component were 18, 40, 6 for metabarcoding and 6, 14, 90 for metabolomics. The correlation among components of each dataset was checked with plotArrow().