Combined Discriminant Analysis of Metabarcoding and Metabolomics
datasets
Data Integration Analysis for Biomarker discovery using Latent variable
approaches for Omics studies (DIABLO) from the package “mixOmics” was
used for the integration of metabolomics and root metabarcoding datasets
(Rohart, Gautier, Singh & Lê Cao 2017). This supervised approach
allowed the integrated analysis of multiple datasets and was used to
identify discriminant features in both datasets that drive differences
between treatment groups. Values of the design matrix were set to 0.1 to
prioritise the discriminant ability of the model. Center log ratio (clr)
transformation was applied to both datasets, and root metabarcoding data
was aggregated at the genus level beforehand. An optimal number of 3
components for “centroid.dist” distance was determined using the
function perf() with 6-fold cross-validation and 10 repeats. The number
of features selected for sparse PLS-DA was tuned with the function
tune.block.splsda() using 4-fold cross-validation with 10 repeats. The
features selected for each component were 18, 40, 6 for metabarcoding
and 6, 14, 90 for metabolomics. The correlation among components of each
dataset was checked with plotArrow().