Better together: A combined theory-driven and data-driven
approach
To realise the full potential of parameters derived from computational
models as biomarkers, theory-driven models will have to be combined with
data-driven (machine learning) approaches. Generative embedding is a
process which exemplifies this combined approach. Here, models of brain
connectivity or cognitive tasks are used to fit parameters, representing
physiological or cognitive processes, which can then be classified into
subgroups by machine learning techniques. These groups are validated
against clinical variables, and assessed for prognostic value. One
example uses dynamic causal modelling, which infers synaptic coupling
between large neuronal populations based on connection strengths between
brain regions, to predict participant disease trajectory in depression
[5].
There are several benefits to a combined theoretical and data-driven
approach. Data-driven models are dependent on the quality of the input
data, i.e. clean signals with little noise, and how sensitive they are
to relevant biological, cognitive, and environmental processes that
shape behaviour. Theory-driven models provide a specific mechanistic
description for the input data, based on prior research and expertise.
Parameters of a theory-driven model can therefore provide a framework
that reduces the dimensionality, i.e. the total number of variables, of
the input data, leading to more stable and accurate output from
data-driven machine learning techniques.