Better together: A combined theory-driven and data-driven approach
To realise the full potential of parameters derived from computational models as biomarkers, theory-driven models will have to be combined with data-driven (machine learning) approaches. Generative embedding is a process which exemplifies this combined approach. Here, models of brain connectivity or cognitive tasks are used to fit parameters, representing physiological or cognitive processes, which can then be classified into subgroups by machine learning techniques. These groups are validated against clinical variables, and assessed for prognostic value. One example uses dynamic causal modelling, which infers synaptic coupling between large neuronal populations based on connection strengths between brain regions, to predict participant disease trajectory in depression [5].
There are several benefits to a combined theoretical and data-driven approach. Data-driven models are dependent on the quality of the input data, i.e. clean signals with little noise, and how sensitive they are to relevant biological, cognitive, and environmental processes that shape behaviour. Theory-driven models provide a specific mechanistic description for the input data, based on prior research and expertise. Parameters of a theory-driven model can therefore provide a framework that reduces the dimensionality, i.e. the total number of variables, of the input data, leading to more stable and accurate output from data-driven machine learning techniques.