Principles of mechanistic modelling

The construction, calibration and validation of mechanistic mathematical models require a constant dialogue between the mathematical/computational implementation and analysis of the model, with the experimental and clinical data. As a first step, all available relevant empirical data describing the biological phenomenon are gathered from the literature together with in-house measurements from clinical, animal and in vitro studies. These data can be complemented using publicly available databases, such as String-DB (41), KEGG (8) and the Human Cell Atlas (42). In principle, all types of data, from low- to high-throughput, single-timepoint to dynamic and mean-field to single-patient-single-cell resolution, can be used (Fig. 4A).
After cleaning, filtering and pre-processing the data using standard statistical and bioinformatic methods (43, 44), regulatory networks at the cellular- and tissue-level scales are assembled from the data and visually represented using the Systems Biology Markup Language (SBML) (45). The networks are translated into mechanistic mathematical models using a chosen modeling formalism that depends on the scale and resolution of the data. Multi-scale mathematical models can be modularly constructed and assembled using a variety of formalisms from nonlinear dynamical systems (46), including Boolean networks for genetic regulatory networks, systems of nonlinear ordinary differential equations (ODEs) for signaling networks and interactions between cell types in the tissue, compartmental differential equations or agent-based models to explicitly model spatial coupling between different cell types, stochastic differential equations to simulate population-level distributions of cell markers and delay-differential equations for multi-level regulatory loops that control tissue homeostasis. In general, when constructing a model for a specific disease affecting a particular tissue (e.g., FA-related SCC in the oral mucosa), a good starting point is to identify mathematical models previously proposed and experimentally validated, which can then be extended and adapted to reproduce specific experimental and clinical observations. For example, mathematical models of loss of epithelial homeostasis, originally developed for atopic diseases (47), could be extended to include SCC in FA-specific hallmarks. The quantitative models are then calibrated with a training set of experimental data, using global parametric optimization algorithms (48-50). This allows one to adapt the mathematical models to specific experimental conditions, as well as to integrate scattered experimental data into a formal and coherent framework that articulates all the individual regulatory players that are typically described in isolation, to understand how they give rise to different clinical manifestations. Finally, model validation is achieved by ensuring that the model can reproduce an additional set of empirical data (the validation set) that was not used for the calibration (Fig. 4B).
Next, the calibrated and validated models are analyzed extensively. For this, the nominal conditions, i.e., the calibrated model, are perturbed, e.g., by systematically altering the magnitude of the individual regulatory interactions (changing parameter values), or by structurally altering the different regulatory interactions (changing the equations). The output of the model, e.g., the “phenotype”, is then collected. Examples of such perturbations-to-response mappings are robustness analysis, sensitivity analysis and bifurcation analysis. A robustness analysis tells us which fraction of variations results in a given phenotype (51). Sensitivity analysis weights each individual interaction (parameter) in terms of its contribution to the change in the model output (52). In bifurcation analysis, the model output is assessed as one (or more) parameters are gradually changed (53). Abrupt health-to-disease transitions mathematically correspond to qualitative changes in the model, and they occur typically at bifurcationsThe value(s) of the parameter(s) at which such a bifurcation occurs, known as a bifurcation (set), is particularly interesting because it represents the critical threshold of a perturbation that can be tolerated by the regulatory structure. Furthermore, the appearance of these bifurcation sets can often be anticipated by early warning signals, such as an increased variance in recovery times of the system (54). Thus, early warning signals of bifurcating systems can be used to improve early detection strategies for abrupt disease transitions (Fig. 4C).
Finally, once the mathematical model has been exhaustively calibrated, validated and analyzed, it can be used for designing and optimizing personalized treatment strategies that consider the specific disease stage and patient characteristics (55). For this, one starts by identifying the potential targets of the intervention strategy, e.g., growth hormone receptor inhibitors that reduce excessive proliferation of malignant cells, and modifying the model accordingly. Next, analysis of the extended model, i.e., the system without treatments plus the dynamics of the treatments, can be performed (Fig. 4C). For example, one can use bifurcation analysis to systematically explore how a given treatment affects the overall virtual population of patients (e.g., considering the natural variations that occur in a population due to polymorphisms) by looking at how the bifurcation sets shift under a specific treatment, such as cetuximab (an epidermal growth factor receptor inhibitor). With a similar line of reasoning, it is possible to use bifurcation analysis to explore how different treatment combinations affect the phenotype of a given patient. Here, again, it is particularly relevant to look for bifurcation sets that separate the qualitatively different clinical phenotypes because these curves correspond to all the minimal treatment combinations that can effectively trigger a disease-to-health transition (56). Besides bifurcation analysis, other techniques, such as model predictive control (57), can also be applied to maximize treatment efficiency while minimizing duration, dosing, and negative side effects (Fig. 4D ). In all stages of this modelling pipeline, model predictions must be verified by comparing them to empirical data.