Population studied

365 pediatric patients (355 children aged 2-17 and ten adolescents aged 18-22 ) with atopic and non-atopic, mild to severe persistent asthma12, were recruited in a prospective, non-interventional type of clinical study at the Srebrnjak Children’s Hospital outpatient clinic. Informed consent was obtained from the children’s parents/legal guardians. The study protocol was approved by the local Ethics Committee. All patients underwent physical examination, anthropometric measurements and standard diagnostic procedures to establish a diagnosis of asthma and guide its management (Table 1). The patients started treatment with ICS (alone or in combination with LABA) and/or LTRA, according to disease severity and previously assessed level of disease control. A follow-up visit with lung function and airway inflammation testing was made after 6 months of treatment use. Additionally, treatment outcomes and the level of asthma control (according to the Global Initiative for Asthma, GINA12) were assessed at the follow-up visit. In total 280 features (variables) were collected. The observational study is described in the supplementary file in detail.

Response variables

According to their response to treatment after 6 months of medication use, the patients were divided into “responders” and “non-responders” in accordance with the Minimal Clinically Important Difference (MCID) for lung function adjusted for children (% of predicted lung function) and data from other studies taking into account changes in the level of asthma control (LOAC) and changes in FENO13–17. The response variables are described in detail in Table 2.

Data preparation and balancing

We used Python scripts and methods previously described for data processing and modelling18. Variables with more than 10% missing values were removed. Those with fewer missing values were imputed by their respective median for continuous variables or mode for discrete variables. To avoid the “curse of dimensionality”19, we aggregated individual variables describing allergic sensitization (skin prick test- SPT and allergen-specific immunoglobulin E- sIgE test results). These variables were binarized and summed into 4 categories: seasonal inhaled, perennial inhaled, insect venom and food allergens. Strong sensitization to house dust mite, cat dander and ragweed were treated separately due to their association with disease severity and more severe outcomes20,21. The dataset consisted of 365 patients and 73 variables. We dealt with an imbalanced classification problem (see Table 3), i.e. responders (1) or non-responders (0) could have been underrepresented. In imbalanced classification predictive models tend to recognize the major class better while struggling with the often scarce minor class, meaning that predictions may be biased towards the major class18,22. To avoid this, we employed synthetic data generation techniques, namely oversampling and under sampling (on the training set exclusively).
A powerful method for oversampling is Synthetic Minority Over-sampling Technique (SMOTE)23, that has previously been reported for predicting lung disease outcomes10. Since our dataset was heterogeneous, we used the adapted algorithm, Synthetic Minority Over-sampling Technique for Nominal and Continuous (SMOTENC)23. We utilized Cluster Centroids (CC)24,25 as the most promising under sampling approach.

Machine learning

Our aim was to estimate to which class a patient belongs (0/1) after treatment based on predictive variables and to predict the patients’ future responses. The employed ensemble classification algorithms follow a paradigm where multiple “weak classifiers” are trained and averaged to improve the prediction abilities and lower the prediction error.
The basis for ensemble classifiers are decision trees (Figure 1, left)26. With the appearance of boosting27 and bootstrapping strategies26 combined predictors started to emerge. Boosting algorithms28 are often utilized in industry18 and personalized medicine29. RF (Figure 1, right) has shown good results in predicting pediatric asthma outcomes10. Except for their excellent performance, decision-tree-based classifiers do not require tedious data preparation and are convenient for working with heterogeneous data. We used two types of classifiers in our research: the AdaBoost and RF classifier. The data was split22 into train (75%) and test (validation) sets (25%). The experimental matrix is described in Table 4.
For model explanation we used permutation importance (PI) which we used in our prior work30. It follows the rationale that a random permutation of a predictor variable values as well as the difference in the classification metrics before and after permuting a predictor variable are used as an importance measure31. This procedure is even more relevant when considering that bootstrapping (resampling with replacement) is used in ensemble classifiers, e.g. not all variables will appear in each tree. This adds up in revealing true predictors in the models and can be also used for feature selection in machine learning models32. Due to imbalance in the targets we stratified the minor class in a train and test set. The model quality metrics used in this work were Accuracy, Sensitivity, Specificity and the Matthews correlations coefficient (MCC).33,34 A detailed description of these metrics is given in the supplementary data.