Machine learning algorithm
For the training data, the c index value for the ML-based logistic regression algorithm was 0.729 (95%CI 0.718-0.740) and was incrementally higher than that obtained for the main effect model (C-index 0.718). Similar results were obtained for the externally validation cohort (0.704, 95%CI 687-0.721).
Table 4 depicts the complex relationships between the incident AF outcome and model features in terms of main effect, interactions and polynomial effects. The top three independent effects of co-morbid conditions in the main effect model (table 3) were COVID-19 status, congestive heart failure and coronary artery disease, which were the only independent effects found in the ML based logistic regression formulation (Table 4). COVID-19 status, congestive heart failure and coronary artery disease also had interaction effects with other co-morbid conditions or demographic variables. Age was significant both as a categorical variable in interaction terms and as a continuous variable in quadratic terms.
In figure 1, the clinical utility of main effect model and ML based logistic regression algorithm had better clinical utility in terms of net benefit than the two treatment strategies (i.e., treat all or none). Above the probability threshold of 1.0%, the ML formulation provided better clinical utility than the main effect model. At a probability threshold of 1.5%, the net true positive AF events were equal to 85.5 events for the ML based logistic regression and higher than those for the main effect model (58.9 net events). In addition, the sensitivity and specificity were equal to 29.8% and 91.2%, respectively for the ML algorithm.