Machine learning algorithm
For the training data, the c index value for the ML-based logistic
regression algorithm was 0.729 (95%CI 0.718-0.740) and was
incrementally higher than that obtained for the main effect model
(C-index 0.718). Similar results were obtained for the externally
validation cohort (0.704, 95%CI 687-0.721).
Table 4 depicts the complex relationships between the incident AF
outcome and model features in terms of main effect, interactions and
polynomial effects. The top three independent effects of co-morbid
conditions in the main effect model (table 3) were COVID-19 status,
congestive heart failure and coronary artery disease, which were the
only independent effects found in the ML based logistic regression
formulation (Table 4). COVID-19 status, congestive heart failure and
coronary artery disease also had interaction effects with other
co-morbid conditions or demographic variables. Age was significant both
as a categorical variable in interaction terms and as a continuous
variable in quadratic terms.
In figure 1, the clinical utility of main effect model and ML based
logistic regression algorithm had better clinical utility in terms of
net benefit than the two treatment strategies (i.e., treat all or none).
Above the probability threshold of 1.0%, the ML formulation provided
better clinical utility than the main effect model. At a probability
threshold of 1.5%, the net true positive AF events were equal to 85.5
events for the ML based logistic regression and higher than those for
the main effect model (58.9 net events). In addition, the sensitivity
and specificity were equal to 29.8% and 91.2%, respectively for the ML
algorithm.