Supporting Information to "Development of an open-access and explainable machine learning prediction system to assess the mortality and recurrence risk factors of Clostridioides difficile infection patients: Model Training and Hyperparameter Optimization with Cross-Validation"
Identifying Clostridioides difficile
infection (CDI) patients at risk of mortality or recurrence will facilitate prevention, timely treatment and improve clinical outcomes. We aim to establish an open-access web-based prediction system, which estimates CDI patients’ mortality and recurrence outcomes, and explains the machine learning prediction with patients’ characteristics. Prognostic models were developed using four various types of machine learning algorithms and statistical logistics regression model utilizing over 15,000 CDI patients from 41 hospitals in Hong Kong. The boosting-based machine learning algorithm Gradient Boosting Machine (Mortality AUC: 0.7878; Recurrence AUC: 0.7076) outperformed statistical models (Mortality AUC: 0.7573; Recurrence AUC: 0.6927) and other machine learning algorithms. The open-access prediction system for clinicians to assess and interpret the risk factors of CDI patients is now available at https://www.cdiml.care/
. In this article, we explain the development of machine learning models and illustrate how to apply hyperparameter tuning with cross-validation to optimize the model accuracy.