Discussion
This study was carried out to evaluate the personalized risk factors of revision ESS for CRS patients. By using machine learning algorithms we discovered novel, previously unpublished, important variables predicting revision ESS, such as high number of visits before and after the baseline ESS and, short time between the baseline visit and baseline ESS. Our data also demonstrated that demographic variables of age, Type 2 high diseases (CRSwNP, asthma, NERD) and immunodeficiency or its suspicion, were important predictors of revision ESS at the individual level, which is in line to previous observations at the population level19.
None of the previous studies have presented models designed to predict revision ESS at the individual level and for non-linear predictors. Success rate for initial ESS range from 76% to 98%20,21. Revision ESS risk has previously been studied at the population level by using such as Cox’s proportional hazard7,9,10 or logistic regression8,9,12,14 models, which usually assume that associations are linear and that an alpha error < 5% indicates importance of a predictor.
Increased number of visits, increased visit frequency, and short time between the baseline visit and the baseline ESS, were associated with revision ESS. Our findings suggest that increased visits before ESS might signal to a more severe disease that seems not only to affect to the physician’s and patient’s decision of ESS at baseline but also that of revision ESS in the follow-up. The results reflect that patients who achieved disease control after the baseline ESS did not need any more follow-up visits at Tertiary care and were unsubscribed from the hospital, whereas those with continuous problems visit more frequently and have higher probability to end up with revision ESS. There is little literature evidence of the predictive potential of visit variables at the individual level. A retrospective cohort study from US (n = 6985) showed that the number of post-operative outpatient visits was associated with revision surgery of anterior cruciate ligament reconstructions22. The findings are thus similar findings to ours, in other surgery and in population level. Our findings that patients who have a high visit frequency at baseline are in a higher risk to be only partially controlled by surgery, might be helpful in patient counseling.
The current study showed that CRSwNP, asthma, and NERD are important predictors of revision ESS also at the individual level. In accordance to this, previous studies have demonstrated on hospital population level that several factors are associated with the CRS recurrence and/or revision ESS, such as CRSwNP, asthma, AR, NERD, eosinophilia and smoking1,7,23,24. CRSwNP patients with co-morbid asthma and/or NERD have an increased risk for recurrence and revision ESS, although these patients seem benefit from initial ESS13,19,25–27. This may reflect a more severe disease, with usually co-morbid NERD, anosmia, Type 2 high eosinophilic inflammation, and a greater tendency of polyp re-growth23,28–37. When performing SFS, Immunodeficiency or its suspicion showed also to one of the top ten predictors by all three classifiers. This is in line to previous study that has shown on hospital population level that immunodeficiency and granulomatosis with polyangiitis increase the revision ESS risk38.
We showed that the length of EHR data collection time increased the predictive accuracy of the models. Data collection time from the baseline visit until 12 months after the baseline ESS had the highest predictive accuracy in our models. Time span of data collection for the model is an optimization task between required time slot after baseline ESS and model accuracy.
We validated the predictive accuracy by using three classifiers. We chose in this study to use logistic regression, gradient boosting and random forest -classifiers as they have different properties as and have been generally used in prediction of such as surgery outcomes39, 40 or persistent asthma41. Logistic regression classifier is linear and thus not able to model possible nonmonotonic and non-linear relations between predictors and outcome42. Random forest and gradient boosting classifiers can model complex relations, but they are so called black box models which means non-interpretable classifiers, which means relations between their inputs and output are difficult to understand directly from the parameters or structure of trained model42. As the predictive accuracy of the variables was similar by the three classifiers in our study, logistic regression was mainly used in validation of variable collection time. Altogether, our findings point out the importance of validating outcome prediction by using different classifiers and evaluating the effect of data collection time, as has also been suggested in previous literature43,44.
The study groups of ours and others have previously demonstrated that younger age is associated with revision ESS on hospital populations of CRSwNP32 or CRS7 patients. In the present study we found that age actually affects revision ESS risk in a non-monotonic way. Hence, logistic regression models seems not solely ideal to study the effect of the individual patient’s age on revision ESS risk. By performing partial dependency plots analysis we showed that the revision ESS risk was the highest for patients with age from 60-70 years, and medium high from 30-60 years or over 70 years, whereas the risk was the lowest from 10-30 years of age. Younger patients have less CRSwNP, or their CRSwNP often comprises antrochoanal polyps, which have shown to bear a smaller revision surgery risk1. An increased risk of revision ESS between 60-70 years may be related to worsening of CRS and/or comorbidities, such as asthma. Studies have shown that CRS is more frequent in severe asthma phenotype in the oldest subjects45. In addition, the number of visits before baseline ESS had non-linear effects for the predictions in our study. Patients with 10-20 visits between the baseline visit and baseline ESS had smaller risk for revision ESS than the patients with less than 10 or more than 20 visits. Those patients visiting 10-20 times before baseline ESS, would possibly have CRSsNP with acute recurrent exacerbations, yet this subgroup warrants confirmation in further studies as the number of subjects in this study was small. Previous studies have shown that CRSsNP patients with recurrent acute rhinosinusitis episodes, benefits from initial ESS1. Previous studies exist of other conditions and of other predictors showing U-shaped association between predictor variable and outcome, such as intraoperative net fluid balance and early atrial tachyarrhythmia recurrence46, and body mass index and asthma in Japanese children47. These findings point out the importance of evaluating the linearity of the association to improve personalized prediction.
There is a high need to detect risk factors of severity and to organize personalized patient care. Artificial intelligence has shown to be effective in EHR-based research of allergy, asthma, and immunology research48, such as to predict eosinophilic esophagitis49, and early childhood asthma persistence41. As far as we know, machine learning models have been used only in few previous CRS studies, to classify osteomeatal complex inflammation on computed tomography50 and olfactory recovery after ESS51. In surgery research, machine learning models have been used to predict surgical site infections52, postoperative outcome of degenerative cervical myelopathy39, revision surgery after knee replacement53, prolonged opioid prescription after surgery for lumbar disc herniation54, and blood transfusion after adult spinal deformity surgery55.
The strengths of this study include random sample of hospital patients, long follow-up time and discovery of non-linear associations between certain variables and outcome. In addition, a novelty is that the models were validated by several classifiers and were tested at the individual level.
Limitations include the small number of patients, yet this was compensated by the cross-validation methods. In addition, patients from only one unit, i.e., generalization of results, should be ensured in a further study with an expanded data set. We acknowledge that we lacked the data of some important factors such as validated symptoms, endoscopic nasal polyp score, medication, Lund Mackay score of sinus computed tomography scans, eosinophils, and extent of baseline ESS. The inclusion of these variables would most probably have improved the estimates. Our analysis of revision surgery may have been influenced by several factors unrelated to recurrence of CRS, including wait-times, operative technique, and surgeons/patients’ personal preferences. Public medical care covers over 90% of our operations56 thus minimizing possibility of bias due to loss of follow up, yet we acknowledge that some individual patients with recurrence may have sought treatment elsewhere. Despite these limitations, we found that intelligent data analysis is feasible to obtain individual probability of revision ESS, and thus could help in informing discussions and decision making of advanced therapy, such as biologicals57.