Feature selection andradiomics signature building
The radiomics signature (RSI) included a total of 13 categories, such as Laa, Va and Trans (refer to section 2.2 for details). Moving on to feature selection and radiomics signature building, we employed six machine learning algorithms, including Gradient Boosting, Support Vector Machine, AdaBoost, Random Forest, K-Nearest Neighbor, and Neural Network, to build a radiomics signatures index (RSI) that could independently predict disease-free survival (DFS) based on the phenotypic characteristics of CT and PET images. The nonlinear survival model was utilized to generate a new feature by predicting survival outcomes via multiple machine learning.
Subsequently, repeated 10-fold cross-validation was used to evaluate the superiority of the trained model, and the Random Forest algorithm was found to obtain a higher AUC (Table 2). The results showed that the AUC value obtained by the random forest model was 0.8587 (95% CI: 0.8421-0.8753). Notably, the prediction accuracy of the model was 0.8529 (95% CI: 0.7968-0.8985). The importance indicators and sorting results of features in the Random Forest model were illustrated (Fig. 1), and the Random Forest algorithm was employed to extract the corresponding radiomics signatures index (RSI) from the imaging data of each patient.
Table 2 Machine learning outcomes of RSI.