Figure 3. The development of serum metabolomics-based machine learning model for OSCC diagnosis. (A) Different machine learning models were initially investigated by comparing their diagnostic performance on the test set. SVM was chosen as the optimal one; (B) The number of features was investigated by sequential feature selection strategy. Fifteen features were sufficient for the SVM model to achieve the optimal predicting accuracy on the test set. (C) The relative fold changes of these 15 metabolite ions on the test set (OSCC vs HC) were visualized; (D) The distribution of two cohorts of HC and OSCC cases and their decision boundary given by SVM were displayed in the feature space constructed with the first two principal components; (E) The classification result of the test set was displayed in a confusion matrix. Here TPR is the true positive rate and PPV is the positive prediction value.