Discussion
This study has demonstrated several advantages of CPSI-MS/ML for OSCC diagnosis from serum samples. From the aspect of data collection efficiency, CPSI-MS realized quick collection of high-dimension metabolomic data from each case directly with a timescale of seconds. The total analytical period for these two cohorts of 819 serum samples took only 12 hours, which satisfies practical requirements for clinical screening. CPSI-MS is quite suitable for the rapid, direct metabolomic profiling from a dried spot of biological fluid such as saliva, serum, or even whole blood. A basic methodology investigation was conducted in this study. A series of serum samples were evenly distributed among the whole test sequence. Then, the variations of the first two principal components (PC1-PC2) were analyzed. The relative standard deviations (RSD values) of PC1 and PC2 fall into the acceptable levels at 18.7 % and 31.2 % (Figure S2 ), respectively, meeting the basic requirement of qualitative analysis.34 This result is largely because data acquisition from the whole cohort can be completed in one working day. The short period of single case analysis by CSPI-MS could make the large cohort assay conducted more effectively. The number of QC samples introduced for monitoring and normalizing the MS system variation was also reduced. This variation is a critical factor that cannot be ignored, especially compared to data taken from traditional LC-MS or GC-MS systems. With aid of a pre-trained machine learning model, the high-dimension metabolome data can be transferred into accurate diagnostic information almost instantly without biased interpretation by practitioners, facilitating its practical value in precision medicine.
From the studies of serum metabolomics reported here and the previous saliva metabolomics, the OSCC-associated discriminating metabolites were identified, respectively. The pathway enrichment analysis revealed which metabolism pathways are influenced in serum and saliva (Table S9 ). The four representative metabolism pathways (histidine metabolism, arginine biosynthesis, arginine, and proline metabolism, aminoacyl-tRNA biosynthesis) discovered in the saliva remained highlighted in the serum level, whereas their impact or significance did not rank at the top. Instead, lipids-related metabolism becomes the major pathways including glycerolipid (GL), glycerophospholipid (GPL), and sphingomyelin (SM) (Fig. 5 ). According to the fold changes of these metabolites (Tables S3 and S4 ), the changes of many metabolites become less obvious in serum, although the 57 discriminating metabolites discovered in the saliva study still had abnormal abundance in serum. This was observed mostly among the metabolites located in the histidine, arginine, and proline metabolism pathways. which were the major changed pathways in the saliva of the OSCC group. In contrast, the GL, GPL, and SM molecules in serum become the major discriminating markers (Figure S3 ).