Raman spectroscopy measurement is an increasingly popular method of diagnosing cancer. [15] Recently, many studies have shown that Raman spectroscopy is a high-accuracy method for diagnosing lung cancer. [16–21]However, most studies mainly carried out Raman detection on tissues to screen for lung cancer. [18–21]Raman detection using tissue is not as convenient as blood detection in general physical examination. Notably, serum detection could be a more favorable and noninvasive method than tissue. Once lung cancer screening can be carried out through blood testing, early lung cancer screening can be realized in a general physical examination which is incomparable with tissue testing. [22]In this study, we used serum samples that are easy to obtain and require no additional modification to test the serum of lung cancer patients.
Due to the complexity and heterogeneity of Raman spectrum data, machine learning methods are necessary for deep data mining. SVM is a machine learning algorithm that classifies data based on supervised learning, particularly suitable for small sample problems and high latitude pattern recognition. [23, 24]SVM is an effective classifier because it can be used for both linearly separable and linearly inseparable data sets. [25] Additionally, the SVM algorithm is applied most frequently in classification and prediction methods with high accuracy for disease risk prediction. [26] Notably, the combination of SVM and Raman spectroscopy has previously been used to distinguish patients with hysteromyoma and cervical cancer from healthy controls. [27] Thus, our SVM method uses a non-linear radial basis function (RBF).
Our study observed significant differences between the average Raman spectrum of lung cancer patients and healthy controls. Meanwhile, the classification model of lung cancer patients and healthy controls show excellent discrimination ability with AUC values of 0.973, and the sensitivity and specificity were 0.917 and 0.922, respectively. Similar conclusions of serum samples detected by Raman spectroscopy were also produced in Shin et al. [28] and Moisoiu et al. [29] studies in which diagnostic sensitivity and specificity in lung cancer were 0.84 (95% CI 0.69–0.93), 0.85 (95% CI 0.62–0.97) and 0.85 (95% CI 0.68–0.95), 0.87 (95% CI 0.73–0.96), respectively. Moreso, Lei et al. [30] used surface enhanced Raman spectroscopy (SERS) combined with principal component analysis (PCA) and partial least-squares discriminant analysis (PLS-DA) to diagnose and distinguish lung cancer and normal serum. Importantly, the model's sensitivity improved to 100%, while the specificity decreased to 83.33%. [30] Compared with similar studies, our classification model seems more excellent. From the studies of Ke et al. [31] and Chen et al. [32], the results of tissue samples detected by Raman spectroscopy were more reliable than serum samples. However, obtaining pathological tissue in patients with early lung cancer whose tumor size is too small or in patients with advanced lung cancer who have already undergone metastasis is challenging. Therefore, if the pathology sample was unavailable, serum detection could be more favorable and noninvasive.
Besides the use of cancer diagnosis, Raman spectroscopy has also been applied in many studies regarding inflammatory diseases such as dengue fever [33], malaria infection[34], virus infection[35], cryptococcal infection[36], ulcerative colitis [37], and cervicitis [38] In our study, the majority of patients in the benign lung lesion group were diagnosed with infectious inflammation. Surprisingly, our model shows great diagnostic ability in distinguishing benign lung lesion patients from healthy controls. Furthermore, this suggests that there may be differences in spectral characteristics among different types of pulmonary infections, which can become a direction for future research.
From a clinical perspective, it is more important to distinguish lung cancer from benign lung lesions. Physicians must decide whether the patient should accept a certain degree of damage from antitumor treatment, such as surgical treatment, chemotherapy, or radiotherapy. In previous studies, the levels of purine metabolites and proteins in the serum of lung cancer patients were significantly increased. [39, 40]The levels of purine catalysis are associated with any condition in which cells and DNA are continuously produced and degraded. [41]In contrast, the protein level of serum is the hallmark of inflammation.[41] On the one hand, there are parallels between infection/injury inflammatory responses and “inflammation in cancer.” [42] Therefore, there is a certain similarity between serum metabolites in cancers and inflammatory diseases. Furthermore, most patients in our lung cancer groups are usually not challenged with a single disease, and they are often concurrent with chronic lung inflammatory disease. Notably, the dual factors increase the classification difficulty of our model of lung cancer patients and healthy controls. This result may be why our model's diagnosis accuracy, sensitivity, and specificity were only 0.85, 0.800, and 0.833, respectively.
The difference between benign lung lesion individuals and lung cancer patients shows a lower significance than between healthy control and benign lung lesion or lung cancer groups. Our results are significant, given the importance of discriminating between benign and malignant lung diseases. Takamori et al. analyzed salivary metabolites and built a multiple logistic regression (MLR) models for discriminating patients with lung cancer from benign lung lesions (AUC = 0.729, 95%CI = 0.598–0.861, p = 0.003) [43]. Compared with this consequence, our research shows a more robust diagnostic ability. Therefore, serum Raman spectroscopy is a strong candidate for screening tests, neither requiring invasive procedures nor complex operations.
This study has several noted limitations. First, the sample size was small, and well-powered large-scale multicenter studies are needed to verify this conclusion in the future. Second, we only had 12 lung cancer samples in the early stage, and statistical analysis could not be performed to verify the model's value for diagnosing early-stage lung cancer. In short, serum Raman spectroscopy and the SVM classification algorithm can be a powerful adjunct to the clinical diagnosis of lung cancer.