Background: The traditional diagnosis of skin lesions mainly relies on dermoscope and pathological biopsy, of which the former is non-objective and the latter is invasive and time-consuming. It is necessary to find an objective and non-invasive inspection method for the diagnosis of skin cancer which is the most common malignant tumor. Herein, we aimed to fast identify the skin cancers on ultrathin frozen fresh tissue sections by combining Raman spectroscopy detection and machine learning technology.
Methods and material: 22 fresh frozen tissue sections including 3 squamous cell carcinomas, 11 basal cell carcinomas, 2 malignant melanomas, 3 seborrheic keratosis, and 3 melanocytic nevi, were included and performed Raman detection. To prevent the discrete Raman data distribution affecting the generalization ability of the learning model, a series of adaptive preprocessing algorithms were first applied to standardize the raw Raman data of five skin lesions. The processed Raman data were performed visualized cluster analysis by principal components analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE). And, using K-nearest Neighbor (KNN) and support vector machine (SVM) classifiers, two predictive models for diagnose were established and evaluated in the training set and test set by the confusion matrixes and receiver operating characteristic (ROC) curves.
Results: The mean variance Raman spectrum graph of 5 skin lesion types were acquired after standardization procession and 4 peak positions with large differences were found. Through dimensionality reduction by PCA and t-SNE, the visual clustering results of Raman data showed heterogeneous intra-cluster homogeneity and inter-cluster dispersion. The test accuracies reached 94.56% and 98.94% in KNN and SVM classifiers respectively. The areas under the ROCs of the two classifiers, in the category dimension and the sample dimension, were all more than 0.99 which is close to the perfect classification effect.
Conclusions: Raman spectroscopy is a competitive candidate for the fast and accurate diagnosis of skin lesions and the molecular information provided may be used in the pathological classification, predicting immunotherapy responsiveness and stratifying prognostic risk. Furthermore, the combination of Raman spectroscopy and machine learning methods showed great diagnostic capabilities with high accuracy is a promising tool for the diagnosis of skin lesions.