SVM Classication of Brain Gray Matter Volume Predicts Classic Trigeminal Neuralgia

Background: Previous studies have shown gray matter(GM) abnormalities in the central nervous system at the group level, but this method is limited because it is based on single or cluster voxels. In contrast, machine learning makes full use of all available empirical information, including differences in brain images or behavioral data, to classify or predict data and ensure good generalization ability. This approach has potential use as a prediction tool at the individual level. We thus hypothesized that a multivariate pattern classication method may distinguish classic trigeminal neuralgia(CTN) patients from healthy controls(HC) based on gray matter volume(GMV). Methods: Resting-state fMRI scans in 24 CTN and 22 HC were processed to extract whole-brain GMV. Based on this feature dataset, take the method of Spearman, T-test, F-score, principal component analysis(PCA) respectively to reduce the feature dimension. Applying a linear support vector machine(SVM) algorithm to differentiate CTN and HC. And extract the features that survive each iteration. Pearson correlation analysis was then used to assess the correlations between the deci_value(the distance from the sample to the optimal hyperplane) and VAS scores and pain duration respectively in CTN patients. Results: Compared with other methods of feature dimension reduction, PCA has a higher ability to correctly classify an individual with an accuracy of 85%(AUC 0.9223; 91.67% sensitivity; 86.36% specicity, p<0.001). And the features that survive each iteration were concentrated in the region of the left anterior cingulate cortex(ACC_L), right superior frontal gyrus(SFG_R), and bilateral cerebellum inferior(CI). In the CTN group, deci_value was positively correlated with the VAS scores in PCA(r=0.42, P=0.041, two-tailed). While there was no difference between deci_value and VAS scores in F-score, T-test, Spearman(r=0.39,p=0.06/r=0.36,p=0.09/r=0.37,p=0.08 respectively). Also, we did not nd signicant

because it is based on single or cluster voxels. In contrast, machine learning [5][6][7] makes full use of all available empirical information, including differences in brain images or behavioral data, to classify or predict data and ensure good generalization ability. And support vector machine (SVM)-based algorithm [10] has been successfully used to classify chronic pain using voxel-based gray matter volume(GMV) [10] . We thus hypothesized that a multivariate pattern classi cation method may distinguish CTN patients from healthy controls(HC) based on GMV. 24 CTN patients (9 males, 15 females; average age 55 ± 13 years; average duration 3.4 ± 4.2 years; average VAS 6.1 ± 1.4) and 22 HC (13 males, 9 females; average age 55 ± 11 years) were included in this study. Inclusion criteria: reference to (ICDH-3) [13] . Exclusion criteria: I) other types of chronic pain conditions, II) history of other central nervous system diseases or mental illness, III) other somatic or psychiatric conditions, IV) unsuitable for magnetic resonance scanning. Demographics and behavioral results of CTN and HC groups were listed in Table 1. And there is no signi cant difference between CTN and HC groups in age, gender, and handedness. The present study was approved by the Medical Research Ethics Committee of The First A liated Hospital of Nanchang University. All individuals include healthy controls provided signed informed consent to participate in the study.  Extract the signal of the resulting normalized and smoothed GM images that were used for training.

Statistical analysis
Demographic and clinical variables of CTN and HC groups were statistically analyzed using SPSS 26.0 software. Independent t-tests comparing the age of the two groups. Gender and Handedness were analyzed using the chi-squared test, P < 0.05 was considered to indicate a statistically signi cant difference.

SVM Classi cation
Machine learning in neuroimage data research can be divided into ve steps [14] : I)Feature extraction; II)Feature selection or feature dimension reduction; III)Model training and model testing; IV)Evaluation the prediction ability of the model; V)Feature localization that contributes to the prediction.
To discriminate CTN from HC based on GMV, we applied an SVM-based algorithm from the Libsvm software library (https://www.csie.ntu.edu.tw/~cjlin/libsvm/) on the Matlab platform. The ow chart for SVM classi cation was shown in Fig. 1, which followed the same classi cation procedure published previously [15] . First of all, extract the signal of the 46 patients' GMV as a features dataset and normalized it to (0,1). Due to the small sample size, the performance of the classi er was tested by leaving-one-out cross validation(LOOCV). The advantage of LOOCV over other cross-validation is that it can train the data as much as possible so that to obtain a more accurate classi er. Thus, within each iteration, we considered one participant as the testing dataset and the remaining participants as the training dataset.
Next, Spearman, T-test, F-score, and PCA was respectively used for feature selection or feature dimension reduction to avoid over tting and discard non-informative features. A) Spearman: Spearman correlation coe cient measures the strength of the relationship between the two variables, for each feature, it was calculated from the two groups based on GMV. B) T-test: In the process of feature selection, by calculating the test statistics, comparing the size of the statistics between features. C) F-score: Generally speaking, the larger F-score of a feature, the greater the value of this feature in classi cation. In the practical application of machine learning, calculating the F-score of all features. Based on the above three methods, arrange the statistical values in descending order, and then select the top 5% statistical values as features for training. D) PCA: All the features were decomposed into a series of principal components(PC), and only with a cumulative contribution of more than 90% were retained.
The misclassi cation parameter C of the SVM was optimized using nested cross-validation on the current training dataset. The process for each iteration includes nding the optimal model used to classify the test dataset. During model training, linear SVM assigns a speci c weight to each feature to re ect its importance in the classi cation [16] . And the features that survive in each iteration were retained, this makes it possible to derive the spatial pattern underlying the classi cation from the mean weight across iterations for the surviving features. A positive weight indicates TN patients having higher GMV than HC in that particular region, while a negative weight indicates the opposite. After 46 iterations, the posterior balanced accuracy was calculated to evaluate the classi cation performance. By the way, the PCA method can not return a weight map.
Besides, to examine the degree to which the classi cation was driven by CTN symptoms rather than other confounds unrelated to CTN, we make a correlation between the deci_value for each subject and the VAS scores and pain duration, respectively. The method was similar to previous researches [17,18] .
Finally, the reliability of the model was evaluated by permutation test. In the permutation test, the label of the sample is randomly replaced and then repeat the above steps 1000 times. If the classi er does not acquire the corresponding relationship between the sample data and the label, then the frequency distribution of the average classi cation accuracy after generalization in the permutation test should obey the normal distribution with an average of 50%. If the average classi cation accuracy after generalization based on real tags falls outside the 95% con dence interval based on random tags, it is considered that the SVM classi er does get reliable learning from training data.

Results
Overall classi er performance of different feature selection methods Table 2 shows the result of different feature selection methods of the SVM classi cation between 24 CTN and 22 HC based on GMV. Include the overall accuracy, sensitivity, speci city, AUC, and the P_value after permutation, with Receiver Operating Characteristic(ROC) curve shown in Fig. 2A. It's not hard to see that PCA has the higher ability to correctly classify an individual as a CTN patient or HC, with 91.67% sensitivity, 86.36% speci city, and 0.9223 area under the curve (AUC), P < 0.001. Thus, 39/46 subjects were correctly classi ed using PCA, speci cally, 19/24 patients and 20/22 HC were correctly classi ed (Fig. 2B). And the model used the top 18 PC to explain more than 90% of the original features.

Relationship between deci_value and VAS scores and pain duration
The correlation between the deci_value(the distance from the sample to the classi cation plane) and VAS scores and pain duration was calculated for the CTN group using SPSS 26.0 software. It revealed that in the CTN group, deci_value was positively correlated with the VAS scores in PCA(r=0.42, P=0.041, twotailed). (Fig 3) While there was no difference between deci_value and VAS scores in F-score, T-test, Spearman ( r=0.39,p=0.06/r=0.36,p=0.09/r=0.37,p=0.08 respectively). We also did not nd signi cant correlations between the pain duration for CTN patients and deci_value among the four methods of feature dimension reduction (P > 0.05). (Table 3)  4 shows the result of features that survived in each iteration were concentrated in the region of the anterior cingulate cortex(ACC), superior frontal gyrus(SFG), and cerebellum inferior(CI). As for F-score, ACC_L, SFG_R, and bilateral CI were survived, and ACC_L, SFG_R, and CI_L was survived in the Spearman and T-test, which also returned a mean weight of these features to re ect its importance in the classi cation. (Table 4)  [20][21][22] . Besides, the contribution of these features to prediction is known from the feature weights derived from the linear model. the absolute value of feature weight can quantify the contribution of features to classi cation or prediction [26] .
Anterior cingulate cortex (ACC) is an important part of the limbic system, also it is an important part of the pain matrix and highlighting network. It has a wide range of ber connections with many other brain regions of the brain and is an important hub of the pain pathway. According to the analysis of painrelated neuroimaging studies, it is found that brain areas such as ACC and frontal cortex are continuously activated by continuous pain stimulation, which is considered to play an important role in sensory discrimination, cognition, and emotion of pain [27] . Therefore, ACC is considered to be the most important brain area for the development of chronic pain [28] . This indicates that the GMV in ACC can objectively re ect the degree of chronic pain to some extent [29] .
The frontal lobe belongs to the advanced information processing area, which does not directly receive the projection of the spinothalamic tract but can integrate the primary pain information into pain perception. Most chronic pain can affect the activity of these brain regions and even the structure of gray matter, and affect some advanced brain functions [30,31] .
The effects of pain on the cerebellum mainly include pain regulation, emotional processing, and sensorimotor processing [30,31] . In addition, the cerebellum can receive ber projections from the spinal cord and trigeminal nerve, and long-term intense pain stimulation can lead to structural changes in its local gray matter [33] . To sum up, there are signi cant abnormalities in GMV in ACC and CI involved in pain transmission, as well as in the SFG involved in pain perception and integration in patients with CTN. The interaction between these brain structures and chronic pain in TN may be one of the important mechanisms for the occurrence and development of TN.

Discussion
Application of SVM algorithm SVM is a kind of generalized linear classi er that classi es the data according to the supervised learning method, and its deci_value is to solve the maximum margin hyperplane for the learning samples [14] .
Compared with other algorithms, the advantage of the SVM algorithm [14,19] is that it can be used for both classi cation and regression analysis; in the process of analysis, the SVM algorithm is to nd the maximum interval hyperplane that can be classi ed, so that it has a certain tolerance and is not easy to over-t; SVM is also very suitable for data with small sample size but high feature dimension of a single sample, which is a typical manifestation of fMRI data.

SVM classi cation
This study has focused on the analysis of gray matter imaging data in CTN patients to determine whether the neuroimaging derived structural patterns are su ciently robust to be distinguished from HC at an individual level using an SVM-based algorithm. Our model applying PCA methods successfully identi ed CTN from HC with an accuracy of 85%, with 91.67% sensitivity, and 86.36% speci city, and highlighted that the model used the top 18 PC to explain more than 90% of the original features in this differentiation. This is an important nding since up till now, the neuroimaging of TN pain has focused primarily on group distinctions [20][21][22] and not on individual level characterization.

Limitations
A limitation of this study is a relatively small sample size. Nonetheless, our successful application of GMV in the classi cation of CTN and HC demonstrated its feasibility as features for the prediction of CTN. Our study focuses on GMV yet there are well-known other indicators abnormal at the group level, such as regional homogeneity (ReHo) [34] and fractional anisotropy (FA) [35,36] and so on. Future work could combine these information with additional types of neuroimaging data in order to examine whether this leads to higher levels of diagnostic accuracy and make the pattern more detailed comprehensive in the future.

Conclusions
In summary, the present study revealed spatially distributed subtle differential patterns of GM abnormalities in CTN patients, and indicated that these abnormalities allow accurate discrimination between CTN patients and HC at an individual level. And this study not only highlights the high accuracy of the PCA method, but also the role of ACC, SFG, and CI for classi cation.    Figure 2B The distance from the sample to the optimal hyperplane(deci_value = 0) for individual CTN participants and individual HC. SVM prediction: positive distance is classi ed as CTN, while negative distance as HC.

Figure 3
Correlation between VAS scores and the deci_value in CTN patients with the methods of PCA. Features survive every LOOCV, the blue region indicates that CTN patients have higher GMV in a particular region than HC, while the red region indicates the opposite.