In this study, it was shown that the AI assisted quantitative muscle ultrasound analysis improved the accuracy of distinguishing between the hands with CTS and without compared to conventional quantitative analysis. Although conventional quantitative analysis also showed higher EI ratio in the affected hand compared to the unaffected hand, but the AUC, sensitivity and specificity were not high enough. On the other hand, the AUC were significantly improved when ML was applied.
Among the methods for evaluating muscle EI, quantitative analysis is the most sensitive method, and the detection rate for pediatric skeletal muscle disease is > 90% (12, 13), with a sensitivity of approximately 75%, even for diseases that cause less structural muscle abnormality (1). However, in previous studies that applied this method for CTS diagnosis, Ozsoy-Unubol et al. reported a sensitivity of 71.4% and specificity of 59.4% when using the EI ratio (3). In another study by Kim et al., the EI ratio had a relatively low AUC (0.66), indicating that its discriminative power was lower than that of muscle disease (2). This is because the change in muscle EI is more pronounced in myogenic disease than in neurogenic disease, and the study by Sogawa et al. also showed a higher AUC value in the myogenic group than in the neurogenic group when compared with the value obtained in normal subjects (4).
Recently, several studies have been conducted on the use of AI to enhance the diagnostic performance. First, Sogawa et al. performed texture analysis on muscle ultrasound images from 67 patients: 25 in the neurogenic group, 21 in the myogenic group, and 21 in the healthy group. They performed binary classification between each group using five ML-based classifiers (linear discriminant analysis, quadric discriminant analysis, k-nearest neighbors, support vector machine, and random forest) and reported > 70% classification accuracy between each group, and a classification accuracy > 90% between the myogenic and neurogenic groups; however, the neurogenic group in this study included patients with generalized neuropathy, not focal neuropathy such as CTS. Among studies applying ML to CTS diagnosis, Sayin et al. used ML algorithms for 109 patients with CTS and 42 healthy individuals to detect CTS, achieving 91% CTS detection accuracy (14). This study used electrophysiological findings as a variable but has the disadvantage of causing discomfort to the patient, owing to the invasiveness of the procedure. Park et al. applied ML classifiers to 1037 hands with CTS for categorization according to severity, i.e., mild, moderate, and severe grades (15). Since demographic factors and ultrasound parameter, such as cross-sectional area and palmar bowing, were analyzed as variables, it required a considerable amount of time to collect the information for analysis. In contrast, in our study, several features could be extracted and analyzed from muscle ultrasound images that could be obtained within 1 min. Therefore, our method is much simpler and is not significantly affected by the operator’s skill.
In our study, we used four ML classifiers: random forest classifier, AdaBoost, linear SVC, and XGB. The classifiers achieved AUC scores of 0.83, 0.80, 0.86, and 0.81, respectively, which are at least 0.04 and up to 0.10 higher than the scores obtained with conventional methods using quantitative analysis. Furthermore, by using RFE, our study not only identified important features for CTS diagnosis among the radiological features of the thenar and hypothenar images but also increased the AUC scores of the classifiers by reducing overfitting. Among the 176 input features, 15 pairs were used each in the thenar and hypothenar muscles, and the AUC score increased by up to 0.89 in the case of linear SVC.
The features selected through RFE were rMAD, IQR, and small-area emphasis. rMAD and IQR suppress outliers and are related to robust statistics. Because these features are commonly selected as important discriminative features, the impact of outliers on classification performance was deemed significant. Indeed, a significant part of the outliers in the muscle ultrasound image is the hyperechoic fibroadipose septa corresponding to the perimysium. Because the perimysium is not a muscle fiber, there is no change in signal intensity depending on whether denervation is present. Therefore, if this region is included in the analysis, the increase in muscle signal intensity due to neurogenic disease may be diluted, and this is why rMAD or IQR, which excludes outliers, is helpful in improving discrimination performance.
The thenar region of the hands with CTS appeared to have more grayish substances with varying levels of intensity and more speckle-like structures in fine patterns than those of the control hands. The implication was that images of the thenar regions of the hands with CTS tend to have a greater small-area emphasis. In the normal muscle, all muscle fibers, except the perimysium, are homogeneous hypoechoic. Because partial denervation occurs as the neurogenic disease progresses, the normal hypoechoic region and denervated hyperechoic region are mixed and the size of the region with similar signal intensity tends to decrease. As an analogy, this is similar to the difference between gravel and sand grains. Accordingly, the small-area emphasis was deemed to be relatively higher in the hands with CTS.
Our study had some limitations. First, muscle ultrasound texture images vary with age or sex, but a subgroup analysis was not performed because of the insufficient number of subjects. Second, because of the insufficiency of data for a finer classification based on the four classes of severity grades, i.e., normal, mild, moderate, and severe, the subjects of this study were limited to the binary classification for control hands and hands with CTS. To overcome these limitations, in a follow-up study, we plan to construct a dataset with a larger number of patients and an even distribution in terms of demographic factors and severity. Finally, since our method used texture feature of the thenar and hypothenar regions, ROIs in the images were manually annotated by clinicians. In our future study, we plan to use a deep neural network (DNN) which automatically extracts features, and does not need the ROI annotation. The DNN-based model is expected to achieve a higher accuracy as well.
In conclusion, this is the first study to use AI-assisted quantitative analysis of muscle ultrasonographic findings in CTS. We propose an ML-based classification using muscle texture features on ultrasound images. We applied RFE to our models to improve CTS classification accuracy and confirmed that the commonly selected features were clinically significant. Among the ML models, linear SVC had the best performance; if RFE was applied, it showed an AUC of 0.89, which is an improvement of 0.13 compared with the conventional quantitative analysis. Therefore, the proposed method could be utilized by physicians as a useful tool to assist in CTS diagnosis and understand the echo patterns observed in the ultrasonography of patients with CTS.