We investigated whether a DL system could evaluate EEM in patients with GO. This system was able to classify both EEM and NEM with high AUCs, sensitivity, and specificity, indicating that the system distinguished images as belonging to participants with EEM or those with NEM on orbital CT images with nearly the same level of accuracy as that of doctors.
Our study defined the 4-mm thickness of the extraocular muscle diameter as abnormal. This cutoff value was determined based on previous reports of Dutton showing NEM thickness. However, Ozgen et al.  reported that mean maximum diameters of the extraocular muscles measured using conventional CT were MR 4.2 (range 3.3–5.0) mm, LR 3.3 (1.7–4.8) mm, SR 4.6 (range 3.2–6.1) mm, and IR 4.8 (range 3.2–6.5) mm. In their study, they used conventional CT. In this CT, individual variations in the chin-up posture of participants during coronal section imaging were observed, which may enhance the variability of extraocular muscle thickness. Conversely, spiral CT is used in our study. Spiral CT is created by reconstructing horizontal cross-sectional images, which are captured at the same angle due to participants’ constant posture during imaging. Therefore, our results showed less variation in extraocular muscle thickness in the control group compared to the findings of Ozgen et al. Therefore, we assumed that our extraocular muscle thickness results were consistent with Dutton’s, with an average thickness of less than 4 mm for each extraocular muscle.
A nationwide survey of patients with GO in the United Kingdom revealed delays in diagnosis, wide variability of access to specialist centers, appropriate treatment, and overall low patient satisfaction with treatment. The same study revealed that only 25% of patients had referrals to a specialist GO clinic and that referrals were typically late. In several studies on general health-related questionnaires about quality of life among patients with GO, the scores of these patients were lower than those of the healthy reference population.[16, 17] Gerding et al. reported that quality-of-life scores among patients with GO were worse than those in patients with diabetes, emphysema, or heart failure. In approximately 70% of adults with Graves’ hyperthyroidism, magnetic resonance imaging or CT scanning reveals EEM. Physicians thus need to monitor patients for ocular signs, including lid edema, lid retraction, and proptosis on visual inspection, and EEM, as demonstrated on orbital imaging, in patients with Graves’ hyperthyroidism. We consider that early detection and treatment of thyroid myopathy may become possible if the DL software system evaluating EEM in GO plays a supporting role in the actual clinical practice.
The modified clinical activity score (CAS) is currently the most widely used index to determine the active phase of inflammation in GO. However, a recent study of GO indicated that the CAS may not reflect the inflammatory activity of myopathy, especially in mild to moderate GO with low NOSPECS scores (no sign of thyroid disease, only eyelid signs, soft tissue involvement, proptosis, extraocular motility restriction, corneal involvement, and sight loss). This system classifies the clinical severity of GO with low exophthalmos values.[20, 21] Nagy et al. reported that EEM does not imply the presence of edematous swelling, and the severity of diplopia is unrelated to the degree of ocular congestion and edema. Kim et al. reported that 44.4% of patients with GO and progressive diplopia had low CASs and no typical symptoms of inflammation. These findings may have arisen because the CAS reflects primarily ocular muscle involvement and acute orbital congestion, which represents inflammatory changes within orbital connective and adipose tissues. Ophthalmologists thus must detect EEM early in the course of GO.
In our heat maps showing the focus of DL, color intensity surrounding the rectus muscles on the orbital CT images increased. The areas in the orbital CT images that the DL system focused on were consistent with those that ophthalmologists focus on when using CT images, they confirm EEM. In other words, the generated heat maps suggest that DL systems can accurately detect EEM associated with GO on the orbital CT images. Our DL software system may be helpful in the ophthalmological assessment of patients with GO.
Our system had several limitations. First, our study was conducted within a single facility, and the model’s robustness must be evaluated prospectively with data from multiple facilities. Second, from the perspective of radiation exposure to the participants, images with a slice thickness of 2 mm were used during CT imaging in this study. Using images with finer slice thickness may improve accuracy. Third, the judgment of EEM was based on measurements of the thickness of the muscles on two-dimensional CT images. The muscles’ volumetric measurement must be evaluated on three-dimensional CT or magnetic resonance images. Finally, DL’s performance and versatility should be evaluated extensively with larger samples and more images.
In conclusion, our results indicate that our DL system and orbital coronal CT had high accuracy for detecting EEM in GO. DL systems to screen orbital coronal CT images may yield useful information about early treatment for EEM patients with GO.