The accurate diagnosis of small melanocytic choroidal tumors is challenging due to similar clinical characteristics between benign choroidal nevi and small malignant UM. These patients benefit from the careful evaluation by an ocular oncologist experienced in managing intraocular tumors. Current practice uses clinical examination and multimodal imaging to predict malignant transformation and thereby guide the diagnosis and management of these tumors. Our study provides proof of concept for ML to identify risk factors for malignant transformation at the time of initial presentation.
Clinical features associated with the risk of malignancy have been well established, including the presence of orange pigment and subretinal fluid.34, 35 Shields et al. conducted a study to identify risk factors for malignant transformation of choroidal nevi, comprising the largest retrospective case series at the time.11 These risk factors included tumor thickness greater than 2 mm on ultrasonography, subretinal fluid, patient symptoms, orange pigment, and tumor margin within 3 mm of the optic disc.11 In 2009, Shields et al. expanded their case series and identified additional risk factors to include ultrasound hollowness and the absence of a halo or drusen overlying the lesion.13 They were combined to form the well-known mnemonic “To Find Small Ocular Melanoma Using Helpful Hints Daily” (TFSOM-UHHD).
While the original TFSOM system provided an evidence-based method for predicting malignant transformation of melanocytic choroidal tumors, Shields et al. further extended the system in 2019 with the development of the “To Find Small Ocular Melanoma Doing IMaging” (TFSOM-DIM) criteria.14 TFSOM-DIM incorporates multimodal imaging techniques in the identification of risk factors, including subretinal fluid on OCT, orange pigment on autofluorescence, and a basal diameter of at least 0.5 mm on fundus photography.14 These additional imaging techniques provided a more nuanced approach to identifying UM,36 which have been evaluated in subsequent studies. Geiger et al. used the TFSOM-DIM criteria to grade multimodal imaging by retrospective chart review, revealing significant differences in the range of risk scores between UM and choroidal nevi.37
Other groups have independently identified risk factors for malignant transformation in melanocytic choroidal tumors. The Collaborative Ocular Melanoma Study (COMS) analyzed small choroidal lesions to find that thickness greater than 2 mm, basal diameter greater than 12 mm, presence of orange pigment, and absence of drusen and RPE changes were predictive of tumor growth.38 Roelofs et al. developed a tumor categorization system, which provided a score for choroidal lesions based on five features: Mushroom shape, Orange pigment, Large size, Enlarging tumor, and Subretinal fluid.39 Their study found these criteria to have a sensitivity of 99.8% in identifying melanocytic choroidal tumors at risk for malignant transformation.39 These scoring systems emphasize the opportunity for objectivity in determining the distinction between the two lesions. Despite their potential in the prediction of malignancy, identifying these risk factors has traditionally been done through careful ophthalmic examination and image interpretation, which is subject to inter-observer variability.40 The application of ML algorithms, such as the one used in our study, has the potential to provide a more accurate and efficient system to improve patient prognosis.
The use of ML as a tool for evaluating retinal lesions has gained interest in recent years. ML involves training algorithms to learn from data sets to act on future data.41 While it has been shown to be useful in the early detection of diabetic retinopathy (DR),42, 43 its potential for predicting malignant transformation in UM has not yet been extensively explored.
In 2014, Roychowdhury et al. developed a novel, fully automated DR detection and grading system for automated screening and treatment prioritization, achieving a sensitivity of 100%, specificity of 53.26%, and an AUC of 0.904.42 In a study by Lam et al. (2018), the use of ML in DR was augmented by developing a CNN to recognize and distinguish between mild and multi-class DR on color fundus images with enhanced recognition of subtle characteristics.43
Supervised ML techniques have shown promise in classifying retinal disease type and stage.44 In the context of UM, wide-field digital true color fundus cameras can capture a choroidal nevus and its associated features in a single photo, potentially making data labeling and ML training faster and more efficient.30 This understanding suggests that ML may function as a valuable tool to assess small tumors and facilitate the prediction of malignant transformation.
Early detection and treatment of UM is crucial as metastasis events may occur early, and effective treatment can prevent its spread.45, 46 Despite the availability of effective treatments for the primary tumor, more than 50% of UM patients develop metastatic disease suggesting that UM may metastasize prior to the time of treatment.45, 47 Consequently, there is a need for identifying and treating small UM to minimize the number of melanocytic choroidal tumors that are observed and subsequently grow during the observation period. The importance of treating small UM was additionally emphasized by Eskelin et al., who measured doubling time in both untreated and treated metastatic UM and proposed that most metastases begin up to 5 years prior to primary tumor treatment.48 Murray et al. retrospectively evaluated a case series of small UM undergoing early fine-needle aspiration biopsy combined with pars plana vitrectomy and endolaser ablation.47 The study found no patients developed metastasis in the follow up period, suggesting that early treatment may lower the risk of mortality compared to observation alone.
Several studies have evaluated non-imaging biomarkers to better diagnose UM and predict prognosis, including some that employ ML techniques.49–51 Serum biomarkers, including several differentially expressed proteins identified in UM gene signatures, have been associated with a worse prognosis in patients diagnosed with UM.49, 52, 53 Furthermore, circulating tumor cells have been detected in patients without clinically detectable metastasis, indicating early spread and highlighting the need to identify prognostic biomarkers.54, 55 The search for these biomarkers underscores the importance of early detection of UM, potentially with the aid of ML for the diagnosis and management of UM. Small UM are a particularly important research subject due to the diagnostic challenge and potential for early local treatment to preserve vision and save lives.56
Our study has important limitations including a small sample size of patients from a single institution, which restricts the generalizability of our findings. Furthermore, the small sample size likely limited the performance of the ML models. The development of high-performing ML models to characterize choroidal lesions would benefit from multi-institutional collaborations and potentially techniques to artificially increase the sample size based on existing data. In addition, technical limitations such as poor image quality or suboptimal feature extraction can also limit the accuracy of ML models. In our study, the Grad-CAM images sometimes focused on small regions of the lesions themselves rather than adjacent subretinal fluid or whole image during the classification task (Fig. 3). To address this, segmentation of the specific regions based on the task, such as a lesion mask (Fig. 4) for orange pigment and drusen or the lesion plus the surrounding retina for subretinal fluid, could improve model performance. Our current classification model in margin, orange pigment, and drusen have relatively low average AUC and high deviation, suggesting the need for further refinement (Fig. 1). For the margin prediction model, segmentation of the optic nerve and lesion prior to training the model could be useful. In the case of orange pigment and drusen prediction models, the small size of these features may require alternative approaches such as cropping images into smaller tiles for classification, which could retain a higher resolution. Nonetheless, based on these limitations, expert knowledge is crucial to guide the development and use of these models to ensure that they are based on clinically relevant features and accurately reflect the underlying biology of the disease.
Our analysis provides proof of concept that ML can accurately identify risk factors for malignant transformation in melanocytic choroidal tumors based on a single UWF image or B-scan US image at the time of initial presentation. Further studies can build on these findings to improve the accuracy and applicability of these models in the clinical setting. ML has the potential to be developed into a clinically useful tool to inform and guide management decisions for melanocytic choroidal tumors and potentially save patient lives.