Predicting Malignant Transformation of Choroidal Nevi Using Machine Learning

Objective This study aims to assess a machine learning (ML) algorithm using multimodal imaging to accurately identify risk factors for uveal melanoma (UM) and aid in the diagnosis of melanocytic choroidal tumors. Subjects and Methods This study included 223 eyes from 221 patients with melanocytic choroidal lesions seen at the eye clinic of the University of Illinois at Chicago between 01/2010 and 07/2022. An ML algorithm was developed and trained on ultra-widefield fundus imaging and B-scan ultrasonography to detect risk factors of malignant transformation of choroidal lesions into UM. The risk factors were verified using all multimodal imaging available from the time of diagnosis. We also explore classification of lesions into UM and choroidal nevi using the ML algorithm. Results The ML algorithm assessed features of ultra-widefield fundus imaging and B-scan ultrasonography to determine the presence of the following risk factors for malignant transformation: lesion thickness, subretinal fluid, orange pigment, proximity to optic nerve, ultrasound hollowness, and drusen. The algorithm also provided classification of lesions into UM and choroidal nevi. A total of 115 patients with choroidal nevi and 108 patients with UM were included. The mean lesion thickness for choroidal nevi was 1.6 mm and for UM was 5.9 mm. Eleven ML models were implemented and achieved high accuracy, with an area under the curve of 0.982 for thickness prediction and 0.964 for subretinal fluid prediction. Sensitivity/specificity values ranged from 0.900/0.818 to 1.000/0.727 for different features. The ML algorithm demonstrated high accuracy in identifying risk factors and differentiating lesions based on the analyzed imaging data. Conclusions This study provides proof of concept that ML can accurately identify risk factors for malignant transformation in melanocytic choroidal tumors based on a single ultra-widefield fundus image or B-scan ultrasound at the time of initial presentation. By leveraging the efficiency and availability of ML, this study has the potential to provide a non-invasive tool that helps to prevent unnecessary treatment, improve our ability to predict malignant transformation, reduce the risk of metastasis, and potentially save patient lives.


Introduction
Uveal melanoma (UM) is the most common intraocular malignancy in adults, with a high rate of metastasis and a poor prognosis. 1The accurate diagnosis of small UM is challenging due to similar clinical characteristics to benign choroidal nevi.Tumors diagnosed as choroidal nevi that subsequently grow during an observation period are at increased risk for metastasis. 2,3 herefore, improving the diagnosis of UM and choroidal nevi at the time of initial presentation has the potential to improve clinical outcomes.
Most types of cancer require a tissue diagnosis via biopsy prior to making a treatment decision.5][6] The risks associated with biopsy in small choroidal tumors are especially high. 7,8 refore, clinicians base their diagnosis on careful clinical examination and multimodal imaging, including fundus photography, auto uorescence, optical coherence tomography (OCT), and ultrasonography, to evaluate patients with melanocytic choroidal tumors.However, indeterminate lesions where a de nite diagnosis cannot be made are often observed to monitor for tumor growth with serial examination and imaging.
Tumor growth is used as a surrogate for malignant transformation and, therefore, an indication for treatment in patients with indeterminate melanocytic choroidal tumors.Large retrospective studies have been performed to identify clinical risk factors that predict malignant transformation in order to identify patients at high risk for tumor growth.Patients at high risk for malignant transformation are most often treated with ionizing radiation or enucleation based on the clinical situation including tumor size, extent of extraocular extension, vision, and the patient's preference.Conversely, patients at low risk for malignant transformation are often observed to avoid the unnecessary ocular morbidity associated with treatment of benign choroidal nevi.
0][11][12] The presence of three or more of these features suggests a greater than 50% risk of malignant transformation. 13The use of multimodal imaging has been shown to be capable of identifying these risk factors and therefore serves as an important tool for clinicians evaluating these lesions 14 .However, improving our ability to predict malignant transformation and accurately diagnose small UM can reduce the risk of metastasis and save patient lives.
Machine learning (ML) offers a promising approach to enhance the identi cation and evaluation of intraocular lesions, thereby providing a versatile tool for clinicians.5][26][27][28][29] Convolutional neural networks (CNNs) assist in disease diagnostics and progress our understanding of the possibilities of extracted information from various imaging techniques.Despite its signi cant potential, few studies have looked at the role of ML in the diagnosis of melanocytic choroidal tumors. 30In the present study, we analyze the utility of ML in the evaluation of choroidal nevi and UM.Our objective was to train an ML algorithm to identify risk factors for UM using ultra-wide eld fundus images and B-scan ultrasonography.In addition to providing useful information for the diagnosis of the disease itself, we also attempt to maximize the information we can extract from each imaging modality.As such, this ML algorithm may be a useful tool for evaluating melanocytic choroidal tumors for early detection of malignancy.

Methods
This retrospective study included analysis of 223 eyes from 221 patients with melanocytic choroidal lesions seen at the eye clinic at the University of Illinois at Chicago between 01/2010 and 07/2022 (Table 1).The study was approved by the institutional review board (IRB) and patient records were collected from the electronic medical record system.The inclusion criteria for this study were patients with a clinical diagnosis of choroidal nevi or UM.Exclusion criteria were patients who have been treated prior to presentation and patients without both ultra-wide eld imaging (Optos PLC, Dunfermline, Fife, Scotland, UK) and B-scan ultrasound (Eye Cubed and ABSolu, Lumibird Medical, Rennes, France) taken at the time of initial presentation.The patients were divided into two groups: (1) patients diagnosed with a choroidal nevus and (2) patients diagnosed with UM.The clinical examination and diagnosis at the time of presentation were taken as the ground truth for diagnosis and the presence of risk factors for malignant transformation included lesion thickness, subretinal uid, orange pigment, proximity to optic nerve, ultrasound hollowness, and drusen.The risk factors were veri ed by a single investigator (MJH) using all multimodal imaging available from the time of diagnosis including ultra-wide eld images (UWF), auto uorescence images, A-scan and B-scan ultrasonography (US), and OCT.We also explore prediction of the categorization into choroidal nevus or UM for each image.The UWF images and B-scan US from all patients were collected and analyzed (Table 2).The AI-based models were developed using ResNet 18 architecture. 18TheResNet architecture consisted of two parts: (1) a feature extractor, which processed UWF or US images to extract features as an output, and (2) a task-speci c header that used features from the previous layers to generate task-speci c outputs (i.e., classi cation output 0 for absence of or apical overlying subretinal uid, or output 1 for presence of subretinal uid).Text information within the US images was cropped, and images were then scaled to 512 x 512 pixels before being fed into the models.Cross-entropy was used as the loss function, and the Adam algorithm was used for the optimizer.The learning rate was set to 0.00005 and the models were trained for 50 epochs.The best model was selected based on the lowest loss observed in the testing set.
The performance of the ML model was measured by the area under the curve (AUC).The bootstrap con dence interval (CI) for the AUC was obtained using the percentiles of the bootstrap distribution.For instance, the 95% CI was obtained using the 2.5th and 97.5th percentiles of the bootstrap distribution.
The 95% CI was computed based on 1000 bootstrap replicates.
We investigated the region or tissue by generating saliency maps for visual explanations of each model using Gradient-Weighted Class Activation Mapping (Grad-CAM). 31Grad-CAM uses the gradients of the target concept, such as 'UM' in our classi cation network, owing into the nal convolutional layer.This produces a coarse localization map highlighting the important regions in the image for predicting the concept.The primary goal of Grad-CAM is to re ect the degree of importance of pixels (regions of interest) to the human visual system, allowing us to make decisions on the classi cation task.

Results
Patient

Grad-CAM images
Localization maps highlight the important pixels (regions of interest) resulted in patterns that provided insight into the classi cation tasks.In the category prediction model from US images, the highest probability regions in the overlying Grad-CAM images tended to include both the lesion of interest and its surrounding tissues.For instance, the highlighted region of a UM included the orbit posterior to the tumor as well as ocular regions adjacent to the lesion (Fig. 2).Additionally, a subset of images highlighted the anterior segment on the US image in the location of the iris and lens, which have been implicated in patients with uveal melanoma. 32,33 m the UWF images, the Grad-CAM images most often correctly located the tumor region for UM.
However, the localization maps for nevi tended to be broader and surrounded the lesions rather than focusing on the nevi themselves (Fig. 2).In one false negative case in the category prediction from US images (Supplemental Fig. 1, score of 0.289), the lesion is 1.94 mm in height but the largest basal diameter is 7.54 mm.As compared to other images of UM in the testing dataset, this image has the smallest thickness.
The subretinal uid prediction model from US images often highlighted a two-centric region in Grad-CAM, corresponding to the subretinal uid on two sides of the lesion with a con dence score of 0.973 (Fig. 3).
The model was also capable of locating subretinal uid from the UWF image with a con dence score of 0.759.Our model also consistently evaluated the predicted hollowness through visualization focusing more often on the lesion itself (Supplemental Fig. 2).

Discussion
The accurate diagnosis of small melanocytic choroidal tumors is challenging due to similar clinical characteristics between benign choroidal nevi and small malignant UM.These patients bene t from the careful evaluation by an ocular oncologist experienced in managing intraocular tumors.Current practice uses clinical examination and multimodal imaging to predict malignant transformation and thereby guide the diagnosis and management of these tumors.Our study provides proof of concept for ML to identify risk factors for malignant transformation at the time of initial presentation.
Clinical features associated with the risk of malignancy have been well established, including the presence of orange pigment and subretinal uid. 34,35 hields et al. conducted a study to identify risk factors for malignant transformation of choroidal nevi, comprising the largest retrospective case series at the time. 11These risk factors included tumor thickness greater than 2 mm on ultrasonography, subretinal uid, patient symptoms, orange pigment, and tumor margin within 3 mm of the optic disc. 11In 2009, Shields et al. expanded their case series and identi ed additional risk factors to include ultrasound hollowness and the absence of a halo or drusen overlying the lesion. 13They were combined to form the well-known mnemonic "To Find Small Ocular Melanoma Using Helpful Hints Daily" (TFSOM-UHHD).
While the original TFSOM system provided an evidence-based method for predicting malignant transformation of melanocytic choroidal tumors, Shields et al. further extended the system in 2019 with the development of the "To Find Small Ocular Melanoma Doing IMaging" (TFSOM-DIM) criteria. 14FSOM-DIM incorporates multimodal imaging techniques in the identi cation of risk factors, including subretinal uid on OCT, orange pigment on auto uorescence, and a basal diameter of at least 0.5 mm on fundus photography. 14These additional imaging techniques provided a more nuanced approach to identifying UM, 36 which have been evaluated in subsequent studies.Geiger et al. used the TFSOM-DIM criteria to grade multimodal imaging by retrospective chart review, revealing signi cant differences in the range of risk scores between UM and choroidal nevi. 37her groups have independently identi ed risk factors for malignant transformation in melanocytic choroidal tumors.The Collaborative Ocular Melanoma Study (COMS) analyzed small choroidal lesions to nd that thickness greater than 2 mm, basal diameter greater than 12 mm, presence of orange pigment, and absence of drusen and RPE changes were predictive of tumor growth. 38Roelofs et al. developed a tumor categorization system, which provided a score for choroidal lesions based on ve features: Mushroom shape, Orange pigment, Large size, Enlarging tumor, and Subretinal uid. 39Their study found these criteria to have a sensitivity of 99.8% in identifying melanocytic choroidal tumors at risk for malignant transformation. 39These scoring systems emphasize the opportunity for objectivity in determining the distinction between the two lesions.Despite their potential in the prediction of malignancy, identifying these risk factors has traditionally been done through careful ophthalmic examination and image interpretation, which is subject to inter-observer variability. 40The application of ML algorithms, such as the one used in our study, has the potential to provide a more accurate and e cient system to improve patient prognosis.
The use of ML as a tool for evaluating retinal lesions has gained interest in recent years.ML involves training algorithms to learn from data sets to act on future data. 41While it has been shown to be useful in the early detection of diabetic retinopathy (DR), 42,43 its potential for predicting malignant transformation in UM has not yet been extensively explored.
In 2014, Roychowdhury et al. developed a novel, fully automated DR detection and grading system for automated screening and treatment prioritization, achieving a sensitivity of 100%, speci city of 53.26%, and an AUC of 0.904. 42In a study by Lam et al. (2018), the use of ML in DR was augmented by developing a CNN to recognize and distinguish between mild and multi-class DR on color fundus images with enhanced recognition of subtle characteristics. 43pervised ML techniques have shown promise in classifying retinal disease type and stage. 44In the context of UM, wide-eld digital true color fundus cameras can capture a choroidal nevus and its associated features in a single photo, potentially making data labeling and ML training faster and more e cient. 30This understanding suggests that ML may function as a valuable tool to assess small tumors and facilitate the prediction of malignant transformation.
Early detection and treatment of UM is crucial as metastasis events may occur early, and effective treatment can prevent its spread. 45,46 espite the availability of effective treatments for the primary tumor, more than 50% of UM patients develop metastatic disease suggesting that UM may metastasize prior to the time of treatment. 45,47 onsequently, there is a need for identifying and treating small UM to minimize the number of melanocytic choroidal tumors that are observed and subsequently grow during the observation period.The importance of treating small UM was additionally emphasized by Eskelin et al., who measured doubling time in both untreated and treated metastatic UM and proposed that most metastases begin up to 5 years prior to primary tumor treatment. 48Murray et al. retrospectively evaluated a case series of small UM undergoing early ne-needle aspiration biopsy combined with pars plana vitrectomy and endolaser ablation. 47The study found no patients developed metastasis in the follow up period, suggesting that early treatment may lower the risk of mortality compared to observation alone.
0][51] Serum biomarkers, including several differentially expressed proteins identi ed in UM gene signatures, have been associated with a worse prognosis in patients diagnosed with UM. 49,52,53 Furthermore, circulating tumor cells have been detected in patients without clinically detectable metastasis, indicating early spread and highlighting the need to identify prognostic biomarkers. 54,55 he search for these biomarkers underscores the importance of early detection of UM, potentially with the aid of ML for the diagnosis and management of UM.Small UM are a particularly important research subject due to the diagnostic challenge and potential for early local treatment to preserve vision and save lives. 56r study has important limitations including a small sample size of patients from a single institution, which restricts the generalizability of our ndings.Furthermore, the small sample size likely limited the performance of the ML models.The development of high-performing ML models to characterize choroidal lesions would bene t from multi-institutional collaborations and potentially techniques to arti cially increase the sample size based on existing data.In addition, technical limitations such as poor image quality or suboptimal feature extraction can also limit the accuracy of ML models.In our study, the Grad-CAM images sometimes focused on small regions of the lesions themselves rather than adjacent subretinal uid or whole image during the classi cation task (Fig. 3).To address this, segmentation of the speci c regions based on the task, such as a lesion mask (Fig. 4) for orange pigment and drusen or the lesion plus the surrounding retina for subretinal uid, could improve model performance.Our current classi cation model in margin, orange pigment, and drusen have relatively low average AUC and high deviation, suggesting the need for further re nement (Fig. 1).For the margin prediction model, segmentation of the optic nerve and lesion prior to training the model could be useful.In the case of orange pigment and drusen prediction models, the small size of these features may require alternative approaches such as cropping images into smaller tiles for classi cation, which could retain a higher resolution.Nonetheless, based on these limitations, expert knowledge is crucial to guide the development and use of these models to ensure that they are based on clinically relevant features and accurately re ect the underlying biology of the disease.
Our analysis provides proof of concept that ML can accurately identify risk factors for malignant transformation in melanocytic choroidal tumors based on a single UWF image or B-scan US image at the time of initial presentation.Further studies can build on these ndings to improve the accuracy and applicability of these models in the clinical setting.ML has the potential to be developed into a clinically useful tool to inform and guide management decisions for melanocytic choroidal tumors and potentially save patient lives.

Table 2
Class descriptions and imaging modalities used for each parameter.US = ultrasound, UWF = ultrawide eld fundus.
demographics A total of 115 patients with choroidal nevi and 108 patients with UM were included in this study.The mean age of patients with choroidal nevi was 64.9 years (range: 27-95), while patients with UM had a mean age of 66.1 years (range: 30-97) (Table1).The majority of patients were female in the choroidal nevus group (75, 65.2%), and the UM group had a more balanced gender distribution with 53 (49.1%) males and 55 (50.1%) females.The racial distribution of patients in both groups was predominantly White, with 76 (73.8%) patients in the choroidal nevus group and 82 (83.7%) patients in the UM group among patients with available race data.Similarly, the ethnicity of patients in both groups was predominantly non-Hispanic or Latino, with 91 (85.8%) patients in the choroidal nevus group and 98 (96.1%) patients in the UM group.Clinical featuresThe mean lesion thickness was 1.6 mm for choroidal nevi and 5.9 mm for UM.The presence of subretinal uid was observed in 5 (4.3%) patients with choroidal nevi and 75 (69.4%)patients with UM.Orange pigment was present in 3 (2.6%)patients with choroidal nevi and 34 (31.5%)patients with UM.The mean margin to the optic nerve head was 5.0 mm for choroidal nevi and 3.1 mm for UM.Drusen were present in 42 (36.5%)patients with choroidal nevi and 35 (32.4%) patients with UM.Ultrasonographic hollowness was observed in 18 (15.7%)patients with choroidal nevi and 86 (79.6%) patients with UM.Finally, a mushroom shape was not observed in any patients with choroidal nevi, while it was present in 16 (14.8%)patients with UM.