Development and Validation of a Deep Learning Algorithm for Differentiation of Choroidal Nevi from Small Melanoma in Fundus Photographs

doi:10.21203/rs.3.rs-4277764/v1

Download PDF

Research Article

Development and Validation of a Deep Learning Algorithm for Differentiation of Choroidal Nevi from Small Melanoma in Fundus Photographs

https://doi.org/10.21203/rs.3.rs-4277764/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Purpose

To develop and validate a deep learning algorithm capable of differentiating small choroidal melanomas from nevi.

Design

Retrospective, multi-center cohort study.

Participants

A total of 752 patients diagnosed with choroidal nevi or melanoma

Methods

Wide- and standard field fundus photographs from patients diagnosed with choroidal nevi or melanoma were collected across multiple centers. Diagnoses had been established by ocular oncologists in clinical examinations, using a comprehensive array of diagnostic tools. To be classified as a nevus, a lesion had to be followed for at least 5 years without being re-diagnosed as a melanoma. A neural network optimized for image classification was trained and validated across cohorts of 495 and 168 images, and subsequently tested on a separate set of 89 images.

Main outcome measures

Sensitivity and specificity of the deep learning algorithm in differentiation of small choroidal melanomas from nevi.

Results

In testing, the algorithm achieved 100% sensitivity in identifying small choroidal melanomas from nevi, with a specificity rate of 74%, using an optimal operating point of 0.63 (on a scale from 0.00 to 1.00) determined from independent training and validation datasets. It outperformed 12 ophthalmologists in sensitivity (Mann-Whitney U P=0.006) but not specificity (P=0.54). When comparing by level of experience, the algorithm showed higher sensitivity than both resident and consultant ophthalmologists (Dunn's test P=0.04 and P=0.006, respectively) but not ocular oncologists (P>0.99). Furthermore, the algorithm demonstrated greater discriminative capacity than ophthalmologists who used the MOLES and TFSOM-UHHD risk factors (DeLong’s test P<0.001, all P values Bonferroni corrected), despite the latter having access to supplementary examination data from ultrasonography and optical coherence tomography (OCT).

Conclusions

This study develops and validates a deep learning algorithm for differentiating small choroidal melanomas from nevi, that matches or surpasses the discriminatory performance of experienced human ophthalmologists. Further research will aim to validate its utility in clinical settings.

Ophthalmology

Choroidal nevi

melanoma

uveal melanoma

deep learning

artificial intelligence

Choroidal nevi are found in several percent of Caucasian adults, with lower prevalence in individuals of African and Asian descent.^{1, 2} In contrast, uveal melanomas, while rare, are associated with a high risk for metastatic death.^{3, 4} Separating the two can be a considerable challenge.

It has been estimated that the risk of malignant transformation of choroidal nevi is 0.2%.⁵ Risk factors include orange pigment, subretinal fluid, dome-shape, low internal reflectivity on ultrasonography, and increasing size.^{6, 7} Observation for growth is often employed to differentiate benign from malignant lesions, and as a reason for referral to specialized centers.⁸ However, even small tumors can be deadly. A previous multicenter study of 45 patients with small choroidal melanomas highlighted that tumors as small as 3 mm in diameter and 1 mm in thickness at the time of treatment may lead to metastatic death.⁹

Small choroidal melanomas can closely resemble nevi, and lesions that were once small and seemingly indolent may suddenly transition to growth. The challenge of distinguishing low-risk benign lesions from early-stage malignancies is considerable, even for experienced ocular oncologists. This difficulty is exacerbated in healthcare systems constrained by a scarcity of subspecialists to oversee such conditions. As a result, many lesions are detected and monitored by healthcare providers with limited experience. Nevertheless, the importance of promptly differentiating between benign choroidal nevi and malignant melanomas cannot be overstated. Early and accurate identification of melanoma is essential for enhancing patient outcomes through timely intervention. Moreover, precise diagnosis mitigates the risk of unnecessary treatment of benign nevi, thereby protecting thousands of patients from interventions that could detrimentally affect their vision and health. Therefore, the development of dependable tools to aid in this challenging task would be valuable.

In this context, the current study seeks to develop and validate a deep learning algorithm designed to differentiate between choroidal nevi and small melanomas using standard and wide-field fundus imagery. The definitive classification of these lesions has been previously established by sub-specialized ocular oncologists through extensive clinical evaluations employing a comprehensive array of diagnostic instruments. The effectiveness of this algorithm will be assessed in comparison to the diagnostic capabilities of human observers, including ophthalmology residents, consultants, and ocular oncologists.

Aim of the study

The aim of this study was to develop and validate a deep learning algorithm capable of distinguishing small choroidal melanomas from nevi in both wide- and standard-field fundus photographs.

Data sets

Fundus photographs were collected from the Ocular Oncology Service at St. Erik Eye Hospital, Stockholm, Sweden—a center receiving images from multiple institutions across Sweden. The photographs were taken using either an ultra-widefield camera (covering 200° of the fundus, Optos, Inc, Dunfermline, UK) or a standard field retinal camera (covering 45° of the fundus, Canon Medical Systems Europe, B.V., Amstelveen, the Netherlands, examples of the collected images are provided in Fig. 1A). The collection prioritized small pigmented choroidal lesions, excluding large melanomas as they are relatively easy to distinguish from nevi. Inclusion criteria were:

Photo taken after January 1st, 2010, marking a period in which medical records were digitalized which facilitated control over follow-up.
Diagnosis of either choroidal melanoma (International Classification of Diseases, 10th revision (ICD10) C69.3) or choroidal nevi (ICD-10 D31.3)
Diagnoses had the be established by a subspecialized ocular oncologist.
For lesions diagnosed as nevi at the time of photography, there had to be at least 5 years of follow-up without re-diagnosis as a melanoma. Lesions that were diagnosed melanoma at a later point in time (e.g. due to growth) were considered melanomas in this study. This criterion was introduced to facilitate the algorithms detection of early signs of malignancy at a time when a small melanoma is hard to distinguish from a nevus.

Exclusion criteria were:

Photos of low quality (issues with focus, movement artifacts, over- or underexposure, reflections, etc.).
Photos where our assessment determined that less than half of the lesion was visible, acknowledging the limitation in precisely estimating the size of the portion not visible in the photograph.
Lesion obscured by retinal detachment, vitreous bleeding or similar.

Out of 866 images evaluated, 112 were excluded based on the above criteria. An additional 2 images were excluded due to containing sensitive personal information (patient names and personal identification numbers), leaving 752 images for the study. These images were randomized into a training cohort (n = 495), a validation cohort (n = 168), and a test cohort (n = 89). For each image in the training and validation sets, a mask of the nevus was created, and each image was labeled with the diagnosis (melanoma or nevus). The study was approved by the Swedish Ethical Review Authority (reference 2022-06210-02) and adhered to the tenets of the Declaration of Helsinki. The requirement for informed consent was waived due to the study's retrospective nature, relying solely on previously collected data, including clinical records and images. This research did not involve any new treatments, interventions, tests, analysis of biological samples, or collection of additional sensitive information. Additionally, we followed the Consolidated Reporting Guidelines for Prognostic and Diagnostic Machine Learning Modeling Studies, details of which are provided in a supplementary file.¹⁰

Clinical diagnosis of nevi and melanomas

St. Erik Eye Hospital in Stockholm holds the national responsibility for diagnosing uveal melanoma. Although ophthalmologists from other Swedish institutions may detect potential choroidal tumors and refer patients to our center, a definitive diagnosis of uveal melanoma is made only after a comprehensive examination at our facility, which is equipped with specialized diagnostic tools and expertise. We advise healthcare professionals, including optometrists and nurses, to initially refer patients to general ophthalmologists for a preliminary evaluation before considering a referral to our institution.

At St. Erik Eye Hospital, we conduct a comprehensive review of each patient’s medical history, including previous diagnoses, current medication regimens, and records of past ocular examinations. Our diagnostic protocol encompasses a range of procedures: assessment of best corrected visual acuity (BCVA) and intraocular pressures (IOP); wide or standard field fundus photographs with autofluorescence; OCT; slit-lamp biomicroscopy; and A- and B-scan ultrasonography. Following this evaluation, we are able to confirm a diagnosis of uveal melanoma in the vast majority of cases. On the rare occasion where clinical examinations are inconclusive, we perform either transvitreal or transscleral biopsies (Fig. 1B).^{11, 12} Patients with small choroidal nevi with absence of risk factors do not need to come to our institution, but may be monitored in their home clinics with periodic examinations and photo documentation. If growth is observed, or other features develop, the patient is typically sent to us for evaluation.

For this study, lesions were also assessed using the MOLES and TFSOM-UHHD criteria.^{13, 14} MOLES assigns a score of 0, 1, or 2 for the well-established predictors Mushroom shape, Orange pigment, Large size, Enlarging tumor, and Subretinal fluid, based on their absence, borderline presence, or presence. Lesions are classified as common nevi, low-risk nevi, high-risk nevi, or probable melanoma, based on their total score being 0, 1, 2, or more than 2, respectively. TFSOM-UHHD, stands for “To Find Small Ocular Melanoma Using Helpful Hints Daily,” or Thickness greater than 2 mm, presence of subretinal Fluid, Symptoms, Orange pigment, tumor Margin within 3 mm of the optic disc, Ultrasonographic Hollowness, and the absence of Halo and Drusen. Lesions exhibiting none of these factors have a 3% likelihood of growth over 5 years, suggesting they are most likely choroidal nevi. Those displaying one factor have a 38% chance of growth, while lesions with two or more factors have a growth probability exceeding 50% at 5 years.¹⁵

Data preprocessing and model architecture

In the preprocessing stage, each fundus photograph was resized to a resolution of 1024×1536 pixels and adjusted to include three channels (RGB) to maintain color information. To standardize brightness across the dataset, we normalized the images based on the average brightness of the training dataset, scaling the pixel values to a range of [0, 1].

We implemented a U-net architecture for our model, characterized by three down sampling layers and eight base filters, employing the Rectified Linear Unit (ReLU) as the activation function.¹⁶ This model was specifically designed to perform as a segmentation tool, with its output subsequently applied to the task of classification. The rationale behind opting for a segmentation approach, as opposed to a direct classification framework, lies in the enhanced interpretability it offers; it allows for clearer visualization of which pixels are being activated by the network. Additionally, by segmenting nevi and melanomas, we provide the network with more detailed information during the training phase, potentially improving the model's learning efficiency and accuracy.

Model training

Our model training process utilized two distinct U-net models within the Expligences Explipipe training framework: one aimed at identifying the area of the lesion, and another tasked with classifying whether the lesion is a melanoma. Both models underwent augmentation for brightness and rotation to enhance their robustness.

For the first model, which focuses on detecting the nevus area, categorical cross-entropy served both as the loss function and the evaluation metric. The process involved calculating the weighted central point of the model's output. Subsequently, a bounding box of dimensions 488×488 pixels was centered around this point, which then served as the input for the second network.

The second model, designed for melanoma classification, also utilized categorical cross-entropy as its loss function. However, the Area Under the Curve (AUC) was the chosen evaluation metric to identify the most effective network iteration. For AUC calculation, the pixel exhibiting the highest melanoma probability within the segmentation was considered the output. During the training of this second network, only melanoma segmentation masks were used, whereas nevus images were paired with empty segmentation masks. The network achieving the highest AUC was at epoch 1189, with an AUC score of 83.4% (Fig. 1C).

In the final step, a shallow random forest classifier was trained on the sorted probability outputs, applying a weighting factor of 10 to melanoma images. This strategy aimed to increase specificity, minimize false negatives, and leverage the entirety of the output data, not just the pixel with the highest probability. Incorporating this method raised the AUC to 88.5% on the validation set (Fig. 1D).

Validation of the algorithm

Statistical analysis and performance comparison

Statistical analysis was performed to compare the sensitivities and specificities of human observers (resident ophthalmologists, n = 6, consultant ophthalmologists, n = 3, and ocular oncologists, n = 3) against the gold standard diagnoses of choroidal melanoma or nevi. During the testing phase of fundus photograph assessment, both human evaluators and the algorithm were blinded to any additional patient and lesion information, encompassing clinical diagnoses and follow-up histories. The Kruskal-Wallis test was utilized to assess the overall differences among the groups for both sensitivity and specificity. Post-hoc pairwise comparisons were conducted using Dunn's test, with Bonferroni correction of P values. Mann-Whitney U tests were employed to compare the algorithm’s performance with the aggregated sensitivities and specificities of human observers. The AUC of the algorithm was compared with the AUCs of MOLES and TFSOM-UHHD score, and pairwise DeLong’s tests. Bonferroni correction was applied to multiple comparisons. P values of less than 0.05 was considered to indicate statistical significance, with all P values being two-sided. Statistical significance and confidence intervals were calculated using SciPy (version 0.15.1) and R (version 4.2.2) with the stats, PMCMRplus, pROC, dunn.test, and dplyr packages.

Descriptive statistics

In this study, we analyzed 752 images from 752 patients, comprising 550 images (73%) of lesions classified as nevi and 202 images (27%) classified as melanomas. Among these patients, 419 (56%) were female and 333 (44%) were male. For the lesions, 397 of the nevi (72%) and 119 of the melanomas (59%) had a thickness of 2.0 mm or less. The distribution of patient and lesion characteristics across the training (n = 495), validation (n = 168), and test (n = 89) cohorts is detailed in Table 1.

Table 1

Characteristics of included patients and lesions across three cohorts
	Training cohort	Validation cohort	Test cohort
n	495	168	89
Sex, n (%)
Male	221 (45)	73 (43)	39 (44)
Female	274 (55)	95 (57)	50 (56)
Age at first observation, mean years (SD)	59 (17)	57 (14)	58 (14)
Fundus photography type, n (%)
Wide-field	137 (28)	27 (16)	31 (35)
Standard field	358 (72)	141 (84)	58 (65)
Nevi, n (%)	367 (74)	123 (73)	60 (67)
Melanomas, n (%)	128 (26)	45 (27)	29 (33)44
Nevus LBD, n (%)
≤ 4.5 mm	246 (67)	83 (67)	44 (73)
4.6–6.0 mm	27(7)	5 (4)	2 (3)
> 6.0 mm	94(26)	35 (29)	14 (24)
Nevus thickness, n (%)
≤ 2.0 mm	259 (71)	96 (78)	42 (70)
> 2.0 mm	17 (4)	5 (4)	2 (3)
N/a	91 (25)	22(18)	16 (27)
Melanoma LBD, n (%)
≤ 4.5 mm	33 (26)	14 (31)	6 (21)
4.6–6.0 mm	26 (20)	7 (16)	8 (27)
> 6.0 mm	69 (54)	24 (53)	15 (52)
Melanoma thickness, n (%)
≤ 2.0 mm	74 (58)	33 (73)	12 (41)
> 2.0 mm	50 (39)	11 (25)	17 (59)
N/a	4 (3)	1 (2)	0 (0)
Mean MOLES score (SD)
Overall	1.7 (2.2)	1.6 (2.1)	1.6 (2.2)
For nevi	0.8 (1.4)	0.9 (1.2)	0.7 (1.2)
For melanomas	4.3 (2.4)	3.9 (2.5)	3.9 (2.4)
Mean no. of risk factors TFSOM-UHHD (SD)
Overall	1.0 (1.1)	1.2 (1.2)	1.2 (1.4)
For nevi	0.8 (0.8)	0.8 (0.8)	0.8 (1.0)
For melanomas	1.9 (1.4)	2.5 (1.3)	2.0 (1.8)
LBD, largest basal diameter. SD, standard deviation. MOLES assigns a score of 0, 1, or 2 for Mushroom shape, Orange pigment, Large size, Enlarging tumor, and Subretinal fluid, based on their absence, borderline presence, or presence. Lesions are classified as common nevi, low-risk nevi, high-risk nevi, or probable melanoma, based on their total score being 0, 1, 2, or more than 2, respectively, as described by Roelofs et al. 2020. TFSOM-UHHD, stands for “To Find Small Ocular Melanoma Using Helpful Hints Daily,” or Thickness greater than 2 mm, presence of subretinal Fluid, Symptoms, Orange pigment, tumor Margin within 3 mm of the optic disc, Ultrasonographic Hollowness, and the absence of Halo and Drusen, as described by Shields et al. 2009.

Analysis of sensitivity

In analyses of the sensitivity and specificity for the differentiation of small choroidal melanomas from nevi in the test cohort of 89 images, the Kruskal-Wallis test revealed a significant difference in sensitivities across the four observer categories (algorithm, resident ophthalmologists, consultant ophthalmologists, and ocular oncologists, χ²=20.6, P < 0.001).

Post-hoc pairwise comparisons using Dunn's test revealed that the algorithm exhibited significantly higher sensitivity compared to resident ophthalmologists (100% vs. 85%, P = 0.04), and consultant ophthalmologists (100% vs. 83%, P = 0.006), but not to ocular oncologists (100% vs. 98%, P > 0.99, Bonferroni-corrected P values, Fig. 3A).

There was no significant difference in sensitivity between the three categories of ophthalmologists (resident ophthalmologists, consultant ophthalmologists, and ocular oncologists, Kruskal-Wallis χ²=3.6, df = 2, P = 0.16).

Analysis of specificity

The Kruskal-Wallis test did not show a significant difference in specificities across the four observer categories (algorithm, resident ophthalmologists, consultant ophthalmologists, and ocular oncologists, χ²=4.0, P = 0.26), with post-hoc pairwise comparisons using Dunn's test revealing similar specificity compared to resident ophthalmologists (74% vs. 63%, P = 0.08), consultant ophthalmologists (74% vs. 78%, P = 0.36), and ocular oncologists (74% vs. 70%, P > 0.99, Bonferroni-corrected P values, Fig. 3A).

There was no significant difference in specificity between the three categories of ophthalmologists (resident ophthalmologists, consultant ophthalmologists, and ocular oncologists, Kruskal-Wallis χ²=2.5, P = 0.29).

Comparison between all ophthalmologists and the algorithm

To compare the collective performance of ophthalmologists with that of the algorithm, the Mann-Whitney U test was employed, considering both sensitivity and specificity measures. The outcomes indicated that the algorithm had higher sensitivity (P = 0.006) but not specificity (P = 0.54) compared to the aggregated data from ophthalmologists (illustrated in Fig. 3B). Similarly, when excluding ocular oncologists, the algorithm had higher sensitivity (P < 0.001) but not specificity (P = 0.37).

Algorithm versus MOLES and TFSOM-UHHD

Lastly, we compared the receiver operating characteristics (ROC) of the algorithm versus scores determined by ophthalmologists using the MOLES and TFSOM-UHHD classifications on the test set of 89 images. Unlike the algorithm, which solely analyzed the photographs, the MOLES and TFSOM-UHHD scores incorporated additional data from ultrasonography and/or OCT to ascertain the presence of risk factors (e.g., ultrasonographic hollowness and subretinal fluid, as described in medical records after examinations by ocular oncologists). Despite this, the achieved an AUC of 0.88 (95% CI 0.82–0.95, Gini index 0.75), which compares favorably to the AUC for MOLES at 0.77 (95% CI 0.66–0.88, Gini index 0.54), and TFSOM-UHHD at 0.67 (95% CI 0.54–0.81, Gini index 0.34). In a pairwise DeLong’s test, the AUC for the AI was significantly higher than the AUC for MOLES (P < 0.001) and TFSOM-UHHD (P < 0.001, Bonferroni corrected P values, Fig. 4). These findings underscore the algorithm's robustness in lesion classification, even with access to more limited data compared to traditional classification systems.

Main Findings

In this study, we have developed and validated a deep learning algorithm designed to differentiate between small choroidal melanomas and nevi in wide- and standard field fundus photographs. These lesions were initially diagnosed by ocular oncologists who utilized comprehensive diagnostic tools including imaging, biomicroscopy, and ultrasonography during clinical examinations. To mitigate the potential for misclassifications by the ocular oncologists, lesions diagnosed as nevi were subjected to a minimum 5-year follow-up period, allowing for the identification of any lesions initially classified as nevi that may have progressed to small melanomas over time.

The algorithm demonstrated superior sensitivity in distinguishing small choroidal melanomas from nevi compared to both resident and consultant ophthalmologists, while achieving similar sensitivity and specificity to that of ocular oncologists. It outperformed the discriminatory performance of ophthalmologists who used the MOLES and TFSOM-UHHD classifications, even though they had access to additional results of ultrasound and OCT examinations. Moreover, it provided identical assessments for identical images, meaning it will give the same score every time the same image is analyzed, which is not necessarily the case for human observers who may battle intra- and interobserver variability.

Context

Choroidal melanocytic lesions are common, but melanomas are rare. Small lesions are often discovered in routine examinations for unrelated conditions, such as cataract surgery, diabetic retinopathy screening, and general eye examinations conducted by optometrists. This presents a significant challenge for many healthcare systems, as the availability of subspecialized ocular oncologists is typically limited, and resources must be prioritized for patients with confirmed malignancies rather than for those with potentially benign lesions. Unlike pigmented lesions on the skin, patients cannot monitor choroidal lesions themselves. Consequently, a large number of lesions, of which only a minimal fraction are harmful, must be evaluated by personnel with limited expertise in ocular oncology.

In this context, an algorithm with equivalent performance to ocular oncologists, applicable to both wide and standard field fundus photography, could prove invaluable. We propose that such an algorithm may serve as a valuable tool to aid both ocular oncologists and other healthcare providers in assessing pigmented choroidal lesions. It could be utilized to identify which small lesions should be referred to ocular oncologists, as well as to provide an objective basis for lesion evaluations. E.g., referrals could be indicated for specific scores, or if changes in these scores are observed over time. Furthermore, given its consistent evaluation of identical photographs, the algorithm may facilitate more reliable assessments by experts, particularly in detecting small melanomas with high sensitivity.

Strengths and limitations

In the development and validation of our algorithm, we relied heavily on the diagnostic conclusions drawn by ocular oncologists. These specialists had access to an extensive array of diagnostic tools, including imaging facilities, biomicroscopes, and ultrasound equipment. To reduce the risk of misclassification—for instance, incorrectly diagnosing a small lesion as a nevus when it might actually be an early-stage melanoma—we mandated a minimum event-free observation period of five years for lesions categorized as nevi. However, our algorithm's accuracy is contingent upon the precision of the ocular oncologists' clinical evaluations, particularly for lesions identified as melanoma that received treatment. It's important to note that all Swedish choroidal melanoma diagnoses are centralized to our institution, meaning that lesions that were classified as nevi by us were not diagnosed as melanoma elsewhere. Furthermore, numerous publications corroborate that our incidence rates align with those in similar populations, providing reassurance that our diagnostic standards are representative.^17–21

Secondly, variability in the evaluation of small choroidal melanocytic lesions by human observers is an inescapable reality, and even ophthalmologists with a subspecialty in ocular oncology are not immune to such variability. In our comparison with the algorithm's performance, only three ocular oncologists were engaged. Consequently, the sensitivity and specificity derived from their evaluations are based on a limited observer pool, which may not necessarily reflect the broader ocular oncology community. Although this subspecialty is notably scarce, and these three professionals account for over half of the country's ocular oncologists, caution is warranted when generalizing their assessments. It is plausible that their tendency towards higher sensitivity and lower specificity could stem from their routine access to a broad array of diagnostic tools, leading to a more inclusive approach in the assessment of photographs to ensure a thorough examination in clinical settings, thereby minimizing the risk of overlooking potential melanomas.

Thirdly, the sample size utilized for the development, validation, and testing of the algorithm was limited, which is a notable consideration given that developers of neural networks generally aim for substantially larger datasets. This limitation suggests the potential for enhanced algorithmic performance with access to a larger and more diverse set of training images. Conversely, its efficacy might diminish for lesions with atypical appearances not represented in the training dataset. Should the validation and testing have involved larger samples, the sensitivity and specificity metrics might have varied. Moving forward, we plan to update and refine the algorithm by training it on expanded datasets and validating its applicability across additional patient cohorts.

Fourthly, a diverse range of cameras and software is employed in the acquisition and analysis of fundus images. Our algorithm's training was limited to images from only two types of imaging systems, leaving its effectiveness on alternative platforms untested. Optos ultra-widefield photos are pseudo colored and may not reproduce true colors accurately. We intentionally mixed wide- and standard field photos in order to develop an algorithm that was insensitive to deviations from true color, ensuring its applicability to both imaging modalities. However, we advise against its application without prior adaptation and revalidation on other types of images.

We have developed and validated a deep learning algorithm that demonstrates high sensitivity and specificity for separating small choroidal melanomas from nevi. This algorithm matched or surpassed the diagnostic capabilities of human ophthalmologists who examined the same fundus photos. Further investigation is needed, ideally in prospective cohorts, to validate the practical application of this algorithm in clinical settings and to assess whether its use can enhance patient care and outcomes compared to traditional ophthalmologic evaluations.

Data sharing statement

Due to the sensitive nature of the clinical data, including images utilized in this study, the authors are unable to share these materials in compliance with Swedish law. The confidentiality and privacy regulations governing patient information strictly prohibit the distribution of such data.

Funding statement

Support for this study was provided to Gustav Stålhammar from:

The Swedish Society of Medicine (SLS-971390)
Region Stockholm (FoUI-981345)
The Swedish Cancer Society (23 2613 Fk)
The Swedish Eye Foundation (2023-05-04)

The sponsors or funding organizations had no role in the design or conduct of this study.

Conflict of interest statement

No conflict of interest exists for any of the authors

Sumich P, Mitchell P, Wang JJ. Choroidal nevi in a white population: the Blue Mountains Eye Study. Arch Ophthalmol 1998;116(5):645-50.
Greenstein MB, Myers CE, Meuer SM, et al. Prevalence and characteristics of choroidal nevi: the multi-ethnic study of atherosclerosis. Ophthalmology 2011;118(12):2468-73.
Kujala E, Mäkitie T, Kivelä T. Very long-term prognosis of patients with malignant uveal melanoma. Invest Ophthalmol Vis Sci 2003;44(11):4651-9.
Stalhammar G. Comprehensive causes of death in uveal melanoma: mortality in 1530 consecutively diagnosed patients followed until death. JNCI Cancer Spectr 2023;7(6).
Kivela T, Eskelin S. Transformation of nevus to melanoma. Ophthalmology 2006;113(5):887-8 e1.
Shields CL, Dalvin LA, Yu MD, et al. Choroidal nevus transformation into melanoma per millimeter increment in thickness using multimodal imaging in 2 355 cases: The 2019 Wendell L. Hughes Lecture. Retina 2019;39(10):1852-60.
Al Harby L, Sagoo MS, O’Day R, et al. Distinguishing Choroidal Nevi from Melanomas Using the MOLES Algorithm: Evaluation in an Ocular Nevus Clinic. Ocular Oncology and Pathology 2021.
Qureshi MB, Lentz PC, Xu TT, et al. Choroidal Nevus Features Associated with Subspecialty Referral. Ophthalmol Retina 2023;7(12):1097-108.
Jouhi S, Jager MJ, de Geus SJR, et al. The Small Fatal Choroidal Melanoma Study. A Survey by the European Ophthalmic Oncology Group. Am J Ophthalmol 2019;202:100-8.
Klement W, El Emam K. Consolidated Reporting Guidelines for Prognostic and Diagnostic Machine Learning Modeling Studies: Development and Validation. J Med Internet Res 2023;25:e48763.
Seregard S, All-Ericsson C, Hjelmqvist L, et al. Diagnostic incisional biopsies in clinically indeterminate choroidal tumours. Eye (Lond) 2013;27(2):115-8.
Herrspiegel C, Kvanta A, Lardner E, et al. Nuclear expression of BAP-1 in transvitreal incisional biopsies and subsequent enucleation of eyes with posterior choroidal melanoma. Br J Ophthalmol 2021;105(4):582-6.
Shields CL, Furuta M, Berman EL, et al. Choroidal nevus transformation into melanoma: analysis of 2514 consecutive cases. Arch Ophthalmol 2009;127(8):981-7.
Roelofs KA, O'Day R, Harby LA, et al. The MOLES System for Planning Management of Melanocytic Choroidal Tumors: Is It Safe? Cancers (Basel) 2020;12(5).
Shields CL, Cater J, Shields JA, et al. Combination of clinical factors predictive of growth of small choroidal melanocytic tumors. Arch Ophthalmol 2000;118(3):360-4.
Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF, eds. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Cham: Springer International Publishing, 2015.
Gill V, Herrspiegel C, Sabazade S, et al. Trends in Uveal Melanoma Presentation and Survival During Five Decades: A Nationwide Survey of 3898 Swedish Patients. Frontiers in Medicine 2022;9:1-9.
Smidt-Nielsen I, Bagger M, Heegaard S, et al. Posterior uveal melanoma incidence and survival by AJCC tumour size in a 70-year nationwide cohort. Acta Ophthalmol 2021.
Bergman L, Seregard S, Nilsson B, et al. Incidence of uveal melanoma in Sweden from 1960 to 1998. Invest Ophthalmol Vis Sci 2002;43(8):2579-83.
Hu DN, Yu GP, McCormick SA, et al. Population-based incidence of uveal melanoma in various races and ethnic groups. Am J Ophthalmol 2005;140(4):612-7.
Mahendraraj K, Lau CS, Lee I, Chamberlain RS. Trends in incidence, survival, and management of uveal melanoma: a population-based study of 7,516 patients from the Surveillance, Epidemiology, and End Results database (1973-2012). Clinical ophthalmology (Auckland, NZ) 2016;10:2113.

The authors declare no competing interests.

ReportingGuidelines.pdf
Reporting Guidelines

Download PDF

Version 1

posted

You are reading this latest preprint version

Development and Validation of a Deep Learning Algorithm for Differentiation of Choroidal Nevi from Small Melanoma in Fundus Photographs

Status:

Version 1

Abstract

Figures

INTRODUCTION

METHODS

Aim of the study

Data sets

Clinical diagnosis of nevi and melanomas

Data preprocessing and model architecture

Model training

Validation of the algorithm

Statistical analysis and performance comparison

RESULTS

Descriptive statistics

Analysis of sensitivity

Analysis of specificity

Comparison between all ophthalmologists and the algorithm

Algorithm versus MOLES and TFSOM-UHHD

DISCUSSION

Main Findings

Context

Strengths and limitations

Conclusions

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1