Prediction of the Reactivation of Retinopathy of Prematurity After Anti-VEGF Treatment Using Machine Learning in Small Numbers

doi:10.21203/rs.3.rs-2257458/v1

Download PDF

Article

Prediction of the Reactivation of Retinopathy of Prematurity After Anti-VEGF Treatment Using Machine Learning in Small Numbers

https://doi.org/10.21203/rs.3.rs-2257458/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Aim

To create and validate a prediction model for retinopathy of prematurity (ROP) reactivation after anti-VEGF therapy with clinical risk factors and retinal images.

Methods

Infants with TR-ROP undergoing anti-VEGF treatment were recruited from two hospitals, and three models were constructed using machine learning and deep learning algorithms. The areas under the curve (AUC), sensitivity (SEN) and specificity (SPC) were used to show the performances of the prediction models.

Results

Finally, we included 87 cases, including 21 with recurrent and 66 nonrecurrent cases. The AUC for the clinical risk factor model was 0.80 and 0.77 in the internal and external validation groups, respectively. The average AUC, sensitivity, and specificity in the internal validation for the retinal image model were 0.82, 0.93, and 0.63, respectively. The SPC, AUC, and SEN for the combined model were 0.73, 0.84, and 0.93, separately.

Conclusion

We constructed a prediction model for the reactivation of ROP. Using this prediction model, we can optimize strategies for treating TR-TOP infants and developing screening plans after treatment.

Health sciences/Diseases

Health sciences/Health care

Health sciences/Risk factors

retinopathy of prematurity

reactivation

prediction

machine learning

deep learning

Retinopathy of prematurity (ROP) is a retinal ischemia-hypoxic disease that usually occurs in preterm infants and is a leading cause of childhood vision loss and blindness.^[1] According to a report by the World Health Organization, more than 20,000 infants are blinded by ROP and 12,300 infants have different levels of visual impairment ^[2] per year. The occurrence of visual disabilities seriously affects children’s long-term quality of life and brings a huge economic burden to families and society.

With improvements in health care and the opening of the three-child policy in China, the birth and survival rates of premature babies have increased dramatically. The incidence rate of ROP is 15–26%, and approximately 1.1–6.8% of ROP babies need treatment,^[3–6] which is called treatment-requiring ROP (TR-ROP). TR-ROP includes type 1 pre-threshold ROP, threshold ROP, or Aggressive ROP (A-ROP)^[7]. Most blindness caused by TR-ROP can be avoided through timely treatment after screening.

Treating TR-ROP includes intravitreal injection (IVI) of antivascular endothelial growth factor (anti-VEGF) therapy, laser photocoagulation, cryotherapy scleral buckling, vitrectomy, and gene therapy.^[8–10] The mechanism, treatment, regression patterns, disease reactivation risk, and long-term management needs differ between laser therapy and anti-VEGF treatments. Laser photocoagulation is a traditional ROP treatment modality that is effective in 90% of TR-ROP cases. It destroys peripheral areas of the retina that produce growth factors and converts them into nonfunctional scar tissue; therefore, it is destructive, especially in zone I TR-ROP. ^[11] VEGF inhibitors do not rely on the destruction of peripheral retinal areas; they directly bind and pharmacologically neutralize with VEGF. Moreover, they are easier to use and can be mastered quickly by ophthalmologists in clinical applications. Hence, anti-VEGF treatment has become the mainstay treatment since the BEAT-ROP trial in 2011,^[12] which filled some deficiencies of laser photocoagulation therapy (e.g., damage to the peripheral retina and harm to premature infants) and provides alternative treatments for other countries with a laser photocoagulation shortage.^[9] Anti-VEGF drugs include pegaptanib, bevacizumab, ranibizumab, conbercept, aflibercept, etc. ^{[13, 14]}

As anti-VEGF treatment has gradually become the major modality for ROP treatment, it has shown a high treatment success rate. In the BEAT-ROP trial, bevacizumab showed advantages in the treatment of zone I ROP when compared with laser therapy.^[15] Cheng et al ^[16] demonstrated that a total of 248 eyes received conbercept treatment, and 245 eyes (98.79%) were the initial treatment effective in Zone II ROP. However, it has also been reported that various degrees of reactivation have occurred in current anti-VEGF therapy, which affects the clinical application of anti-VEGF drugs. Reactivation is defined as the appearance of plus disease, an elevated ridge, or pathological new vessels after an initial regression of ROP following treatment. Ling ^[17] reported that 340 eyes received bevacizumab, ranibizumab, and laser therapy; the overall reactivation rate was 12.9%, 10.0% for the bevacizumab group, followed by the ranibizumab group (20.8%). Bai et al. ^[18] reported that forty-eight eyes received conbercept treatment, and eight eyes (16.7%) relapsed. According to the above research reports, the reactivation rate was between 6.8% and 64%, and relapse generally occurred from seven to twenty weeks after treatmen, the datails are in Table 1. The reasons for reactivation include the limited duration of anti-VEGF efficacy and the involvement of other related factors.

Table 1

Summary of studies on the use of Anti-vegf drugs for ROP in latest years
Authors	Anti-VEGF drugs	Sample size	Initial efficiency	Risk factors	Recurrence time (weeks)	Recurrence rate	complications
Ling et al ^[17]	Bevacizumab(B) Ranibizumab(B)	All: 279 eyes of 143 infants B: 231eyes of118 infants R: 48 eyes of 25 infants	-	IVR monotherapy, early PMA at initial treatment, Zone I ROP, low Apgar score, and multiple births.	PMA:42.2 ± 3.40	B:23/231(10%) R:10/48(20.8%)	NO
Cheng.et al ^[16]	Conbercept(C) Ranibizumab(R)	All: 1199 eyes of 625 infants C: 283 eyes of 145 infants; R: 916 eyes of 480 infants;	C:98.23% (278/283) R:97.05% (889/916)	NA	R:7.87 ± 0.65 weeks after treatment C:10.6 ± 1.53 weeks after treatment	C:15.19% (43/283) R:29.69% (272/916)	NO
Lyn et al ^[19]	Ranibizumab(R)	50 eyes of 50 infants	92% (46/50)	larger extent of preexisting retinal neovascularization, post-IVR oxygen requirement	PMA:43.1 ± 3.3	64% (32/50)	NO

Clinical risk factors for relapse after anti-VEGF treatment include small gestational age, a low Apgar score, multiple births, oxygen requirement after therapy, early postmenstrual age during initial treatment, etc. ^{[17, 19, 20]}. Many forms of reactivation require re-treatment; hence, timely detection and management of reactivation have become major concerns in anti-VEGF therapy for ROP.

Artificial intelligence (AI), including machine learning (ML) and deep learning (DL), has been a hot topic in recent years; it is considered the most advanced technology in the field of image recognition. ^[21–23] DL achieved amazing results in the ImageNet Visual Recognition Challenge (ILSVRC) in 1,000 categories of categorization tasks, lowering the error rate from 26.1–15.3%. DL has achieved robust performance in medical fields such as radiology, pathology, and dermatology, which are similar to ophthalmology. ^[24–26] In ophthalmology, DL systems have been shown to accurately detect diabetic retinopathy, age-related macular degeneration, cataract, and glaucoma. ^[27–30] Brown ^[31] created a fully-automated DL system for normal/pre-plus/plus disease using convolutional neural network (CNN) based on 5,511 retinal images, AUC achieved 0.98 for plus diseases, compared with senior ophthalmologists, the DL system also showed strong advantages.

Recently, AI has been used to predict prognosis in the clinical field. ^[32] To our knowledge, there is no research on the prediction of reactivation after anti-VEGF treatment for TR-ROP. Thus, this study was performed to apply ML and DL to construct a program that predicts ROP reactivation using clinical risk factors and retinal images, hoping to identify high-risk reactivation cases before treatment.

1. Patient Enrollment and Labeling

This was a retrospective study. Data for training and internal validation were collected between April 2016 and November 2021 from Hospital I (Zhujiang Hospital, Southern Medical University, GuangZhou), and external validation was performed from Hospital II (Tung Wah Hospital, Sun Yat-sen University, Dongguan). The study was conducted according to the Declaration of Helsinki and approved by the Research Ethics Committee of the Zhujiang Hospital and Tung Wah Hospital. The inclusion criteria were (1) a definite diagnosis of ROP that meets the conditions for ROP treatment, (2) anti-VEGF therapy as an initial treatment, (3) a clear clinical treatment outcome, (4) complete clinical data, (5) ROP screenings had been complete, and screenings lasted at least six months after treatment, and (6) ocular or systemic diseases that may influence the outcomes of this study.

We divided the cases into two groups: reactivation and non-reactivation. All cases were annotated by three trained ophthalmologists with rich experience independently. ROP screening criteria were developed in accordance with the 2014 Chinese ROP Screening Criteria, a wide-field digital retinal imaging system (RetcamIII) was used. The most severe ROP condition of each child before treatment was recorded as an ROP diagnosis.

2. Treatment And Follow-up Screening

The indications of ROP treatment are type 1 pre-threshold ROP, threshold ROP, or aggressive ROP (A-ROP).^[7] Type 1 ROP refers to any-stage ROP with plus disease, stage 3 ROP without plus disease in zone I, or stage 2 or 3 ROP with plus disease in zone II.^[13] A-ROP means rapidly progressing, severe ROP, including plus disease without progression being observed through the stages of ROP, rapid pathologic neovascularization, etc. ^[33] Once type 1 ROP or A-ROP is diagnosed, treatment is initiated within 72 hours.

All guardians of the babies were fully informed of the drug principles, treatment details, and potential risks of anti-VEGF drugs; then, they chose the type of injection drug and signed the informed consent form. Injections were performed by a trained ophthalmologist in a sterile operating room under topical anesthesia using a microscope. A 0.25 mg/0.025 mL dose of Conbercept or a 1mg/0.025 mL dose of Aflibercept was extracted with a 30-gauge needle and perpendicularly injected and redirected slightly toward the center of the eyeball after the needle passed the equator of the lens.^[34] Topical tobramycin /dexamethasone eye drops for treated eyes were prescribed for three days after an injection. Post-injection follow-up eye examinations were performed on day one, days three to five (if not improved by day one). Subsequently, each infant was examined weekly to biweekly for at least six months after treatment.

Effective treatment is the regression of the plus disease or the disappearance of a ridge, a decrease in retinal vessel tortuosity, the presence of normal vessels, etc. ^[35] Reactivation vascular changes include recurrence of plus disease and neovascularization after a partial or complete regression. ^[33] Once reactivation requiring treatment is diagnosed, treatment should be arranged within 72 hours. We recorded the time of reactivation, the status of ROP, and follow-up treatment.

3. Collecting Data On Potential Predictive Factors

According to research and clinical experience, the following four classes of information regarding the potential predictive factors of recurrent ROP were obtained and confirmed by another researcher: (1) grandmother-related factors: gestational hypertension, gestational diabetes mellitus, premature rupture of membranes, placental abruption, placental previa, cesarean delivery, and IVF-ET; (2) infant variables: gestational age (GA), birth weight (BW), sex, 1-minute and 5-minute Apgar scores, multiple births, neonatal asphyxia, sepsis, respiratory distress syndrome, bronchopulmonary dysplasia, pneumonia, intraventricular hemorrhage, necrotizing enterocolitis, anemia, neonatal hypoxic-ischemic encephalopathy (HIE), congenital heart disorder (atrial septal defect, ventricular septal defect, patent ductus arteriosus), pulmonary hypertension, and neonatal jaundice; (3) treatment: mechanical ventilation, oxygen therapy (post-treatment), and a blood transfusion; (4) TR-ROP conditions: postmenstrual age (PMA) at TR-ROP initial treatment, zone 1 ROP, a preretinal hemorrhage, A-ROP, and anti-VEGF drugs.

The infant variables were collected before anti-VEGF treatment. All of these variables were recorded in electronic health records (EHRs). Missing data were processed in two ways: (1) if fewer than 20%, the missing variables were imputed with a mean value and mode value, and (2) if more than 20%, the missing variables were deleted.

Another possible predictive factor was ROP screening retinal images before initial treatment. We collected fundus images from RetCam III. Posterior fundus images from the same eye as a file were divided into reactivation and non-reactivation groups (label). Exclusion criteria comprised: (1) retinal photographs taken by other devices other than RetCamIII; (2) infants with other ocular diseases (e.g, persistent hyperplastic primary vitreous, congenital cataract, coats disease, or other congenital retinal diseases); and (3) any images without an accurate label.

4. Training and Validation of Prediction Models

4.1. The Clinical Risk Factor Model (Crm)

In recent years, due to the realization of attention mechanisms, many ML models can build prediction models based on table data. A prediction model based on clinical factors was established using three algorithms: random forest (RF), support vector machine (SVM), and categorical boosting (CatBoost). CatBoost is upgraded to the gradient boosted decision tree (GBDT) algorithm and can be compatible with category features. We used the L2 regularization parameter and cross-entropy as the loss function, so the final output value was a numerical value in the range of 0.0–1.0. To increase the robustness of the algorithm, we first calculated the effectiveness of each feature using the GBDT algorithm, resulting in a 0.0–1.0 importance score. The importance score was then passed in as one of the model’s features; some low-importance features were dropped. To be fair, we used a grid search approach for hyperparameter tuning.

For validation, we used five-fold validation as internal validation. The training groups were randomly and equally divided into five parts: four were used for training, and the remaining one was used for validation. This procedure was repeated until each group was validated. This procedure arranged the parameters and evaluated the performance of the prediction model. For internal validation, we used data from Hospital II to externally validate the prediction model. Finally, the importance of the variables among all the investigated factors was scored by CatBoost in reactivation cases.

4.2. Retinal Photograph Model (Rm)

Retinal photographs were used in the RM prediction model. The images were resized to a resolution of 400 × 400 pixels, and images in the training sets were augmented eight-fold with 90-degree rotations and horizontal and vertical flips.

ResNeSt is an improvement based on the Resnet network; it integrates an attention mechanism, which makes an algorithm automatically learn a better perception domain, is more stable, and has less error. The ResNeSt was developed as an open-source project on GitHub (https://github.com/zhanghang1989/ResNeSt). We used the ResNeSt network to build the RM model, which is highly specialized for image data and widely used in medical imagery. Considering the number of datasets, we used transfer learning, which has been proven to improve a model’s performance and partially solve problems of insufficient learning ability, low classification accuracy in small sample learning, and poorly trained model generalization ability caused by poor extraction characteristics, usually loaded ImageNet on the training model.

To gain better performance, the network was trained on an ImageNet and glaucoma dataset (https://grand-challenge.org/), achieving 95% accuracy in the glaucoma dataset, and then transferred to the TR-ROP dataset; the network layer was subsequently fine-tuned in the TR-ROP reactivation dataset. Due to the small number of TR-ROP reactivation datasets, we enhanced the training data by adding random blocks, random contrast adjustments, random gamma values, and mixups.

After training the DL prediction model, we obtained the probability of each retinal image and divided each element in the vector to be 1.0 or 0.0, based on a threshold of 0.5, to obtain the two reactivation patterns in a formation. For internal validation, we used 5-fold cross-validation on the training dataset, which was randomly divided into five independent parts. For several reasons, we did not obtain an external validation dataset for this model. The average results after each group were validated and recorded to evaluate the performance of the DL prediction model. Details are shown in Fig. 1.

4.3. Combined Model (Cm)

We combined clinical factors with retinal images to build a CM prediction model. The model may be an improvement on RM and CRM prediction models. The training involved two steps. First, we trained a stable prediction mode based on the CatBoost algorithm. Second, the output of the CatBoost algorithm was used as part of the next model input, and we used cross-entropy as the basic loss function in the final model.

To improve the final performance of the model, we referred to the experience of focaloss and modified the loss function as follows:

Loss = a * (-ylogy^’ – (1-y) log (1-y^’))

a = exp(-b*y*G) / Z

Z is the normalization factor, B is the table network weight, and G is the table network output score.

In terms of training parameters, we used batch learning and stochastic gradient descent method to minimize the loss function. The final result was obtained by training 400 epochs using cosine annealing.

4.4. Statistical Analysis

Statistical analysis was performed using SPSSv23 software (IBM,USA). We present variables as mean ± standard deviations for continuous factors and numbers and percentages for categorical factors. A chi-squared test or Fisher’s exact test was used for categorical factors, and a Student’s t-test was used for continuous factors between the non-reactivation and reactivation groups, P < 0.05 was considered statistically significant. The AUC, SEN and SPC were used to show the prediction models’ performance.

Ten TR-ROP reactivated cases and eight non-reactivated controls were excluded due to a lack of necessary information. Finally, we included 87 cases, including 21 cases with reactivation and 66 cases with non-reactivate; the reactivation rate was 24.14 (a 75.86% success rate). Sixty-four cases (12 reactivation cases and 52 non-reactivation cases) from Hospital I were used for training and internal validation, while 23 cases ( 9 reactivation cases and 14 non- reactivation cases) from Hospital II were used for external independent validation. Sixty-one infants received conbercept, while 26 infants received aflibercept treatment. The clinical characteristics are presented in Table 2: BW, PMA at initial treatment, retinal hemorrhage, oxygen therapy (after treatment), and sex were significantly different between the reactivation and non- reactivation groups. The only significant independent risk factor for reactivated ROP is retinal hemorrhage.

Table 2

Comparisons of the variables between the children with recurrence and the controls.
Risk factors	All infants	Non-Recurrent ROP(n = 52)	Recurrent ROP(n = 12)	t	P
BW, g (Mean ± SD)	1270 ± 340	1310 ± 340	1070 ± 230	2.33	0.023
PMA at TR-ROP initial treatment, week (Mean ± SD)	36.84 ± 2.27	37.11 ± 2.15	35.67 ± 2.59	2.01	0.048
retinal hemorrhage (n,%)	26	32.7% (17/52)	75% (9/12)	NA	0.01
oxygen therapy (post-treatment) (n,%)	18	21.2% (11/52)	58.3% (7/12)	NA	0.028

Crm Model Performance

We collected 40 potential predictive factors; the details of these factors are presented in the Methods section. Among the included factors, BW, GA, and PMA at TR-ROP initial treatment were the most relevant factors for the reactivation of TR-ROP, as shown in Fig. 2. Finally, we used three predictors ( PMA at TR-ROP initial treatment, A-ROP, and respiratory distress syndrome) in the CM model. The AUC, sensitivity, and specificity in the internal validation and external validation are shown in Table 3. The Receiver operating characteristic (ROC) curves of models with different algorithms are compared in Fig. 3. Three algorithms (i.e., SVM, Rf, and CatBoost) were tested, and the algorithm that produced the highest AUC of the ROC was chosen. The results show that CatBoost outperformed SVM and RF. The AUCs for CatBoost were 0.80 and 0.77, respectively, in the internal and validation groups.

Table 3

the performance of clinical factor models
	method	AUC	Sensitivity (%)	Specificity (%)
Internal validation	SVM	0.75	70.00	69.27
	Rf	0.76	73.33	61.82
	catboost	0.80	90.00	75.09
External validation	SVM	0.60	60.00	53.85
	Rf	0.68	75.00	53.85
	catboost	0.77	80.00	69.23

Rm Model Performance

We used retinal photographs before the initial treatment to predict the reactivation of ROP; there were 136 eyes (18 recurrent cases and 118 nonrecurrent cases). The average AUC, sensitivity, and specificity in the internal validation are shown in Table 4. The RM model performed better than the CRM model in internal validation. However, for several reasons, we did not obtain an external validation dataset for this model.

Table 4

the performances of three models in internal validation
Models	AUC	Sensitivity (%)	Specificity (%)
CRM	0.80	90.00	75.09
RM	0.82	93.33	63.27
CM	0.84	93.33	72.91

Cm Performance

We used a model that combined clinical factors with retinal photographs as an optimal model to predict the reactivation of ROP. PMA at TR-ROP initial treatment, A-ROP and respiratory distress syndrome were added to the model. After adding these clinical parameters, all model indicators increased except for SPC (which decreased to 0.73), with AUC reaching 0.84 and SEN reaching 0.93. As explained in the RM results, we also lacked external validation data for the CM model. The ROC curves of three models are compared in Fig. 4.

In this study, we developed and validated a DL reactivation prediction model for TR-ROP anti-VEGF treatment. Before our study, we retrieved data from the Web of Science, PubMed, the China National Knowledge Infrastructure (CNKI) dataset, and Google Scholar for articles with the keywords “retinopathy of prematurity,” “ROP,” “paediatric vitreoretinal disease,” “prediction model,” “prediction algorithm,” “reactivation,” “recurrence,” and “relapse” (published between January 1, 2001, and September 30, 2021). However, we searched for no known studies on establishing the practical prediction model for infants with a high risk of ROP reactivation after anti-VEGF treatment based on retinal images or nonimage data,^[36] only identified studies on univariate analyses based on clinical information. ^{[17, 37]} Hence, we established the first prediction model for reactivation prediction after TR-ROP treatment. This model uses preoperative parameters, including clinical risk factors and retinal images, to construct prediction models that achieve acceptable performance in both internal and external validation. Moreover, all of these factors can be collected from an EHR.

The most promising impact of our study is that we can predict reactivation early, using only data before treatment, and it is noninvasive. Reactivation prediction in TR-ROP treatment is a critical factor for successful treatment and follow-up planning. Using these ML and DL prediction models, we can optimize strategies for treating TR-TOP infants before treatment and developing screening plans after treatment. If an infant is predicted to have a high risk of reactivation when receiving anti-VEGF drug therapy, the treatment method can be changed; a fluorescein angiography examination can be undergone to find nonperfusion areas and, if necessary, use laser therapy after anti-VEGF treatment, ^{[12, 38]} or perform more frequent screenings to detect reactivation promptly. Otherwise, we can appropriately reduce post-treatment screenings. This DL prediction model may minimize reactivation probability and reduce adverse impacts of reactivation, thus improving a patient’s prognosis.^[32]

To better identify the most important clinical factors, we calculated and ranked the weight ratios of all the factors collected. The top three relevant factors are PMA at initial treatment, GA, and BW. PMA at TR-ROP initial treatment, A-ROP, and respiratory distress syndrome were used in the CRM and added to the CM. Finally, previous studies have revealed that these factors are relevant to TR-ROP reactivation. ^{[17, 20, 37]} PMA at initial treatment contributed the most to the model, and earlier PMA at treatment was associated with a higher risk of reactivation, with 36.90 ± 2.27 weeks and 35.52 ± 2.41 weeks in the non-reactivation and reactivation groups (P = 0.046) in our study. According to Lyn’s study, ^[19] PMA at initial treatment was significantly different between non-reactivation and reactivation groups. Ling ^[17] reported that early PMA at initial treatment may be associated with less VEGF block and a lower capacity to make his or her own antioxidant enzymes. Low GA and BW are important risk factors for ROP, TR-ROP occurrence, and treatment failure. ^{[17, 20]} In Chiharu’s study, ^[37] A-ROP was an independent risk factor for ROP reactivation. Additionally, all of these factors can be noninvasively obtained from EHRs.

A picture is worth a thousand words, especially regarding retinal photographs. Many hidden features in retinal photographs can be analyzed and extracted by computers for diagnosis or therapy-response prediction. Computer-based retinal photograph analysis has been used in prediction models. In clinical practice, the occurrence and severity of ROP are detected from retinal photographs of multiple angles captured by RetCam or an indirect ophthalmoscope. Some experts have proposed that a screening system based solely on vessel characteristics in the posterior pole is sufficient for ROP prediction, diagnosis, progression, and therapy response.^{[31, 39–41]} Gupta^[39] showed that patients with ROP reactivation had a higher vascular severity score before treatment, and the ROP vascular severity score at the time of initial treatment was associated with reactivation, which suggests that retinal photographs before initial treatment may be predictive of reactivation after treatment. In our study, we used posterior fundus images before treatment to build a prediction model, achieving 83.35% accuracy, 93.33% sensitivity, and 72.91% specificity. However, more studies are needed to prove and detect hidden predictive areas in retinal photographs due to the small resamples in our study.

To get the best performances, we tried three algorithms in the CRM model. The CatBoost model gets the overall performance in internal and external validations; the CatBoost algorithm is a modification of the GBDT algorithm and compatible with categorical features, ^[42] and achieves acceptable performances in malaria prediction, Parkinson’s disease treatment response, and hospital mortality prediction. ^[43–45] We tried three training schemes to get the best prediction model, and the CM model achieved the best performance using the ResNeSt network for internal validation. Biometrical information was added to the CM model as an optimal strategy, which achieved better performance than the CRM and RM models.

The CM model pattern is like the process that ophthalmologists use during the diagnosis process in ROP screenings. With the development of DL applications, more studies are trying to combine biometrical information with clinical information. Coyner^[41] successfully improved the specificity of the TR-ROP prediction model by adding biometrical information; the GA + vascular severity score (VSS) is the best-performing model. The DLR-A model, added with clinical information using deep learning radiomics (DLR), obtains overall performance in a pancreatic neuroendocrine neoplasm reactivation model after the radical surgery constructed by Song. ^[32] The HER2-positive breast cancer recurrence model using H&E images and clinical information accurately assesses the risk of recurrence.^[46] The clinical information and biomarkers from imaging play important roles in the reactivation modeling process of ROP.

AI achieves good performance in ROP plus detection, ROP diagnosis, and grading. Brown ^[31] created a fully-automated DL system for normal/pre-plus/plus disease using CNN based on 5,511 RetCam retinal photographs. The U-Net was used for vascular vessel segmentation and pretrained. Inception-V1 was used for the classfication task. Inception-V1 was pretrained and then learned into this classification task using transformer learning. Finally, t-SNE reduced the dimensionality of the image features learned in the classification task to make them visualized, using five-fold cross-validation, the AUC achieved 0.98 for plus diseases. Compared with senior ophthalmologists, the DL system also showed strong advantages. After visualizing the features with t-SNE, we found continuity of disease features, which makes DL methods more interpretable. The improved automatic DL system passed FDA certification in 2020. Due to the demographic and clinical characteristics of different countries, many countries have created their own ROP AI systems. Zachary from the University of Sydney, Australia partnered with the Amazon Network to develop ROP AI using the Inception-V3 network and RMSProp optimizer. The original data was 3,487 fundus photographs, and the number of images doubled using data enhancement, with a final internal validation result of AUC of 0.993, sensitivity of 96.6%, specificity of 98.0%, external validation result AUC of 0.977, sensitivity of 93.9%, and specificity of 80.7. ^[47] Recently, a more detailed classification system has been developed that can identify the stages of ROP, plus disease, and pre-plus disease. The networks used included Inception-V3,12, Xception13, and InceptionResNet-V2. AUC ranged from 0.98 to 0.99, sensitivity from 0.92 to 0.98, specificity from 0.95 to 0.99, and F1 score from 0.72 to 0.92. Now the platform is open for external validation.^[48]

DL models need vast and criterion standard-annotated training datasets; otherwise, they may perform worse or generalize poorly to other domains.^[49] However, in practical use, we cannot always acquire the data we need in certain conditions (e.g., rare diseases). In our study, we used only 87 samples for training and validation, which was far less than the data requirements. The reasons for this were as follows: First, there is no publicly available dataset in ROP, which limits the datasets. Second, most premature infants meeting screening criteria will not develop ROP; TR-ROP infants only account for less than 10% of them; therefore, collecting such a large volume of hospital-based medical data from patients with TR-ROP is difficult. There have been some studies about the application of DL in small samples. Song et al. ^[32] constructed a pancreatic neuroendocrine neoplasms reactivation (DLR) model after radical surgery with preoperative computed tomography (CT) images using only 74 patients, which achieved 0.80 AUC, 0.90 sensitivity, and 0.67 specificity. DL combined with boosting was used to compensate for the small sample size in our study. Meanwhile, the prediction model achieved stable and acceptable performance in external validation, which proves that a relatively small number of patients might have been acceptable in the ML and DL analysis.

Limitations And Expectations

First, our study was retrospective, and the included number was small. Positive and negative cases are uneven, which limits learning from positive samples.^[50] More prospective studies with larger sample sizes should be conducted to further evaluate the model’s accuracy, and we will also collect cases of retinal laser photocoagulation and other anti-VEGF treatments. For small numbers, low-shot learning and few-shot learning were developed. ^{[51, 52]} Low-shot learning may solve the paucity of image and AI bias due to paucity or imbalance of data, ^[51] we will try to use low-shot or few-shot learning in our study’s next step.

Second, we did not obtain external validations for the RM and CM models, which reduced the interpretation of model generalization and robustness; we will collect more samples for the external validations. And, we enrolled only infants receiving Aflibercept and Conbercept treatments. However, the reactivation rates between different anti-VEGF drugs are different, and more various samples are needed to develop a more precise prediction model.

Third, the follow-up time was short; the maximum interval until reactivation after anti-VEGF treatment is unknown, ^[33] but the reactivation following anti-VEGF therapy has been reported up to several years after treatment. Additionally, we used repeat data from one infant in the CM model, due to the small number of our study. The PMA at TR-ROP initial treatment, A-ROP, and respiratory distress syndrome were used twice in retinal photographs (left and right) which were collected from one infant. When we obtain enough data, the retinal photographs of both eyes from the infant will be considered as one case.

Fourth, we didn,t make the process visualized. To increase the model explanation and transparency mechanisms, we will try to highlight the potential predictive areas in retinal photographs using saliency maps, which may help with a clearer understanding of the pathophysiological mechanisms of ROP reactivation.

In summary, we successfully constructed a prediction model for the reactivation of TR-ROP, which may be useful in clinical practice. This is the first prediction model for ROP reactivation. We tried three ways to build the model: the CM model got the best AUC and SEN in the internal validation. A precise model containing different treatments should be further explored in the future. In the next step, we will use recurrent neural network to try to use the DL model to predict reactivation time.

Author Contributions

Concept，design，administrative, technical, or material support: Xiaohe Lu, Songfu Feng, Wenzhao Liang. Acquisition, analysis, Statistical analysis, drafting of the manuscriptor interpretation of data: Yichen Bai, Jiali Li, Rong Wu. Critical revision of the manuscript for important intellectual content: Tao Chen, Rong Wu, Wenzhao Liang. Study supervision: Xiaohe Lu. Final approval of the manuscript: Xiaohe Lu, Songfu Feng

Declaration of Competing Interest

The authors declare that they have no competing interests.

Funding

None.

Data availability statements

The datasets generated and analysed during the current study are not publicly available due to patient privacy but are available from the corresponding author on reasonable request

Hartnett M E. Retinopathy of Prematurity: Evolving Treatment With Anti-Vascular Endothelial Growth Factor[J]. Am J Ophthalmol, 2020,218:208–213.
Blencowe H, Lawn J E, Vazquez T, et al. Preterm-associated visual impairment and estimates of retinopathy of prematurity at regional and global levels for 2010[J]. Pediatr Res, 2013,74 Suppl 1:35–49.
Yang Q, Zhou X, Ni Y, et al. Optimised retinopathy of prematurity screening guideline in China based on a 5-year cohort study[J]. Br J Ophthalmol, 2021,105(6):819–823.
Xu Y, Zhou X, Zhang Q, et al. Screening for retinopathy of prematurity in China: a neonatal units-based prospective study[J]. Invest Ophthalmol Vis Sci, 2013,54(13):8229–8236.
Wu T, Zhang L, Tong Y, et al. Retinopathy of Prematurity Among Very Low-Birth-Weight Infants in China: Incidence and Perinatal Risk Factors[J]. Invest Ophthalmol Vis Sci, 2018,59(2):757–763.
Cao Y, Jiang S, Sun J, et al. Assessment of Neonatal Intensive Care Unit Practices, Morbidity, and Mortality Among Very Preterm Infants in China[J]. JAMA Netw Open, 2021,4(8): e2118904.
Jin E, Yin H, Li X, et al. SHORT-TERM OUTCOMES AFTER INTRAVITREAL INJECTIONS OF CONBERCEPT VERSUS RANIBIZUMAB FOR THE TREATMENT OF RETINOPATHY OF PREMATURITY[J]. Retina, 2018,38(8):1595–1604.
Ludwig C A, Chen T A, Hernandez-Boussard T, et al. The Epidemiology of Retinopathy of Prematurity in the United States[J]. Ophthalmic Surg Lasers Imaging Retina, 2017,48(7):553–562.
Kang H G, Choi E Y, Byeon S H, et al. Intravitreal ranibizumab versus laser photocoagulation for retinopathy of prematurity: efficacy, anatomical outcomes and safety[J]. Br J Ophthalmol, 2019,103(9):1332–1336.
Yoon J M, Shin D H, Kim S J, et al. OUTCOMES AFTER LASER VERSUS COMBINED LASER AND BEVACIZUMAB TREATMENT FOR TYPE 1 RETINOPATHY OF PREMATURITY IN ZONE I[J]. Retina, 2017,37(1):88–96.
Stahl A, Gopel W. Screening and Treatment in Retinopathy of Prematurity[J]. Dtsch Arztebl Int, 2015,112(43):730–735.
Moshfeghi D M. Systemic Solutions in Retinopathy of Prematurity[J]. Am J Ophthalmol, 2018,193:xiv-xviii.
Chen Y T, Liu L, Lai C C, et al. ANATOMICAL AND FUNCTIONAL RESULTS OF INTRAVITREAL AFLIBERCEPT MONOTHERAPY FOR TYPE 1 RETINOPATHY OF PREMATURITY: One-Year Outcomes[J]. Retina, 2020,40(12):2366–2372.
VanderVeen D K, Melia M, Yang M B, et al. Anti-Vascular Endothelial Growth Factor Therapy for Primary Treatment of Type 1 Retinopathy of Prematurity: A Report by the American Academy of Ophthalmology[J]. Ophthalmology, 2017,124(5):619–633.
Darlow B A, Ells A L, Gilbert C E, et al. Are we there yet? Bevacizumab therapy for retinopathy of prematurity[J]. Arch Dis Child Fetal Neonatal Ed, 2013,98(2):F170-F174.
Cheng Y, Zhu X, Linghu D, et al. Comparison of the effectiveness of conbercept and ranibizumab treatment for retinopathy of prematurity[J]. Acta Ophthalmol, 2020,98(8):e1004-e1008.
Ling K P, Liao P J, Wang N K, et al. RATES AND RISK FACTORS FOR RECURRENCE OF RETINOPATHY OF PREMATURITY AFTER LASER OR INTRAVITREAL ANTI-VASCULAR ENDOTHELIAL GROWTH FACTOR MONOTHERAPY[J]. Retina, 2020,40(9):1793–1803.
Bai Y, Nie H, Wei S, et al. Efficacy of intravitreal conbercept injection in the treatment of retinopathy of prematurity[J]. Br J Ophthalmol, 2019,103(4):494–498.
Lyu J, Zhang Q, Chen C L, et al. Recurrence of Retinopathy of Prematurity After Intravitreal Ranibizumab Monotherapy: Timing and Risk Factors[J]. Invest Ophthalmol Vis Sci, 2017,58(3):1719–1725.
Huang Q, Zhang Q, Fei P, et al. Ranibizumab Injection as Primary Treatment in Patients with Retinopathy of Prematurity: Anatomic Outcomes and Influencing Factors[J]. Ophthalmology, 2017,124(8):1156–1164.
LeCun Y, Bengio Y, Hinton G. Deep learning[J]. Nature, 2015,521(7553):436–444.
Ham Y G, Kim J H, Luo J J. Deep learning for multi-year ENSO forecasts[J]. Nature, 2019,573(7775):568–572.
Li J O, Liu H, Ting D, et al. Digital technology, tele-medicine and artificial intelligence in ophthalmology: A global perspective[J]. Prog Retin Eye Res, 2021,82:100900.
Decuyper M, Maebe J, Van Holen R, et al. Artificial intelligence with deep learning in nuclear medicine and radiology[J]. EJNMMI Phys, 2021,8(1):81.
Wang X, Wang R, Yang S, et al. Combining Radiology and Pathology for Automatic Glioma Classification[J]. Front Bioeng Biotechnol, 2022,10:841958.
Du-Harpur X, Watt F M, Luscombe N M, et al. What is AI? Applications of artificial intelligence to dermatology[J]. Br J Dermatol, 2020,183(3):423–430.
Keenan T, Chen Q, Agron E, et al. DeepLensNet: Deep Learning Automated Diagnosis and Quantitative Classification of Cataract Type and Severity[J]. Ophthalmology, 2022.
Ting D, Cheung C Y, Lim G, et al. Development and Validation of a Deep Learning System for Diabetic Retinopathy and Related Eye Diseases Using Retinal Images From Multiethnic Populations With Diabetes[J]. JAMA, 2017,318(22):2211–2223.
Lee T, Jammal A A, Mariottoni E B, et al. Predicting Glaucoma Development With Longitudinal Deep Learning Predictions From Fundus Photographs[J]. Am J Ophthalmol, 2021,225:86–94.
Liefers B, Taylor P, Alsaedi A, et al. Quantification of Key Retinal Features in Early and Late Age-Related Macular Degeneration Using Deep Learning[J]. Am J Ophthalmol, 2021,226:1–12.
Brown J M, Campbell J P, Beers A, et al. Automated Diagnosis of Plus Disease in Retinopathy of Prematurity Using Deep Convolutional Neural Networks[J]. JAMA Ophthalmol, 2018,136(7):803–810.
Song C, Wang M, Luo Y, et al. Predicting the recurrence risk of pancreatic neuroendocrine neoplasms after radical resection using deep learning radiomics with preoperative computed tomography images[J]. Ann Transl Med, 2021,9(10):833.
Chiang M F, Quinn G E, Fielder A R, et al. International Classification of Retinopathy of Prematurity, Third Edition[J]. Ophthalmology, 2021,128(10): e51-e68.
Wu W C, Kuo H K, Yeh P T, et al. An updated study of the use of bevacizumab in the treatment of patients with prethreshold retinopathy of prematurity in taiwan[J]. Am J Ophthalmol, 2013,155(1):150–158.
Zhang G, Yang M, Zeng J, et al. COMPARISON OF INTRAVITREAL INJECTION OF RANIBIZUMAB VERSUS LASER THERAPY FOR ZONE II TREATMENT-REQUIRING RETINOPATHY OF PREMATURITY[J]. Retina, 2017,37(4):710–717.
Lin D, Chen J, Lin Z, et al. A practical model for the identification of congenital cataracts using machine learning[J]. EBioMedicine, 2020,51:102621.
Iwahashi C, Utamura S, Kuniyoshi K, et al. FACTORS ASSOCIATED WITH REACTIVATION AFTER INTRAVITREAL BEVACIZUMAB OR RANIBIZUMAB THERAPY IN INFANTS WITH RETINOPATHY OF PREMATURITY[J]. Retina, 2021,41(11):2261–2268.
Garcia G J, Snyder L, Blair M, et al. PROPHYLACTIC PERIPHERAL LASER AND FLUORESCEIN ANGIOGRAPHY AFTER BEVACIZUMAB FOR RETINOPATHY OF PREMATURITY[J]. Retina, 2018,38(4):764–772.
Gupta K, Campbell J P, Taylor S, et al. A Quantitative Severity Scale for Retinopathy of Prematurity Using Deep Learning to Monitor Disease Regression After Treatment[J]. JAMA Ophthalmol, 2019.
Taylor S, Brown J M, Gupta K, et al. Monitoring Disease Progression With a Quantitative Severity Scale for Retinopathy of Prematurity Using Deep Learning[J]. JAMA Ophthalmol, 2019.
Coyner A S, Chen J S, Singh P, et al. Single-Examination Risk Prediction of Severe Retinopathy of Prematurity[J]. Pediatrics, 2021,148(6).
Hancock J T, Khoshgoftaar T M. CatBoost for big data: an interdisciplinary review[J]. J Big Data, 2020,7(1):94.
Lee Y W, Choi J W, Shin E H. Machine learning model for predicting malaria using clinical information[J]. Comput Biol Med, 2021,129:104151.
Yang B, Wang X, Mo J, et al. The amplitude of low-frequency fluctuation predicts levodopa treatment response in patients with Parkinson's disease[J]. Parkinsonism Relat Disord, 2021,92:26–32.
Li L, Zhang Z, Xiong Y, et al. Prediction of hospital mortality in mechanically ventilated patients with congestive heart failure using machine learning approaches[J]. Int J Cardiol, 2022.
Yang J, Ju J, Guo L, et al. Prediction of HER2-positive breast cancer recurrence and metastasis risk from histopathological images and clinical information via multimodal deep learning[J]. Comput Struct Biotechnol J, 2022,20:333–342.
Tan Z, Simkin S, Lai C, et al. Deep Learning Algorithm for Automated Diagnosis of Retinopathy of Prematurity Plus Disease[J]. Transl Vis Sci Technol, 2019,8(6):23.
Wang J, Ji J, Zhang M, et al. Automated Explainable Multidimensional Deep Learning Platform of Retinal Images for Retinopathy of Prematurity Screening[J]. JAMA Netw Open, 2021,4(5):e218758.
Ting D, Peng L, Varadarajan A V, et al. Deep learning in ophthalmology: The technical and clinical considerations[J]. Prog Retin Eye Res, 2019,72:100759.
Akazawa M, Hashimoto K, Katsuhiko N, et al. Machine learning approach for the prediction of postpartum hemorrhage in vaginal birth[J]. Sci Rep, 2021,11(1):22620.
Burlina P, Paul W, Mathew P, et al. Low-Shot Deep Learning of Diabetic Retinopathy With Potential Applications to Address Artificial Intelligence Bias in Retinal Diagnostics and Rare Ophthalmic Diseases[J]. JAMA Ophthalmol, 2020,138(10):1070–1077.
Feng R, Zheng X, Gao T, et al. Interactive Few-Shot Learning: Limited Supervision, Better Medical Image Segmentation[J]. IEEE Trans Med Imaging, 2021,40(10):2575–2588.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Prediction of the Reactivation of Retinopathy of Prematurity After Anti-VEGF Treatment Using Machine Learning in Small Numbers

Status:

Version 1

Abstract

Aim

Methods

Results

Conclusion

Figures

Introduction

Materials And Methods

1. Patient Enrollment and Labeling

2. Treatment And Follow-up Screening

3. Collecting Data On Potential Predictive Factors

4. Training and Validation of Prediction Models

4.1. The Clinical Risk Factor Model (Crm)

4.2. Retinal Photograph Model (Rm)

4.3. Combined Model (Cm)

4.4. Statistical Analysis

Results

Crm Model Performance

Rm Model Performance

Cm Performance

Discussion

Limitations And Expectations

Conclusion

Declarations

References

Additional Declarations

Status:

Version 1