EfficientNetV2 for Quality Estimation of Diabetic Retinopathy Images from DeepDRiD

doi:10.21203/rs.3.rs-2195089/v1

Diabetic retinopathy (DR) is caused by diabetes and is usually identified from retinal fundus images. Regular DR screening from digital fundus images could be burdensome to ophthalmologists and moreover prone to human errors. The quality of the fundus images is essential to improve the quality of the classification and thereby reduce diagnostic errors. Hence an automated method for quality estimation (QE) of digital fundus images using an ensemble of EfficientNetV2 models including small, medium, and large models is proposed. The ensemble method was cross-validated and tested on an openly available dataset from DeepDRiD. The test accuracy for QE is 75% outperforming the existing methods on the DeepDRiD dataset. Hence, this may be a potential tool for automated QE of fundus images and could be handy to the ophthalmologist.

Biomedical Engineering

diabetic retinopathy

quality estimation

DeepDRiD

EfficientNetV2

fundus image

Diabetic retinopathy (DR) is a common disease caused by diabetes, affecting mostly working individuals and leading to loss of vision. By 2040, it is estimated that 600 million people will suffer from diabetes and approximately one third of them have a chance of getting DR [1]. DR is usually identified by an ophthalmologist by visual examination of digital fundus images for the presence of one or more retinal lesions such as microaneurysms, soft exudates, haemorrhages, and hard exudates [2]. DR can broadly be classified into non-proliferative DR (NPDR) and proliferative DR (PDR). The preliminary stage of DR is NPDR where the microaneurysms are visible in the fundus image and the advanced stage of DR that is PDR can lead to severe vision loss. The NPDR is further subdivided into 3 types mild, moderate and severe NPDR. The international clinical DR severity scale contains five grades for classifying DR based on fundus images grade 0 is no apparent retinopathy, grade 5 is PDR, and the aforementioned types of NPDR are classified as grade 1, 2, and 3 respectively.

Since PDR could lead to blindness and the manual evaluation of fundus images may create severe burden on ophthalmologists. Moreover, inaccurate grading of DR can happen with inadequate training of healthcare professionals. Hence, automated methods for DR screening are warranted to assist ophthalmologists and trained healthcare practitioners. However, poor quality digital fundus images can lead to false positives, and hence it is vital to first estimate the quality of acquired funds images before proceeding with DR grading [3]. Therefore, fully automated methods for accurate quality estimation (QE) of fundus images are demanding.

In the past decade, several state-of-the-art deep learning (DL) architectures including AlexNet [4], VGGs [5], GoogLeNet [6], ResNet [7], DenseNet [8], EfficientNets [9, 10] and recently vision transformer (ViT) [11] based models were developed for various computer vision tasks such as object localization, detection and classification. Even though, training DL models from scratch require massive data, transfer learning (TL) could facilitate adapting the already trained models for new classification tasks thus eliminating the need for huge data for training. Both TL and DL have been playing a major role in healthcare for moving towards the development of DL-based automated systems from medical images such as radiographs, computed tomography, digital fundus images, positron emission tomography, and magnetic resonance imaging, etc. for diagnostic and prognostic tasks as well as assisting the medical practitioners in several scenarios such as faster data acquisition and quality control [12–14]. EfficientNetV2 is one of the recently developed DL architectures based on progressive learning with a combination of neural architecture search and scaling to improve both the training speed and parameter efficiency [15]. EfficientNetV2 outperformed several previous state-of-the-art models including ViTs in image classification tasks on the ImageNet challenge. Here, the contributions of this work are as follows:

i)A fully automated method for overall QE of digital fundus images is proposed using an ensemble of EfficientNetV2 small (S), medium (M), and large (L) models. Since model ensembling was proved to be effective in some previous studies [16].

ii) The proposed ensemble model is cross-validated and tested on a publicly available dataset from DeepDRiD as QE on images from this dataset seems to be challenging [3].

iii) The ability of the proposed ensemble model for overall QE is further stratified with respect to DR disease severity.

Related work

Several works existed in the literature for the quality estimation of digital fundus images based on both machine learning and deep learning techniques. These works are basically divided into 2-class classification and 3-class classification problems which are given in Table 1. In 2-class classification, the images are divided into either good or bad quality. Whereas in the 3-class problem, the images are divided into good, moderate, and bad quality. In [17], a PLS regressor was developed based on handcrafted features and the method achieved an area under the receiver operator characteristic curve (AUC) of 95.8% on their private dataset. Further, using a support vector machine (SVM) classifier and from a mixture of private and public datasets containing fundus images of varying resolutions, [18] demonstrated an accuracy of 91.4%, [19] obtained an AUC of 94.5%, and [20] achieved a sensitivity of 95.3% in fundus image QE. In other studies, based on EyePACS Kaggle datasets [21, 22], pre-trained deep learning models were finetuned for feature extraction and these extracted features were further fed to the SVM classifier for detection of bad quality fundus images. The highest classification accuracy in these studies is 95.4%. Using the DRIMDB dataset, several ML classifiers were developed including gcforest, random forest regressor [23–25] and achieved accuracies above 98%.

Some recent studies existed for 3-class classification of fundus image quality using light-weight CNN [26] and an ensemble of CNNs [27] based on Kaggle datasets and obtained accuracies above 85% for 3-class quality classification. In the most recent study using pretrained ResNet50 [28], the finetuned model on a Kaggle dataset demonstrated an accuracy of 98.6%. Overall, using these private and public datasets mentioned so far, the classification task is generally easier since the images are quite differentiable to the naked eye. However, in a recent digital fundus image QE challenge [3], the good and bad quality images in the DeepDRiD dataset are quite hard to differentiate, and hence the highest accuracy obtained in the QE challenge was 69.81%. Therefore, there is a scope to improve the QE of DeepDRiD images and in the present study, we have explored the same using the EfficientNetV2 models and their ensembling [15].

Table 1

Previous works on assessment of fundus image quality using different machine learning and deep learning methods on various private and public datasets. DeepDRiD: Diabetic retinopathy – grading and image quality estimation challenge dataset. CNN: convolution neural network. ML: machine learning, DL: deep learning, PLS: partial least squares, SVM: support vector machine.
Study	Method	Dataset	Image Resolution	Performance (%)
Yu H et.al. [17]	PLS regressor	Private – 1884	4752 × 3168	AUC: 95.8
Yu F et.al. [21]	SM + AlexNet + SVM	Kaggle – 5200 (subset)	Original: 2592 × 1944 Resized: 256 × 256	Accuracy: 95.4 AUC: 98.2
Yao Z et.al. [18]	SVM	Private – 3224	-	Accuracy: 91.4 AUC: 96.2
Welikala R.A. et.al. [20]	SVM	UK Biobank – 800 (subset)	2048 × 1536	Sensitivity: 95.3 Specificity: 91.1
Wang S et.al. [19]	SVM	Private and Public – 536	Private: 2560 × 1960 Public: 570 × 760 and 565 × 584	AUC: 94.5 Sensitivity: 87.4 Specificity: 91.7
S Feng et.al. [22]	DT, SVM and DL	EyePACS at Kaggle – 4372	Multiple resolutions	Accuracy: 93.6 Sensitivity: 94.7 Specificity: 92.3
S Ugur et.al. [23]	Several ML classifiers	DRIMDB – 216	570 × 760	Accuracy: 98.1
Raj A et.al. [27]	Ensemble of CNNs	FIQuA (EyePACS at Kaggle) – 1500	Multiple resolutions	Accuracy: 95.7 (3-class classification)
Perez AD et.al. [26]	Light-weight CNN	Kaggle – 4768 (2-class) Kaggle – 28792 (3-class)	896 × 896	Accuracy: 91.1 (2-class) Accuracy: 85.6 (3-class)
Liu H et.al. [25]	gcforest	DRIMDB – 216 (3-class) ACRIMA – 705 (2-class)	Multiple resolutions	Accuracy: 88.6 (DRIMDB dataset) Accuracy: 85.1 (ACRIMA dataset)
Karlsson RA et.al. [24]	Random forest regressor	Private – 787 oximetry and 253 RGB DRIMDB – 216 (194 were used)	1600 x 1200 (oximetry) 3192 x 2656 (RGB) 760 x 570 (DRIMDB)	Accuracy: 98.1 (DRIMDB) ICC: 0.85 (oximetry) ICC: 0.91 (RGB)
Shi C et.al. [28]	Pretrained ResNet50	Kaggle – 2434 (2-class)	Multiple resolutions	Accuracy: 98.6 Sensitivity: 98.0 Specificity: 99.1
Liu R [3]	ISBI 2020 grand challenge	DeepDRiD – 2000 (2-class)	Multiple resolutions	Accuracy: 69.81

Dataset

An openly available dataset from Diabetic retinopathy – grading and image quality estimation challenge (DeepDRiD) in ISBI 2020 was used [3]. The dataset consists of 2000 regular fundus images from 500 subjects where each patient has four images with two acquisitions for each eye. The images are centered at the macula and optic disc.

Table 2

Basic details of DeepDRiD regular fundus images.
Dataset	No. of images	No. of participants	Female (%)	Age (years)	BMI (kg/m²)
Set-A (training)	1200	300	49.00	70.63 ± 7.70	25.17 ± 3.13
Set-B (validation)	400	100	56.00	65.13 ± 1.89	24.88 ± 3.21
Set-C (testing)	400	100	54.00	61.36 ± 7.23	25.01 ± 2.6

More basic details about the dataset are given in Table 2. The dataset is divided into Set-A, Set-B, and Set-C for model training, validation, and testing respectively. For a fair comparison of the proposed model performance with literature, the training, validation, and test sets remain unaltered. The images are labelled as good and bad quality by two authorized ophthalmologists and the labels were confirmed or revised by a third senior ophthalmologist. The example fundus images with both good and bad quality are shown in Fig. 1. The dataset containing good and bad quality images stratified with respect to DR severity are given in Table 3 for all training, validation, and test sets.

Table 3

Number of good and bad quality images in the training, validation, and testing sets stratified with respect to DR severity. DR: diabetic retinopathy, NPDR: non-proliferative diabetic retinopathy, PDR: proliferative retinopathy.
Dataset	No DR	Mild NPDR	Moderate NPDR	Severe NPDR	PDR
Set-A (Training)	Good: 234 Bad: 306	Good: 74 Bad: 66	Good: 126 Bad: 108	Good: 108 Bad: 106	Good: 34 Bad: 38
Set-B (Validation)	Good: 62 Bad: 112	Good: 32 Bad: 14	Good: 48 Bad: 44	Good: 30 Bad: 38	Good: 10 Bad: 10
Set-C (Testing)	Good: 86 Bad: 114	Good: 22 Bad: 14	Good: 44 Bad: 28	Good: 22 Bad: 50	Good: 6 Bad: 14

EfficientNetV2

EfficientNetV2 [15] is a new family of convolutional neural networks and improved versions of the EfficientNetV1 [10] with focus on two aspects such as improving the training speed and smaller in parameter size. Towards this goal, a combination of training-aware neural architecture search and scaling were used to jointly optimize the training speed and parameter efficiency. The faster training was achieved by using both MBConv and Fused-MBConv blocks. MBConv layers are basic structures of MobileNetV2 [29] that build from inverted residual blocks. In Fused-MBConv layer, the depthwise 3×3 convolution and expansion 1×1 convolution in MBConv was replaced by a regular 3×3 convolution as shown in Fig. 2. Further, a squeeze and excitation (SE) block in both MBConv and Fused-MBConv was used to adaptively weight different channels followed by a 1×1 squeeze layer was placed to bring down the number of channels equal to the channels present in the input of MBConv/Fused-MBConv. The EfficientNetV2 family used Fused-MBConv blocks in the early layers. The EfficientNetV2-S model architecture starts with standard 3×3 convolution layer followed by three Fused-MBConv and three MBConv layers. The final layers contain a 1×1 convolution, pooling and followed by a fully connected layer. The EfficientNetV2-S model was scaled up using compound scaling to get EfficientnetV2-M/L. For more details, refer to [9].

The training speed was further enhanced by progressively increasing the image size during training. However, this progressive training often results a drop in accuracy. To combat the drop in accuracy due to progressive training, adaptive regularization such as dropout and data augmentation were used. That means weak augmentation was used for small image size and stronger augmentation for larger images. In the present work, we employed EfficientNetV2-S, -M and -L models.

Model training and validation

The final classification layer of the pre-trained EfficientNetV2-S, -M, and -L models is removed and the output neuron is added for the final classification of good vs. bad image quality. The hyperparameters of the models were selected empirically for this study. For training, Adadelta optimizer with a learning rate of 0.1 was used and the number of epochs was set to 10. Binary cross-entropy as described in Eq. (1) was used as the loss function since it is a two-class classification. In ${CE}_{loss}$, N is the number of samples; y is the true label and $\widehat{y}$ is the predicted label by the model. For all individual models, only the last 20 percent of the parameters were allowed to be finetuned during training. The validation set (Set-B) was used to make sure that the individual models are not overfitting.

$${CE}_{loss}=-\frac{1}{N}{\sum }_{i=0}^{N}ylog(\widehat{y})+(1-y)\text{l}\text{o}\text{g}(1- \widehat{y})$$

1

Ensemble model

For model ensembling, there was no separate training involved as we have implemented the ensembling using the predicted probabilities of the individual models. The predicted probability of the ensemble model ${p}_{en}$ is calculated as mean of the individual EfficientNetV2-S, -M and -L models predicted probabilities ${p}_{s}$, ${p}_{m}$, and ${p}_{l}$ respectively. Mathematically, it is described in Eq. (2).

$${p}_{en}=\frac{{p}_{s}+{p}_{m}+{p}_{l}}{3}$$

2

Evaluation metrics

To evaluate the performances of the individual and the ensemble model, accuracy, F1-score and balanced accuracy (BA) are used which are described in equations (3)-(5). Here, F1-score is BA are computed from recall, specificity and precision scores from equations (6)-(8). In addition to these, confusion matrix (CM), area under the receiver operating characteristic curve (AUC) are also used as model performance indicators. In CM in Eq. (9), TP is true positive (poor image quality; label 1), TN is true negative (good image quality; label 0), FP is a false positive, and FN is a false negative.

$accuracy=\frac{TP+TN}{TP+TN+FP+FN}$	(3)
$F1-score=\frac{2precisionrecall}{precison+recall}$	(4)
$BA=\frac{sensitivity+specificity}{2}$	(5)
$sensitivity=\frac{TP}{TP+FN}$	(6)
$specificity=\frac{TN}{TN+FP}$	(7)
$precision=\frac{TP}{TP+FP}$	(8)
$CM= \left[\begin{array}{cc}TP& FN\\ FP& TN\end{array}\right]$	(9)

The ensembling from the individual EfficientNetV2-S, -M, and -L models has achieved an accuracy of 75.0 percent and an AUC of 74.9 percent on the whole test set for QE which are better than performance metrics obtained with any of the individual models. The better individual model was EfficientNetV2-S. For complete performance metrics details, refer Table 4. Further, the performance scores of the individual models and their ensembling for the QE with respect to DR grade are also computed and are available in Table 5. The accuracy and AUC for QE of fundus images with PDR are 90.0 and 83.3 percent. In general, the performance metrics for QE are better for fundus images with PDR compared to fundus images with NPDR and no DR.

Table 4: Performance metrics for QE of all test set images for individual EfficientNetV2 models as well as their ensembling. BA: balanced accuracy, AUC: area under the curve, QE: quality estimation.

	EfficientNetV2-S	EfficientNetV2-M	EfficientNetV2-L	Ensembling
Accuracy	72.3	72.8	74.0	75.0
AUC	73.1	72.6	73.5	74.9
F1-Score	72.2	72.8	73.9	75.0
BA	73.1	72.6	73.5	74.9

Furthermore, figure 3 shows the confusion matrices on whole test set for individual models and their ensembling. Comparing with the methods presented in the DeepDRiD challenge for QE as grand challenge 2 [3], our proposed ensemble model has achieved an overall accuracy of 75 percent which is more than 5 percentage points indicating the robustness of our method as well as the power of ensembling. In addition, the confusion matrices for the ensemble model on the test set stratified with respect to DR severity are available in figure 4. In general, the method worked well for PDR images compared to the rest. For PDR images, the ensemble model has achieved 100 percent sensitivity. The sensitivity is also approximately 80 percent for fundus images with no DR, mild DR and severe DR.

Table 5: Performance metrics for QE of test images stratified with respect to DR severity for individual EfficientNetV2 models as well as their ensembling. BA: balanced accuracy, AUC: area under the curve, QE: quality estimation, DR: diabetic retinopathy, NPDR: non-proliferative diabatic retinopathy, PDR: proliferative diabetic retinopathy.

		EfficientNetV2-S	EfficientNetV2-M	EfficientNetV2-L	Ensemble model
No DR	Accuracy	71.5	73.0	72.5	75.5
	AUC	72.3	72.7	71.5	74.9
	F1-Score	71.6	73.0	72.3	75.5
	BA	72.3	72.7	71.5	74.9
Mild NPDR	Accuracy	72.2	77.8	75.0	77.8
	AUC	69.5	76.2	73.1	77.9
	F1-Score	71.8	77.8	74.8	78.0
	BA	69.5	76.2	73.1	77.9
Moderate NPDR	Accuracy	70.2	70.5	70.6	71.0
	AUC	70.5	70.8	70.8	71.8
	F1-Score	70.1	70.1	71.1	71.2
	BA	70.5	70.9	70.8	71.5
Severe NPDR	Accuracy	76.4	72.2	77.8	77.8
	AUC	79.2	68.5	73.8	76.4
	F1-Score	77.3	72.5	77.8	78.2
	BA	79.2	68.5	73.8	76.4
PDR	Accuracy	70.0	90.0	85.0	90.0
	AUC	69.0	83.3	75.0	83.3
	F1-Score	71.0	89.3	83.2	89.3
	BA	69.0	83.3	75.0	83.3

Comparing with previous studies on QE of fundus images, the QE on DeepDRiD images is quite challenging since there are minimal visual differences between good and bad quality images as can be seen in figure 1. Further, the very high-performance metric values on related studies listed in Table 1 could be due to that it may be relatively a trivial task for those models to differentiate bad vs. good quality images on other datasets from DRIMDB and Kaggle. Furthermore, this study demonstrates the QE with respect to DR severity that was not implemented so far.

Limitations

This study has few limitations. The size of the Set-C is relatively small especially when stratified the results with respect to DR severity. In future, the proposed ensembling method should be tested on larger datasets such as DeepDRiD to corroborate the ability of the proposed method in QE. Although the individual EfficientNetV2 model hyperpameters were empirically chosen, a more thorough search of hyperparameters including the choice of optimizer may be performed via grid search or random search. Nevertheless, based on few experiments conducted, Adadelta worked better in terms of overall accuracy than RMSprop and Adam optimizers.

The proposed method based on an ensemble of EfficientNetV2-S, -M, and -L models performed better than the existing works for QE of fundus images from DeepDRiD dataset. The performance metrics of QE are generally found better at no DR then severe NPDR and PDR. The code could be provided upon reasonable request. Hence, the proposed ensembling framework from the three EfficientNetV2 models could assist the ophthalmologists to automate the fundus image overall QE before proceeding with DR grading.

Acknowledgements

We would like to acknowledge the DeepDRiD challenge organizers for providing the dataset.

Conflicts-of-interest

None to declare.

K. Ogurtsova, J.D. da Rocha Fernandes, Y. Huang, U. Linnenkamp, L. Guariguata, N.H. Cho, D. Cavan, J.E. Shaw, L.E. Makaroff, IDF Diabetes Atlas: Global estimates for the prevalence of diabetes for 2015 and 2040, Diabetes Res. Clin. Pract. 128 (2017) 40–50. https://doi.org/10.1016/j.diabres.2017.03.024.
W. Wang, A.C.Y. Lo, Diabetic Retinopathy: Pathophysiology and Treatments, Int. J. Mol. Sci. 19 (2018). https://doi.org/10.3390/IJMS19061816.
R. Liu, X. Wang, Q. Wu, L. Dai, X. Fang, T. Yan, J. Son, S. Tang, J. Li, Z. Gao, A. Galdran, J.M. Poorneshwaran, H. Liu, J. Wang, Y. Chen, P. Porwal, G.S. Wei Tan, X. Yang, C. Dai, H. Song, M. Chen, H. Li, W. Jia, D. Shen, B. Sheng, P. Zhang, DeepDRiD: Diabetic Retinopathy-Grading and Image Quality Estimation Challenge, Patterns (New York, N.Y.). 3 (2022). https://doi.org/10.1016/J.PATTER.2022.100512.
A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, Adv. Neural Inf. Process. Syst. 25 (2012). http://code.google.com/p/cuda-convnet/ (accessed December 7, 2021).
K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition, 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc. (2014). https://doi.org/10.48550/arxiv.1409.1556.
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, Going Deeper with Convolutions, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 07-12-June-2015 (2014) 1–9. https://doi.org/10.1109/CVPR.2015.7298594.
K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2016-December (2015) 770–778. https://doi.org/10.48550/arxiv.1512.03385.
G. Huang, Z. Liu, L. Van Der Maaten, K.Q. Weinberger, Densely Connected Convolutional Networks, Proc. – 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017. 2017-January (2016) 2261–2269. https://doi.org/10.48550/arxiv.1608.06993.
M. Tan, Q. V. Le, EfficientNetV2: Smaller Models and Faster Training, (2021). https://doi.org/10.48550/arxiv.2104.00298.
M. Tan, Q. V. Le, EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, 36th Int. Conf. Mach. Learn. ICML 2019. 2019-June (2019) 10691–10700. https://arxiv.org/abs/1905.11946v5 (accessed December 7, 2021).
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby, An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, (2020). https://doi.org/10.48550/arxiv.2010.11929.
R. Yousef, G. Gupta, N. Yousef, M. Khari, A holistic overview of deep learning approach in medical imaging, Multimed. Syst. 28 (2022) 881. https://doi.org/10.1007/S00530-021-00884-5.
S. Tummala, Deep Learning Framework using Siamese Neural Network for Diagnosis of Autism from Brain Magnetic Resonance Imaging, in: 2021 6th Int. Conf. Converg. Technol., IEEE, 2021: pp. 1–5. https://doi.org/10.1109/I2CT51068.2021.9418143.
M.W. Nadeem, H.G. Goh, M. Hussain, S.Y. Liew, I. Andonovic, M.A. Khan, Deep Learning for Diabetic Retinopathy Analysis: A Review, Research Challenges, and Future Directions, Sensors (Basel). 22 (2022). https://doi.org/10.3390/S22186780.
M. Tan, Q. V. Le, EfficientNetV2: Smaller Models and Faster Training, (2021). https://doi.org/10.48550/arxiv.2104.00298.
S. Tummala, S. Kadry, S. Ahmad, C. Bukhari, H.T. Rauf, Classification of Brain Tumor from Magnetic Resonance Imaging using Vision Transformers Ensembling, Curr. Oncol. 2022, Vol. 29, Pages 7498–7511. 29 (2022) 7498–7511. https://doi.org/10.3390/CURRONCOL29100590.
H. Yu, C. Agurto, S. Barriga, S.C. Nemeth, P. Soliz, G. Zamora, Automated image quality evaluation of retinal fundus photographs in diabetic retinopathy screening, Proc. IEEE Southwest Symp. Image Anal. Interpret. (2012) 125–128. https://doi.org/10.1109/SSIAI.2012.6202469.
Z. Yao, Z. Zhang, L.Q. Xu, Q. Fan, L. Xu, Generic features for fundus image quality evaluation, 2016 IEEE 18th Int. Conf. e-Health Networking, Appl. Serv. Heal. 2016. (2016). https://doi.org/10.1109/HEALTHCOM.2016.7749522.
S. Wang, K. Jin, H. Lu, C. Cheng, J. Ye, D. Qian, Human Visual System-Based Fundus Image Quality Assessment of Portable Fundus Camera Photographs, IEEE Trans. Med. Imaging. 35 (2016) 1046–1055. https://doi.org/10.1109/TMI.2015.2506902.
R.A. Welikala, M.M. Fraz, P.J. Foster, P.H. Whincup, A.R. Rudnicka, C.G. Owen, D.P. Strachan, S.A. Barman, T. Aslam, S. Barman, P. Bishop, P. Blows, C. Bunce, R. Carare, U. Chakravarthy, M. Chan, A. Chianca, V. Cipriani, D. Crabb, P. Cumberland, A. Day, P. Desai, B. Dhillon, A. Dick, P. foster, J. Gallacher, D. Garway-Heath, rini Goverdhan, J. Guggenheim, P. Gupta, C. Hammond, R. Hogg, A. Hughes, P. Keane, S.P.T. Khaw, A. Khawaja, G. Lascaratos, A. Lotery, P. Luthert, T. Mac-Gillivray, S. Mackie, K. Martin, M. McGaughey, B. McGuinness, G. McKay, M. McKibbin, D. Mitry, T. Moore, J. Morgan, Z. Muthy, E. O’Sullivan, C. Owen, P. Patel, T. Peto, J. Rahi, A. Rudnicka, C. Grossi Sampedro, D. Steel, I. Stratton, N. Strouthidis, C. Sudlow, C. Thaung, D. Thomas, E. Trucco, A. Tufail, S. Vernon, A. Viswanathan, C. Williams, K. Williams, J. Yates, M. Yates, J. Yip, H. Zhu, Automated retinal image quality assessment on the UK Biobank dataset for epidemiological studies, Comput. Biol. Med. 71 (2016) 67–76. https://doi.org/10.1016/J.COMPBIOMED.2016.01.027.
F. Yu, J. Sun, A. Li, J. Cheng, C. Wan, J. Liu, Image quality classification for DR screening using deep learning, Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. IEEE Eng. Med. Biol. Soc. Annu. Int. Conf. 2017 (2017) 664–667. https://doi.org/10.1109/EMBC.2017.8036912.
F. Shao, Y. Yang, Q. Jiang, G. Jiang, Y.S. Ho, Automated Quality Assessment of Fundus Images via Analysis of Illumination, Naturalness and Structure, IEEE Access. 6 (2017) 806–817. https://doi.org/10.1109/ACCESS.2017.2776126.
U. Sevik, C. Köse, T. Berber, H. Erdöl, Identification of suitable fundus images using automated quality assessment methods, J. Biomed. Opt. 19 (2014) 046006. https://doi.org/10.1117/1.JBO.19.4.046006.
R.A. Karlsson, B.A. Jonsson, S.H. Hardarson, O.B. Olafsdottir, G.H. Halldorsson, E. Stefansson, Automatic fundus image quality assessment on a continuous scale, Comput. Biol. Med. 129 (2021). https://doi.org/10.1016/J.COMPBIOMED.2020.104114.
H. Liu, N. Zhang, S. Jin, D. Xu, W. Gao, Small sample color fundus image quality assessment based on gcforest, Multimed. Tools Appl. 2020 8011. 80 (2020) 17441–17459. https://doi.org/10.1007/S11042-020-09362-Y.
A.D. Pérez, O. Perdomo, F.A. González, A lightweight deep learning model for mobile eye fundus image quality assessment, Https://Doi.Org/10.1117/12.2547126. 11330 (2020) 151–158. https://doi.org/10.1117/12.2547126.
A. Raj, N.A. Shah, A.K. Tiwari, M.G. Martini, Multivariate Regression-Based Convolutional Neural Network Model for Fundus Image Quality Assessment, IEEE Access. 8 (2020) 57810–57821. https://doi.org/10.1109/ACCESS.2020.2982588.
C. Shi, J. Lee, G. Wang, X. Dou, F. Yuan, B. Zee, Assessment of image quality on color fundus retinal images using the automatic retinal image analysis, Sci. Reports 2022 121. 12 (2022) 1–11. https://doi.org/10.1038/s41598-022-13919-2.
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L.C. Chen, MobileNetV2: Inverted Residuals and Linear Bottlenecks, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2018) 4510–4520. https://doi.org/10.48550/arxiv.1801.04381.

EfficientNetV2 for Quality Estimation of Diabetic Retinopathy Images from DeepDRiD

Status:

Version 1

Abstract

Figures

Introduction

Related work

Methods

Dataset

EfficientNetV2

Model training and validation

Ensemble model

Evaluation metrics

Results And Discussion

Conclusions

Declarations

References

Status:

Version 1

\(accuracy=\frac{TP+TN}{TP+TN+FP+FN}\)	(3)
\(F1-score=\frac{2precisionrecall}{precison+recall}\)	(4)
\(BA=\frac{sensitivity+specificity}{2}\)	(5)
\(sensitivity=\frac{TP}{TP+FN}\)	(6)
\(specificity=\frac{TN}{TN+FP}\)	(7)
\(precision=\frac{TP}{TP+FP}\)	(8)
\(CM= \left[\begin{array}{cc}TP& FN\\ FP& TN\end{array}\right]\)	(9)