Convolutional Neural Networks for automated detection of Diabetic Macular Edema

doi:10.21203/rs.3.rs-1989082/v2

Diabetic Macular Edema (DME) is a diabetes induced pathology which is responsible for degradation of visual health among diabetic patients. Its initial effects are blurred vision and may lead to complete loss of eyesight. Hence, it is crucial to detect the symptoms of DME at an early stage. Convolutional Neural Networks (CNN) are the most preferred systems for medical image classification. This paper proposes an efficient CNN model for automated classification of DME. Additionally, transfer learning is employed for 2 Pre-trained CNNs i.e. VGG16 & DenseNet. The performance of proposed CNN is compared with VGG16 and DenseNet in terms of classifier accuracy, loss function and Receiver Operating Characteristics (ROC) curve. The proposed CNN has exceeded the performance of DenseNet in classifier accuracy by 0.48% and has lesser system loss function by 1.47%. VGG16 has performed best in these three with classifier accuracy, loss function and ROC as 87.41%, 29.64% and 0.96 respectively.

Convolutional neural networks

diabetic macular edema

classification

classifier accuracy

optical coherence tomography

The occurrence of Diabetes Mellitus (DM) is increasing sharply worldwide. Study shows that there will be 360 million more cases of DM till 2030 [1]. DM causes severe damage to human physiology including degradation of bones, nerves and other vital organs. Due to the increased level of glucose in blood, the blood capillaries get perforated and the content leaks in to the outer space triggering various kinds of complications.

Diabetic Macular Edema (DME) is a retinal pathology brought on by prolonged and uncontrolled diabetic condition. It shows up as fluid cysts with in retina and thickening of retina as a result of fluid leaks from damaged macular blood vessels. Macula is the central portion of retina, rich in cones and therefore responsible for clear, sharp and detailed vision. Any abnormality in blood vessels in this region causes alarming loss of visual acuity that, if ignored for too long may soon be followed by complete blindness. Therefore, it is of utmost importance to identify this impairment as early as possible so that the required treatment can proceed.

The different imaging methods used in diagnosing Diabetic Retinopathy (DR) are biomicroscopy, fluorescein angiography, fundus images and OCT scans [2]. Out of these above mentioned imaging methods fundus images and OCT scans are the most often used imaging modality for detection of DME [3].

This paper compares the effectiveness of three Convolution Neural Networks (CNNs) using a publically available OCT dataset. The organization of paper is as follows. Section 2 contains related previous work. Section 3 explains the proposed methodology. Experimental setup and dataset details are given in following section i.e. Section 4. Results are covered in Section 5. Conclusion and future scope is presented in Section 6.

Numerous studies have been carried out in order to design an effective, fully automated DME classification system. It is advised to utilize several soft computing and machine learning algorithms to enhance the classification accuracy of classifier up to an acceptable grade, some of which have yielded encouraging results.

CNNs have evolved as a prominent and efficient tool in classification of medical images. Kermany et al. [3] has proposed the use of transfer learning algorithm to make a CNN based classifier fast and comparably accurate. It was tested for small as well as bigger dataset and the the accuracy up to 96.6% was attained. With some modification in algorithm of transfer learning Karri et al. [4] achieved higher accuracy of 99%. It involved the use of back propagation to rectify the filters. Though the accuracy to detect normal eye is enhanced but the sensitivity and specificity is decreased significantly. Kamble et al. [9] suggested to fine tune a pre-trained CNN to classify normal and abnormal eye and obtained 100% accuracy for a small dataset.

Pratt et al. [5] proposed deep neural network for image classification when dataset is big. The algorithm provided the accuracy of 75% for 5 grade classification. Gulshan et al. proposed deep CNN with inception-3 classifier for two different datasets and concluded to achieve variation in accuracy for both datasets. Grassmann et al. [7] further incorporated random forest classifier with deep CNN for 13 classes. Perdomo et al. [8] suggested deep CNN based diagnosis based on 5 features and obtained an accuracy of 93.75%.

Giancardo et al. [10] proposed application of Probabilistic, Geometric and Tree based classification and wavelet based feature extraction to obtain area under curve between 0.88–0.94. Acharya et al. [11], Deepak et al. [12] suggested supervised learning for neural network, while, Srinivasan et al. [13] implemented SVM classifier to classify normal, DME and Age –related Macular Degeneration (AMD) with histogram based feature extraction and obtained 86.7% accuracy to detect normal eye and 100% accuracy to detect DME and AMD. Chan et al. [14] proposed the SVM classifier with CNN based feature extraction and achieved a classification accuracy of 96%.

This work comprises of comparing the performance of 2 pre-trained networks i.e. VGG-16 and DenseNet and our proposed CNN for automated classification of provided OCT scans to classify for normal or DME eye. Fig. I shows the proposed methodology.

3.1 Data collection

Secondary data is utilized for this study. Total 3000 OCT scans comprising of both normal and DME cases were taken from a publically available dataset. Out of these 3000 images, 124 images were discarded due to severe distortions and 2876 images were retained. Fig. II and Fig. III show the example of normal and DME eye scan.

3.2 Pre-processing

The collected dataset consisted of raw images with varying image size and some of the images were having preponderance of unwanted pixels. Hence, for better feature extraction and training of CNN, the images are resized for uniformity and image cropping is done to achieve a better organized dataset.

3.3 Image segmentation

The pre-processed images are segmented using k-means clustering. The value of k = 4 is chosen to get better insight of image pixels in co-ordination. The segmented images are again filtered to obtain the final segmented image in only 2 clusters. The dataset is revised as segmented dataset to be used for training and testing of CNNs.

3.4 Image Classification using CNN

The segmented dataset is partitioned for training and testing purpose for CNNs. The same is used to train the proposed CNN. Transfer Learning is employed for the pre-trained CNNs. The architecture of used CNNs is explained as follows:

VGG-16

VGG − 16 is a widely used CNN for classification of medical images. It is pre-trained on subset of ImageNet database. It consists of 21 layers in total, starting from Convolutional layers combined with maxpooling layers and in last followed by dense layers. The number of covolutional layers, maxpooling layers and dense layers are 13,5 and 3 respectively. It has approximately 138 million parameters contained in 16 weight associated layers.

DenseNet

DenseNet is a deep Convolutional network which focuses on deeper CNN learning. It consists of dense blocks and transition layers. The dense blocks perform a combined function of batch normalization function (BN), rectified linear function (ReLU) and convolution function (conv). The transition layer performs downsampling via convolution and pooling operations. To summarize, DenseNet has 120 convolution layers and 4 average pooling layers.

Proposed CNN

A 20 layer CNN is proposed to classify normal and DME eye OCT scan. The proposed CNN model consists of 5 convolution layers, 5 maxpooling layers, 5 dropout layers, 1 flatten layer followed by 3 dense layers. The architecture of proposed CNN is given in table I. All convolutional layers have ReLU activation. The optimizer used is ADAMAX. Learning rate for CNN was chosen to be 0.001. There are 29202 total parameters and the trainable parameters are 29202. The dropout is fixed at 0.20.

Table I: Proposed CNN architecture

Sr. No.	Layer (type)	Output shape	Parameters
1	Input layer	( 128,128,3)	0
2	conv2d (Conv2D)	(128, 128, 32)	416
3	max_pooling2d (MaxPooling2D)	( 64, 64, 32)	0
4	dropout (Dropout)	( 64, 64, 32)	0
5	conv2d_1 (Conv2D)	( 64, 64, 16)	2064
6	max_pooling2d_1 (MaxPooling2D)	( 32,32, 16)	0
7	dropout_1 (Dropout)	( 32,32, 16)	0
8	conv2d_2 (Conv2D)	( 32,32, 16)	1040
9	max_pooling2d_2 (MaxPooling2D)	( 16,16, 16)	0
10	dropout_2 (Dropout)	( 16,16, 16)	0
11	conv2d_3 (Conv2D)	( 16,16, 8)	520
12	max_pooling2d_3 (MaxPooling2D)	( 8, 8, 8)	0
13	dropout_3(Dropout)	( 8, 8, 8)	0
14	conv2d_4 (Conv2D)	( 8, 8, 8)	264
15	max_pooling2d_4 (MaxPooling2D)	( 4, 4, 8)	0
16	dropout_4(Dropout)	( 4, 4, 8)	0
17	flatten (Flatten)	( 128)	0
18	dense (Dense)	( 128)	16512
19	dense_1 (Dense)	( 64)	8256
20	dense_2(Dense)	( 2)	130

The experimental code is implemented with python 3.9.12 (based on scikit, keras, tensorflow). The GPU used is RTX NVIDIA 3060 with 8GB memory for training. The experiment used tensorflow 2.8.0, scikit-image 0.19.2, scikit-learn 1.1.1, keras 2.8.0 and keras-preprocessing 1.1.2 libraries.

Publically available data is collected from kaggle.com. The dataset consists of 2876 OCT images in total (958 images are of DME case and 1918 images are of normal case). Out of these 2876 images 2455 images are used to train the CNN and 421 images are used for testing and validation.

VGG-16 and DenseNet are used to classify the segmented images in 2 classes i.e. normal and DME. Also, the proposed CNN model is implemented and trained for our segmented dataset. The classifier performance is compared in terms of accuracy and loss curves as well as roc curves for all three CNNs. The results are shown by Fig. IV-IX. The respective classifier accuracy is shown in Table II. We have been able to achieve the classifier accuracy up to 87.41% with VGG16 model for our dataset.

The ROC curve obtained are shown by Fig. X-XII. Our proposed CNN model has attained 0.92 ROC curve for both classes of grading. VGG16 has achieved 0.96 while DenseNEt has also achieved ROC of 0.92.

The classifier performance is compared for above mentioned 3 CNNs in terms of classifier accuracy. Our proposed CNN has achieved classifier accuracy of 83.14%. VGG16 has achieved accuracy of 87.41% and DenseNet has achieved accuracy of 82.66%. Table 2 shows the classifier accuracy comparison.

Table II: Performance comparison of different CNNs

Sr. No.	CNN Model	Classifier Accuracy (%)	Loss function (%)	Area under Curve
1	Proposed CNN	83.14%	36.97%	0.92
2	VGG16	87.41%	29.64%	0.96
3	DenseNet	82.66%	38.44%	0.92

VGG16, DenseNet and the Proposed CNN are used to classify the OCT scans for DME and normal class and the results are compared for classifier performance. After testing the trained CNNs, it is found that VGG16 provides the best classifier performance out of all three in terms of classifier accuracy, loss function function and ROC. VGG16 has outperformed DenseNet and our Proposed CNN, though our Proposed CNN has shown better result when compared with DenseNet. The proposed CNN has provided better classifier accuracy and lower loss function as compared to DenseNet while the ROC has been found equal for the both. The performance of proposed CNN is found consistent during testing and validation phase of implementation. Though the proposed CNN lags behind than VGG16 in performance, it can still be improvised by addition of more layers. The future scope of this study lies in enhancing the performance of CNN by addition of more layers and verifying the performance with other available CNNs. Also, its performance can be studied for other and bigger datasets.

Ethical approval and consent to participate

- NA

Human and Animal Ethics

- NA

Consent for publication

- NA

Availability of supporting data

- The dataset used/ analyzed during this study are available and can be accessed from https://www.kaggle.com/paultimothymooney/kermany2018

Funding-

No funding.

Authors' contributions

- Manisha Bangar drafted the manuscript and did data analysis and Dr. Prachi Chaudhary compiled the results in figures and tables. All authors have reviewed the manuscript. The authors have no competing interests to declare that are relevant to the content of this article.

Acknowledgements

- NA

S Wild et al., (2004) Global prevalence of diabetes: estimates for the year 2000 and projections for 2030 Diabetes Care 2004; 27:1047-1053[PMID: 15111519]
M. R. K. Mookiah et al., (2013) Computer-Aided Diagnosis of Diabetic Retinopathy: A Review Accepted Manuscript, Computers in Biology and Medicine, 2013.
Daniel S. Kermany et al (2018) Identifying Medical Diagnosis and Treatable Diseases by Image based Deep Learning. Cell 172 © 2018 Elsevier, 1122-1131.
S.P. K. Karri et al., (2017) Transfer Learning based classification of Optical Coherence Tomography images with Diabetic Macular Edema and dry Age Related macular degeneration. Biomedical Optics Express 579 Vol. 8, No. 2, © 2017
Harry Pratt et al., (2016) Convolutional Neural Networks for Diabetic Retinopathy. International Conference on Medical Imaging Understanding and Analysis 2016, Procedia Computer Science 90 (2016) 200-205, © 2016 Elsevier
Varun Gulshan et al., (2016) Development and Validation of Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. Original Investigation, © 2016 American Medical Association
Felix Grassmann et al., (2018) A Deep Learning Algorithm for prediction of age-related eye disease study severity scale for age-related macular degeneration from color fundus photography. American Academy of Ophthalmology, © 2018
Perdomo et al., (2018) OCT-NET: A Convolutional Network for Automatic Classification of Normal and Diabetic Macular Edema using SD-OCT Volumes. IEEE 15^th International Symposium on Biomedical Imaging (ISBI 2018).
Kamble et al., (2018) Automated Diabetic Macular Edema (DME) Analysis using Fine Tuning with Inception-Resnet-v2 on OCT Images. IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBE 2018)
Giancardo et al., (2012) Exudate-based diabetic macular edema detection in fundus images using publically available datasets. Medical image Analysis 16 (2012) 216-226.
U. Rajendra Acharya et al., (2017) Automated Diabetic Macular Edema grading system using DWT, DCT features and Maculopathy Index, Computers in Biology and Medicine. 2017 May 1;84:59-68;. doi:10.1016/j.compbiomed.2017.03.016
Deepak et al., (2012) Automatic Assessment of Macular Edema from Color Retinal Images. IEEE TRANACTIONS ON MEDICAL IMAGING. VOL. 31, No. 3, March 2012.
Pratul P. Srinivasan et al., (2014) Fully Automated Detection of Diabetic Macular Edema and dry age-related macular degeneration from optical coherence tomography images. Biomedical Optics Express 3568 Vol. 5 N0.10 © 2014 OSA
Genevieve et al., (2017) Transfer Learning for Diabetic Macular Edema (DME) Detection on Optical Coherence Tomography (OCT) Images. Proceedings of the 2017 IEEE International Conference on Signal and Image Processing Applications (IEEE ICSIPA 2017), Malaysia

No competing interests reported.

Convolutional Neural Networks for automated detection of Diabetic Macular Edema

Status:

Version 2

Abstract

Figures

1. Introduction

2. Related Work

3. Proposed Methodology

3.1 Data collection

3.2 Pre-processing

3.3 Image segmentation

3.4 Image Classification using CNN

4. Experimental Setup And Dataset

5. Result

6. Conclusion And Future Scope

Declarations

References

Additional Declarations

Status:

Version 2