Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Materials Today: Proceedings 45 (2021) 5660–5664

Contents lists available at ScienceDirect

Materials Today: Proceedings


journal homepage: www.elsevier.com/locate/matpr

Early-stage prediction of glaucoma disease to reduce surgical


requirements using deep-learning
Niharika Thakur, Mamta Juneja ⇑
Computer Science and Engineering. University Institute of Engineering and Technology, Panjab University, Chandigarh, India

a r t i c l e i n f o a b s t r a c t

Article history: Glaucoma is an emerging retinal disease which may become the first common cause of blindness if not
Available online 11 March 2021 detected at earlier stage. Retinal examination used by ophthalmologists for diagnosis is tedious and time
consuming, thus computer vision is been progressively used for earlier diagnosis employing retinal
Keywords: images. Researchers these days are using different machine learning and deep learning approaches for
Glaucoma diagnosis. As, the machine learning techniques require extraction of handcrafted features for classifica-
Retinal images tion, use of deep learning is found to be promising for automated diagnosis. This paper presents the com-
Deep-learning
parative analysis of different state of the art deep learning techniques such as Xception, Inception,
Classification
DenseNet, ResNet and VGG. Further, the comparison of approaches is performed using parameters such
as precision, recall and accuracy. The outcome of this study could be employed for designing of handheld
diagnostic tools of glaucoma that can be used by medical practitioners and researchers for analysis of
retinal images and prediction of glaucoma. As a result, the diagnosis performed using Computer aided
diagnosis (CAD) systems using imaging modalities would perform better in presence of illuminating dis-
turbances as well and reduce diagnostic time and cost performed by conventional devices such as
tonometer, pachymeter for retinal examination. Additionally, the life of handheld diagnostic device
would increase due to ease of use, and recurrent use of retinal cameras for acquisition of fundus image
of same eye would reduce due to improved prediction at single instance. Further, these systems would
help to predict glaucoma at earlier stage, plan treatment using medications and reduce the number of
surgeries. Now, if the disease is predicted at earlier stage, this would in turn save the patient from surgery
performed at advanced stage, thereby optimizing the use of materials such as stainless steel and titanium
used for designing of surgical equipment.
Ó 2021 Elsevier Ltd. All rights reserved.
Second International Conference on Aspects of Materials Science and Engineering (ICAMSE 2021).

1. Introduction nea and testing of side vision. But all these approaches are manual
and time consuming, so retinal fundus imaging modality which
Glaucoma is a disease of an eye caused due to optic nerve dam- captures the backside of retina is being used for diagnosis [3].
age resulting in irreversible loss of vision. The damage of optic Fig. 1. given below shows the retinal fundus image acquired using
nerve is due to increased pressure in the eye caused due to buildup the fundus camera.
of fluid [1]. It is the second most leading cause of vision loss after The retinal image comprises of optic disc and optic cup, which
cataracts. As per the study performed by Tham et al., it is expected are key indicators of disease prediction. Optic disc is located in
to rise further upto 111 million by 2040 if left untreated, and may the backside of an eye where different retinal vessels merge
become the first common cause of blindness [2]. The patients sus- together and is also known as blind spot. While, optic cup is the
pected with glaucoma show no initial symptoms unless progressed central depression of variable size on optic disc which increases
to advanced stage. Thus, earlier detection of this disease may play a with increased pressure of eye. The size of optic cup increases with
significant role in its prevention. Ophthalmologists use different the advancement of disease and covers the optic disc completely
approaches such as fluid measure of eye, thickness measure of cor- resulting in blindness. Vision loss caused due to glaucoma cannot
be reversed once occurred, but its further progression can be
halted if diagnosed at earlier stage. The researchers across the
⇑ Corresponding author.
world are using different machine learning and deep learning
E-mail address: mamtajuneja@pu.ac.in (M. Juneja).

https://doi.org/10.1016/j.matpr.2021.02.458
2214-7853/Ó 2021 Elsevier Ltd. All rights reserved.
Second International Conference on Aspects of Materials Science and Engineering (ICAMSE 2021).
N. Thakur and M. Juneja Materials Today: Proceedings 45 (2021) 5660–5664

and fine-tuned weights having learning rate of 0.0001 [12].


Recently, Diaz-Pinto et al. in 2019 performed classification of glau-
coma using deep learning-based networks such as Xception, Incep-
tion, VGG and ResNet. Based on the performance of different
networks, Xception was found to outperform another network
[13]. Further in 2019, Gomez- Valverde et al. [14] found VGG as
best performing classifier. Singh A in 2019 performed classification
of glaucoma using fine-tuned Inception network after augmenta-
tion to increase the sample size for effective training. The approach
was found to overcome the issue of overfitting for smaller sample
size [15]. Recently, in 2020 Juneja et al. [16] suggested modified
Xception as best performing classifier for smaller sample size. Also,
Li et al. in 2020 suggested ResNet comprising of convolutional,
pooling and fully connected layers for performing glaucoma classi-
fication. But the random initialisation here required a larger data-
set leading in high computation time [17]. Further, Judy et al. in
2020 used pre-trained alexnet with Support vector machine
(SVM) to perform classification of glaucoma on fundus images [18].
Thus, on the basis of above studies, it can be analysed that dif-
ferent researchers suggested different classifiers from time to time,
which may be due to varying datasets and experimental setup.
Hence, it necessitates generalization of things on some common
datasets which could be used for fair comparison without any
biased decisions. Hence, this study presents the comparative anal-
ysis of state-of-the-art networks on publicly available retinal
Fig. 1. Retinal fundus image. datasets.

approaches for earlier prediction of glaucoma which still requires 3. Methodology


improvement for accurate diagnosis [4]. Hence, this study presents
the analysis of different deep learning approaches to identify the This section discusses the methodology used for performing
best suitable for prediction of glaucoma and do further improvisa- classification with deep learning-based approaches. Five types of
tions in future. convolution networks namely Xception, Inception, ResNet, Dense-
Net and VGG were being employed for same. Fig. 2. shows the basic
deep learning architecture comprising of input, convolution, pool-
2. Related work
ing, fully connected and output layer.
The classification using each network initiated with preparation
As per the recent study performed by Thakur et al. in 2020, use
of input to be fed into convolution networks. Input image of size
of deep learning-based approaches have been suggested for classi-
2896 X 1944 pixels were here initially copped to reduce the size
fication in near future for improved performance [5]. Thus, this
of image to 512 X 512 pixels and get the desired region of interest
section presents some of the recently used deep learning-based
(ROI). The cropped image was then augmented using 30-degree
approaches for classification of glaucoma in retinal images.
rotation, shifting and flipping to increase the number of images
Benzebouchi et al. in 2017 performed classification of glaucoma
to 1322 from 267 (combined using two datasets) for improved per-
in retinal images using convolution neural network (CNN) with 25
formance of convolution networks. A convolutional network
epochs at 10 and 5 iterations respectively [6]. Also, Bander et al. in
includes multiple layers followed by fully connected layers which
2017 used CNN-based deep learning for extraction of features to
takes two-dimensional image as an input. The multiple layers here
classify the retinal image as ‘abnormal’ or ‘normal’ using an SVM
are convolutional layer, pooling layer, activation layer, and fully
classifer. The pre-trained Alexnet CNN model with 23 layers of con-
connected layer. Pooling layer is used to down sample and reduce
volution, max-pooling, fully connected, softmax and output was
the image size for faster training and fetching only significant fea-
used for training of feature extraction [7]. Further in 2018, Chai
tures. Convolutional layer has filter passing over image and per-
et al. used combination of fully convolutional network (FCN),
forming dot product of original pixels with weights defined in
recurrent convolutional network (RCNN) and CNN. The use of
filter. Further, fully connected layer takes one dimensional vector
hybrid approach increased the computational complexity of the
as input and output list of possible labels. Finally, the activation
network, but improved the performance [8]. Also, in 2018 Naoto
layer introduces non-linearity to allow self-training. Different net-
et al. performed glaucoma classification using deep residual net-
works used here vary in ordering and size of layers. The networks
works and found it to be better than random forest (RF) and sup-
used for classification are as follows:
port vector machine (SVM) [9]. Also, Raghavendra et al. in 2018
applied eighteen-layer CNN to extract deep features through reti-
nal fundus image for diagnosing glaucoma. The CNN used com- 3.1. Xception [19]
prises of two parts, first is the convolutional and max-pooling
layer with the input of each layer as an output of the previous Xception is a linear stacked architecture with depth wise sepa-
layer. While the second part includes a fully connected layer with rable convolutional layers and residual connections. It has 36 lay-
softmax activation function for classification of extracted features ers structured into 14 modules forming base of feature extraction.
[10]. Similarly, Norouzifard et al. in 2018 also suggested deep The depth wise separable performs channel wise spatial convolu-
learning-based classification employing ResNet and VGG compris- tions followed by 3x3 convolution. The convolutional layers are
ing of pooling, concatenate, dropout and residual layers [11]. Fur- preceded by batch normalization. The ReLU activation function is
ther in 2018, Kim et al. used modified VGG with batch size 32 followed throughout the architecture to bring out the non-
5661
N. Thakur and M. Juneja Materials Today: Proceedings 45 (2021) 5660–5664

Fig. 2. Basic Deep-learning architecture.

linearity in the output of each layer. The output of the architecture layers over a pool size of 2x2, with a stride 2. Non-linearity is intro-
has a global average pool layer which is followed by a dense net- duced by ReLU activation function. The end of the network has
work and logistic regression. dropout layer, fully connected layer and uses sigmoid activation.

3.2. Inception [20] 4. Results and discussions

Inception is a 48 layered architecture comprising of 3x3 convo- 4.1. Experimental setup and dataset used
lution and 3x3 pooling layers with batch normalization and stride
2. The network tail consists of dense layer with softmax activation. The approaches used for classification were implemented in
Inception layer is a combination of 1x1, 3x3 and 5x5 convolutional Python version 3.7.0 with TensorFlow version 2.1. Also, the testing
layers with their output clubbed together to act as the input of the was performed on Nvidia GeForce GTX TITAN Xp GPU having two
next layer. The tail of the architecture has a dropout layer and a IntelÒ Xeon(R) E5-2620 2.4 GHz CPUs. Further, the datasets used
dense layer having softmax as the activation function, which gives for testing comprises of DRISHTI-GS [24] and RIM-One [25] com-
us the final output. It uses two 3x3 convolutional layers instead of prising of 101 and 166 cases respectively including both normal
a 5x5 convolutional layer to reduce the number of parameters and glaucomatous cases. DRISHTI dataset comprises of 30 normal
without decreasing the model efficiency. images and 71 glaucomatous images. This dataset is generated
by the researchers from IIIT Allahabad along with the association
3.3. ResNet [21] of Arvind eye hospital situated in Madurai, India. Whereas, RIM-
One comprises of 92 normal images and 74 abnormal images
ResNet is a 50 layered architecture comprising of convolutional, (glaucoma/suspect). This dataset has been developed by University
activation, pooling, normalization and dense layer having sizes of La Laguna, Spain in collaboration with Hospital Universitario de
3x3. Additionally, it has skip connections to add input blocks to Canarias, Spain.
particular output block. It consists of residual blocks which include
a function f(x) and y = x as an identity mapping, i.e. the input to the 4.2. Performance parameters
block is added to the output of the block F(x) given as the function f
(x), and y = x as an identity mapping, i.e. the input to the block is The comparison of approaches implemented were carried out
added to the output of the block F(x) using parameters such as precision, recall, accuracy calculated
using true positives and negatives, false positives and negatives
3.4. DenseNet [22] (TP/TN/FP/FN). Here, TP represents prediction as glaucomatous
which are true, TN represents prediction as normal which are true,
DenseNet as the name implies is a dense network having 121 FP represents prediction as glaucomatous which are false and FN
layers, where each subsequent layer fetches input from preceding represents prediction as normal which are false. The values of TP,
layer. Image is passed as input to the convolution layer which is TN, FP, FN are derived from confusion matrix shown in Fig. 3.
followed by dense blocks separated by a combination of convolu- Thus, parameters such as precision, recall and accuracy are
tional layers with filter of size 1x1 and average pooling layer with defined as:
filter of size 2x2. Individually dense blocks here are grouping of
zero padding, normalization, 2x2 activation and 1x1 convolutional  Precision: It is defined as percentage of relevant predictions and
layer. The block ends with a concatenation layer which concate- is given using Eq. (1)
nates the output of preceding layers. The network tail here
TP
includes dropout layer and sigmoid activation. Precision ¼  100 ð1Þ
TP þ FP
3.5. VGg [23] Higher is the value of precision, better is the performance.

VGG is a 16 layered network with convolutional stacks, 2x2  Recall: It is defined as percentage of relevant predictions which
pooling and fully connected layers. The stride of convolution is are correct and is given using Eq. (2)
fixed as 1 pixel, along with spatial padding to preserve the resolu-
tion. The width of convolution layers starts with 64 filters and TP
Recall ¼  100 ð2Þ
increases with a factor of two after every Max Pool layer, the last TP þ FN
layer having a width of 512. Pooling is carried out by five Max Pool
5662
N. Thakur and M. Juneja Materials Today: Proceedings 45 (2021) 5660–5664

Output of Condition (e.g. Glaucoma/Normal) by ground truth


classification Positive Negative Total of Row
Positive Tp Fp Tp + Fp
Negative Fn Tn Fn + Tn
Total of Tp+Fn Fp+Tn N
Column (Total cases with (Total cases =Tp+Tn+Fp+Fn
the given without given (Total cases)
condition) condition)
Fig. 3. Confusion matrix.

analysed on public datasets are shown in Table 1 with correspond-


Table 1
Performance analysis of deep learning methods. ing graph plots in Fig. 4.
As per experimental analysis, the values of precision, recall and
Classification Network Performance parameters
accuracy for Xception is found to be 90.1%, 91.8% and 93.1%. Fur-
Precision Recall Accuracy ther, the Inception gave 87.3% precision, 85.7% recall and 90.6%
Xception 90.1 91.8 93.1 accuracy. Also, the values of precision, recall and accuracy for
Inception 87.3 85.7 90.6 ResNet is found to be 82.5%, 81.4% and 83.2%. Similarly, the values
ResNet 82.5 81.4 83.2 of precision, recall and accuracy for DenseNet is found to be 89.3%,
DenseNet 89.3 87.3 92.1
VGG 80.9 79.1 81.6
87.3% and 92.1%. Whereas, the VGG offers precision of 80.9%, recall
of 79.1% and accuracy of 81.6%. Thus, Xception is found to outper-
form Inception, ResNet, DenseNet and VGG. Further, the filters
used in Xception simultaneously considers a spatial dimension
Higher is the value of recall, better is the performance.
and a cross-channel or ‘‘depth” dimension. Instead of partitioning
input data into several compressed chunks, it maps the spatial cor-
 Accuracy: It is defined as percentage of correct prediction and is
relations for each output channel separately, and then performs a
given using Eq. (3)
1x1 depth wise convolution to capture cross-channel correlation.
Hence, it shows improved accuracy as compared to other state of
TP þ TN the art architectures.
Accuracy ¼  100 ð3Þ
TP þ TN þ FP þ FN
Higher is the value of accuracy, better is the performance.
5. Conclusion

4.3. Performance analysis Earlier detection of glaucoma using image processing tech-
niques can assist the doctors in precise diagnosis in less amount
All the networks used here 30 epochs, ReLU activation for hid- of time. Traditional methods employing retinal images used till
den layers and Sigmoid activation for output layers with 135 layers date are either less automated or less accurate. Thus, the analysis
in Xception, 314 in Inception, 178 in Resnet, 430 in DenseNet and of automated deep learning approaches such as Xception, Incep-
23 in VGG. The performance of different state of the art networks tion, ResNet, DenseNet and VGG has being performed in this paper

100

95 93.1
91.8 92.1
Performance Metrics

90.1 90.6
89.3
90
87.3 87.3
85.7
85 82.5 83.2
81.4 80.9 81.6
79.1
80

75

70
Xcepon Incepon ResNet DenseNet VGG
Classificaon Networks

Precision Recall Accuracy

Fig. 4. Comparative analysis of classification networks.


5663
N. Thakur and M. Juneja Materials Today: Proceedings 45 (2021) 5660–5664

to analyse the best one. Further, based on the experimental analy- [8] Y. Chai, H. Liu, X.u. Jie, Glaucoma diagnosis based on both hidden features and
domain knowledge through deep learning models, Knowl.-Based Syst. 161
sis, Xception is found to give 90.1% precision, 91.8% recall and
(2018) 147–156.
93.1% accuracy. Thus, the Xception network could be extended [9] N. Shibata et al., Development of a deep residual learning algorithm to screen
and improvised for better predictions on large number of datasets for glaucoma from fundus photography, Sci. Rep. 8 (1) (2018) 1–9.
so as to be used in real time clinical scenarios for earlier diagnosis. [10] U. Raghavendra, H. Fujita, S.V. Bhandary, A. Gudigar, J.H. Tan, U.R. Acharya,
Deep convolution neural network for accurate diagnosis of glaucoma using
Thus, the outperforming Xception can be extended for further digital fundus images, Inf. Sci. 1 (441) (2018) 41–49.
improvisations and embedded in handheld diagnostic tools of [11] M. Norouzifard et al., Automated glaucoma diagnosis using deep and transfer
glaucoma to be utilized in clinical scenarios for analysis of retinal learning: Proposal of a system for clinical testing, 2018 International
Conference on Image and Vision Computing New Zealand (IVCNZ), 2018.
images. Further, these CAD systems would replace the conven- [12] Kim, Mijung, et al. ‘‘Web applicable computer-aided diagnosis of glaucoma
tional diagnostic procedures performed using devices such as using deep learning.” arXiv preprint arXiv:1812.02405 (2018).
tonometer and pachymeter. Also, the outcome of this study would [13] Díaz. Pinto, Andrés,, et al., CNNs for automatic glaucoma assessment using
fundus images: an extensive validation/Andres Diaz-Pinto...[et al.], Biomed.
reduce the consumption and cost of materials such as stainless Eng. Online 18 (2019).
steel and titanium used for manufacturing of surgical products, [14] Juan J. Gómez-Valverde et al., Automatic glaucoma classification using color
as the surgeries would be replaced by medications due to diagnosis fundus images based on convolutional neural networks and transfer learning,
Biomed. Opt. Express 10 (2) (2019) 892–913.
at earlier stage. [15] A. Singh, S. Sengupta, V. Lakshminarayanan. Glaucoma diagnosis using transfer
learning methods. InApplications of Machine Learning 2019 Sep 6 (Vol. 11139,
Declaration of Competing Interest p. 111390U). International Society for Optics and Photonics..
[16] Mamta Juneja et al., GC-NET for classification of glaucoma in the retinal fundus
image, Mach. Vis. Appl. 31 (5) (2020) 1–18.
The authors declare that they have no known competing finan- [17] F. Li, L. Yan, Y. Wang, J. Shi, H. Chen, X. Zhang, M. Jiang, Z. Wu, K. Zhou, Deep
cial interests or personal relationships that could have appeared learning-based automated detection of glaucomatous optic neuropathy on
to influence the work reported in this paper. color fundus photographs, Graefe’s Archive for Clinical and Experimental
Ophthalmology. 258 (4) (2020 Apr) 851–867.
[18] D. Judy, Automated Identification of Glaucoma from Fundus Images using
References Deep learning Techniques, Eur. J. Mol. Clin. Med. 7 (2) (2020) 5449–5458.
[19] François Chollet, Xception: Deep learning with depthwise separable
[1] P. Ramulu, Glaucoma and disability: which tasks are affected, and at what convolutions, Proceedings of the IEEE conference on computer vision and
stage of disease?, Curr. Opin. Ophthalmol. 20 (2) (2009) 92. pattern recognition, 2017.
[2] Y.-C. Tham et al., Global prevalence of glaucoma and projections of glaucoma [20] Christian Szegedy et al., Rethinking the inception architecture for computer
burden through 2040: a systematic review and meta-analysis, Ophthalmology vision, Proceedings of the IEEE conference on computer vision and pattern
121 (11) (2014) 2081–2090. recognition, 2016.
[3] Kierstan Boyd, ‘‘Glaucoma Diagnosis.” American Academy of Ophthalmology [21] Targ, Sasha, Diogo Almeida, and Kevin Lyman. ‘‘Resnet in resnet: Generalizing
(2019). Available at: https://www.aao.org/eye-health/diseases/glaucoma- residual architectures.” arXiv preprint arXiv:1603.08029 (2016).
diagnosis. Accessed on: 16th June 2020. [22] Yi Zhu, Shawn Newsam, Densenet for dense flow, 2017 IEEE international
[4] V.S. Mary, E.B. Rajsingh, G.R. Naik, Retinal fundus image analysis for diagnosis conference on image processing (ICIP), 2017.
of glaucoma: a comprehensive survey, IEEE Access (2016). [23] Shuying Liu, Weihong Deng, Very deep convolutional neural network based
[5] N. Thakur, M. Juneja, Classification of glaucoma using hybrid features with image classification using small training sample size, 2015 3rd IAPR Asian
machine learning approaches, Biomed. Signal Process. Control 62 (2020) conference on pattern recognition (ACPR), 2015.
102137. [24] Jayanthi Sivaswamy et al., Drishti-gs: Retinal image dataset for optic nerve
[6] N.E. Benzebouchi, N. Azizi, S.E. Bouziane, Glaucoma diagnosis using head (onh) segmentation, 2014 IEEE 11th international symposium on
cooperative convolutional neural networks, Int. J. Adv. Electron. Comput. Sci. biomedical imaging (ISBI), 2014.
5 (1) (2018) 31–36. [25] Fumero, Francisco, et al. ‘‘RIM-ONE: An open retinal image database for optic
[7] N.E. Benzebouchi, N. Azizi, S.E. Bouziane, Glaucoma diagnosis using nerve evaluation.” 2011 24th international symposium on computer-based
cooperative convolutional neural networks. In: Proceedings of ISER 88th medical systems (CBMS). IEEE, 2011.
International Conference 2017, pp. 1–6.

5664

You might also like