Intelligence-Based Medicine: Rutuja Shinde

Intelligence-Based Medicine 5 (2021) 100038
Contents lists available at ScienceDirect
Intelligence-Based Medicine
journal homepage: www.sciencedirect.com/journal/intelligence-based-medicine
Glaucoma detection in retinal fundus images using U-Net and supervised

machine learning algorithms☆
Rutuja Shinde
Department of Computer Engineering, Pune Institute of Computer Technology, Savitribai Phule Pune University, Pune, India
A R T I C L E I N F O A B S T R A C T
Keywords: Background and objective: Glaucoma is a neuro-degenerative eye disease developed due to an increase in the Intra-
Cup to disc ratio ocular Pressure inside the retina. Being the second largest cause of blindness worldwide, it can lead the person
Inferior superior nasal temporal towards complete blindness if an early diagnosis does not take place. With respect to this underlying issue, there
Blood vessel extraction
is an immense need of developing a system that can effectively work in the absence of excessive equipments,
Support vector machine
Neural network
skilled medical practitioners and also is less time consuming.
Adaboost Methods: This work proposes an offline Computer-Aided Diagnosis (CAD) system for glaucoma diagnosis using
retinal fundus images. This application is developed using image processing, deep learning and machine learning
approaches. Le-Net architecture is used for input image validation and Region of Interest (ROI) detection is done
using brightest spot algorithm. Further, the optic disc and optic cup segmentation is performed with the help of
U-Net architecture and classification is done using SVM, Neural Network and Adaboost classifiers.
Results: The accuracy of 99% is achieved using Le-Net for input image validation. Considering the application of
brightest spot algorithm, an accuracy of 98.67% is achieved for ROI extraction. Further, a dice-coefficient of 0.93
and 0.87 was attained for the segmentation of optic disc and optic cup respectively using U-Net architecture.
Using SVM, Neural Network and Adaboost classifiers, the proposed methodology managed to achieve a classi
fication accuracy, recall, specificity and sensitivity of 100%, thus proving the system to be reliable and
promising.
Conclusions: In conclusion, the proposed desktop application is easy to use and can play a major role in early
glaucoma detection. The modular design of the CAD system is made up of a set of standalone components that
can be used for a variety of different tasks for detecting and classifying glaucoma. Due to the model being trained
on a variety of different datasets, the system proves to be robust and more accurate.
1. Introduction The word “fundus” derived from Latin language, refers to the part of
the eyeball that lies opposite to the pupil. Thus, a photograph of the
Glaucoma is an eye disease that vandalizes the optic nerve and interior surface of the eye is regarded as the fundus image [12]. Apart
Retinal Nerve Fiber Layer (RNFL), ultimately leading to blindness if left from funduscopic images, there are also other imaging techniques
untreated. In this condition, the fluid builds in the anterior part of the namely Scanning Laser Polarimetry, Heidelberg Retina Tomography
retina thereby increasing pressure in the eye and causing the destruction (HRT) and Optical Coherence Tomography (OCT) [2], but the cost of
of the optic nerve. Estimation claims that the total number of cases these equipments comes across as a barrier in the context of affordability
worldwide will rise to 79.6 million by 2020 and 111.8 million by 2040. for many of the hospitals worldwide. There fore we work on glaucoma
It has also been estimated that Asians would comprise of 47% of those detection using fundus images which are an affordable alternative for
with glaucoma and 87% of those with Angle Closure Glaucoma (ACG) health practitioners.
[28]. People having age greater than 60 tend to be at a higher risk [34]. Increased Intra-Ocular Pressure (IOP) gives rise to the occurrence of
Furthermore, it is the second leading cause of blindness in the world. In structural changes in the anatomy of the eye. In a retinal fundus image,
order to prevent irreversible vision loss and structural damage incurred, an Optic Disc (OD) is the region where blood vessels and the nerve fibers
an early detection of the same is very crucial [28]. enter the retina. An Optic Cup (OC) is the central bright portion of the
☆
This research project has been sponsored by the National Institute of Ophthalmology (NIO), Pune.
E-mail address: rutujadshinde1994@gmail.com.
https://doi.org/10.1016/j.ibmed.2021.100038
Received 3 October 2020; Received in revised form 7 May 2021; Accepted 1 July 2021
Available online 14 July 2021
2666-5212/© 2021 The Author. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
R. Shinde Intelligence-Based Medicine 5 (2021) 100038
retina and lies within the optic disc. Fig. 1 describes these terminologies image processing methods namely, pixel brightness transformations,
diagrammatically. In case of glaucoma, the structural changes incurred geometric transformations, confined region of the processed image. and
cause the optic cup to get enlarged which is termed as the phenomenon return back to its normal stage in order to locate the ROI. A
of “cupping”. Thus, the Cup-to-Disc Ratio (CDR) proves to be a mean non-automated method by manual cropping of ROI is used [10]. Nar
ingful indicator for glaucoma detection. Another considerable indicator asimhan and Vijayarekha [22] used an approximate area around the
is termed as the Inferior Superior Nasal Temporal (ISNT) rule. Blood brightest point obtained from G-plane of RGB fundus image. Further [7]
vessel extraction also has its importance in glaucoma detection as blood applied intensity weighted centroid method to calculate OD centre.
vessels state the health condition of an organ and hence the eyesight. Unlike finding the brightest spot, ROI was detected by screening in
From the literature surveyed in section 2, it is found that most of the tensity values of OD in RGB colour plane [4].
researchers have worked with relatively small number of images, fundus
images from private datasets and with datasets that lack real time var 2.2. OD and OC segmentation
iations in image quality. This hampers the robustness of the system.
There is a need to develop a model which works for images acquired Soltani et al. [35] in their work, attempted to segment optic disc with
under different environmental conditions [16]. Also, there is scope in the help of Laplace, Sobel and Canny contour detection methods. Canny
enhancing the classification accuracy as much as possible. detector provided good results in disc detection and localization. On the
The main contribution of this research work lies in diagnosis of other hand Laplacian operator provides lowest accuracy and is more
glaucoma with the help of interpretation of retinal fundus images. This susceptible towards the noise in the image. In the work of Aquino et al.
motive is accomplished by the development of a Computerised detection [6], OD localization was done by selecting the best result from the three
(CAD) system in the form of a desktop application for the assistance of methods, maximum difference method, maximum variance method and
health practitioners. The proposed method accomplishes automatic low pass filter method. Further, circular hough transform was applied
screening of glaucoma diagnosis and classifies an input image as glau for OD boundary segmentation. Finally, Elliptical template-based
comatous or healthy. Deep learning approaches such as LeNet, U-Net approach and Circular template-based approach were used for OD seg
and Neural Network (NN) have been used in the process of glaucoma mentation where the later approach proved to yield better results
detection which is completely offline. As a result, clinicians in rural comparatively. Deformable models face local energy minima and curve
areas can also be able to efficiently use this application for their diag initialization problems. In order to overcome this drawback, curvature
nosis. As glaucoma diagnosis is a time consuming procedure and re and texture constrained composite weighted (CTCRW) algorithm was
quires skilled professionals, no special skills are required to work with proposed. This algorithm uses Gabor texture energy and mean curvature
developed application. The proposed algorithm uses three features features in order to calculate the edge weights [25]. Pal and Chatterjee
namely CDR, ISNT ratio of Neuro-retinal Rim (NRR) and blood vessels as [24] performed morphological operations and histogram equalization
the decision criteria for glaucoma detection. A majority voting system on an RGB image for OD edge enhancement. OD boundary was detected
developed using SVM and NN classifiers and adaboosting technique has using canny edge detector and segmented using flood fill algorithm.
been used in an attempt to classify the input to be either normal or Hatanaka et al. [15] attempted to extract the optic disc on the basis
glaucomatous. Also, our work contributes in the advancement in the of shape and colour of the fundus image. P-tile thresholding method was
field of glaucoma detection in following manner: applied on all the three channels of an RGB image. An approximate optic
disc was obtained by the combination of the three images and canny
• Developing a robust system by training the model with different edge detector was applied. The optic cup was obtained by profile anal
kinds of datasets containing varied characteristics of fundus images ysis using blue component. Finally, the vertical CDR was used for
and increasing the number of training and testing images. glaucoma diagnosis. Whereas in Ref. [17], the localization of optic cup
• Increasing number of classifiers in order to enhance generalization was done using contour method based on morphological operations and
power of the system. that of the optic cup was done by ROI based segmentation using
• Ensuring that features used for classification are significant in the component labelling algorithm. The optic disc and cup segmentation
context of the results achieved. was done using red and green channels respectively from a RGB fundus
image. Das et al. [10] used watershed transform for OD and OC seg
2. Related work mentation. The red channel of an RGB color space whereas ‘a’ plane
from the Lab color space respectively were used for segmentation pur
2.1. ROI extraction pose. In the work of [22], OD and OC segmentation was performed using
K-means clustering algorithm. Ahmad et al. [4] in their work, consid
Agarwal et al. [3] extracted ROI by calculating spatially the optic cup ered V-plane from the HSV color space whereas green channel from RGB
centre and locating a square region with a radius ‘r’ around it. Initially, color space for OD and OC segmentation respectively. Kim et al. [18]
the coordinates of optic disc are approximated further obtaining the used U-Net in combination with Fully Convolutional Networks (FCN) for
actual coordinates for the same. While [27] used four categories of OD and OC segmentation. Both binary and multiclass FCNs were used
for segmentation.
2.3. Feature extraction
Glaucoma diagnosis was performed using ISNT rule [29]. Followed

by performing morphological operations, CDR was calculated [37]. In
order to increase the accuracy, a mask generation approach using the OD
centroid for extracting the Neuroretinal Rim width was put forward [9].
Local entropy thresholding was used for blood vessels extraction. Based
on the obtained CDR and ISNT values [22]. RNFLD features have been
used for glaucoma progression monitoring [26]. Deep CNN was used to
extract distinguishable features from glaucomatous and healthy images.
Al Ghamdi et al. [5], proposed a novel self-learning technique by
Fig. 1. Fundus images showing landmarks namely Optic Disc, Optic Cup, blood labelling unlabelled data using CNN architecture and training the same.
vessels and Neuro-retinal Rim [20]. The CNN architecture was fine-tuned and trained in supervised fashion
2
over the labelled dataset. As stated earlier, the condition of blood vessels As a result, a dataset comprising of 2264 images is used for as a part
plays a key role in describing the health status of an eye. Considering of training and validation set in OD segmentation while that of 1768
this, Balasubramanian et al. [8] in their work performed extraction of images is used for OC segmentation for the same. These images are
blood vessels using Gabor filter, morphological operations and annotated in such a way that the pixel intensity value of regions of OD
threshold-ing. Histogram of Oriented Gradients was used for feature and OC is equal to 1 while the remaining area is set to a pixel intensity
extraction. Gabor filters provide good results in image texture analysis. value of 0. The boundaries of optic cup and disc were marked under the
surveillance of experienced health professionals.
2.4. Classification
3.2. Feature extraction and classification
Classification was done using k-Nearest Neighbour (k-NN), Support
Vector Machine and Bayes classifier [22]. Classification was done based Feature extraction and classification steps require both segmented
on the obtained CDR and ISNT values. Vijapur and Kunte [36] propose OD and OC for a particular input image. Hence the aforementioned 520
Pearson –R coefficients as the feature for the segmentation of OD and fundus images used to train the model for OC segmentation have been
OC. This feature describes the variations in mean intensity of an image. used for feature extraction and classification purposes.
Recurrent Neural Network (RNN) was used for classification [26].
VGG-16 model was used for glaucoma classification. Authors in Li et al. 4. Proposed methodology
[19] explored seven ConvNets namely AlexNet, VGG-16, VGG-19,
GoogLeNet, ResNet-50 and ResNet-152. AlexNet yields best results when The proposed CAD pipeline comprises of four steps described in
SVM classifier was used for classification. sections 4.1, 4.2, 4.3 and 4.4 respectively. The input images are pro
cessed further only if the image is fundus image. After the validation
3. Data step, the Region of Interest (ROI) is extracted from input image. Next,
the optic disc and cup segmentation is performed using the U-Net ar
A collection of 6 datasets has been used. RIM-ONE, DRISHTI-GS, chitecture. Followed by segmentation, feature extraction is performed.
DRIONS-DB, JSIEC and DRIVE are publicly available datasets. NIO is a Finally, classification is done based on these features and the input
private dataset obtained from NIO hospital, Pune. Table 1 comprises of image is predicted to be either glaucomatous or healthy. Instead of
the datasets used for proposed methodology and the number of images training a Neural Network with the dataset, the aforementioned meth
in each of the datasets. The model training is not performed on indi odology was chosen for the following two reasons. Firstly, the typical
vidual datasets instead has been trained by collectively training all the machine learning steps resulted into commendably high accuracy. The
images in 6 datasets together. All the images are resized to a fixed features selected for classification were considered after studying the
dimension of 512 × 512. structural changes occurring in the eyesight which were reflected in the
fundus images. The generalized CAD pipeline has been shown in Fig. 2.
3.1. Segmentation
A total of 666 labelled fundus images have been used for the purpose
of segmentation. In order to train the CNN model for optic cup seg
mentation, 146 images from the preconsidered 666 images were found
to have ill-defined optic cup boundaries and hence were discarded. The
dataset is split in the ratio (70%–15% - 15%) as (training set-validation
set and test set) respectively. The revised test set comprises of 100 im
ages (15%) of the original 666 images for OD segmentation. The dis
tribution comprises of 16 images from each of the RIM-ONE, DRISHTI-
GS, DRIONS-DB and JSIEC datasets while 18 images from DRIVE and
NIO datasets respectively. In case of OC segmentation using U-Net and
classification using SVM, adaboost and NN, the revised test set consti
tuted 78 (15%) of the original 520 images. The distribution comprises of
13 images from each of the datasets respectively. In an attempt to obtain
a well trained model, the number of images are needed to be increased.
Thus image augmentation is performed to increase the dataset size by
rotating all the images with degrees, specifically, 45◦ , 90◦ and 170◦ . The
images were rotated by 170◦ and not 180◦ as not much difference be
tween the OD and OC of original images and their corresponding 180◦
rotated counterparts was seen because both types of images were sym
metrical to each other. Three augmented copies per image were made
for the training and validation sets. Also, images and their augmented
counterparts were not split between training and validation sets.
Table 1
Datasets used for training and testing.
Dataset Number of images
RIM-ONE 169
DRISHTI-GS 101
DRIONS-DB 110
NIO 118
JSIEC 124
DRIVE 44
Total 666
Fig. 2. Generalized CAD pipeline.
3
4.1. Input image validation value is considered to be the brightest spot for further processing. A ROI
is cropped along a defined radius around the considered brightest spot.
Before proceeding towards glaucoma detection, this step is incurred This method is easy but has certain drawbacks: (a) The method is prone
in order to ensure that no other input except a fundus image is being to noise, for example unneeded higher intensity values in the back
processed. The purpose of input image validation is to check whether the ground. (b) Fundus images suffer from poor illumination during image
provided input image is nothing else other than a fundus image. Being acquisition process. As a result, an unwanted variation in brightness is
the first and foremost step as described in Fig. 2, the binary classification retained in fundus images.
is performed using LeNet CNN architecture. LeNet is chosen for this In order to overcome the above challenges, fundus image denoising
purpose due to following reasons: (a) It can be trained in very less time plays a key role in adding robustness to the algorithm. The core concept
(b) It does not need any GPUs for training (c) It has a compact archi is that noise occurs at higher spatial frequencies whereas features to be
tecture that suffices the needs of image classification. extracted are contained in lower frequencies. In order to remove high
For developing the fundus image classifier, the dataset used as a part frequency noise from the grayscale image, we apply gaussian blur. As a
of non-fundus images consists of 10,672 images while that of fundus result the pixels with very high intensity values are averaged out due to
images constitutes 1845 images. The UKBench dataset [1] is used as a intensity impact of their neighbouring pixels. Gaussian smoothing
part of non-fundus images dataset and consists of random images. In denoises an image by averaging out higher intensity values using a
order to increase the dataset, apart from pre-existing 666 fundus kernel size. The gaussian kernel size of (65, 65) is used and sigma value
im-ages, an addition of 1179 images of good quality are added from is set to 0. Higher kernel size helps in better denoising and hence the
Kaggle dataset thus making a total of 1845 images. The reason that the mentioned kernel size is set by trial and error method. Using this kernel
added 1179 images are not used in further stages is that these images are size, we were able to achieve very high accuracy in ROI detection on the
unlabelled. Hence the ground truth i. e whether they are glaucoma or datasets used. After this pre-processing step, coordinates of the pixel
normal images is unknown. However, the added images aid in sub with highest intensity value is found from the gaussian blurred image
stantial assistance to the dataset required for this purpose because, in and a circle around the pixel is drawn with a radius equivalent to
order to develop a fundus image classifier, being acquainted with the gaussian kernel size mentioned above. Consequently, the x and y co
fact that the dataset used for training LeNet model is categorized into ordinates needed for the purpose of cropping the image in a rectangular
fundus/non-fundus images, solely suffices the need of the algorithm. format are achieved by subtracting radius value from x and y co
The internal classification (glaucoma or normal) of fundus images ordinates of the pixel having highest intensity value that is obtained
therein is unneeded and out of the scope for constructing a fundus image from the gaussian blurred image.
classifier. Images of dimension 256 × 256 are used for the purpose of The dimension of an ROI extracted image is 512 × 512. This image is
training LeNet model. further used as an input in segmentation, feature extraction and classi
As shown in Fig. 3, the LeNet architecture consists of two sets of fication steps. The data were split into training, validation and testing
convolutional, activation, and pooling layers, followed by a fully- with ratios of 70%, 15%, 15% respectively. Trial and error process for
connected layer, activation, another fully-connected, and finally a finding the correct Gaussian window size is performed on training and
softmax classifier. The model was trained in 25 epochs with a learning validation images. Accuracy is tested on testing images which were not
rate of 0.0001, decay rate of 4e-6 and a batch size of 32. Using ReLU as used in the above step. Fig. 4 illustrates the procedure of ROI extraction
the activation function, binary cross entropy as a loss function along along with comparative results achieved using traditional and robust
with Adam optimizer, 99% accuracy is achieved using LeNet. The soft methods. Fig. 4(G) shows a gaussian blurred image obtained using
max classifier returns a list of probabilities of the input image belonging grayscale image shown in Fig. 4(B). A heatmap shows pixel values with
to that class. The class label having the largest probability is chosen as similar intensity by assigning a unique colour to the group. The heatmap
the final prediction for the input image. obtained using traditional method represented by Fig. 4(C) shows
highest intensity values in orange colour. The heatmap in Fig. 4(H)
shows that noise in Fig. 4(C) is overcome after filtering the image and
4.2. ROI extraction
hence correct brightest spot is located. Comparing numerically, the
highest pixel intensity value before applying gaussian blur is 199
The Region of Interest (ROI) is a portion of an image that is
(incorrect) while post gaussian blur application is 169 (correct). With
considered to perform operations upon, as a part of a particular objec
the help of this algorithm, 98.67% accuracy was obtained for the pur
tive. Along with increased accuracy, ROI extraction, described in Fig. 2,
pose of ROI detection and extraction.
is needed to reduce the computation cost as well the computation time.
It can be achieved manually or automatically. In case of fundus images,
4.3. Optic disc and optic cup segmentation
the portion including and around the disc is considered to be the ROI of a
fundus image. As a characteristic of retinal fundus image, generally the
We use the revolutionary U-Net architecture which is a fully con
pixel with highest intensity value lies in the optic cup. With the help of
volutional neural network used for biomedical image segmentation,
this fact, we propose a more robust, automatic method for ROI extrac
illustrated in Fig. 2. The core reasons behind adopting this model for OD
tion known as the brightest spot algorithm.
and OC segmentation are described as follows:
In traditional method, initially, the input image is converted into
grayscale image. Then coordinates of pixel having highest intensity
Fig. 3. LeNet architecture.
4
• It requires much fewer images for training.

• Achieves results that are competitive with the existing state-of-the-
art methodologies.
• It overcomes all the challenges that occur during OD and OC seg
mentation, as described in section 5.
The architecture, shown in Fig. 5 consists of a contracting path and

an expansive path. The contracting path is responsible for feature
extraction while the expansive path combines spatial information with
high resolution features to obtain the required segmentation map. The
contracting path is a neural network that comprises of a series of con
volutional layers followed by Rectified Linear Unit (ReLU) and max
pooling layers. Initially, the input image is passed through convolutional
layer that constitutes a filter with spatial resolution of 3 × 3 pixels.
Further, ReLU activation, described in the function below, is applied.
The function is denoted by positive part of its argument:
ReLU(x) = max(0, x) (1)
where x is the input to a neuron.

Then, the 2 × 2 max pooling layer reduces dimensions of the input
layer by selecting the maximum value from each cluster of neurons and
converting it into a single neuron in the next layer. As a part of model
optimization, the proper tuning of parameters and weight initialization
is necessary. Using batch normalization, the inputs to the layers within
the network are normalized with the help of statistical features namely
mean and covariance of the current mini-batch. In order to address the
problem of overfitting, a dropout layer as a part of regularization is
applied where the values of nodes having probability greater than p are
survived. The dropout regularization of 0.5 has been applied. Stochastic
Gradient Descent (SGD) is used as the optimization method. In SGD one
sample per iteration is randomly shuffled and chosen and gradient of the
cost function is computed. This method is computationally less expen
sive as compared to gradient descent algorithm. (x1, y1), …, (xn, yn) is the
given set of training examples where xi ∈ Rm and yi ∈ k (yi ∈ − 1, 1 for
classification) the aim is to learn a linear scoring function f (x) = wT x +
b with model parameters w ∈ Rm and intercept b , Rm.
The sign of function f(x) is used as a decision criteria for binary
classification. The regularized training error is given by,
1∑ n
E(w, b) = L(yi , f (xi )) + αR(w) (2)
n i=1
where,
L = loss function that measures model (mis)fit.

R = regularization term (aka penalty) that penalizes model
complexity.
α = non-negative hyperparameter that controls the regularization
strength.
The update rule for SGD is given below,

[ ]
∂R(w) ∂L(wT xi + b, yi )
w←w − η α + (3)
∂w ∂w
where,
η = learning rate which controls the step-size in the parameter space.

b = The intercept is updated similarly but without regularization.
Finally, Sigmoid function (Si) described below is used to limit the

output values of the segmentation map between (0, 1).
Fig. 4. ROI extraction using Brightest Spot algorithm. Figures B, C, D, E and F 1

Si (x) = (4)
are the results obtained by traditional method. Figures G, H, I, J and K are the 1 + e− x
results obtained by the application of proposed method.

Dice coefficient is used as the metric based on which model training
5
Fig. 5. U-Net architecture.
is done. It is the quantification of spatial overlap between actual results 4.4. Feature extraction
and the results reproduced using CNN for segmentation of OD and OC.
The similarity coefficient, where 0 indicates no spatial overlap while 1 The conversion of images into a category of features is termed as
indicates a complete spatial overlap, has been adopted in order to feature extraction. Two amongst the three features used in this work are
validate segmentation results obtained using U-Net. As the generated extracted from optic disc and optic cup whereas the third feature is
output would be a binary segmentation map, dice coefficient loss. DL is based on blood vessels extraction. The areas of OD and OC regions that
used as the loss function for optic disc and cup segmentation. Dice co are extracted from the segmented optic disc and cup are used for
efficient. DC is described as follows, calculating CDR value. Along with CDR, ISNT ratio for both NRR and
blood vessels is calculated as a decision criteria for glaucoma detection,
2|a ∩ b|
DC = (5) outlined in Fig. 2.
|a| ∪ |b|
4.4.1. Cup-to-disc ratio assessment
DL = 1 − DC (6)
The optic cup to disc ratio is the most widely used feature for glau
where, coma detection. The reason behind it is that whenever an individual is
affected with glaucoma due to some substantial amount of strain pro
a = binary vector of the intensity values of the ground truth image. duced in the retina, the phenomenon of cupping occurs. We calculate
b = binary vector of the intensity values of the predicted image. CDR using the area of OC and OD. The binary images obtained from
∩ = represents an intersection between two input binary vectors |a| optic cup and disc segmentation are used for area calculation. The
and |b|. following formula is incorporated for CDR evaluation.
∪ = represents union between two input binary vectors |a| and |b|. [ cup area ]
CDR = 2 × (7)
disc area
Using the dataset described in section 3.1, the dimensions of images CDR value ≥ 0.5 is considered to be glaucomatous and lesser than
used for training are set to 256 × 256 after completion of the process of that is considered to be healthy [4].
ROI extraction. The learning rate applied is 0.0004 and momentum is set
to 0.95. A batch size of 2 is used, in order to decrease the amount of GPU 4.4.2. Extraction of ISNT quadrants and ratio evaluation
memory needed. The U-Net model for segmentation is efficiently trained In some cases, due to large area of cup and disc, CDR value of the
in 150 epochs. Further, 80% of images are used for training while the person being tested can be high, even though the person is healthy. In
remaining 20% are used for testing the efficiency of the trained U-Net order to overcome this drawback, another decision criteria that evalu
model. The proposed algorithm is implemented using Python 3.7 and ates Neuroretinal Rim in ISNT quadrants is applied. Neuroretinal Rim is
trained on NVIDIA GPU Tesla 4 and CUDA version 10.1. Keras frame the area of an optic disc that remains after exclusion of optic cup from
work with tensorflow backend is used for training both the neural net the area of the optic disc. In the case of glaucoma, as a consequence of
works used namely LeNet and U-Net. Fig. 6 illustrates the predicted cupping, the NRR width in ISNT quadrants becomes disproportionate.
segmented optic disc and cup images in comparison with their respec For a healthy eye the rim area is in the order where I > S > N > T. NRR
tive ground truth images.
6
Fig. 6. Optic cup and optic disc segmentation using U-Net architecture. Curves in blue and red indicate contours of ground truth and predicted boundaries
respectively. (A) Input fundus image, (B) Indicates predicted and ground truth boundaries of optic cup,(C) Indicates predicted and ground truth boundaries of optic
disc. Jaccard Coefficient is denoted by JC.
7
area is calculated in ISNT quadrants. Fig. 7 shows the four regions in

which NRR is divided and also the violation of ISNT rule for a glau
comatous image has been shown.
With the help of the obtained optic disc and optic cup from the
segmentation step, XOR operation is performed on both the images in
order to obtain the NRR [29]. The result is shown in Fig. 8. Next step is
the extraction of NRR in ISNT quadrants. For this purpose, a binary mask
is generated taking into consideration the quadrant wise degrees
described in Fig. 7 (A). Each generated mask shown in Fig. 9 is of the
dimension of 512 × 512 which is equivalent to the size of the input
image. The generated mask is rotated 90◦ to extract rim width in all the
four quadrants.
Rim width in the respective quadrants is obtained by performing
AND operation between the NRR image and the corresponding mask [4].
The results attained after performing these operations are shown in
Fig. 10. Further, the rim area belonging to every quadrant is calculated
by counting the number of white pixels in the image. Finally, the ratio of
NRR is calculated by using following formula,
[ ∑ ]
1 + (RI , RS )
RimArea ratio = ∑ (8)
1 + (RN , RT )
where,
RI = Rim area in inferior quadrant

RS = Rim area in superior quadrant
RN = Rim area in nasal quadrant Fig. 8. Extracted NRR using XOR operation.
RT = Rim area in temporal quadrant
The rim width ratio is lesser in healthy cases whereas high in case the
patient has glaucoma.
4.4.3. Extraction of blood vessels in ISNT quadrants and ratio evaluation

The consequences of the structural changes occurring in the fundus
Fig. 9. Generated masks in four ISNT quadrants (A) Inferior region; (B) Su
perior region; (C) Nasal region; (D) Temporal region.
of an eye are used as the decision making criteria for glaucoma detec
tion. Similarly, it is found that in case of abnormality, the blood vessels
tend to shift more towards the nasal side. Using this concept as a factor
for abnormality detection, blood vessels ratio is used as the third feature
for glaucoma diagnosis [23].
Fig. 7. (A) Four regions of NRR division. (B) Rim width in ISNT region in both Initially, the green channel of an input RGB fundus image is extracted
the cases [29].
8
as the visibility of blood vessels is more clear in green channel compared

to the red and blue channels. Then green channel of the input image is
converted into grayscale and Contrast Limited Adaptive Histogram
Equalization (CLAHE) is applied. CLAHE is used in an attempt to
enhance the edges and improve local contrast such that the intensity
values in the region of blood vessels is lesser as compared to remaining
part of the grayscale image. Application of CLAHE also helps in
denoising due to histogram clipping and redistribution of intensity
values across the entire image. The blood vessels in grayscale image
were optimally visible from the background when grid size of (6, 6) was
applied and the clipping limit equal to 4 was set. Further we use bot
tomhat transform in order to enhance darker blood vessels in a bright
fundus image. The reason behind selection of bottom-hat filter is that it
enhances high intensity pixel values (blood vessels) as compared to the
low intensity pixel values in the following manner. First the morpho
logical closing operation is performed. Next, the closed image is sub
tracted from input image. This results into the ultimate extraction of
blood vessels. The equation of bottomhat transform has been described
as follows.
Tb (f ) = [(f * b) − f ] (9)
where,
Tb(f) = Bottom-hat transform of input image

f = input image
‘*’ = closing operation
b = structuring element
Fig. 10. Rim width in the four ISNT quadrants.
Fig. 11. Threshold selection for blood vessel segmentation.
9
Structuring element (SE) of size (15, 15) is used for closing operation.
In order to obtain an appropriate threshold for mapping the blood ves
sels into a binary image, a histogram analysis of the grayscale converted
green channel of the input image and bottom-hat filtered image is per
formed. In the example shown in Fig. 11, we can observe that the pixels
with least intensity values of the histogram plot in Fig. 11 (B), indicate
blood vessels. To the contrary, in the bottom-hat filtered image, the
brightest pixels shown in Fig. 11 (D) comprise of the blood vessels. It was
noticed that the lowest intensity value in a green channel image was
approximately the highest intensity value for its corresponding bottom-
hat filtered image. From the aforementioned facts, it is able to conclude
that the threshold value should approximately lie within this range. The
threshold value (T) is chosen using following formula:
T = 3.15 × (σ) (10)
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
∑N 2
i=1 (yi − y)
σ= (11)
N
where,
σ = standard deviation
N = number of pixels intensity values
y = mean of all the pixel intensity values
y = pixels intensity values
Lastly, a binary image containing only the blood vessels is obtained

by using thresholding operation shown in Fig. 12 (D). Further, the
extracted blood vessels image is divided into ISNT quadrants in a similar
manner as described in section 4.3.
The blood vessels ratio is calculated using following formula,
[ ∑ ]
1 + (BI , BS )
Blood VesselsArea ratio = ∑ (12)
1 + (BN , BT )
where,
BI = Area covered by blood vessels in inferior quadrant

BS = Area covered by blood vessels in superior quadrant
BN = Area covered by blood vessels in nasal quadrant
BT = Area covered by blood vessels in temporal quadrant
The result of blood vessels in ISNT quadrants is shown in Fig. 12.
4.5. Classification
In order to develop a robust application, the classification system has

been generalized by gaining predictions from three supervised algo Fig. 12. Blood Vessels extraction and their distribution in ISNT quadrants.
rithms. The features used are CDR, ISNT NRR ratio and ISNT blood
vessel ratio, described in section 4.4. Final decision is made using hard sample. Very small values of gamma create a risk of overfitting the
majority voting. A total of 520 images are used for classification out of model and become insensitive towards regularization parameter. The
which 380 images are glaucomatous while 140 are healthy cases. In regularization parameter C tries to maintain a balance between margin
order to train the SVM and adaboost and NN classifiers, 70% of images maximization and corresponding misclassifications. Optimal results
were used for training and 15% each for validation and testing were achieved for the dataset used when value of gamma was equal to 1
respectively. and that of C was set to 10.
4.5.1. Classification using SVM 4.5.2. Classification using NN

SVM is chosen as one of the classifiers because it avoids overfitting by The reason behind application of NN for glaucoma classification is
provision of various regularization parameters. Also, SVM has been that (a) the neurons acquire knowledge by themselves speedily and with
widely used by researchers to perform binary classification for glaucoma high accuracy (b) work efficiently in real time conditions.
detection because it is able to separate linearly inseparable data into The Multilayer perceptron model is a fully connected deep neural
linearly separable data with the help of the kernel trick. However, se network comprising of one input layer, two hidden layers and one
lection of correct parameters and kernels is challenging but choosing output layer. First hidden layer is composed of three neurons whereas
them accurately would help in attaining noteworthy results. the second layer constitutes two neurons. ReLU is used as an activation
For the purpose of classification, SVM classifier with RBF kernel is function for hidden layers whereas Sigmoid function is applied to the
used. The parameter gamma defines the range of influence of a training output layer. Binary cross entropy, described in the equation below, is
10
used as a loss function as it tends to decrease the dissimilarity between Et = training error,
actual and predicted values. α = weight of the classifier,
h(xi) = output hypothesis,
∑
C=2
BC = − gi log(Sigm(pi )) (13) E (F) = error function
i=1
The chosen base estimator is decision tree classifier for which the
BC = gi log(Sigm(pi )) − (1 − gi )log(1 − Sigm(Pi )) (14) parameters, max depth as well as the learning rate is equal to 1. The
maximum number of estimators at which boosting is terminated, i. e,
where, estimators is set to 50 while random state is equal to 20. As compared to
the recall value achieved using decision tree algorithm, adaboost
BC = Binary Cross-entropy, ensemble technique that consisted of decision tree as the base classifier
C = Number of classes, helped in achieving a recall value of 100%, thus making adaboost
g = ground truth values, ensemble learning a better choice for glaucoma classification.
p = predicted values,
Sigm ( ) = Sigmoid function 4.5.4. Final prediction
The working principle of majority voting is such that after every
Adam, derived from adaptive moment estimation is used as an model makes a prediction for the input instance in terms of [0/1] i. e
optimizer because learning rates are computed based on the parameters, [healthy/glaucoma], the final prediction is the one that is made by
thus helping overcome saddle points during training. The model was atleast two-third of the three classifiers that are described in section 4.5.
trained in 120 epochs whereas a batch size of 2 helped in acquiring The results prove that a percentage of glaucoma cases have been
optimal results. From Fig. 13, it can be observed that loss per epoch correctly classified.
decreases gradually and eventually becomes 0.
5. Results and discussion
4.5.3. Classification using adaboost
The meta-learning approach with the help of adaboost technique is This section discusses results obtained by the algorithms used for
selected in order to increase the efficacy of binary classification using input image validation, segmentation, feature extraction and classifi
decision tree. It is an iterative algorithm that strengthens the classifier cation. We also compare the results of proposed methodology with other
by reducing the bias error by taking into consideration the difference work and present statistical analysis of the features used for
between predicted and the actual values. Following equation describes classification.
parameters of a boosted classifier. In the proposed CAD pipeline, the foremost step is to detect whether
∑
T the input image is a retinal fundus image. The detection is done using
Ft (x) = [ft (x)] (15) LeNet architecture and the results are discussed in Fig. 14. As seen in the
t=1 figure, the training loss is almost equal to the validation loss indicating
that the model has been very well trained.
where,
Optic disc and optic cup play the most crucial role in the classifica
tion process because of the evaluation of two features, namely, CDR and
Ft (x) = boosted classifier,
ISNT ratio are solely based upon the accuracy of segmentation. From the
ft = weak learner,
literature surveyed in section 2, although the optic cup and disc seg
x = input,
mentation using image processing techniques helps in achieving note-
T = total number of classifiers
worthy results, the techniques face major challenges. These techniques
are highly impacted by fundus image quality. Fundus images are prone
The training error Et is described as follows:
to illumination inhomogeneities, colour, contrast, saturation, image
∑
Et = E[Ft− 1 (xi ) + αt h(xi )] (16) resolution, quality of sensors etc. On the other hand, most of the image
i processing algorithms consider the similar set of features for medical
where,
Fig. 14. Training loss and accuracy using LeNet architecture fundus/non-
Fig. 13. Decrease in loss during training phase of Neural Network. fundus image classification.
11
image analysis and thus glaucoma detection. As a result, these factors Table 2
give rise to a few limitations in OD and OC segmentation that are Result comparison for Optic Disc and Optic Cup segmentation.
described as follows: (a) The methods are sensitive towards image dis Author Dataset Dice Score
tortions and noise. For example, the presence of gaussian noise in fundus
OD OC
image can influence segmentation results. (b) Structural characteristics
of a fundus image are not reflected in histogram analysis. Therefore, this DRIONS-DB 0.94 –
Sevastopolsky [33] RIM-ONE v.3 0.95 0.82
may impact glaucoma assessment. (c) The probability of performance DRISHTI-GS – 0.85
efficiency on cross dataset evaluation is low. Maninis et al. [21] DRIONS-DB 0.97 –
(d) Threshold selection is not always straightforward. Thus the deep RIM-ONE v.3 0.96 –
learning, U-Net model proves to be the best algorithm for semantic RIM-ONE v.3
Ferreira et al. [31] 0.84
segmentation for segmentation of OD and OC with high accuracy. –
DRISHTI-GS
Fig. 15 illustrates the stages of glaucoma detection process starting from Yin et al. [38] Origa-light 0.83
ROI extraction to glaucoma classification. Five random images were DRISHTI-GS
selected each from the testing sets of five datasets used for training the RIM-ONE
model. The accuracy in class prediction confirms that the model has Proposed methodology DRIONS-DB 0.93 0.87
NIO
been well trained and is highly reliable. DRIVE
In the work of Sevastopolsky [33], U-Net was used for segmentation JSIEC
with an advantage of reduced glaucoma prediction time. The input
images were pre-processed using CLAHE. Strategies like dropout layer
and data augmentation were applied for better performance [21]. Fer Table 3
reira et al. [31] used U-Net Convolutional Neural Network architecture Feature-wise mean and standard deviation values of healthy and glaucoma cases
only for optic disc segmentation and achieved commendable results. for training examples.
Unlike Sevastopolsky [33], in our work, there is no need of image
Features Glaucoma Healthy
pre-processing and can be segmented as it is. Table 2 shows that our
proposed methodology has achieved. mean ± SD mean ± SD
Table 3 describes the range of values under which healthy and CDR 0.8 ± 0.320 0.2 ± 0.240
glaucomatous cases lie. In case of non-glaucomatous cases, the feature Blood Vessel Ratio 2.9 ± 0.160 1.5 ± 0.250
ISNT Ratio 1.6 ± 2.890 1.1 ± 0.216
values are smaller as compared to glaucomatous cases. As the training
samples increase, the performance of the system shows an improvement
in the accuracy. In order to compare the significance of extracted fea statistical test that is used to calculate differences in results across
tures with respect to the classifiers applied for glaucoma classification, samples taken from dissimilar populations. For an input matrix [mij ]N×k,
the Friedman statistical test has been applied to the experimental re the rank matrix [r.j], which is the mean, is calculated at the intersection
sults. Statistical analysis helps in the determining whether the experi of every row (i,k). “N” is the number of rows (classification models)
mental results are probably due to chance or else they carry some factor whereas “k” is the number of columns (features) in the input matrix. The
of interest to the concerned topic. Friedman test is a nonparametric
Fig. 15. Results achieved using proposed methodology. Column (A) represents ROI extracted images from 5 different datasets. Images in columns (B) and (C) display
the segmented images of OD and OC respectively using U-Net architecture. Column (D) illustrates the extracted Neuro-retinal Rim. Images in columns (E) shows the
distribution of NRR in four ISNT quadrants. Column (F) shows detected blood vessels on an optic disc image masked with predicted optic disc in column (B). Columns
(G), (H) and (I) display the values obtained for CDR, ISNT Ratio and Blood vessel ratio respectively.
12
Friedman statistic Fr is given by: test set, and the other five datasets are used for training and vali
k ( )2 dation by using all the datasets described in section 3.1. Further, to
∑
Fr =
12N
r.j −
k+1
(17) verify whether the proposed methodology works up to the mark in
k(k + 1) j=1 2 facing real world challenges, the trained model was tested on two
completely different and publicly available, HRF and APTOS data
If N > 15 or k > 4, the p-value is given by:
sets, in order to ensure whether generalization is achieved.
( )
P χ 2k− 1 ≥ Fr (18)
where, 5.1. Performance on HRF dataset
χ 2k− 1 = critical value obtained for chi-square test at (k-1) degrees k-1 From the obtained accuracy results indicated in Table 6, it can be
of freedom. seen that our method could eliminate noise by accurately segmenting
lesion images interfered by dark spots contained in HRF dataset and
The p-value is obtained from tables prepared for Friedman test. If p- could successfully predict negative and positive cases. Also, it demon
value is greater than, the null hypothesis H0 is rejected else we fail to strates the precision, recall and F1 scores achieved on HRF dataset.
reject H0. Failure to reject H0 indicates that the experiment carried out is
statistically significant. 5.2. Performance on APTOS dataset
In our experimentation, Friedman test has been applied to the data
described in Table 4. Null hypothesis H0 can be stated as: Asia Pacific Tele - Ophthalmology Society (APTOS) is a real-world
H0 = The impact of ensemble of features on classification accuracy dataset, comprising of noise in images containing artifacts under
remains unchanged. exposed or overexposed. The images are gathered from multiple clinics
The test results indicated that an ensemble of features significantly using a variety of cameras over an extended period of time, which in
altered the classification accuracy (Fr = 8.3, degrees of freedom = (4, 4), troduces further variation. As a result, working with such a dataset be
p-value < 0.05). Using Friedman test, the p-value obtained is 0.021 (p- comes a challenging task. From 250 fundus images taken from APTOS
value < α), whereas the critical value CV described in Friedman table at dataset and labelled to be glaucomatous or healthy, 130 images are
(4,4) degrees of freedom is 7.8 (Fr > CV). Thus, we fail to reject the null glaucomatous while 120 images are normal. After testing these images
hypothesis H0 factualising the hypothesis that all the three features upon the proposed model, the system could achieve precision, recall and
prove to be significant for the purpose of accurate classification. F1 scores as 99.1%, 98.3% and 98.7% respectively, indicated by Table 6.
In the work of Ruengkitpinyo et al. [30], 9 features were extracted
using ISNT rule and SVM classifier was used for glaucoma image clas • The images in DrishtiGS dataset constituted patients between 40 and
sification. Deepika and Maheswari [11], performed pre-processing by 80 years of age while 25–90 years for that of Drive dataset. Rim-One
extraction of green component from the RGB image, CLAHE for blood dataset consists of high quality images with special focus on OD
vessel extraction and Active Contour Models for feature extraction. segmentation. Drions-DB comprises of all the images from Caucasian
Hatanaka et al. [14] proposed comparison of the optic disc profiles, the ethnicity having 53 years as the mean age of all the patients. Images
profile for glaucoma cases appeared to be a broad mountain with short from JSIEC and NIO datasets have varied illumination conditions.
skirt while that for healthy cases tended to appear as a narrow mountain From the results presented in Table 6, it is visible that the proposed
with long skirts. Morphological closing and opening operations using system was well trained and attained high accuracy when tested with
structuring element were applied for segmentation [32]. In proposed images from unknown dataset. Also, Tables 7 and 8 elucidate that
methodology, the ensemble of classifiers and selection of significant images were well segmented despite the heterogeneity in image
features provided an advantage over others in achieving excellent re conditions.
sults. Considering the results, as summarized in Table 5, this work has
achieved remarkable results and we believe that the developed desktop As anticipated, the results show that the model is generalized and can
app would prove to be highly reliable for health practitioners across the work efficiently with unknown images. A high sensitivity is necessary
world. for a model developed in healthcare domain, since it implies that the
cases having disease are rarely misdiagnosed. As evident from Table 6,
• Addressing the first issue, whether training the model with different the proposed methodology has achieved noteworthy sensitivity scores
kinds of datasets and increasing the number of training and testing that are greater than 96% in case of all the described datasets. Results
images developed a robust model, Table 2 provides sufficient evi described therein thus demonstrate that the proposed model is robust to
dence that proposed methodology shows competent dice scores as different datasets that come from unique sources.
compared to other works.
• We also raised concerns regarding generalization power of the sys • Also, we focused on accuracy enhancement by ensuring that pro
tem by obtaining predictions from more than one number of classi vided features carry significance in glaucoma classification. In this
fiers. To suffice with this purpose, each of the six datasets is used as a context, the Friedman statistical test ensures that the features carry
significance in classification whereas high F1 scores were analysed
when the ensemble of classifiers was used for classification purpose.
Table 4 Our method is superior to other methods as we consider three fea
Testing results displayed in the form of macro-averaged F-1 scores, post training tures for glaucoma diagnosis with higher percentage of correct
individual/multiple classifiers with individual/multiple features. Columns classification.
include features abbreviated as Cup-to-Disc Ratio (CDR), ISNT Ratio (ISNT) and
Blood Vessel Ratio (BVR). As a result, the proposed methodology is clinically significant, as the
Model/Features CDR ISNT BVR CDR + ISNT + BVR classification accuracy outperforms the accuracy obtained by existing
SVM 100 100 100 100 methods. In order to check whether the model is prone to overfitting, 10-
NN 41 40 75 49 fold cross validation was performed. Table 9 describes the cross vali
Adaboost 42 50 72 55 dation results obtained while training the models for OD and OC seg
SVM + NN 100 100 100 100 mentation. Also, it could be stated that, if the model was prone to
+ Adaboost
overfitting or was overfitted, it would not have absolutely performed
13
Table 5
Performance comparison of supervised machine learning algorithms for Glaucoma detection.
Author Classifier Dataset Accuracy Specificity Sensitivity
Ruengkitpinyo et al. [30] SVM Mettapracharak 90% 91.5% 92.6%

Hospital, Thailand
Deepika and Maheswari [11] SVM, ANFIS HRF dataset 97.7% 82% 95.7%
Hatanaka et al. [14] ANN, RBF, IMAGEnet 78% 80% 75%
k-NN, SVM
Septiarini and Harjoko [32] SVM 60 images 90% 100% 80%
Guo et al. [13] Deep CNN DRISHTI-GS1, 76.42% 76.2% 76.61%
ORIGA
DRISHTI-GS
Proposed Methodology SVM, NN, Adaboost RIM-ONE 100% 100% 100%
DRIONS-DB
NIO
DRIVE
JSIEC
Table 6 Table 9
Quantitative result comparison by calculating accuracy, specificity, sensitivity, Accuracy, specificity, sensitivity, precision and F1 score metrics obtained after
precision and F1 score values on respective test datasets while the ensemble of performing cross validation while training U-Net model for OD and OC
classifiers is trained on remaining datasets for proposed model. Images in the segmentation.
datasets marked with (*) symbol have been tested without training images Dataset Accuracy Specificity Recall Precision F1
therein on the proposed model. Score
Test dataset Accuracy Specificity Recall Precision F1 Score OD 97.9 95.1 99.1 97.9 98.5
RIM-ONE 96.9 95.9 97.9 96 96.9 OC 89.4 80.5 96.6 86 91
DRISHTI 98 97.5 98.3 98.3 98.3
DRIONS 99 98.2 100 98.1 99
97.5
NIO 98.3 100 95.2 100
well while predicting labels for images from completely unseen datasets,
JSIEC 99.1 98.7 100 97.7 98.8
DRIVE 97.6 95.8 100 95 97.4 namely, HRF and APTOS. In order to check the robustness of the
*HRF 98.8 99.2 98.3 99.1 98.7 developed model, HRF and APTOS datasets were not used as a part of
*APTOS 98.2 97.5 99.3 98.7 98.2 training, in an attempt to test the model against unknown set of images.
Average 98.8 99 98.7 98.9 98.8 The results have been described in Table 6 of the manuscript. Following
are a few limitations of the proposed work:
Table 7
• It cannot detect the presence of Peri-Papillary Atrophy (PPA) which
Comparison of optic disc segmentation results from U-Net for each dataset. could affect in defining boundary while performing OD segmentation
Following metrics (in average percentage) have been calculated on respective and CDR calculation. However, ISNT and Blood vessel ratios lied
test datasets while the proposed model is trained on remaining datasets. within the range in such cases and therefore misclassification could
U-Net RIM DRISHTI DRIONS NIO JSIEC DRIVE
be avoided.
-ONE -GS -DB • The model cannot be trained on unlabelled dataset. Furthermore, the
U-Net model for OD and OC segmentation is based on semantic
Input 169 101 110 118 124 44
Jaccard 0.988 0.976 0.978 0.974 0.980 0.957 segmentation. This implies that there must be an availability of bi
Precision 0.959 0.969 0.971 0.974 0.954 0.990 nary images representing OD and OC in fundus images in the form of
Recall 0.979 0.940 0.957 0.95 0.976 0.904 groundtruth.
F1 Score 0.969 0.954 0.964 0.962 0.965 0.926
Specificity 0.944 0.941 0.948 0.987 0.975 0.956
Accuracy 0.964 0.94 0.954 0.974 0.975 0.931
6. Conclusions
The novelty of this research work lies in developing a methodology

that is robust and well generalized to real time fundus images. This
Table 8
paper proposes the development of a desktop application as a part of an
Comparison of optic cup segmentation results from U-Net for each dataset.
automated CAD system for glaucoma detection using retinal fundus
Following metrics (in average percentage) have been calculated on respective
test datasets while the proposed model is trained on remaining datasets. images. Images input to the system are validated using deep learning
approach known as LeNet. Despite of it being a compact CNN archi
U-Net RIM DRISHTI DRIONS NIO JSIEC DRIVE
tecture, the accuracy obtained is commendable. ROI extraction using the
-ONE -GS -DB brightest spot algorithm is presented. In this algorithm, the image is
Input 169 101 110 118 124 44 converted into grayscale and preprocessed using gaussian blur in order
Jaccard 0.868 0.853 0.909 0.896 0.9 0.92 to improve the accuracy of ROI detection. Further, the segmentation of
Precision 0.979 0.984 0.985 0.92 0.947 0.88 optic disc and optic cup is done using U-Net architecture. U-Net proves
Recall 0.970 0.969 0.971 0.902 0.873 0.923
F1 Score 0.974 0.977 0.978 0.910 0.848 0.931
to be a very efficient algorithm for medical image segmentation.
Specificity 0.971 0.973 0.974 0.966 0.882 0.878 Further, three features namely, CDR, NRR ISNT ratio and Blood Vessels
Accuracy 0.97 0.977 0.972 0.91 0.89 0.970 ISNT ratio have been considered as the decision criteria for glaucoma
detection. A binary mask of dimensions similar to that of the input image
14
is used to distribute the image into four ISNT quadrants. The area of the [12] Dutta MK, Mourya AK, Singh A, Parthasarathi M, Burget R, Riha K. Glaucoma
detection by segmenting the super pixels from fundus colour retinal images. In:
image in every quadrant is used to further calculate the ISNT ratio. The
2014 international conference on medical imaging, m-health and emerging
p-value < 0.05 indicates that the features used and classifiers applied are communication sys-tems (MedCom). IEEE; 2014. p. 86–90.
significant in terms of glaucoma detection. Finally, using these three [13] Guo F, Mai Y, Zhao X, Duan X, Fan Z, Zou B, Xie B. Yanbao: a mobile app using the
features as the input parameters, the classification is done using majority measurement of clinical parameters for glaucoma screening. IEEE Access 2018;6:
77414–28.
voting attained from SVM, NN and adaboost algorithms. It is observed [14] Hatanaka Y, Muramatsu C, Sawada A, Hara T, Yamamoto T, Fujita H. Glaucoma
that the choice of classification method influences both accuracy and risk assessment based on clinical data and automated nerve fiber layer defects
efficiency. Summarizing further, the objectives of the research work detection. In: 2012 annual international conference of the IEEE engineering in
medicine and biology society. IEEE; 2012. p. 5963–6.
have also been stated and results demonstrate that the mentioned [15] Hatanaka Y, Noudo A, Muramatsu C, Sawada A, Hara T, Ya-mamoto T, Fujita H.
challenges have been tackled using methodologies discussed above. Vertical cup-to-disc ratio measurement for diagnosis of glaucoma on fundus
To further conclude, the proposed pipeline can be used by any system images. In: Medical imaging 2010: computer-aided diagnosis. International Society
for Optics; 2010.
and is completely offline. The desktop application can contribute to [16] Kanse SS, Yadav DM. Retinal fundus image for glaucoma detection: a review and
medical practitioners in decision making and also in early detection of study. J Intell Syst 2019;28:43–56.
glaucoma thus helping a patient in avoiding an irreversible loss that [17] Kavitha S, Karthikeyan S, Duraiswamy K. Early detec-tion of glaucoma in retinal
images using cup to disc ratio. In: 2010 second international conference on
would have caused to their eyesight. Furthermore, as a part of future computing, communication and networking technologies. IEEE; 2010. p. 1–5.
work, we intend to extend the domain of proposed methodology to other [18] Kim J, Tran L, Chew EY, Antani S. Optic disc and cup segmentation for glaucoma
types of medical images, for example, OCTA images. Also, more number characterization using deep learning. In: 2019 IEEE 32nd international symposium
on computer-based med-ical systems (CBMS). IEEE; 2019. p. 489–94.
of features can be considered apart from those described in proposed
[19] Li A, Wang Y, Cheng J, Liu J. Combining multiple deep features for glaucoma
methodology. classification. In: 2018 IEEE international con-ference on acoustics, speech and
signal processing (ICASSP). IEEE; 2018. p. 985–9.
[20] Lotankar M, Noronha K, Koti J. Detection of optic disc and cup from color retinal
Declaration of competing interest images for automated diagnosis of glaucoma. In: 2015 IEEE UP section conference
on electrical computer and electronics (UPCON). IEEE; 2015. p. 1–6.
[21] Maninis KK, Pont-Tuset J, Arbeláez P, Van Gool L. Deep retinal image
The author declares that there are no conflicts of interests. understanding. In: International conference on medical image computing and
computer-assisted intervention. Springer; 2016. p. 140–8.
Acknowledgements [22] Narasimhan K, Vijayarekha K. An efficient automated system for glaucoma
detection using fundus image. J Theor Appl Inf Technol 2011;33:104–10.
[23] Nayak J, Acharya R, Bhat PS, Shetty N, Lim TC. Auto-mated diagnosis of glaucoma
Funding: No external funding sources supported this research article. using digital fundus images. J Med Syst 2009;33:337.
We specially thank Dr. Aditya Kelkar from NIO, Pune for his valuable [24] Pal S, Chatterjee S. Mathematical morphology aided op-tic disk segmentation from
retinal images. In: 2017 3rd international conference on condition assessment
contribution in provision and classification of the images. techniques in electrical sys-tems (CATCON). IEEE; 2017. p. 380–5.
[25] Panda R, Puhan N, Panda G. Mean curvature and texture constrained composite
Appendix A. Supplementary data weighted random walk algorithm for optic disc segmentation towards glaucoma
screening. Healthcare technology letters 2018;5:31–7.
[26] Panda R, Puhan NB, Rao A, Padhy D, Panda G. Re-current neural network based
Supplementary data to this article can be found online at https://doi. retinal nerve fiber layer defect detection in early glaucoma. In: 2017 IEEE 14th
org/10.1016/j.ibmed.2021.100038. international symposium on biomedical imaging (ISBI 2017). IEEE; 2017. p. 692–5.
[27] Pavithra G, Anushree G, Manjunath T, Lamani D. Glau-coma detection using ip
techniques. In: 2017 international confer-ence on energy, communication, data
References analytics and soft computing (ICECDS). IEEE; 2017. p. 3840–3.
[28] Quigley HA, Broman AT. The number of people with glau-coma worldwide in 2010
[1] URL, https://archive.org/details/ukbench. and 2020. Br J Ophthalmol 2006;90:262–7.
[2] Acharya R, Ng YE, Suri JS. Image modeling of the human eye. Artech House; 2008. [29] Ruengkitpinyo W, Vejjanugraha P, Kongprawechnon W, Kondo T, Bunnun P,
[3] Agarwal A, Gulia S, Chaudhary S, Dutta MK, Travieso CM, Alonso-Hernández JB. Kaneko H. An automatic glaucoma screen-ing algorithm using cup-to-disc ratio and
A novel approach to detect glaucoma in retinal fundus images using cup-disk and isnt rule with support vector machine. In: IECON 2015-41st annual conference of
rim-disk ratio. In: 2015 4th international work conference on bioinspired the IEEE in-dustrial electronics society. IEEE; 2015. 000517–000521.
intelligence (IWOBI). IEEE; 2015. p. 139–44. [30] Ruengkitpinyo W, Vejjanugraha P, Kongprawechnon W, Kondo T, Bunnun P,
[4] Ahmad H, Yamin A, Shakeel A, Gillani SO, Ansari U. Detection of glaucoma using Kaneko H. An automatic glaucoma screen-ing algorithm using cup-to-disc ratio and
retinal fundus images. In: 2014 interna-tional conference on robotics and emerging isnt rule with support vector machine. In: IECON 2015-41st annual conference of
allied technologies in engineering (iCREATE). IEEE; 2014. p. 321–4. the IEEE in-dustrial electronics society. IEEE; 2015. 000517–000521.
[5] Al Ghamdi M, Li M, Abdel-Mottaleb M, Shousha MA. Semi-supervised transfer [31] dos Santos Ferreira MV, de Carvalho Filho AO, de Sousa AD, Silva AC, Gattass M.
learning for convolutional neural networks for glaucoma detection. In: ICASSP Convolutional neural network and texture descriptor-based automatic detection
2019-2019 IEEE international conference on acoustics, speech and signal and diagnosis of glaucoma. Expert Syst Appl 2018;110:250–63.
processing (ICASSP). IEEE; 2019. p. 3812–6. [32] Septiarini A, Harjoko A. Automatic glaucoma detection based on the type of
[6] Aquino A, Gegúndez-Arias ME, Marín D. Detecting the optic disc boundary in features used: a review. J Theor Appl Inf Technol 2015;72.
digital fundus images using morphological, edge detection, and feature extraction [33] Sevastopolsky A. Optic disc and cup segmentation methods for glaucoma detection
techniques. IEEE Trans Med Imag 2010;29:1860–9. with modification of u-net convolutional neural network. Pattern Recogn Image
[7] Ayub J, Ahmad J, Muhammad J, Aziz L, Ayub S, Akram U, Basit I. Glaucoma Anal 2017;27:618–24.
detection through optic disc and cup seg-mentation using k-mean clustering. In: [34] Sharma P, Sample PA, Zangwill LM, Schuman JS. Di- agnostic tools for glaucoma
2016 international conference on computing, electronic and electrical engineering detection and management. Surv Ophthalmol 2008;53:S17–32.
(ICE cube). IEEE; 2016. p. 143–7. [35] Soltani A, Battikh T, Jabri I, Mlouhi Y, Lakhoua MN. Study of contour detection
[8] Balasubramanian T, Krishnan S, Mohanakrishnan M, Rao KR, Kumar CV, methods as applied on optic nerve’s images for glaucoma diagnosis. In: 2016
Nirmala K. Hog feature based svm classification of glaucomatous fundus image international conference on control, decision and information technologies
with extraction of blood vessels. In: 2016 IEEE annual India conference (INDICON). (CoDIT). IEEE; 2016. 083–087.
IEEE; 2016. p. 1–4. [36] Vijapur NA, Kunte RSR. Glaucoma detection by using pearson-r correlation filter.
[9] Darsana S, Nair RM. Mask image generation for segmenting retinal fundus image In: 2015 international conference on communications and signal processing
features into isnt quadrants using array centroid method. International Journal of (ICCSP). IEEE; 2015. p. 1194–8.
Research in Engineering and Tech-nology 2014;3:263–7. [37] Virk JK, Singh M, Singh M. Cup-to-disk ratio (cdr) determination for glaucoma
[10] Das P, Nirmala S, Medhi JP. Detection of glaucoma using neuroretinal rim screening. In: 2015 1st international conference on next generation computing
information. In: 2016 international conference on accessibility to digital world technologies (NGCT). IEEE; 2015. p. 504–7.
(ICADW). IEEE; 2016. p. 181–6. [38] Yin F, Liu J, Wong DW, Tan NM, Cheng J, Cheng CY, Tham YC, Wong TY. Sector-
[11] Deepika E, Maheswari S. Earlier glaucoma detection using blood vessel based optic cup segmentation with intensity and blood vessel priors. In: 2012
segmentation and classification. In: 2018 2nd interna-tional conference on Annual International Conference of the IEEE Engineering in Medicine and Biology
inventive systems and control (ICISC). IEEE; 2018. p. 484–90. Society. IEEE; 2012. p. 1454–7.
15

Intelligence-Based Medicine: Rutuja Shinde

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Intelligence-Based Medicine: Rutuja Shinde

Uploaded by

Copyright:

Available Formats

Intelligence-Based Medicine 5 (2021) 100038

Contents lists available at ScienceDirect

Glaucoma detection in retinal fundus images using U-Net and supervised

2.3. Feature extraction

Glaucoma diagnosis was performed using ISNT rule [29]. Followed

Fig. 3. LeNet architecture.

• It requires much fewer images for training.

The architecture, shown in Fig. 5 consists of a contracting path and

where x is the input to a neuron.

L = loss function that measures model (mis)fit.

The update rule for SGD is given below,

η = learning rate which controls the step-size in the parameter space.

Finally, Sigmoid function (Si) described below is used to limit the

Fig. 4. ROI extraction using Brightest Spot algorithm. Figures B, C, D, E and F 1

results obtained by the application of proposed method.

Fig. 5. U-Net architecture.

area is calculated in ISNT quadrants. Fig. 7 shows the four regions in

RI = Rim area in inferior quadrant

4.4.3. Extraction of blood vessels in ISNT quadrants and ratio evaluation

as the visibility of blood vessels is more clear in green channel compared

Tb(f) = Bottom-hat transform of input image

Fig. 11. Threshold selection for blood vessel segmentation.

Lastly, a binary image containing only the blood vessels is obtained

BI = Area covered by blood vessels in inferior quadrant

The result of blood vessels in ISNT quadrants is shown in Fig. 12.

In order to develop a robust application, the classification system has

4.5.1. Classification using SVM 4.5.2. Classification using NN

where, 5.1. Performance on HRF dataset

Ruengkitpinyo et al. [30] SVM Mettapracharak 90% 91.5% 92.6%

The novelty of this research work lies in developing a methodology

You might also like