Professional Documents
Culture Documents
Report 4
Report 4
Submitted by
2020- 2024
Shekharesh Barik
Asso. Professor, Dept. of CSE
Certificate
This is to certify that, this is a bonafede Project report, titled “Diabetic Retinopathy
Detection using Deep Learning”, done satisfactorily by Rishub Kumar(2001229037),
Aditi Pradhan(2001229062), Bishes Sinha(2001229076) in partial fulfillment of
requirements for the degree of B.Tech. in Computer Science & Engineering under Biju
Patnaik University of Technology (BPUT).
This Project report on the above-mentioned topic has not been submitted for any other
examination earlier before in this institution and does not form part of any other course
undergone by the candidate.
We express indebtedness to our guide Prof. Shekharesh Barik, of the Computer Science
& Engineering department who spared his valuable time to go through manuscript and
offer his scholar advice in the writing. His guidance, encouragement and all out help have
been invaluable to me. There is short of words to express my gratitude and thankfulness
to him.
We are grateful to all the teachers of Computer Science & Engineering department,
DRIEMS, for their encouragement, advice and help.
At the outset, we would like to express my sincere gratitude to Prof. Surajit Mohanty,
H.O.D of Computer Science & Engineering department for his moral support extended
towards us throughout the duration of this project.
We are also thankful to our friends who have helped us directly or indirectly for the
success of this project.
Diabetic retinopathy (DR) is a severe complication of diabetes mellitus that affects the retina
and can lead to vision impairment or blindness if left untreated. Early and accurate diagnosis
is paramount for effective intervention and management. In recent years, deep learning, a
subset of machine learning, has demonstrated remarkable success in various healthcare
applications, including medical image analysis. CNNs, a type of deep learning model, excel
in image classification tasks and have shown promise in automating the diagnosis of various
medical conditions. This paper leverages CNNs to predict diabetic retinopathy by analyzing
retinal images. The model is trained on a dataset of labeled retinal images, learning to
identify the characteristic signs of diabetic retinopathy, such as microaneurysms,
hemorrhages, and exudates. The methodology involves preprocessing of retinal images, data
augmentation techniques, and the design of a CNN architecture optimized for diabetic
retinopathy prediction. The model's performance is evaluated using various metrics, including
sensitivity, specificity, and accuracy, achieving an impressive accuracy rate of 94-96%.
Preliminary results indicate the potential of CNNs in accurately predicting diabetic
retinopathy from retinal images. The model's ability to detect early signs of the condition
holds promise for early intervention and improved patient outcomes. Beyond the impressive
accuracy achieved, this research offers a transformative shift in the healthcare landscape. The
automated diagnosis of diabetic retinopathy using deep learning models can significantly
reduce the burden on healthcare professionals, expedite the diagnosis process, and enhance
patient access to timely care. Moreover, it opens doors to remote screening and monitoring of
diabetic patients, particularly in underserved regions where access to specialized eye care is
limited. The implications of this work extend beyond diabetic retinopathy, serving as a
compelling case study for the integration of artificial intelligence in healthcare,
revolutionizing disease detection, and ultimately improving patient outcomes.
I
CHAPTER 1
INTRODUCTION
The research presented in this study addresses a pressing healthcare challenge: the early and
accurate diagnosis of diabetic retinopathy (DR), a sight-threatening complication of diabetes
mellitus. DR not only imposes a significant burden on healthcare systems but also poses a
substantial risk to the quality of life for affected individuals. To tackle this issue, we leverage
the power of Convolutional Neural Networks (CNNs) in medical image analysis. The goal is
to develop a robust and scalable deep learning model that can autonomously identify and
classify signs of DR in retinal images. This innovative approach holds the potential to
revolutionize early DR detection, reduce the reliance on limited ophthalmological expertise,
and enhance accessibility to critical eye care services, ultimately preventing vision loss and
improving the well-being of patients with diabetes.
Page | 1
image preprocessing methodologies, putting great efforts on fundus image preprocessing
enhanced the deep feature extraction methodologies used. Gargeya et al. depended on
Sensitivity and Specificity measures to evaluate their system, in addition to the average Area
Under the Receiver Operating Characteristic curve (AUROC). The system was tested to
classify DR in 2 cases, the presence of the disease regardless of its stages against the healthy
retinal state, as well as identifying the presence of the disease in very early stages with only
few mild symptoms against its absence, as the results were visualized using a heatmap.
[3] Zhou et al. suggested using the concept of Multiple Instance Learning (MIL),
dividing the training phase into two stages, single and multi-scale training, the first stage is to
determine DR lesions for each image in one particular scale independently, and the latter to
search for lesions in an image represented in multiple scales. The authors preprocessed
images by normalizing, resizing and cropping them into a rectangular shape. Gaussian
Smoothing Kernel was applied to adjust images’ brightness, contrast, and color intensities.
Images were resized to fit FOV with 384 pixel radius. Cropping was performed to ensure the
elimination of bright borders of the image. In the single-scale learning, all images were scaled
to r = 384-pixel.
[4] Training GoogLeNet and AlexNet CNNs for 2-ary, 3-ary, and 4-ary accurate DR
grading was proposed by Lam et al. These models addressed previous limitations of
identifying early stages of the disease as reported in Gargeya and Leng’s system. Various
image preprocessing and data augmentation methods were employed, as well as training
multi-class models for the purpose of improving the algorithm’s sensitivity to early-stage DR
in fundal images. The deep layered CNNs were performing efficiently using a combination of
heterogeneous sized filters and low-dimensional embeddings, this certainly assesses the
model to learn deeper features from the training datasets. Furthermore, data augmentation
methodologies used; such as image padding, rolling, rotation and zooming, were most
effective in detecting R1 stage images, which were previously determined as the most
difficult to classify among the 4 DR stages.
[5] Sengupta et al. introduced architecture mainly aimed to achieve a robust
classification for unprecedented and varying datasets. They trained their model using
Kaggle/EYEPACS dataset, which contains 5 DR stages; healthy, mild, moderate, severe, and
PDR. The dataset was divided into two groups; low and high disease grades, each containing
Page | 2
grades 1, 2 and grades 3, 4, 5 respectively. The model includes image preprocessing and
augmentation, the authors rescaled image pixels, set image mean to 0 and then normalized
them. Moreover, the Hough transform, which is a feature extraction technique used in image
processing and computer visualization, was used whenever an image notch was identified
after conversion to grayscale. Histogram Equalization was applied to all images. This was
followed by a modified inception v3, 256-neuron dense layer, and a softmax layer
consecutively, with 2 output probabilities of the 2 classifiers. The proposed model was the
MESSIDOR dataset and outperformed several previous models in grading different DR
stages. It has achieved 90.4% accuracy, 89.26% sensitivity, 91.94% specificity, and an AUC
of 0.90.
[6] Due to the lack of large ground-truth-labeled datasets available for retinal
funduscopy, high complexity of traditional image analysis techniques when used for large
data amounts, and the quite limited models’ introduced a unique, state-of-the-art methodology
for DR detection, constructed by patch-based DCNNs, this is to distinguish most challenging
patches to be lesion or lesion-free patches, referred to as Red Lesions, which are classified
after training the model to localize critical patches, each defined by a 65-pixel subsample.
Patch localization was proven to speed-up image processing than specifically using classical
image segmentation. The suggested approach was to choose only specific critical patches
from each image using strides (image subsampling), instead of segmenting images and
including all present patches; this, on 512 x 512-pixel images, reduces the number of patches
obtained, which in turn states the number of corresponding predictions outputs from
1,300,720 to 52,428 when using DR stages (normal, S-1, S-2, S-3). The introduced model has
achieved excellent unprecedented results on the MESSIDOR dataset with 99.3% accuracy,
98% sensitivity, and 99% specificity.
[7] Unlike previous combinations of image preprocessing techniques, Pour et al.
implemented an image-augmentation-free method for analyzing funduscopy data. The authors
only employed CLAHE to enhance image quality by increasing color contrast, this method is
used to determine vessel or non-vessel lesions. To increase the model’s accuracy the pre-
trained CNNs family, EfficientNets, was chosen to be the most effective for this diagnostic
model. Moreover, pre-trained CNNs were found to save time and improve accuracy. The
authors suggested scaling up CNNs to achieve greater accuracy rates, which can be performed
Page | 3
on three dimensions; CNN’s width, depth and resolution; where width implies the number of
channels in each convolutional layer, depth indicates the number of layers in the network, and
the resolution is determined by the resolution of the image fed to the CNN.
Page | 4
human error, and limited by the availability of specialized ophthalmologists. This glaring gap
in healthcare accessibility and efficiency is what ignites our motivation to delve into the world
of deep learning and CNNs. In recent years, we have witnessed the remarkable evolution of
deep learning techniques, especially CNNs, which have demonstrated unparalleled prowess in
image analysis tasks. Their ability to discern intricate patterns and anomalies within images
has not only revolutionized various industries but also holds the potential to redefine the
landscape of medical diagnosis. The opportunity to apply these cutting-edge technologies to
the field of ophthalmology, particularly in the context of DR, is not just scientifically
intriguing but profoundly humanitarian. It embodies the fusion of innovation and compassion,
offering hope to millions of individuals grappling with diabetes and the looming specter of
vision loss. Our motivation springs from the belief that through the integration of CNNs, we
can facilitate early DR detection, reduce the burdens on healthcare providers, and democratize
access to high-quality eye care, ensuring that no one is left behind in the fight against diabetic
retinopathy.
Performance Evaluation: Assess the model's performance using established metrics such
as sensitivity, specificity, and accuracy, with a target accuracy rate of 94-96%.
Page | 5
Real-world Applicability: Investigate the practical feasibility of integrating the developed
CNN model into clinical settings, emphasizing scalability, reliability, and usability.
Accessibility: Explore the potential for remote screening and accessibility in underserved
regions, aiming to democratize early DR detection and eye care services.
Page | 6
CHAPTER 2
PROPOSED SYSTEM
In this modern era, human beings encounter different health issues. Most of the health issues
are due to the food habits of the individuals. In this Project work, a predictive approach is
proposed to pre-treat Diabetic Retinopathy. The proposed approach has three phases namely
data collection, data storage and analytics. This approach plays an important role in predicting
diabetes and pre-treating diabetic patients.
While developing the new system all requirements of the end user was taken into
consideration. These have been maximum efforts towards overcoming the drawbacks of the
existing system, while the new system was designed & developed [18].
2.1 PROCESS
Agile is a process by which a team can manage a Project by breaking it up into several stages
and involving constant collaboration with stakeholders and continuous improvement and
iteration at every stage. It promotes continuous iteration of development and testing
throughout the software development life cycle of the project. Both development and testing
activities are concurrent.
Page | 7
Fig. 2.1: Process
2.2 DATASET
The quality and quantity of the dataset has a direct effect on the decision-making process of
the Deep Learning model. And these two factors influence the robustness, precision and
performance of Deep Learning algorithms.
Page | 8
The data set is in image format which is further prepared to for pre-processing in order to fit
in Deep Learning model.
Page | 9
3. Zooming
The process of enlarging an existing image is known as image zooming. Thermal images that
have been zoomed in can be used to enhance photos since they demonstrate what happens
when photos are taken up close.
4. Contrast Enhancement
An image can have more contrast and intensities overall with contrast enhancement than it
would otherwise. This was utilized to enhance photographs and this strategy is employed in
the training set to include a wide variety of potential results. The same shot can have better
contrast on a sunny or bright day.
5. Salt & Pepper Noise
Salt and pepper noise is created by randomly shifting the intensity of some pixels to 1 and
others to 0. Salt and pepper noise may occur in images taken on dusty days or with a dusty
camera.
Page | 10
CHAPTER 3
METHODOLOGY
The main purpose of designing this system is to predict the diabetes. We have used Logistic
regression as a machine-learning algorithm to train our system and various library NumPy for
large mathematical function, pandas package for large level data manipulation, matplotlib for
command style function, SNS for data visualization, and various algorithm like decision tree
classifier, random forest classifier, SVC. These algorithms are discussed below in detail.
Page | 11
3.2 CONVOLUTION NEURAL NETWORK (CNN)
Convolutional Neural Network (CNN) are a class of deep learning models specifically
designed for processing and analyzing grid-like data, such as images and videos. They are
characterized by their unique architecture, which includes convolutional layers that apply
filters to input data, enabling automatic feature extraction. CNNs have revolutionized
computer vision tasks by excelling in tasks like image classification, object detection, and
facial recognition. Their hierarchical structure allows them to recognize complex patterns and
hierarchical features, making them adept at handling visual data. CNNs have also found
applications beyond computer vision, such as in natural language processing tasks where they
can be used for text classification and sequence modeling, further solidifying their importance
in modern machine learning and AI systems.
Page | 12
Image
ReLU Max Pooling
Preprocessing
Sigmoid
Output as 1D
Function on CNN
Vector
Final Output
Classification
for Diabetic or
Adiabatic
Retinopathy
Page | 13
3.3.2 Loading and Splitting Dataset
In the model the building part, you can use the IRIS dataset, which is a very famous multi-
class classification problem. This dataset comprises 4 features (sepal length, sepal width, petal
length, petal width) and a target (the type of flower). This data has three types of flower
classes: Setosa, Versicolor, and Virginica. The dataset is available in the scikit-learn library,
or you can also download it from the UCI Machine Learning Library.
Page | 14
Fig 3.7 Training & Saving the Model
Page | 15
CHAPTER 4
RESULTS AND ANALYSIS
If you closely look at the dataset, you will notice that out of the 10,000 customers that entered
the store, only 500 were shoplifters in that month. Some quick math will tell you that 95% of
customers are not shoplifters and the other 5% are:
Page | 16
Fig. 4.1 Representation of shoplifter example
If you basically said that everyone was not a shoplifter, you’d be correct in 95 out of 100
cases, which is what the model did.
You would be wrong in the other 5 cases. But who cares? We’re still 95% accurate, but this is
clearly not what we are looking for.
This problem happens in imbalanced. You’re either a shoplifter, or you’re not, and the
former is significantly more present in the dataset. Our model finds a shortcut to increase
accuracy.
Page | 17
Moral of the story: Accuracy is not a suitable metric to use in situations where the data is
skewed. Then how do we solve this problem? Stop using machine learning and just
investigate every customer.
Evaluate your model using a different metric, let us take a look at recall and precision,
but first, it is important to understand some terminology.
This is a confusion matrix; it shows you all the possible scenarios of the predictions of a
model Vs the ground truth.
What this metric basically does is tells us how good our model is at identifying relevant
samples. Or how good is our model at catching actual shoplifters?
Where TP: Shoplifters correctly identified, FN: Shoplifters missed.
Now if the model classifies no one as a shoplifter, it is going to have 0 recall. This means it is
"not good at all" if we are optimizing to have as high of recall as possible. On the flip side,
what if we label everyone as a shoplifter? We'll have a recall of 100%, after all, recall just
cares about not missing shoplifters, not about false accusations on customers (False Positive).
But there is a metric that does care about false accusations on customers, precision. We
Page | 18
basically replace False Negatives with False Positives in the denominator. Out of all the
positive classes we have predicted correctly, how many are actually positive. Or. How good is
our model at not making false accusations?
𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒
By this point, you might've guessed what problem we will encounter if we only optimize for
high precision in our model. The model can just call everyone not a shoplifter and have high
precision.
Recall and precision are related such that high precision leads to low recall and low
precision leads to high recall. We obviously want both as high as possible. So we need the f1
score.
The F1 score is the harmonic mean of precision and recall. The harmonic mean is a special
type of mean(average) which is explained by this formula:
2𝑎𝑏
𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑎, 𝑏 =
𝑎+𝑏
2 ∗ 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑟𝑒𝑐𝑎𝑙𝑙
𝐹1 𝑆𝑐𝑜𝑟𝑒 = 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 𝑎𝑛𝑑 𝑟𝑒𝑐𝑎𝑙𝑙 =
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
Now if we optimize our model to have an F1 score, we can have high precision and recall.
This translates to out model being able to catch shoplifters and at th same time not falsely
accuse innocent customers. The model will be sure when it catches a shoplifter that it is
actually a shoplifter and that brings this story to an end.
4.2 Results
We are using particular sorts of matrices like confusion matrix, accuracy, and F1 score to
compare the results of the various models we have developed. Confusion score provides a
more realistic picture of the model's performance. The Confusion matrix, a two-dimensional
array, includes the phrases True Negative, True Positive, False Negativeand False Positive.
Page | 19
Fig. 4.4 Four outcomes of Confusion Matrix
Below is the representation of training & validation accuracy and validation loss of the CNN.
Page | 20
Fig. 4.6 Training & validation accuracy | validation & loss
Page | 21
CONCLUSION AND FUTURE SCOPE
The deployment of this advanced deep learning model holds immense potential for benefiting
underserved rural populations lacking access to regular clinical check-ups and medical
equipment. By leveraging its capabilities, early-stage disease detection and prevention can be
significantly enhanced, revolutionizing healthcare in these areas. This model's ability to
produce reliable results swiftly and accurately is pivotal, as it empowers individuals and
healthcare providers to take proactive precautions and administer timely treatment, ultimately
improving health outcomes and quality of life in these underserved regions. The deployment
of this cutting-edge deep learning model on edge devices signifies a groundbreaking step
towards making healthcare easily accessible to all, particularly in underserved regions. In the
future, this project is poised to revolutionize diabetic retinopathy diagnosis by enabling the
detection of its various stages, including Mild, Moderate, Severe, and Proliferate, using
advanced machine learning techniques. By running this model on portable, readily available
edge devices, such as smartphones or low-cost computing platforms, it ensures that even
remote and resource-constrained areas can benefit from early-stage diabetic retinopathy
detection, facilitating timely interventions and improving the overall health and well-being of
countless individuals worldwide.
Page | 22
REFERENCES
1. B. Antal and A. Hajdu, “An ensemble-based system for automatic screening of diabetic
retinopathy,” Knowledge-Based Systems, vol. 60. pp. 20–27, 2014, doi:
10.1016/j.knosys.2013.12.023.
2. J. Cuadros and G. Bresnick, “EyePACS: an adaptable telemedicine system for diabetic
retinopathy screening,” J. Diabetes Sci. Technol., vol. 3, no. 3, pp. 509–516, May 2009.
3. L. Zhou, Y. Zhao, J. Yang, Q. Yu, and X. Xu, “Deep multiple instance learning for
automatic detection of diabetic retinopathy in retinal images,” IET Image Processing, vol.
12, no. 4. pp. 563–571, 2018, doi: 10.1049/iet-ipr.2017.0636.
4. C. Lam, D. Yi, M. Guo, and T. Lindsey, “Automated Detection of Diabetic Retinopathy
using Deep Learning,” AMIA Jt Summits Transl Sci Proc, vol. 2017, pp. 147–155, May
2018.
5. S. Sengupta, A. Singh, J. Zelek, and V. Lakshminarayanan, “Cross-domain diabetic
retinopathy detection using deep learning,” Applications of Machine Learning. 2019, doi:
10.1117/12.2529450.
6. G. T. Zago, R. V. Andreão, B. Dorizzi, and E. O. Teatini Salles, “Diabetic retinopathy
detection using red lesion localization and convolutional neural networks,” Comput. Biol.
Med., vol. 116, p. 103537, Jan. 2020.
7. M. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural
Networks,” arXiv [cs.LG], May 28, 2019.
8. G. Kalantzis, M. Angelou, and E. Poulakou-Rebelakou, “Diabetic retinopathy: an
historical assessment,” Hormones, vol. 5, no. 1, pp. 72–75, Jan. 2006.
9. “Diabetic retinopathy,” May 30, 2018. https://www.mayoclinic.org/diseases-
conditions/diabetic-retinopathy/symptoms-causes/syc-20371611 (accessed Jan. 03, 2021).
10. Stanford University. Computer Science Dept. Heuristic Programming Project and E. H.
Shortliffe, MYCIN: a Knowledge-based Computer Program Applied to Infectious
Diseases. 1977.
11. C. Ross and E. Brodwin, “At Mayo Clinic, AI engineers face an ‘acid test’ - STAT,” Dec.
18, 2019. https://www.statnews.com/2019/12/18/mayo-clinic-artificial-intelligenc e-acid-
Page | 23
test/ (accessed Jan. 03, 2021).
12. “Artificial Intelligence: How to get it right.” https://www.nhsx.nhs.uk/ai-lab/explore-all-
resources/understand-ai/arti ficial-intelligence-how-get-it-right/ (accessed Jan. 10, 2021).
13. CB Insights, “How Google Plans To Use AI To Reinvent The $3 Trillion US Healthcare
Industry,” Apr. 19, 2018. https://www.cbinsights.com/research/report/google-strategy-
healthcare/ (accessed Jan. 03, 2021).
14. “Four ways in which Watson is transforming the healthcare sector.”
https://www.healthcareglobal.com/technology-and-ai-3/four-ways-which-watson-
transforming-healthcare-sector (accessed Jan. 03, 2021).
15. H. J. Jelinek, M. J. Cree, D. Worsley, A. Luckie, and P. Nixon, “An automated
microaneurysm detector as a tool for identification of diabetic retinopathy in rural
optometric practice,” Clin. Exp. Optom., vol. 89, no. 5, pp. 299–305, Sep. 2006.
16. M. D. Abràmoff et al., “Automated Early Detection of Diabetic Retinopathy,”
Ophthalmology, vol. 117, no. 6. pp. 1147–1154, 2010, doi: 10.1016/j.ophtha.2010.03.046
17. “UCI machine learning repository: Diabetic retinopathy Debrecen data set data set.”
https://archive.ics.uci.edu/ml/datasets/Diabetic+Retinopathy+Debrecen +Data+Set
(accessed Jan. 09, 2021).
18. E. Decencière et al., “FEEDBACK ON A PUBLICLY DISTRIBUTED IMAGE
DATABASE: THE MESSIDOR DATABASE,” Image Analysis & Stereology, vol. 33, no.
3. p. 231, 2014, doi: 10.5566/ias.1155.
19. R. Polikar, “Ensemble learning,” Scholarpedia, vol. 4, no. 1. p. 2776, 2009, doi:
10.4249/scholarpedia.2776.
20. M. Almseidin, A. Abu Zuraiq, M. Al-kasassbeh, and N. Alnidami, “Phishing detection
based on machine learning and feature selection methods,” Int. J. Interact. Mob. Technol.,
vol. 13, no. 12, p. 171, Dec. 2019, Accessed: May 01, 2021. [Online].
21. L. M. Abualigah, A. T. Khader, and E. S. Hanandeh, “A new feature selection method to
improve the document clustering using particle swarm optimization algorithm,” J.
Comput. Sci., vol. 25, pp. 456–466, Mar. 2018.
Page | 24