Disease Identification and Retinal Scan Correction Using Deep Learning Techniques Project Report

DISEASE IDENTIFICATION AND RETINAL FUNDUS
SCAN CORRECTION USING DEEP LEARNING
A CAPSTONE PROJECT REPORT
Submitted in partial fulfillment of the

requirement for the award of the
Degree of
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING
by
J Kartik Sai (19BCI7056)

Sriharshitha Deepala (19BCD7246)
Hemant Kumar Rathore
(19BCE7472)
Under the Guidance of
DR. ANUPAMA NAMBURU
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

VIT-AP UNIVERSITY
AMARAVATI- 522237
DECEMBER 2022
CERTIFICATE
This is to certify that the Capstone Project work titled “DISEASE

IDENTIFICATION AND RETINAL SCAN CORRECTION USING DEEP
LEARNING” that is being submitted by J Kartik Sai (19BCI7056), Sriharshitha
Deepala (19BCD7246) and Hemant Kumar Rathore (19BCE7472) is in partial
fulfillment of the requirements for the award of Bachelor of Technology, is a record
of bonafide work done under my guidance. The contents of this Project work, in full
or in parts, have neither been taken from any other source nor have been submitted
to any other Institute or University for award of any degree or diploma and the same
is certified.
Dr. Anupama Namburu

Guide
1
The thesis is satisfactory / unsatisfactory
Internal Examiner External Examine
Approved by
PROGRAM CHAIR DEAN

B. Tech. CSE School Of Computer Science and Engineering
2
ACKNOWLEDGEMENT
We would like to first express our profound gratitude and deep regard to our
guide Dr. Anupama Namburu and sincerely wish to acknowledge her vision,
guidance, valuable feedback, and constant support throughout the duration of this
project.
We would also like to extend our gratitude to Vellore Institute of Technology,

for providing the necessary resources and facilities to complete this project to the
best of our ability. We are finally grateful to everyone who helped us directly or
indirectly see this project through to completion.
3
ABSTRACT
In this paper a deep learning application is presented which can show if the
scan of a retinal fundus risks having a disease and if so, the nature of the disease
with a risk probability along with a scan clarifier. Image processing and Deep
learning methods are used to achieve this result. This model is hosted with the help
of a website where the user can upload their scan of their retinal fundus in order to
get a diagnosis and a corrected scan (if required). When the user uploads their scan
onto the site, the deep learning model processes the image and outputs the disease
risk and the kind of disease being risked in their case. This is achieved using Deep
learning techniques including Convolutional neural network (CNN) and Variational
Autoencoder (VAE) for disease type/risk identification and scan correction
respectively
In today’s world, we face many difficulties in terms of eye care, including

treatment, quality of prevention, vision rehabilitation services, and scarcity of
trained eye care experts. Early detection and diagnosis of ocular abnormalities can
help in preparing before-hand for disease prevention and early treatment
procedures. This project aims at classification and prediction of diseases. With over
1 billion preventable cases, the ability to pre-emptively solve the issue is the best
strategy. But this is not possible in certain places with lack of proper eye-care in
either technological or technician departments.
4
TABLE OF CONTENTS
S.No. Chapter Title Page Number

1. Acknowledgement 3
2. Abstract 4
3. List of Figures 6-7
4. 1 Introduction 8
1.1 Objectives 9
1.2 Background and Literature Survey 9-11
1.3 Organization of the Report 11
5. 2 Disease identification and scan 12

correction using Deep learning
2.1 Proposed System 12-13
2.2 Working Methodology 13-16
2.3 System Details 16
2.3.1 Software 16-17
2.3.2 Hardware 17-18

6. 3 Results and Discussion 19-24
7. 4 Conclusion & Future Works 25-26
8. 5 Appendix 27-66
9. 6 References 67-68
5
List of Figures
Figure No. Title Page number

1 Active vs preventable cases 8
2 Binary Classification model 12
3 Multi-class Classification model 13
4 Proposed Workflow 14
5 Healthy vs Unhealthy distribution 19
6 Disease classes distribution 19
7 Confusion matrix for Binary classification 20
8 Binary accuracy across 25 epochs 20
9 Binary cross-entropy across 25 epochs 21
10 Categorical accuracy across 25 epochs 21
11 Hamming Loss across 25 epochs 22
12 Sigmoid focal cross entropy Loss across 25 22

epochs
13 Evidence of lower bound Loss across 10 epochs 23
6
14 Retinal Fundus Image having no disease 23
15 Retinal Fundus Image having disease 24
7
CHAPTER 1
INTRODUCTION
According to the World Health Organization, over 2 billion people suffer from some form
of vision impairment or problems with at least 50% of the 2 billion are preventable
Figure 1 Active vs Preventable cases
With over 1 billion preventable cases, the ability to pre-emptively solve the issue is the
best strategy. But this is not possible in certain places with lack of proper eye-care in either
technological or technician departments. To tackle this issue, we need a disease risk predictor that
can tell us if the retina of the patient shows any signs or lead indicators of a predefined set of
symptoms.
This disease risk predictor and classifier can be implemented using the Retinal Fundus
Multi-Disease Dataset (RFMiD). This dataset consists of 3200 retinal fundus images and the
corresponding set of ground truth values in CSV format, and this enables the implementation of
the solution through deep learning.
8
1.1 Objectives
The following are the objectives of this project:

● To predict retinal disease risk using custom Convolutional Neural Network (CNN) model.
● To classify the type of disease if there is any risk found in the retina using the binary
classification CNN model that is used to predict the retinal disease risk and head of multi-
class classification CNN.
● To clear the bad scans and improve low quality scans using Variational Autoencoders
(VAE).
1.2 Background and Literature Survey
We referred to a paper titled “Retinal Fundus Multi-disease image Dataset” by Samiksha

Pachade, Prasanna Porwal, Dhanshree Thulkar, Manesh Kokare, Girish Deshmukh, Vivek
Sahasrabuddhe, Luca Giancardo, Gwenolé Quellec, Fabrice Mériaudeau. This paper talks about
the dataset of for the Retinal fundus scans taken from clinics using three different digital cameras
TOP-CON 3D OCT-2000, Kowa VX-10α, and TOPCON TRC-NW300, all of them centered
either on the macula or optic disc. These images are collected from subjects visiting an eye clinic
due to a concern for their eye health. The RFMiD dataset consists of 3200 images which includes
both normal and abnormal conditions of an eye. The abnormal conditions were further classified
into 45 different categories which includes Diabetic Retinopathy (DR), Tessellation (TSLN),
Age-Related Macular Degeneration (ARMD), Myopia (MYA), Drusen (DN), Branch Retinal
Vein Occlusion (BRVO), Media Haze (MH), Laser Scars(LS), Epiretinal Membrane (ERM),
Macular Scar(MS), Central Serous Retinopathy (CSR), Optic Disk Cupping (ODC), Central
retinal vein occlusion (CRVO), Tortuous vessels (TV), Asteroid hyalosis (AH), Optic disc pallor
(ODP), Optic disc edema (ODE), shunt (ST), Anterior ischemic optic neuropathy (AION),
Parafoveal telangiectasia (PT), Retinal traction (RT), Retinitis (RS), Chorioretinitis (CRS),
Exudation (EDN), Retinal pigment epithelium changes (RPEC) , Macular hole (MHL) , Retinitis
Pigmentosa (RP), Cotton-wool spots (CWS), Coloboma (CB) , Optic disc pit maculopathy
(ODPM), Preretinal hemorrhage (PRH), Myelinated nerve fibers (MNF), Hemorrhagic
retinopathy (HR), Central retinal artery occlusion (CRAO), Tilted disc (TD), Cystoid macular
edema (CME), Post-traumatic choroidal rupture (PTCR) , Choroidal folds (CF), Vitreous
hemorrhage (VH), Macroaneurysm (MCA) , Vasculitis (VS), Branch retinal artery occlusion
9
(BRAO), Plaque (PLQ), Hemorrhagic
10
Pigment Epithelial Detachment (HPED) and Collateral (CL). The dataset consists of one-hot
encoded values of disease and risk and these 45 different eye conditions was also given in .csv
format. This dataset was divided into 60% training set i.e., 1920 images, 20% evaluation set i.e,
640 images and 20% testing test i.e., 640 images.
A similar project on the ground of Deep Learning was carried out by Dominik Müller,
Iñaki Soto-Rey, and Frank Kramer of IT-Infrastructure for Translational Medical Research,
University of Augsburg, Germany, and Medical Data Integration Center, University Hospital
Augsburg, Germany. This is our base paper; the title of our base paper is “Multi – Disease
Detection in Retinal Imaging Based on Ensembling Heterogeneous Deep Learning Models.” This
paper discussed an approach to detect various retinal diseases by utilizing ensemble learning and
heterogeneous deep convolutional neural network models. Various ensemble learning techniques
like heterogeneous deep learning models, bagging via 5 – fold cross validation and stacked
logistic regression models have been integrated and modern techniques like class weighting,
transfer learning, real-time image augmentation and focal loss are to build the classification
pipeline. The dataset used in this paper was Retinal Fundus Multi – Disease Image Dataset
(RFMiD). This dataset totally consists of 3200 retinal fundus images and 46 different conditions
of a retina. The number of classes were reduced to a disease risk class, another 27 columns of
various retinal conditions and 1 ‘OTHER’ class that consists of extremely rare conditions. The
dataset was divided into training, testing and validation datasets. The training dataset consists of
1920 images and testing and validation datasets consist of 1280 images. In order to mitigate the
bias of the dataset, pre-processing and data augmentation was performed over it. The disease risk
was detected by using multiple models based on EfficientNetB4 and DenseNet201 architectures.
Thereafter disease label classification was done by using ResNet152, InceptionV3, DenseNet201
and EfficientNetB4 and applied ensemble learning strategies Bagging and Stacking were used.
This model gave an AUROC score of 0.95.
We also referred to “Squeeze and excitation networks” paper (by Jie Hu, Li Shen, Samuel
Albanie, Gang Sun, Enhua Wu) for the squeeze and excitation block in the model architecture to
enhance the learning and give better accuracy and results as a consequence.
A site was also developed to deploy the model for common and easy usage. To deploy the
model as a web application, an article that was published in Medium named “Tensorflow and
Keras Model Deployment using Flask in Google Colab” by Rishi Mishra was taken as reference.
11
This article mainly discusses saving a deep learning model in Keras .H5 format/ .json format,
creating a web server using flask, running a flask application using flask – ngrok in Google Colab.
1.3 Organization of the Report
The remaining chapters of the project report are described as follows:

● Chapter 2 contains the proposed system, methodology and software details.
● Chapter 3 discusses the results obtained after the project was implemented.
● Chapter 4 concludes the report.
● Chapter 5 consists of codes.
● Chapter 6 gives references.
12
CHAPTER 2
DISEASE IDENTIFICATION AND SCAN CORRECTION USING
DEEP LEARNING
This Chapter describes the proposed system, working methodology, software and hardware
details.
2.1 Proposed System
The solution is bifurcated into 2 phases, namely:
1. Binary classification for disease risk classification (Healthy vs Unhealthy)
Figure 2 Binary classifier
The model is created using 2 different kinds of blocks:

a. Custom CNN block consisting of 2 parallel convolution operations each of 1x1 kernel and
a 3x3 kernel leading into a dropout layer and then Maxpooled with 2x2 kernel with strides
of 2.
b. Squeeze and Excitation block taken from the “Squeeze and excitation networks” paper (by
Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Enhua Wu) with a global Maxpool and a
dense connection with a element wise multiplication with the feature map tensors to add
weights to each feature map
13
2. Multi-class classification for disease-class classification (DR, TSN, OC, OTHER etc.)
Figure 3. Multi-class classification model
This model is an extension of the binary classifier with a multi-class classifier head.
The head consists of a single convolution block of 3 convolutions (1x1, 3x3 and 5x5), dropout
and max pool operation of 2x2 size and strides of 2. This is then input into a squeeze and
excitation block to emphasize important feature maps and then flattened
2.2 Working Methodology

The project consists of three models, disease risk prediction model(unhealthy or healthy),
category of disease prediction model(disease class), and scan denoiser. These models were
implemented by taking inspiration from different deep learning techniques such as densenet, SE
blocks, Inception block, VAE and many more.
The initial step of all the models is to load the data-set. The RFMiD data-set is loaded
using tensorflow.data pipeline(A data pipeline is a series of data processing steps). The tf.data
pipeline makes it possible to handle massive volumes of data, read from many data types,
combine randomly picked pictures into a batch for training, and execute complicated
transformations, and also this pipeline helps in reducing processing time by effectively utilizing
14
CPU and GPU resources. The RFMiD data-set is highly biased, and to reduce bias we performed
Data Augmentation to increase low image density.
The data augmentations performed were as follows:

a. Cropping so that the retinal fundus occupies the maximum space and to convert
rectangular images into squares for lower information loss
b. Augmentation layers as follows:
1. Random rotation
2. Random flipping
3. Rescaling from [0,255] to [0.0, 1.0]
4. Random Brightness
5. Random contrast
Figure 4. Proposed Workflow
The dataset is now ready to be passed into the model. Firstly the Binary classifier is created. The
model architecture is done by taking inspiration from Inception V3 and addition of a squeeze and
excitation network.
15
A Loss function used to benchmark the model to see how well it has learnt. Here, the loss
function is a simple Binary Cross-entropy loss due to the binary nature of the healthy status of a
retinal scan (either healthy or unhealthy). This helps to learn better exploiting the log-probability
concept.
After the loss is calculated, the edges’ weights are to be updated so as to minimize the
loss. To do this, an optimiser is used. In this case we have made use of the ‘Adam’ optimiser with
an initial learning rate of 0.01 (1e-2), 𝛃1 decay value of 0.9 and 𝛃2 decay value of 0.999.
Combined with this, we also put in a learning rate schedule callback which monitors the
validation loss with a patience of 3 epochs. This scheduler decays the learning rate by a factor of 2
(that is, it reduces the learning rate by half if the validation loss is greater than or equal to the
earlier epoch’s validation loss for 3 epochs continuously)
For multi-class classification, the binary classification model was used and was added with
a multi-class head consisting of 3 convolution parallel operations leading into a Squeeze and
Excite block and subsequently flattened and led into a dense network with 28 output nodes to
signify 28 different classes. Each of these 28 nodes have a sigmoid activation so as to give a
probability of risking the disease individually.
This multi-class classifier is trained with a loss function of “Sigmoid focal cross-entropy”
due to the high complexity of the training labels and to prevent the model from overfitting on the
easier examples. This loss also helps to penalize the model if it finds a constant value to output for
the best loss. To give an understandable benchmark/metric, categorical accuracy and Hamming
loss were used respectively for the accuracy and inaccuracy.
The Variational AutoEncoder was used to denoise the unclean scans and help fix any
inconsistencies in the scan generated. It comprises a simple architecture of an encoder
Convolutional Neural Network that encodes the image into a latent space of mean and log-
variance. These latent space parameters are then re-parameterized with reference to the normal
distribution and any outlier is deemed as noise and is substituted with an appropriate value.
These updated parameters are then decoded into input-like outputs (in this case, Retinal fundus
scan images). Benchmarking of this model is done using ELBO or Evidence of lower
16
bound loss which helps to give a benchmark and understand how the model performs with respect
to the original and corrected images.
2.3 System Details
This section describes the software and hardware details of the system:
2.3.1 Software Details
Google Colaboratory, Python Programming Language and Python Modules such as NumPy, Pandas,
Tensorflow, Keras, Sklearn, Matplotlib, PIL, etc are used.
1. Google Colaboratory
Google Colaboratory which is popularly known as Google Colab is a platform for

researchers by Google. It is very much like Jupyter Notebooks but it is a cloud-based platform.
One can share their work with others very easily just by providing a link to them. The shared
notebook can be viewed by anyone without downloading it. This is one of the major advantages
of Google Colab. It also helps to perform data analysis tasks, train machine learning and complex
deep learning models under CPU(Central Processing Unit), GPU(Graphics Processing Unit) and
TPU(Tensor Processing Unit) run times with a 12 hours of continuous execution time. Google
Colab also had an option to use a local run time or connect to a cloud based instance with more
powerful hardware.
2. Python language and Python packages used
Python is a widely used programming language for research and development. The below are
a few of the reasons for extensive use of Python in research and development :
1. Python is a high-level, general-purpose programming language that is easy to learn and
use, making it accessible to a wide range of people, including those with no prior
programming experience.
17
2. Multiple libraries and frameworks in Python enable us to complete bigger tasks in lesser
lines of code.
3. Python has a large and active community of users and developers
Python 3.8.16 was used in this project, which is proprietary with Google Colbaratory.
Versions of all the packages were in accordance with Python 3.8. The packages used were as
follows:
1. tensorflow - keras : Deep learning framework for Neural network and data augmentation
model architecture
2. pandas : CSV file read and write
3. numpy : Array operations, data-type operations, miscellaneous
4. tensorflow - dataset : Dataset generator and augmentation
5. PIL : Python image library for image-related operations and image manipulations
6. lime : Locally interpretable explanation for the decision taken by the model
7. sklearn : Scikit-learn (abbreviated as “sklearn”) contains many useful tools for data
manipulation (scaling, imputation of missing values, generating synthetic samples), and
evaluating the performance of a deep learning model (accuracy score, F1 score, AUC,
confusion matrix, etc metrics are useful to assess the performance of a model).
8. matplotlib : Python library for data visualization
9. flask : Microweb framework in python. Well-suited for building small to medium - sized
web applications
10. werkzeug : Provides a collection of utilities for building web applications using flask.
11. pyngrok : creates a secure tunnel from a public URL to a locally web server from python
code.
12. flask_ngrok : An extension of flask that allows a user to make their flask application
available globally by using ngrok.
2.3.2 Hardware Details
The specifications of CPUs, GPUs and TPUs offered by free version of google colab are :
● CPU: A single CPU with 2 vCPUs and 13 GB of memory.
18
● GPU: A single GPU with 1 vCPU and 13 GB of memory. The GPU is an NVIDIA
Tesla T4, K80.
● TPU: A single TPU with 2 vCPUs and 13 GB of memory. The TPU is a v2 TPU.
19
CHAPTER 3
RESULTS AND DISCUSSIONS
a. Dataset distribution of healthy vs unhealthy samples
Figure 5. Healthy vs Unhealthy distribution
b. Dataset distribution of disease classes (reduced from 45 to 28 due to lack of

samples)
Figure 6. Disease classes distribution
20
c. Confusion matrix for Binary classifier
Figure 7. Confusion matrix for Binary classification
d. Binary accuracy across 25 epochs
Figure 8. Binary accuracy across 25 epochs
21
e. Binary classification Cross-entropy loss across 25 epochs
Figure 9. Binary cross-entropy across 25 epochs
f. Categorical Accuracy across 25 epochs
Figure 10. Categorical accuracy across 25 epochs
22
g. Hamming loss for 25 epochs
Figure 11. Hamming Loss across 25 epochs
h. Sigmoid focal cross-entropy across 25 epochs
Figure 12. Sigmoid focal cross entropy Loss across 25 epochs
23
i. ELBO loss for 10 epochs
Figure 13. Evidence of lower bound Loss across 10 epochs
j. Web Application
Figure 14: Retinal Fundus Image having no disease

24
Figure 15: Retinal Fundus Image having disease
25
CHAPTER 4
CONCLUSION AND FUTURE WORK
This project was a consequence of a problem survey done on local ophthalmologists.

Complaints ranged from incorrect, slow diagnosis and lack of expertise or help that is required to
properly help with the accurate decision of a disease or risk of a disease. This creates a large
problem in proper planning, safety of a patient which could result in improper precautionary
measures taken or in some cases, no precautionary measures. This project aims at helping to solve
that issue and also hopes to provide a sort of pre-diagnosis to any individuals so that they take the
utmost care and can be particularly useful in very remote places, with a very steep patient per
doctor ratio.
Abnormalities in a retinal fundus scan, if looked into in its early stages, can lead to pre-
emptive cure of many diseases like diabetes but having a doctor who can do it always is not a
guarantee in many places. Therefore, a need arises for an automated and accurate classification
and prediction mechanism for this kind of job.
This project also highlights the dire state of some of the more remote places in our country
and few other countries where the availability of an eye-specialist is not very accessible and
having a proper diagnosis is not always a guarantee with these conditions. This also highlights
how many cases could be prevented with proper diagnosis and symptom checking as the signs of
illness are very well reflected onto the patients’ retinas. Some diseases include but are not limited
to Diabetic retinopathy, Macular Degeneration. If these conditions are not taken into
consideration, it could prove fatal to the patient in some cases.
The model performance was limited due to the time constraint due to its computationally
heavy nature. Training for 25 epochs at a time to go back and optimize the model was the viable
strategy because of the same. The depth of convolutional layers and parallelisation of multiple
convolution operations with many kernels has made the variance of the model to be wider and be
capable of understanding, learning and identifying the disease with a far more broader
understanding and knowing. This inturn caused the model to learn very slowly and overfit on
several occasions which was fixed by using dropout, data augmentation and other regularization
methods like L2.
The Variational AutoEncoder also was a very computationally intensive model due to the
size of the images and the RGB coloration of reconstruction of images. The CVAE also had the
same challenges as the CNN model in the computational complexity domain. But the
reconstruction for the first 10 epochs has proven to be a very promising model and can be further
trained and improvised with better training times and computational methods.
26
A future prospect of this project could happen with micro-computers/processors capable of
running the models on their local chipset wherein the model weights could be dumped onto it and
these tiny computers/processors could be embedded onto scan stations of the retinas in
optometrists and can provide a near-instantaneous result on the scanner itself.
Another future prospect could be training on newer data in order to fine-tune the model or
help identify newer diseases. This can prove very useful in an age where in the case of a global
pandemic, the remote areas get the least amount of attention and in return cause massive damage
to life in those areas.
A possible third future improvement could come in the form of more technically sound
machines where the finer details are highlighted which can enhance the model capabilities of
identifying diseases. This can also help in places where the machines are not up-to-date and can
provide a far more cleaner re-construction in their cases.
27
CHAPTER 5
APPENDIX
PPTS FOR REVIEW 1, 2 AND 3 :
Review - 1:
https://docs.google.com/presentation/d/16ESI7IAYYQOlWlyAcjEiGuITxCWI02p8
ae17cg12NaE/edit?usp=sharing
Review - 2:
https://docs.google.com/presentation/d/1yO0CVWbe0ghLlmMJw4tShf6Kqo1GzjM
SosGxModbENw/edit?usp=sharing
Review - 3:
https://docs.google.com/presentation/d/1DY_k15ylI9MNulTmg5gmr-WX35_xWsv
ILs1aI1kaZCw/edit?usp=sharing
POSTER :
Link for Poster :
https://drive.google.com/file/d/1YuZdMJEjLW5hGZIUdyqoi5YW7luhtyBw/view?u
sp=sharing
28
Python script and code
Binary and Multi-Class Classification
# Mounting google drive
# importing required libraries
# Loading data
from google.colab import
import
drive tensorflow as
labels = pd.read_csv('/content/drive/MyDrive/Capstone/dataset/RFMiD_Traini
tf import numpy as =np['ID'], axis = 1, inplace = True)
labels.drop(labels
drive.mount('/content/drive'
import pandas as pd 28
from sklearn.model_selection import
train_test_split import matplotlib.pyplot as plt
from keras import layers
from keras import backend as K
from tensorflow.data import Dataset
from google.colab.patches import cv2_imshow as
imsh from numpy import load
import pickle
%load_ext tensorboard
labels =
labels.astype('uint8') labels
# block 1
# block 2
# block 3
X =
# block 4
load('/content/drive/MyDrive/Capstone/NPdatasets/
Y =
Xdata256.npy', allow_pickle = True)
evalX =
np.array(labels['Disease_Risk']) Y
load('/content/drive/MyDrive/Capstone/NPdatasets/ValXdata256.np
%cd /content
=
y'Y.astype('uint8')
for i in
)
range(len(X)): X[i]
for i in range(len(evalX)):
= X[i][...,::-1]
evalX[i] = evalX[i][...,::-
2
evalLabels =
pd.read_csv('/content/drive/MyDrive/Capstone/dataset/RFMiD_Vali
da tion_Labels.csv')
evalY = np.array(evalLabels['Disease_Risk'])
# block 5
def prepDataBinTr(X, Y, bsize = 64):
posX = []
negX = []
posY = []
negY = []
for i in range(len(Y)):
if Y[i] == 1:
posX.append(X[i])
posY.append(Y[i])
elif Y[i] == 0:
negX.append(X[i])
negY.append(Y[i])
posX = np.array(posX)
posY = np.array(posY)
negX = np.array(negX)
negY = np.array(negY)
def make_ds(x,y):
3
ds = Dataset.from_tensor_slices((x,y))
ds = ds.shuffle(320).repeat()
return ds
ds_pos = make_ds(posX, posY)
ds_neg = make_ds(negX, negY)
dss = [ds_pos, ds_neg]
ds = Dataset.sample_from_datasets(dss, weights = [0.4, 0.6])
# ds = Dataset.from_tensor_slices((X, Y))
ds.shuffle(320)
ds = ds.batch(bsize)
augModel = tf.keras.Sequential([
tf.keras.layers.RandomBrightness(factor = 0.2),
layers.RandomContrast([0.3, 0.7]),
layers.RandomFlip('horizontal_and_vertical'),
layers.RandomRotation([-0.5, 0.5], fill_mode = 'constant',

fill_value = 0.0),
layers.RandomZoom(0.2, 0.2, fill_mode = 'constant',

fill_value = 0.0)
])
ds = ds.map(lambda x, y: (augModel(x), y), num_parallel_calls =

tf.data.AUTOTUNE)
return ds
def prepDataBinVal(X, Y, bsize = 64):
3
# xtes = np.reshape(xtes, [xtes.shape[0], 256, 256, 3])
val_ds = Dataset.from_tensor_slices((X,Y))
val_ds = val_ds.map(lambda x, y: (x, y), num_parallel_calls

= tf.data.AUTOTUNE)
val_ds = val_ds.batch(bsize)
return val_ds.prefetch(tf.data.AUTOTUNE)
# block 6
# block 7
# block 8
ds = prepDataBinTr(X, Y, bsize = 32)
it = iter(ds)
valds = prepDataBinVal(evalX, evalY, bsize = 32)
temp
temp = it.next()
=
temp0 =
it.next()
np.asarray(temp[0])
temp0 =
imsh(temp0[2])
np.asarray(temp[1]) #
32
print(temp[1])
imsh(temp0[2])
# block 9
from tensorflow.keras import layers as L
def createModel(id):
def customCSP(inp, f):
u = inp.shape
ar = []
while(len(ar) <= u[3]/2):
i = np.random.randint(u[3])
if i not in ar:
ar.append(i)
feat1 = []
feat2 = []
for i in range(u[3]):
if i in ar:
feat1.append(inp[:,:,:,i])
elif i not in ar:
feat2.append(inp[:,:,:,i])
feat1 = tf.convert_to_tensor(feat1)
feat2 = tf.convert_to_tensor(feat2)
conv1 = L.Conv2D(f, (1,1), activation = 'relu', padding =

'same')(feat1)
conv1 = L.Dropout(0.6)(conv1)
33
'same')(feat1)

'same')(feat1)
conv1 = L.MaxPool2D((2,2),(2,2))(conv1)
# feat2 = L.MaxPool2D((2,2),(2,2))(feat2)
out = L.concatenate([conv1, conv2, conv3, feat2], axis = -1)
return out
def customBlock(inp, f, a = True):
act = None
if a:
act = 'relu'
conv1 = L.Conv2D(f, (1,1), activation = act, padding =

'same')(inp)
34
# conv1 = L.Conv2D(f, (3,3), activation = act, padding =
'same')(conv1)
conv2 = L.Conv2D(f, (3,3), activation = act, padding =

'same')(inp)

'same')(conv2)

'same')(inp)

'same')(conv3)
# conv3 = L.Dropout(0.6)(conv3)
# conv1 = L.Conv2D(f/2, (3,3), strides = (2,2), activation =

act, padding = 'same')(conv1)


# conv3 = L.MaxPool2D((2,2),(2,2))(conv3)
output = L.concatenate([conv1, conv2], axis = -1)
return output
35
def SENblock(inp,ch,ratio=16):
x = L.GlobalMaxPooling2D()(inp)
x = L.Dense(ch//ratio, activation='relu')(x)
x = L.Dropout(0.5)(x)
x = L.Dense(ch, activation='sigmoid')(x)
output = L.multiply(inputs = [inp, x])
return output
K.reset_uids()
inp_shape = L.Input(shape = (256, 256, 3))
inp_shape = L.Rescaling(1./255)(inp_shape)
t1 = customBlock(inp_shape, 32)
t1 = SENblock(t1, t1.shape[-1], ratio = 16)
t1 = L.MaxPooling2D((2,2),(2,2))(t1)
t2 = L.MaxPooling2D((2,2),(2,2))(t2)
t3 = L.MaxPooling2D((2,2),(2,2))(t3)
36
out = L.concatenate([t1, t2, t3], axis = -
1) out = customBlock(out, 64,True)
# out = customCSP(out, 32)1

out = SENblock(out, out.shape[-1], ratio = 16)
out = L.Flatten()(out)
out = L.Dense(32, activation =

'relu',kernel_regularizer=tf.keras.regularizers.l2(l2
= 0.01))(out)
# out = L.Activation(tf.keras.activations.relu(out, alpha

= 0.5))
# out = L.Dropout(0.7)(out)
out = L.Dense(1, activation = 'sigmoid', bias_initializer

= tf.keras.initializers.Constant(np.log(1519/401)))(out)
model = tf.keras.Model(inp_shape, out, name

= f'Custom_Model_{id}')
# img_path = f'/content/Net_image_{id}.png'
# tf.keras.utils.plot_model(model, to_file = img_path,

show_shapes = True, show_layer_names = False, show_dtype = True)
return model
# block 10
from keras.callbacks import ReduceLROnPlateau
from sklearn.model_selection import train_test_split
learning_rate_reduction = ReduceLROnPlateau(monitor =
'val_binary_accuracy', patience = 3, verbose = 1, factor =
0.5, min_lr = 1e-7)
37
binMod = createModel(1)
opt = tf.keras.optimizers.Adam(1e-2, beta_1 = 0.9, beta_2

= 0.999, epsilon = 1e-7, amsgrad = False)
binMod.compile(optimizer = opt, loss =

tf.keras.losses.BinaryCrossentropy(from_logits = False),
metrics
= [tf.keras.metrics.BinaryAccuracy(threshold = 0.450)])
# block 11
# block
!rm -rf 12
./logs
# block 13
%tensorboard --logdir logs
from keras.callbacks import TensorBoard
class LRLogger(TensorBoard):
def init (self, log_dir, **kwargs):
super(). init (log_dir = log_dir, **kwargs)
def on_epoch_end(self, epoch,
logs=None): logs = logs or {}
logs.update({'lr':
K.eval(self.model.optimizer.lr)})
super().on_epoch_end(epoch, logs)
3
# lrlogcback = LRLogger('/content/logs')
# block 14
from matplotlib import pyplot as plt
import keras
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.utils import class_weight
# class GraphCallback(keras.callbacks.Callback):
# def init (self):
# accs = []
# self.accs = accs
# def on_epoch_begin(self, epoch, logs = None):
# try:
# plt.plot(self.accs)
# # plt.plot(history.history[atts[2]])
# plt.title('model accuracy')
# plt.xlabel('epoch')
# plt.legend(['train'], loc='upper left')
# plt.show()
# except Exception as e:
# print(logs.keys())
# return
# def on_epoch_end(self, epoch, logs = None):
3
# clear_output()
# # def on_train_end(self, logs = None): self.accs.append(logs['binary_a
def modelGraph(ds,valds, model, epochs = 15): log_dir = './logs'

tbcback = keras.callbacks.TensorBoard(log_dir=log_dir,
histogram_freq=0, profile_batch = 5)
filepath = '/content/checkpoints_globmaxpool.h5'
chckpntcback = ModelCheckpoint(filepath, monitor = 'val_binary_accuracy'
early_stop = EarlyStopping(monitor = 'val_binary_accuracy', patience = 8

history = model.fit(ds, validation_data = valds, epochs = epochs,steps_p
# history = model.fit(ds, epochs = epochs, callbacks = [learning_rate_re
model.save(f'/content/CustomModel_globmaxpool.h5')
modelGraph(ds,valds, binMod, epochs = 25)
# block 15
from keras.models import load_model
4
mod_path =
'/content/drive/MyDrive/Capstone/Model_Weights/CustomModel.h
5'
binMod = load_model(mod_path)
# block 16
# block 17
# block 18
choice =
try:
np.random.randint(640) im =
choice
from =
lime import
evalX[choice]
np.random.randint(640)
lime_image im =
except Exception
im = im[...,::-1]
evalX[choice]
as
im e:
= np.reshape(im, [1,256,256,3])
!pip
pred = install lime
binMod.predict(im, verbose =
0)from lime import lime_image
4
# im = im[...,::-1]
# im = np.reshape(im, [1,256,256,3])
explanation = explainer.explain_instance(im, binMod.predict)
# block 19
# block 20
# block 21
evalY[choic
from skimage.segmentation import mark_boundaries
e]
choices = np.random.randint(low = 0, high = 639, size = 32)
ypred = []
temp_1, mask_1 =
42
explanation.get_image_and_mask(explanation.top_labels[0
], positive_only=True, num_features=5, hide_rest=True)
temp_2, mask_2 =
explanation.get_image_and_mask(explanation.top_labels[0
], positive_only=False, num_features=10,
hide_rest=False)
fig, (ax1, ax2) = plt.subplots(1, 2,
figsize=(15,15)) ax1.imshow(mark_boundaries(temp_1,
mask_1)) ax2.imshow(mark_boundaries(temp_2,
mask_2)) ax1.axis('off')
ax2.axis('off')
yact = []
# for i in choices:
# im = evalX[i]
# im = im[...,::-1]
# im = np.reshape(im, [1,256,256,3])
# pred = binMod.predict(im, verbose = 0)
# if pred > 0.450:
# ypred.append(1)
# else:
# ypred.append(0)
# yact.append(yeval[i])
for i in choices:
# if Y[i] == 0:
im = evalX[i]
im = im[...,::-1]
im = np.reshape(im, [1,256,256,3])
pred = binMod.predict(im, verbose = 0)
pred = pred[0][0]
print(f'predicted : {(pred)}, actual : {Y[i]}')
if pred >= 0.40:
ypred.append(1)
else:
ypred.append(0)
yact.append(Y[i])
# block 22
4
import matplotlib.pyplot as
plt plt.figure(figsize =
[35,5]) plt.plot(ypred, 'bo')
plt.plot(yact, 'rx')
plt.legend(['ypred', 'yact'])
plt.axhline(0.55,c = 'r', linestyle = '--')
# block 23
# block 24
from
from keras.models import
sklearn.metrics import confusion_matrix
load_model
conf_matrix#=binModi = binMod2
confusion_matrix(y_true=yact,
binModi =
y_pred=ypred) fig, ax = plt.subplots(figsize=(7.5,
load_model('/content/drive/MyDrive/Capstone/Model_Weights/Custo
7.5)) ax.matshow(conf_matrix, cmap=plt.cm.Blues,
mM odel.h5')
alpha=0.3)
for i in range(conf_matrix.shape[0]):
for j in range(conf_matrix.shape[1]):
ax.text(x=j, y=i,s=conf_matrix[i, j],

va='center', ha='center', size='xx-large')
# plt.xlabel('Predictions',
fontsize=18) plt.ylabel('Actuals',
fontsize=18) plt.title('Predictions',
fontsize=18) plt.show()
4
binModi.layers[-5].name
# block 25
# block 26
# block 27
img = evalX[0]
binMod = tf.keras.models.Model(inputs = binModi.input, outputs
img = np.reshape(img,
def PrepDataMulTr(X, Y, bsize = 64):
= binModi.get_layer('concatenate_3').output)
#[1,256,256,3])
def makeDS(x, print(binMod2(img))
y):
binMod.output
# posX = []
# posY = []
# # for i in range(len(y)): posX.append(x[i])

# posY.append(y[i])
# posX = np.array(posX)
# posY = np.array(posY)
45
# def make_ds(x1,y1):
# ds = Dataset.from_tensor_slices((x1,y1))
# ds = ds.repeat()
# return ds
# ds_pos = make_ds(posX, posY)
# return ds_pos
# dss = []
# weight = []
# for y in range(Y.shape[-1]):
# dss.append(makeDS(X,Y[:,y]))
# weight = [1./28]*28
# ds = dss[0]
# for i in range(1,28):
# ds = ds.concatenate(dss[i])
ds = Dataset.from_tensor_slices((X, Y))
ds.shuffle(240)
ds = ds.batch(bsize)
tf.keras.layers.RandomBrightness(factor = 0.3),
layers.RandomContrast([0.0, 0.7]),
layers.RandomFlip('horizontal_and_vertical'),
4
layers.RandomRotation([-0.3, 0.7], fill_mode =
'constant', fill_value = 0.)
])
ds = ds.map(lambda x, y: (augModel(x), y), num_parallel_calls

= tf.data.AUTOTUNE)
return ds
def prepDataBinVal(X, Y, bsize = 64):
# xtes = np.reshape(xtes, [xtes.shape[0], 256, 256, 3])
val_ds = Dataset.from_tensor_slices((X,Y))
val_ds = val_ds.map(lambda x, y: (x, y), num_parallel_calls

= tf.data.AUTOTUNE)
val_ds = val_ds.batch(bsize)
return val_ds.prefetch(tf.data.AUTOTUNE)
# block 28
#
Y block 29
= np.array(labels.drop(labels = ['Disease_Risk'], axis =
# block 30
trainDS = PrepDataMulTr(X, Y, 32)
1)) Y.shape
it = iter(trainDS)
4
temp = it.next()
temp0 =
np.asarray(temp[1]) #
imsh(temp0[2])
# print(np.unique(temp0, return_counts =
True)) print(temp[1].shape)
# block 31
# block 32
binMod.output.sha
import
pe keras.layers as L
def createModelMul(binMod, id =
1): K.reset_uids()
def customBlock(inp, f):
conv1 = L.Conv2D(f, (1,1), activation = 'relu', padding

= 'same', name = "conv")(inp)
conv1 = L.Dropout(0.5, name = "drp")(conv1)

= 'same', name = "conv1")(inp)
conv2 = L.Dropout(0.5, name = "drp1")(conv2)

= 'same', name = "conv2")(inp)
conv3 = L.Dropout(0.5, name = "drp2")(conv3)
conv1 = L.MaxPool2D((2,2),(2,2), name = "mp")(conv1)
conv2 = L.MaxPool2D((2,2),(2,2), name = "mp1")(conv2)
conv3 = L.MaxPool2D((2,2),(2,2), name = "mp2")(conv3)
4
output = L.concatenate([conv1, conv2, conv3], axis = -1, name
= "conc")
return output
def SENblock(inp,ch,ratio=16):
x = L.GlobalMaxPooling2D(name = 'glob')(inp)
x = L.Dense(ch//ratio, activation='relu', name = 'de')(x)
x = L.Dropout(0.5, name = 'drpsen')(x)
x = L.Dense(ch, activation='sigmoid', name = 'de1')(x)
output = L.multiply(inputs = [inp, x], name = 'mul')
return output
binMod.trainable = False
input_custom = binMod.output
out = customBlock(input_custom, 64)
out = SENblock(out, out.shape[-1],16)
out = L.Flatten()(out)
out = L.Dense(32, activation = 'relu', name = 'dese')(out)
out = L.Dense(28, activation = 'sigmoid', name = 'dese1')(out)
model = tf.keras.models.Model(inputs = binMod.input, outputs =

out, name = f'Custom_Model_Mul_{id}')
return model
# block 33
!pip install tensorflow_addons

49
# block 34
from keras.callbacks import ReduceLROnPlateau
from sklearn.model_selection import
train_test_split from tensorflow_addons.metrics
import HammingLoss
from tensorflow_addons.losses import SigmoidFocalCrossEntropy
learning_rate_reduction = ReduceLROnPlateau(monitor =
'categorical_accuracy', patience = 3, verbose = 1, factor =
0.5, min_lr = 1e-7)
mulMod = createModelMul(binMod, 1)
opt = tf.keras.optimizers.Adam(1e-2, beta_1 = 0.9, beta_2

= 0.999, epsilon = 1e-7)
mulMod.compile(optimizer = opt, loss =

SigmoidFocalCrossEntropy(), metrics =
[tf.keras.metrics.CategoricalAccuracy(), HammingLoss(mode
# block 35
# block 37
mulMod.summary
() # block 36
try:
mulMod.save('/content/mulMod_wts.h5')
import netron
5
except Exception as e:
!pip install
netron import
netron
import portpicker
from google.colab import output
port =
portpicker.pick_unused_port() with
output.temporary():
netron.start('/content/mulMod_wts.h5', port, browse=False)
# block 38
from keras.callbacks import TensorBoard
class LRLogger(TensorBoard):
def init (self, log_dir, **kwargs):
super(). init (log_dir = log_dir, **kwargs)
def on_epoch_end(self, epoch,
logs=None): logs = logs or {}
logs.update({'lr':
K.eval(self.model.optimizer.lr)})
super().on_epoch_end(epoch, logs)
5
# block 39
!rm -rf ./logsM
%tensorboard --logdir logsM
# block 40
from matplotlib import pyplot as plt import keras

from tensorflow.keras.callbacks import ModelCheckpoint from tensorflow.ker
from sklearn.utils import class_weight
# class GraphCallback(keras.callbacks.Callback):
# definit(self):
# # accs = [] self.accs = accs
# # #
def
# #
on_epoch_begin(self,
# # # epoch, logs = None): try:
# plt.plot(self.accs)
# plt.plot(history.history[atts[2]]) plt.title('model accuracy') plt.x
except Exception as e:
5
# print(logs.keys())
# return
# def on_epoch_end(self, epoch, logs = None):
# clear_output()
# def on_train_end(self, logs = None):
# self.accs.append(logs['binary_accuracy'])
def modelGraph(ds, model, epochs = 15):
log_dir = './logsM'
tbcback = keras.callbacks.TensorBoard(log_dir=log_dir,
histogram_freq=0, profile_batch = 5)
filepath = '/content/checkpoints_Mul.h5'
chckpntcback = ModelCheckpoint(filepath, monitor =

'hamming_loss', save_best_only = True)
early_stop = EarlyStopping(monitor = 'categorical_accuracy',

patience = 8, verbose = 1)
history = model.fit(ds,epochs = epochs, callbacks =

[learning_rate_reduction, tbcback, chckpntcback])
# history = model.fit(ds, epochs = epochs, callbacks =

[learning_rate_reduction])
model.save(f'/content/{model.name}.h5')
modelGraph(trainDS, mulMod, epochs = 100)
# block 41
53
choices = np.random.randint(low = 0, high = 639, size = 32)
ypred = []
yact = []
# for i in choices:
# im = evalX[i]
# im = im[...,::-1]
# im = np.reshape(im, [1,256,256,3])
# pred = binMod.predict(im, verbose = 0)
# if pred > 0.450:
# ypred.append(1)
# else:
# ypred.append(0)
# yact.append(yeval[i])
for i in choices:
# if Y[i] == 0:
im = evalX[i]
im = im[...,::-1]
im = np.reshape(im, [1,256,256,3])
pred = mulMod.predict(im, verbose = 0)
pred = pred
print(f'predicted : {(pred)}, actual : {Y[i]} \n \n ')
# block 42
binMod.layers.pop()
# block 43
binMod.layer
s.
5
# block 44
for i in range(Y.shape[-
1]): print(Y[:,i])
5
Convolutional Variational Autoencoder (CVAE)
# importing required libraries
# Mounting google drive
!nvidia-smi
import tensorflow as
from
# google.colab
tfLoading
import import
pandas as
data
drive
pd import numpy as np
xtr =
drive.mount('/content/drive'
from tensorflow.data import
np.load('/content/drive/MyDrive/Capstone/NPdatasets/Xdata256.np
y'
Dataset import
)
tensorflow_probability as tfp from
IPython import display
import glob
import
imageio
import PIL
import time
5
xtes =
np.load('/content/drive/MyDrive/Capstone/NPdatasets/ValXdata256
.n py')
xtes.dtype
# block 1
from tensorflow.keras import layers as

L def make_ds(x, ts, bs):
ds =
Dataset.from_tensor_slices((x))
L.Rescaling(1./25
5) #
L.Lambda(func)
])
ds = ds.map(lambda x:
(augModel(x))) ds = ds.shuffle(ts)
ds =
ds.batch(bs)
return ds
train_shape =
xtr.shape[0] test_shape
= xtes.shape[0] bsize =
64
print(f'Train shape : {train_shape} \nBatch Size : {bsize}')
# block 2

imsh it = iter(test_ds)
temp = it.next()
temp = temp[0]
imsh(np.array(temp
# block 3
from tensorflow.keras.models import Model
5
from tensorflow.keras import layers as L
class VAE(tf.keras.Model):
def init (self, lat_dim):
self.lat_dim = lat_dim
super(VAE, self). init ()
self.encoder = tf.keras.Sequential([
L.InputLayer(input_shape=(256, 256, 3)),
L.Conv2D(filters=32, kernel_size=3, strides=(2, 2),
activation='relu'),
L.Conv2D(filters=64, kernel_size=3, strides=(2, 2),
activation='relu'),
L.Flatten(),
L.Dense(lat_dim + lat_dim)
])
self.decoder = tf.keras.Sequential([
L.InputLayer(input_shape=(lat_dim,)),
L.Dense(units=64*64*64, activation='relu'),
L.Reshape(target_shape=(64, 64, 64)),
L.Conv2DTranspose(filters=64, kernel_size=3, strides=2,
padding='same',activation='relu'),
L.Conv2DTranspose(filters=32, kernel_size=3, strides=2,
padding='same',activation='relu'),
# L.Conv2DTranspose(filters=32 , kernel_size=3, strides=1,
padding='same'),
L.Conv2DTranspose(filters=3 , kernel_size=4, strides=1,
padding='same')
])
@tf.function
def sample(self, eps = None):
if eps is None:
eps = tf.random.normal(shape = (100, self.lat_dim))
return self.decode(eps, sig = True)
def decode(self, z, sig = False):

logits = self.decoder(z)
if sig:
5
prob =
tf.sigmoid(logits)
return prob
return logits
def encode(self, x):

mean, lv = tf.split(self.encoder(x), num_or_size_splits =
2, axis = 1)
return mean, lv
def repar(self, mean, lv):

eps = tf.random.normal(shape =
# block 4
optimizer = tf.keras.optimizers.Adam(1e-4)
def log_normal_pdf(sample, mean, logvar,

raxis=1): log2pi = tf.math.log(2. * np.pi)
return tf.reduce_sum(
-.5 * ((sample - mean) ** 2. * tf.exp(-logvar) + logvar
+ log2pi),
axis=raxis)
def compute_loss(model, x):

mean, logvar =
model.encode(x) z =
model.repar(mean, logvar)
x_logit = model.decode(z)
cross_ent =
tf.nn.sigmoid_cross_entropy_with_logits(logits=x_logit,
labels=x) logpx_z = -tf.reduce_sum(cross_ent, axis=[1, 2, 3])
logpz = log_normal_pdf(z, 0., 0.)
logqz_x = log_normal_pdf(z, mean, logvar)
return -tf.reduce_mean(logpx_z + logpz - logqz_x)
# @tf.function
def train_step(model, x,
5
loss = compute_loss(model, x)
gradients = tape.gradient(loss,
model.trainable_variables)
optimizer.apply_gradients(zip(gradients,
model.trainable_variables))
# block 5
import matplotlib.pyplot as
plt epochs = 10
latent_dim = 2
num_examples_to_generate = 16
random_vector_for_generation =
tf.random.normal( shape=[num_examples_to_ge
nerate, latent_dim])
model = VAE(latent_dim)
def generate_and_save_images(model, epoch,

test_sample): mean, logvar =
model.encode(test_sample)
z = model.repar(mean, logvar)
predictions = model.sample(z)
fig = plt.figure(figsize=(4,
4))
for i in range(predictions.shape[0]):
plt.subplot(4, 4, i + 1)
plt.imshow(predictions[i, :, :, 0],
cmap='gray') plt.axis('off')
plt.savefig('image_at_epoch_{:04d}.png'.format(epoch
)) plt.show()
assert bsize >=

num_examples_to_generate for
test_batch in test_ds.take(1):
6
for epoch in range(1, epochs +
1): start_time = time.time()
for train_x in train_ds:
train_step(model, train_x,
optimizer)
end_time = time.time()
loss =
tf.keras.metrics.Mean() for
test_x in test_ds:
loss(compute_loss(model,
test_x)) elbo = -loss.result()
ELBO.append(elbo)
display.clear_output(wait=False)
print('Epoch: {}, Test set ELBO: {}, time elapse for
current epoch: {}'
.format(epoch, elbo, end_time -
start_time)) generate_and_save_images(model,
# block 6
imsh im1 = xtes[np.random.randint(640)]

im1 = im1[...,::-1]
im = im1.reshape([1, 256, 256,
3]) mean, logvar =
model.encode(im)
z = model.repar(mean,
logvar) predictions =
# block 7
pred =
predictions pred
= pred*255
pred = np.asarray(pred)
pred =
pred.reshape((256,256,3)) pred
= pred[...,::-1]
6
# Block 8
elbLoss = [x.numpy()/1000 for x in ELBO]
# block 9
plt.plot(elbLoss)
6
Multi- Class Classification Model Deployment using Flask
#Mount Google Drive

from google.colab import
!pip
driveinstall flask_ngrok
!ngrok authtoken XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
#Authinstall
!pip token censored
pyngrok as security
measure import os
os.chdir('/content/drive/MyDrive/Capstone_FallSem2022-23/
flask') from flask_ngrok import run_with_ngrok
import numpy as np
from keras.models import load_model
from keras.utils import load_img, img_to_array
from flask import Flask, redirect, url_for, request,
render_template from werkzeug.utils import secure_filename
from PIL import Image
from PIL import ImageFilter
app = Flask( name )

run_with_ngrok(app) # Start ngrok when app is
run model =
load_model('CustomModel.h5',compile=False)
model_1 = load_model('Custom_Model_Mul_1.h5',compile=False)
def center_crop(img, nw =
None): wid, hei = img.size
if nw is None:
nw = min(wid, hei)
l = (wid -
63
r = (wid + nw)/2
b = (hei + nw)/2
center_cropped_img = img.crop((l, t, r, b))

return center_cropped_img
def binary_model_predict(img_path, model):

img = Image.open(img_path)
# img = img.filter(ImageFilter.SHARPEN)
w, h = img.size
if not w==h:
img = center_crop(img)
img = img.resize((256, 256))#This loads an image and resizes the
image to (256, 256):
# Preprocessing the image
x = img_to_array(img) #The img_to_array() function adds channels:
x.shape = (256, 256, 3) for RGB
# x = np.true_divide(x, 255)
x = np.expand_dims(x, axis=0) #expand_dims() is used to add the number
of images: x.shape = (1, 256, 256, 3):
preds = model.predict(x)
return preds
def multiClass_model_classify(img_path, model_1):

img_1 = load_img(img_path, target_size=(256, 256))
# Preprocessing the image

x_1 = img_to_array(img_1)
# x = np.true_divide(x, 255)
x_1 = np.expand_dims(x_1, axis=0)
preds_1 = model_1.predict(x_1)
return preds_1
@app.route('/')
def index():
# Main page
return render_template('index.html')
@app.route('/i')
6
def i():
return render_template('index.html')
@app.route('/p')
def p():
return render_template('predict.html')
@app.route('/c')
def c():
return render_template('contactus.html')
@app.route('/predict', methods=['GET', 'POST'])

def upload():
if request.method == 'POST':
f = request.files['file']
# Save the file to ./uploads
basepath = os.path.dirname(" file ")
file_path = os.path.join(
basepath, 'uploads', secure_filename(f.filename))
f.save(file_path)
binary_preds = binary_model_predict(file_path,model)
# print(binary_preds)
# binary_preds=1
if(binary_preds>=0.40):
# binary_result="Disease Detected"
# return render_template('predict.html',
# prediction_text='Prediction : {}
'.format(binary_result))
multi_pred = multiClass_model_classify(file_path,model_1)
# print(type(multi_pred))
val = multi_pred.tolist()
val=val[0]
diseases = []
classes =
['DR','ARMD','MH','DN','MYA','BRVO','TSLN','ERM','LS','MS','CSR','ODC','CRV
O','TV','AH','ODP','ODE','ST','AION','PT','RT','RS','CRS','EDN','RPEC','MHL
','RP','OTHER']
for i in range(len(classes)):
if (val[i]>0.30):
diseases.append(classes[i])
6
binary_result="Diseases detected are : "
listToStr = ','.join([str(elem) for elem in
diseases]) result=binary_result+listToStr
return render_template('predict.html',
prediction_text=' {}
'.format(result)) binary_result = "No Disease Detected"
return render_template('predict.html',
prediction_text='Prediction : {}
'.format(binary_result))
Source Code Link for html Templates and Scripts :

https://drive.google.com/drive/folders/1UPpJzzco0bVnmxY3AS9FsKfVuS0d6HR-
?usp=sharing
6
CHAPTER 6
REFERENCES
[1] “Home - RIADD (ISBI -2021) - Grand Challenge.” https://riadd.grand -challenge.org/Home/
[2] S. Pachade et al., “Retinal Fundus Multi -Disease Image Dataset (RFMiD): A Dataset
for Multi -Disease Detection Research,” Data, vol. 6, no. 2, p. 14, Feb. 2021, doi:
10.3390/data6020014.
[3] E. Sudheer Kumar and C. Shoba Bindu, “ MDCF : Multi-Disease Classification Framework
on Fundus Image Using Ensemble CNN Models,” September 2021.
Available: https://jilindaxuexuebao.com/details.php?id=DOI:10.17605/OSF.IO/ZHA9C
[4] Dominik Müller, Iñaki Soto-Rey, and Frank Kramer, “Multi – Disease Detection in Retinal
Imaging Based on Ensembling Heterogeneous Deep Learning Methods,” March 2021.
Available: https://arxiv.org/abs/2103.14660
[5] Jie Hu, Li Shen, Samuel, Albanie, Gang Sun, and Enhua Wu, “Squeeze-and-Excitation
Networks,” May 2019. Available: https://arxiv.org/abs/1709.01507
[6] M.A.Rodriquez, H.AlMarzouqi, and P.Liatsis, “Multi-Label Retinal Disease Classification

using Transformers,” July 2022. Available: https://arxiv.org/abs/2207.02335
[7] Andrés G Marrugo, and María S Millán, “Retinal Image Analysis : Preprocessing and
Feature Extraction,” February 2011. Available:
https://iopscience.iop.org/article/10.1088/1742-6596/274/1/012039
[8] Ekberjan Derman, “Dataset Bias Mitigation Through Analysis of CNN Training Scores,” June
2021. Available: https://arxiv.org/abs/2106.14829
[9] Diederik P. Kingma, and Max Welling “An Introduction to Variational Autoencoders,” December
2019. Available:
[10] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir
Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich, “Going Deeper With
Convolutions,” Sept 2014. Available: https://arxiv.org/abs/1409.4842
[11] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew
Wojna, “Rethinking the Inception Architecture for Computer Vision,” December 2015.
Available: https://arxiv.org/abs/1512.00567v3
6
[12] Alex Labach, Hojjat Salehinejad, and Shahrokh Valaee, “Survey of Dropout Methods
for Deep Neural Networks,” October 2019. Available: https://arxiv.org/abs/1904.13310
[13] Shaofeng Cai, Yao Shu, Gang Chen, Beng Chin Ooi, Wei Wang,and Meihui Zhang,
“Effective and Efficient Dropout for Deep Convolutional Neural Networks,” July 2020.
Available: https://arxiv.org/abs/1904.03392
[14] Huimin Ren, Yun Yue, Chong Zhou, Randy C. Paffenroth, Yanhua Li, and Matthew L.
Weiss, “Robust Variational Autoencoders: Generating Noise-Free Images from Corrupted
Images.”
Available: https://users.wpi.edu/~yli15/Includes/AdvML20_Huimin.pdf
[15] “Tensorflow and Keras Model Deployment using Flask in Google Colab.”
https://medium.datadriveninvestor.com/tensorflow-and-keras-model-deployment-using-flask-in-googl
e-colab-b8e49d1d4af0
[16] “Understanding Variational Autoencoders (VAEs)”

https://towardsdatascience.com/understanding-variational-autoencoders-vaes-f70510919f73
[17] “Build tensorflow input pipelines”

https://www.tensorflow.org/guide/data
[18] “TensorBoard: TensorFlow’s Visualization toolkit”

tensorflow.org/tensorboard
[19] “Parallel Convolutional Layers in Keras”

https://stackoverflow.com/questions/43151775/how-to-have-parallel-convolutional-layers-in-keras
6
BIODATA
Name: J Kartik Sai

Mobile No : +91 7032038461
Email : kartik.19bci7056@vitap.ac.in
Permanent Address : Flat number 201, Amrutha
Nilayam,Military Dairy farm road, Old Bowenpally,
Secunderabad, Telangana, 500011
Name: Hemant Kumar Rathore

Mobile No : +91 8269040209
Email: hemanth.19bce7472@vitap.ac.in
Permanent Address: Raj Steel Industries, In front of
Anand Hero, Champa Road Janjgir, Janjgir,
Chhattisgarh, 495668
6
Disease Identification and Retinal Fundus Scan Correction
With over 1 billion preventable cases,
the ability to pre-emptively solve the issue is
using Deep Learning
the best strategy. But this is not possible in
certain places with lack of proper eye-care in
either technological or technician Kartik Sai1, Sriharshitha Deepala1, Hemant Kumar
departments. To tackle this issue, we need a Rathore1
disease risk predictor that can tell us if the 1
Vellore Institute of Technology
retina of the patient shows any signs or lead
indicators of a predefined set of symptoms.
Abstract
This disease risk predictor and
classifier
In thiscan be a implemented
paper deep learning using the is presented which can show if the scan of a retinal fundus risks having a disease and if so, the
application
Introduction
Retinal Fundus
nature of the disease Multi-Disease Dataset
with a risk probability Model
along with a scan Architecture
clarifier. Image processing and Deep learning methods are used to achieve this result.
(RFMiD). This dataset consists of 3200
This model is hosted with the help of a website where the user can upload their scan of their retinal fundus in order to get a diagnosis and a corrected
retinal fundus When
scan (if required). images
the user and
uploadsthe
their scan onto the site, the deep learning model processes the image and outputs the disease risk and
corresponding set of ground truth values Methodology
in This is achieved using Deep learning techniques including Convolutional neural network (CNN) and
the kind of disease being risked in their case.
CSV format,
Variational and this (VAE)
Autoencoder enablesforthe imple-
disease • The initial and
type/risk identification stepscan
of allcorrection
the models is to load the data-set. The RFMiD data-set is loaded using
respectively.
tensorflow.data pipeline(A data pipeline is a series of data processing steps)
Software
In today’sandworld,hardware
we face many Details
difficulties in terms of eye care, including treatment, quality of prevention, vision rehabilitation services, and
• diagnosis
The RFMiD data-set is highly biased, andintopreparing
reduce bias perform Data Augmentation to
scarcity of trained eye care experts. Early detection and of ocular abnormalities can help before-hand for disease prevention
increase low
and early treatment procedures. This project aims at classification and image density.
prediction of diseases. With over 1 billion preventable cases, the ability to pre-
emptively solve the issue is the best strategy. But this is•not possible
After in certain
Loading places
the data and with lack of
applying proper eye-care
different types of in either technological
pre-processing or technician
techniques, create a
deep learning model to predict disease risk (unhealthy or healthy iris scan) as shown in the
Figure I by utilizing inception V3, Squeeze and Excitation block and by considering many
other parameters.
• After successful creation of disease prediction model,save the model weights in .H5 format
add multi-diseases classification to it as shown in Figure II.
• To remove noisy scans, create an Iimage
Figure de-noiser
: Binary Classifierusing Variational Auto-Encoder to
denoise the images.
Conclusion
This project was a consequence of a problem survey done on local ophthalmologists. Complaints ranged from incorrect, slow diagnosis and lack of
expertise or help that is required to properly help with the accurate decision of a disease or risk of a disease. This creates a large problem in proper
planning, safety of a patient which could result in improper precautionary measures taken or in some cases, no precautionary measures. This project
aims at helping to solve that issue and also hopes to provide a sort of pre-diagnosis to any individuals so that they take the utmost care and can be
paticularly useful in very remote places, with a very steep patient per doctor ratio.
Figure II : Binary Classifier with Multi-Class
Software Details : Google Colab, Python

Programming Language, Python Modules like NumPy, Pandas, Tensorflow, Keras, Sklearn, Matplotlib, PIL, flask, flask- ngrok, Werkzeug.utils, lime
HardwareDetails: Freetierofgoogle
colab with 12GB RAM, 108GB disk and Nvidia Tesla T4 GPU.

Disease Identification and Retinal Scan Correction Using Deep Learning Techniques Project Report

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Disease Identification and Retinal Scan Correction Using Deep Learning Techniques Project Report

Uploaded by

Copyright:

Available Formats

DISEASE IDENTIFICATION AND RETINAL FUNDUS

SCAN CORRECTION USING DEEP LEARNING

A CAPSTONE PROJECT REPORT

Submitted in partial fulfillment of the

J Kartik Sai (19BCI7056)

Under the Guidance of

DR. ANUPAMA NAMBURU

SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

This is to certify that the Capstone Project work titled “DISEASE

Dr. Anupama Namburu

Internal Examiner External Examine

PROGRAM CHAIR DEAN

We would also like to extend our gratitude to Vellore Institute of Technology,

In today’s world, we face many difficulties in terms of eye care, including

S.No. Chapter Title Page Number

1.2 Background and Literature Survey 9-11

1.3 Organization of the Report 11

5. 2 Disease identification and scan 12

2.1 Proposed System 12-13

2.2 Working Methodology 13-16

2.3 System Details 16

2.3.1 Software 16-17

2.3.2 Hardware 17-18

Figure No. Title Page number

2 Binary Classification model 12

3 Multi-class Classification model 13

5 Healthy vs Unhealthy distribution 19

6 Disease classes distribution 19

7 Confusion matrix for Binary classification 20

8 Binary accuracy across 25 epochs 20

9 Binary cross-entropy across 25 epochs 21

10 Categorical accuracy across 25 epochs 21

11 Hamming Loss across 25 epochs 22

12 Sigmoid focal cross entropy Loss across 25 22

15 Retinal Fundus Image having disease 24

Figure 1 Active vs Preventable cases

The following are the objectives of this project:

1.2 Background and Literature Survey

We referred to a paper titled “Retinal Fundus Multi-disease image Dataset” by Samiksha

1.3 Organization of the Report

The remaining chapters of the project report are described as follows:

2.1 Proposed System

The solution is bifurcated into 2 phases, namely:

1. Binary classification for disease risk classification (Healthy vs Unhealthy)

Figure 2 Binary classifier

The model is created using 2 different kinds of blocks:

Figure 3. Multi-class classification model

2.2 Working Methodology

The data augmentations performed were as follows:

Figure 4. Proposed Workflow

2.3 System Details

2.3.1 Software Details

Google Colaboratory which is popularly known as Google Colab is a platform for

2. Python language and Python packages used

2.3.2 Hardware Details

● CPU: A single CPU with 2 vCPUs and 13 GB of memory.

RESULTS AND DISCUSSIONS

a. Dataset distribution of healthy vs unhealthy samples

Figure 5. Healthy vs Unhealthy distribution

b. Dataset distribution of disease classes (reduced from 45 to 28 due to lack of

Figure 6. Disease classes distribution

Figure 7. Confusion matrix for Binary classification