Professional Documents
Culture Documents
Disease Identification and Retinal Scan Correction Using Deep Learning Techniques Project Report
Disease Identification and Retinal Scan Correction Using Deep Learning Techniques Project Report
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING
by
DECEMBER 2022
CERTIFICATE
1
The thesis is satisfactory / unsatisfactory
Approved by
2
ACKNOWLEDGEMENT
We would like to first express our profound gratitude and deep regard to our
guide Dr. Anupama Namburu and sincerely wish to acknowledge her vision,
guidance, valuable feedback, and constant support throughout the duration of this
project.
3
ABSTRACT
In this paper a deep learning application is presented which can show if the
scan of a retinal fundus risks having a disease and if so, the nature of the disease
with a risk probability along with a scan clarifier. Image processing and Deep
learning methods are used to achieve this result. This model is hosted with the help
of a website where the user can upload their scan of their retinal fundus in order to
get a diagnosis and a corrected scan (if required). When the user uploads their scan
onto the site, the deep learning model processes the image and outputs the disease
risk and the kind of disease being risked in their case. This is achieved using Deep
learning techniques including Convolutional neural network (CNN) and Variational
Autoencoder (VAE) for disease type/risk identification and scan correction
respectively
4
TABLE OF CONTENTS
1.1 Objectives 9
5
List of Figures
4 Proposed Workflow 14
6
14 Retinal Fundus Image having no disease 23
7
CHAPTER 1
INTRODUCTION
According to the World Health Organization, over 2 billion people suffer from some form
of vision impairment or problems with at least 50% of the 2 billion are preventable
With over 1 billion preventable cases, the ability to pre-emptively solve the issue is the
best strategy. But this is not possible in certain places with lack of proper eye-care in either
technological or technician departments. To tackle this issue, we need a disease risk predictor that
can tell us if the retina of the patient shows any signs or lead indicators of a predefined set of
symptoms.
This disease risk predictor and classifier can be implemented using the Retinal Fundus
Multi-Disease Dataset (RFMiD). This dataset consists of 3200 retinal fundus images and the
corresponding set of ground truth values in CSV format, and this enables the implementation of
the solution through deep learning.
8
1.1 Objectives
10
Pigment Epithelial Detachment (HPED) and Collateral (CL). The dataset consists of one-hot
encoded values of disease and risk and these 45 different eye conditions was also given in .csv
format. This dataset was divided into 60% training set i.e., 1920 images, 20% evaluation set i.e,
640 images and 20% testing test i.e., 640 images.
A similar project on the ground of Deep Learning was carried out by Dominik Müller,
Iñaki Soto-Rey, and Frank Kramer of IT-Infrastructure for Translational Medical Research,
University of Augsburg, Germany, and Medical Data Integration Center, University Hospital
Augsburg, Germany. This is our base paper; the title of our base paper is “Multi – Disease
Detection in Retinal Imaging Based on Ensembling Heterogeneous Deep Learning Models.” This
paper discussed an approach to detect various retinal diseases by utilizing ensemble learning and
heterogeneous deep convolutional neural network models. Various ensemble learning techniques
like heterogeneous deep learning models, bagging via 5 – fold cross validation and stacked
logistic regression models have been integrated and modern techniques like class weighting,
transfer learning, real-time image augmentation and focal loss are to build the classification
pipeline. The dataset used in this paper was Retinal Fundus Multi – Disease Image Dataset
(RFMiD). This dataset totally consists of 3200 retinal fundus images and 46 different conditions
of a retina. The number of classes were reduced to a disease risk class, another 27 columns of
various retinal conditions and 1 ‘OTHER’ class that consists of extremely rare conditions. The
dataset was divided into training, testing and validation datasets. The training dataset consists of
1920 images and testing and validation datasets consist of 1280 images. In order to mitigate the
bias of the dataset, pre-processing and data augmentation was performed over it. The disease risk
was detected by using multiple models based on EfficientNetB4 and DenseNet201 architectures.
Thereafter disease label classification was done by using ResNet152, InceptionV3, DenseNet201
and EfficientNetB4 and applied ensemble learning strategies Bagging and Stacking were used.
This model gave an AUROC score of 0.95.
We also referred to “Squeeze and excitation networks” paper (by Jie Hu, Li Shen, Samuel
Albanie, Gang Sun, Enhua Wu) for the squeeze and excitation block in the model architecture to
enhance the learning and give better accuracy and results as a consequence.
A site was also developed to deploy the model for common and easy usage. To deploy the
model as a web application, an article that was published in Medium named “Tensorflow and
Keras Model Deployment using Flask in Google Colab” by Rishi Mishra was taken as reference.
11
This article mainly discusses saving a deep learning model in Keras .H5 format/ .json format,
creating a web server using flask, running a flask application using flask – ngrok in Google Colab.
12
CHAPTER 2
DISEASE IDENTIFICATION AND SCAN CORRECTION USING
DEEP LEARNING
This Chapter describes the proposed system, working methodology, software and hardware
details.
This model is an extension of the binary classifier with a multi-class classifier head.
The head consists of a single convolution block of 3 convolutions (1x1, 3x3 and 5x5), dropout
and max pool operation of 2x2 size and strides of 2. This is then input into a squeeze and
excitation block to emphasize important feature maps and then flattened
The initial step of all the models is to load the data-set. The RFMiD data-set is loaded
using tensorflow.data pipeline(A data pipeline is a series of data processing steps). The tf.data
pipeline makes it possible to handle massive volumes of data, read from many data types,
combine randomly picked pictures into a batch for training, and execute complicated
transformations, and also this pipeline helps in reducing processing time by effectively utilizing
14
CPU and GPU resources. The RFMiD data-set is highly biased, and to reduce bias we performed
Data Augmentation to increase low image density.
The dataset is now ready to be passed into the model. Firstly the Binary classifier is created. The
model architecture is done by taking inspiration from Inception V3 and addition of a squeeze and
excitation network.
15
A Loss function used to benchmark the model to see how well it has learnt. Here, the loss
function is a simple Binary Cross-entropy loss due to the binary nature of the healthy status of a
retinal scan (either healthy or unhealthy). This helps to learn better exploiting the log-probability
concept.
After the loss is calculated, the edges’ weights are to be updated so as to minimize the
loss. To do this, an optimiser is used. In this case we have made use of the ‘Adam’ optimiser with
an initial learning rate of 0.01 (1e-2), 𝛃1 decay value of 0.9 and 𝛃2 decay value of 0.999.
Combined with this, we also put in a learning rate schedule callback which monitors the
validation loss with a patience of 3 epochs. This scheduler decays the learning rate by a factor of 2
(that is, it reduces the learning rate by half if the validation loss is greater than or equal to the
earlier epoch’s validation loss for 3 epochs continuously)
For multi-class classification, the binary classification model was used and was added with
a multi-class head consisting of 3 convolution parallel operations leading into a Squeeze and
Excite block and subsequently flattened and led into a dense network with 28 output nodes to
signify 28 different classes. Each of these 28 nodes have a sigmoid activation so as to give a
probability of risking the disease individually.
This multi-class classifier is trained with a loss function of “Sigmoid focal cross-entropy”
due to the high complexity of the training labels and to prevent the model from overfitting on the
easier examples. This loss also helps to penalize the model if it finds a constant value to output for
the best loss. To give an understandable benchmark/metric, categorical accuracy and Hamming
loss were used respectively for the accuracy and inaccuracy.
The Variational AutoEncoder was used to denoise the unclean scans and help fix any
inconsistencies in the scan generated. It comprises a simple architecture of an encoder
Convolutional Neural Network that encodes the image into a latent space of mean and log-
variance. These latent space parameters are then re-parameterized with reference to the normal
distribution and any outlier is deemed as noise and is substituted with an appropriate value.
These updated parameters are then decoded into input-like outputs (in this case, Retinal fundus
scan images). Benchmarking of this model is done using ELBO or Evidence of lower
16
bound loss which helps to give a benchmark and understand how the model performs with respect
to the original and corrected images.
This section describes the software and hardware details of the system:
Google Colaboratory, Python Programming Language and Python Modules such as NumPy, Pandas,
Tensorflow, Keras, Sklearn, Matplotlib, PIL, etc are used.
1. Google Colaboratory
Python is a widely used programming language for research and development. The below are
a few of the reasons for extensive use of Python in research and development :
1. Python is a high-level, general-purpose programming language that is easy to learn and
use, making it accessible to a wide range of people, including those with no prior
programming experience.
17
2. Multiple libraries and frameworks in Python enable us to complete bigger tasks in lesser
lines of code.
3. Python has a large and active community of users and developers
Python 3.8.16 was used in this project, which is proprietary with Google Colbaratory.
Versions of all the packages were in accordance with Python 3.8. The packages used were as
follows:
1. tensorflow - keras : Deep learning framework for Neural network and data augmentation
model architecture
2. pandas : CSV file read and write
3. numpy : Array operations, data-type operations, miscellaneous
4. tensorflow - dataset : Dataset generator and augmentation
5. PIL : Python image library for image-related operations and image manipulations
6. lime : Locally interpretable explanation for the decision taken by the model
7. sklearn : Scikit-learn (abbreviated as “sklearn”) contains many useful tools for data
manipulation (scaling, imputation of missing values, generating synthetic samples), and
evaluating the performance of a deep learning model (accuracy score, F1 score, AUC,
confusion matrix, etc metrics are useful to assess the performance of a model).
8. matplotlib : Python library for data visualization
9. flask : Microweb framework in python. Well-suited for building small to medium - sized
web applications
10. werkzeug : Provides a collection of utilities for building web applications using flask.
11. pyngrok : creates a secure tunnel from a public URL to a locally web server from python
code.
12. flask_ngrok : An extension of flask that allows a user to make their flask application
available globally by using ngrok.
The specifications of CPUs, GPUs and TPUs offered by free version of google colab are :
18
● GPU: A single GPU with 1 vCPU and 13 GB of memory. The GPU is an NVIDIA
Tesla T4, K80.
● TPU: A single TPU with 2 vCPUs and 13 GB of memory. The TPU is a v2 TPU.
19
CHAPTER 3
20
c. Confusion matrix for Binary classifier
21
e. Binary classification Cross-entropy loss across 25 epochs
22
g. Hamming loss for 25 epochs
23
i. ELBO loss for 10 epochs
j. Web Application
25
CHAPTER 4
Abnormalities in a retinal fundus scan, if looked into in its early stages, can lead to pre-
emptive cure of many diseases like diabetes but having a doctor who can do it always is not a
guarantee in many places. Therefore, a need arises for an automated and accurate classification
and prediction mechanism for this kind of job.
This project also highlights the dire state of some of the more remote places in our country
and few other countries where the availability of an eye-specialist is not very accessible and
having a proper diagnosis is not always a guarantee with these conditions. This also highlights
how many cases could be prevented with proper diagnosis and symptom checking as the signs of
illness are very well reflected onto the patients’ retinas. Some diseases include but are not limited
to Diabetic retinopathy, Macular Degeneration. If these conditions are not taken into
consideration, it could prove fatal to the patient in some cases.
The model performance was limited due to the time constraint due to its computationally
heavy nature. Training for 25 epochs at a time to go back and optimize the model was the viable
strategy because of the same. The depth of convolutional layers and parallelisation of multiple
convolution operations with many kernels has made the variance of the model to be wider and be
capable of understanding, learning and identifying the disease with a far more broader
understanding and knowing. This inturn caused the model to learn very slowly and overfit on
several occasions which was fixed by using dropout, data augmentation and other regularization
methods like L2.
The Variational AutoEncoder also was a very computationally intensive model due to the
size of the images and the RGB coloration of reconstruction of images. The CVAE also had the
same challenges as the CNN model in the computational complexity domain. But the
reconstruction for the first 10 epochs has proven to be a very promising model and can be further
trained and improvised with better training times and computational methods.
26
A future prospect of this project could happen with micro-computers/processors capable of
running the models on their local chipset wherein the model weights could be dumped onto it and
these tiny computers/processors could be embedded onto scan stations of the retinas in
optometrists and can provide a near-instantaneous result on the scanner itself.
Another future prospect could be training on newer data in order to fine-tune the model or
help identify newer diseases. This can prove very useful in an age where in the case of a global
pandemic, the remote areas get the least amount of attention and in return cause massive damage
to life in those areas.
A possible third future improvement could come in the form of more technically sound
machines where the finer details are highlighted which can enhance the model capabilities of
identifying diseases. This can also help in places where the machines are not up-to-date and can
provide a far more cleaner re-construction in their cases.
27
CHAPTER 5
APPENDIX
Review - 1:
https://docs.google.com/presentation/d/16ESI7IAYYQOlWlyAcjEiGuITxCWI02p8
ae17cg12NaE/edit?usp=sharing
Review - 2:
https://docs.google.com/presentation/d/1yO0CVWbe0ghLlmMJw4tShf6Kqo1GzjM
SosGxModbENw/edit?usp=sharing
Review - 3:
https://docs.google.com/presentation/d/1DY_k15ylI9MNulTmg5gmr-WX35_xWsv
ILs1aI1kaZCw/edit?usp=sharing
POSTER :
Link for Poster :
https://drive.google.com/file/d/1YuZdMJEjLW5hGZIUdyqoi5YW7luhtyBw/view?u
sp=sharing
28
Python script and code
Binary and Multi-Class Classification
# Mounting google drive
# importing required libraries
# Loading data
from google.colab import
import
drive tensorflow as
labels = pd.read_csv('/content/drive/MyDrive/Capstone/dataset/RFMiD_Traini
tf import numpy as =np['ID'], axis = 1, inplace = True)
labels.drop(labels
drive.mount('/content/drive'
import pandas as pd 28
import pickle
%load_ext tensorboard
labels =
labels.astype('uint8') labels
# block 1
# block 2
# block 3
X =
# block 4
load('/content/drive/MyDrive/Capstone/NPdatasets/
Y =
Xdata256.npy', allow_pickle = True)
evalX =
np.array(labels['Disease_Risk']) Y
load('/content/drive/MyDrive/Capstone/NPdatasets/ValXdata256.np
%cd /content
=
y'Y.astype('uint8')
for i in
)
range(len(X)): X[i]
for i in range(len(evalX)):
= X[i][...,::-1]
evalX[i] = evalX[i][...,::-
2
evalLabels =
pd.read_csv('/content/drive/MyDrive/Capstone/dataset/RFMiD_Vali
da tion_Labels.csv')
evalY = np.array(evalLabels['Disease_Risk'])
# block 5
posX = []
negX = []
posY = []
negY = []
for i in range(len(Y)):
if Y[i] == 1:
posX.append(X[i])
posY.append(Y[i])
elif Y[i] == 0:
negX.append(X[i])
negY.append(Y[i])
posX = np.array(posX)
posY = np.array(posY)
negX = np.array(negX)
negY = np.array(negY)
def make_ds(x,y):
3
ds = Dataset.from_tensor_slices((x,y))
ds = ds.shuffle(320).repeat()
return ds
# ds = Dataset.from_tensor_slices((X, Y))
ds.shuffle(320)
ds = ds.batch(bsize)
augModel = tf.keras.Sequential([
tf.keras.layers.RandomBrightness(factor = 0.2),
layers.RandomContrast([0.3, 0.7]),
layers.RandomFlip('horizontal_and_vertical'),
])
return ds
3
# xtes = np.reshape(xtes, [xtes.shape[0], 256, 256, 3])
val_ds = Dataset.from_tensor_slices((X,Y))
val_ds = val_ds.batch(bsize)
return val_ds.prefetch(tf.data.AUTOTUNE)
# block 6
# block 7
# block 8
ds = prepDataBinTr(X, Y, bsize = 32)
it = iter(ds)
valds = prepDataBinVal(evalX, evalY, bsize = 32)
temp
temp = it.next()
=
temp0 =
it.next()
np.asarray(temp[0])
temp0 =
imsh(temp0[2])
np.asarray(temp[1]) #
32
print(temp[1])
imsh(temp0[2])
# block 9
def createModel(id):
u = inp.shape
ar = []
i = np.random.randint(u[3])
if i not in ar:
ar.append(i)
feat1 = []
feat2 = []
for i in range(u[3]):
if i in ar:
feat1.append(inp[:,:,:,i])
feat2.append(inp[:,:,:,i])
feat1 = tf.convert_to_tensor(feat1)
feat2 = tf.convert_to_tensor(feat2)
conv1 = L.Dropout(0.6)(conv1)
33
conv2 = L.Conv2D(f, (3,3), activation = 'relu', padding =
'same')(feat1)
conv2 = L.Dropout(0.6)(conv2)
conv3 = L.Dropout(0.6)(conv3)
conv1 = L.MaxPool2D((2,2),(2,2))(conv1)
conv2 = L.MaxPool2D((2,2),(2,2))(conv2)
conv3 = L.MaxPool2D((2,2),(2,2))(conv3)
conv1 = L.MaxPool2D((2,2),(2,2))(conv1)
conv2 = L.MaxPool2D((2,2),(2,2))(conv2)
conv3 = L.MaxPool2D((2,2),(2,2))(conv3)
# feat2 = L.MaxPool2D((2,2),(2,2))(feat2)
return out
act = None
if a:
act = 'relu'
34
# conv1 = L.Conv2D(f, (3,3), activation = act, padding =
'same')(conv1)
conv1 = L.Dropout(0.6)(conv1)
conv2 = L.Dropout(0.6)(conv2)
# conv3 = L.Dropout(0.6)(conv3)
conv1 = L.MaxPool2D((2,2),(2,2))(conv1)
conv2 = L.MaxPool2D((2,2),(2,2))(conv2)
# conv3 = L.MaxPool2D((2,2),(2,2))(conv3)
return output
35
def SENblock(inp,ch,ratio=16):
x = L.GlobalMaxPooling2D()(inp)
x = L.Dense(ch//ratio, activation='relu')(x)
x = L.Dropout(0.5)(x)
x = L.Dense(ch, activation='sigmoid')(x)
return output
K.reset_uids()
inp_shape = L.Rescaling(1./255)(inp_shape)
t1 = customBlock(inp_shape, 32)
t1 = L.MaxPooling2D((2,2),(2,2))(t1)
t2 = customBlock(inp_shape, 32)
t2 = L.MaxPooling2D((2,2),(2,2))(t2)
t3 = customBlock(inp_shape, 32)
t3 = L.MaxPooling2D((2,2),(2,2))(t3)
36
out = L.concatenate([t1, t2, t3], axis = -
out = L.Flatten()(out)
# out = L.Dropout(0.7)(out)
# img_path = f'/content/Net_image_{id}.png'
return model
# block 10
learning_rate_reduction = ReduceLROnPlateau(monitor =
'val_binary_accuracy', patience = 3, verbose = 1, factor =
0.5, min_lr = 1e-7)
37
binMod = createModel(1)
# block 11
# block
!rm -rf 12
./logs
# block 13
%tensorboard --logdir logs
class LRLogger(TensorBoard):
logs.update({'lr':
K.eval(self.model.optimizer.lr)})
super().on_epoch_end(epoch, logs)
3
# lrlogcback = LRLogger('/content/logs')
# block 14
import keras
# class GraphCallback(keras.callbacks.Callback):
# accs = []
# self.accs = accs
# try:
# plt.plot(self.accs)
# # plt.plot(history.history[atts[2]])
# plt.title('model accuracy')
# plt.xlabel('epoch')
# plt.show()
# except Exception as e:
# print(logs.keys())
# return
3
# clear_output()
# # def on_train_end(self, logs = None): self.accs.append(logs['binary_a
filepath = '/content/checkpoints_globmaxpool.h5'
chckpntcback = ModelCheckpoint(filepath, monitor = 'val_binary_accuracy'
# block 15
4
mod_path =
'/content/drive/MyDrive/Capstone/Model_Weights/CustomModel.h
5'
binMod = load_model(mod_path)
# block 16
# block 17
# block 18
choice =
try:
np.random.randint(640) im =
choice
from =
lime import
evalX[choice]
np.random.randint(640)
lime_image im =
except Exception
im = im[...,::-1]
evalX[choice]
as
im e:
= np.reshape(im, [1,256,256,3])
!pip
pred = install lime
binMod.predict(im, verbose =
4
# im = im[...,::-1]
# im = np.reshape(im, [1,256,256,3])
# block 19
# block 20
# block 21
evalY[choic
from skimage.segmentation import mark_boundaries
e]
choices = np.random.randint(low = 0, high = 639, size = 32)
ypred = []
temp_1, mask_1 =
42
explanation.get_image_and_mask(explanation.top_labels[0
], positive_only=True, num_features=5, hide_rest=True)
temp_2, mask_2 =
explanation.get_image_and_mask(explanation.top_labels[0
], positive_only=False, num_features=10,
hide_rest=False)
figsize=(15,15)) ax1.imshow(mark_boundaries(temp_1,
mask_1)) ax2.imshow(mark_boundaries(temp_2,
mask_2)) ax1.axis('off')
ax2.axis('off')
yact = []
# for i in choices:
# im = evalX[i]
# im = im[...,::-1]
# im = np.reshape(im, [1,256,256,3])
# ypred.append(1)
# else:
# ypred.append(0)
# yact.append(yeval[i])
for i in choices:
# if Y[i] == 0:
im = evalX[i]
im = im[...,::-1]
im = np.reshape(im, [1,256,256,3])
pred = pred[0][0]
ypred.append(1)
else:
ypred.append(0)
yact.append(Y[i])
# block 22
4
import matplotlib.pyplot as
plt plt.figure(figsize =
plt.plot(yact, 'rx')
plt.legend(['ypred', 'yact'])
# block 23
# block 24
from
from keras.models import
sklearn.metrics import confusion_matrix
load_model
conf_matrix#=binModi = binMod2
confusion_matrix(y_true=yact,
binModi =
y_pred=ypred) fig, ax = plt.subplots(figsize=(7.5,
load_model('/content/drive/MyDrive/Capstone/Model_Weights/Custo
7.5)) ax.matshow(conf_matrix, cmap=plt.cm.Blues,
mM odel.h5')
alpha=0.3)
for i in range(conf_matrix.shape[0]):
for j in range(conf_matrix.shape[1]):
# plt.xlabel('Predictions',
fontsize=18) plt.ylabel('Actuals',
fontsize=18) plt.title('Predictions',
fontsize=18) plt.show()
4
binModi.layers[-5].name
# block 25
# block 26
# block 27
img = evalX[0]
binMod = tf.keras.models.Model(inputs = binModi.input, outputs
img = np.reshape(img,
def PrepDataMulTr(X, Y, bsize = 64):
= binModi.get_layer('concatenate_3').output)
#[1,256,256,3])
def makeDS(x, print(binMod2(img))
y):
binMod.output
# posX = []
# posY = []
# posX = np.array(posX)
# posY = np.array(posY)
45
# def make_ds(x1,y1):
# ds = Dataset.from_tensor_slices((x1,y1))
# ds = ds.repeat()
# return ds
# return ds_pos
# dss = []
# weight = []
# for y in range(Y.shape[-1]):
# dss.append(makeDS(X,Y[:,y]))
# weight = [1./28]*28
# ds = dss[0]
# for i in range(1,28):
# ds = ds.concatenate(dss[i])
ds = Dataset.from_tensor_slices((X, Y))
ds.shuffle(240)
ds = ds.batch(bsize)
augModel = tf.keras.Sequential([
tf.keras.layers.RandomBrightness(factor = 0.3),
layers.RandomContrast([0.0, 0.7]),
layers.RandomFlip('horizontal_and_vertical'),
4
layers.RandomRotation([-0.3, 0.7], fill_mode =
'constant', fill_value = 0.)
])
return ds
val_ds = Dataset.from_tensor_slices((X,Y))
val_ds = val_ds.batch(bsize)
return val_ds.prefetch(tf.data.AUTOTUNE)
# block 28
#
Y block 29
= np.array(labels.drop(labels = ['Disease_Risk'], axis =
# block 30
trainDS = PrepDataMulTr(X, Y, 32)
1)) Y.shape
it = iter(trainDS)
4
temp = it.next()
temp0 =
np.asarray(temp[1]) #
imsh(temp0[2])
# print(np.unique(temp0, return_counts =
True)) print(temp[1].shape)
# block 31
# block 32
binMod.output.sha
import
pe keras.layers as L
def createModelMul(binMod, id =
1): K.reset_uids()
4
output = L.concatenate([conv1, conv2, conv3], axis = -1, name
= "conc")
return output
def SENblock(inp,ch,ratio=16):
x = L.GlobalMaxPooling2D(name = 'glob')(inp)
return output
binMod.trainable = False
input_custom = binMod.output
out = L.Flatten()(out)
return model
# block 33
import HammingLoss
learning_rate_reduction = ReduceLROnPlateau(monitor =
'categorical_accuracy', patience = 3, verbose = 1, factor =
0.5, min_lr = 1e-7)
mulMod = createModelMul(binMod, 1)
# block 35
# block 37
mulMod.summary
() # block 36
try:
mulMod.save('/content/mulMod_wts.h5')
import netron
5
except Exception as e:
!pip install
netron import
netron
import portpicker
port =
portpicker.pick_unused_port() with
output.temporary():
# block 38
class LRLogger(TensorBoard):
logs.update({'lr':
K.eval(self.model.optimizer.lr)})
super().on_epoch_end(epoch, logs)
5
# block 39
# block 40
# class GraphCallback(keras.callbacks.Callback):
# definit(self):
# # accs = [] self.accs = accs
# # #
def
# #
on_epoch_begin(self,
# # # epoch, logs = None): try:
# plt.plot(self.accs)
# plt.plot(history.history[atts[2]]) plt.title('model accuracy') plt.x
except Exception as e:
5
# print(logs.keys())
# return
# clear_output()
# self.accs.append(logs['binary_accuracy'])
log_dir = './logsM'
tbcback = keras.callbacks.TensorBoard(log_dir=log_dir,
histogram_freq=0, profile_batch = 5)
filepath = '/content/checkpoints_Mul.h5'
model.save(f'/content/{model.name}.h5')
# block 41
53
choices = np.random.randint(low = 0, high = 639, size = 32)
ypred = []
yact = []
# for i in choices:
# im = evalX[i]
# im = im[...,::-1]
# im = np.reshape(im, [1,256,256,3])
# ypred.append(1)
# else:
# ypred.append(0)
# yact.append(yeval[i])
for i in choices:
# if Y[i] == 0:
im = evalX[i]
im = im[...,::-1]
im = np.reshape(im, [1,256,256,3])
pred = pred
# block 42
binMod.layers.pop()
# block 43
binMod.layer
s.
5
# block 44
for i in range(Y.shape[-
1]): print(Y[:,i])
5
Convolutional Variational Autoencoder (CVAE)
# importing required libraries
# Mounting google drive
!nvidia-smi
import tensorflow as
from
# google.colab
tfLoading
import import
pandas as
data
drive
pd import numpy as np
xtr =
drive.mount('/content/drive'
from tensorflow.data import
np.load('/content/drive/MyDrive/Capstone/NPdatasets/Xdata256.np
y'
Dataset import
)
tensorflow_probability as tfp from
import glob
import
imageio
import PIL
import time
5
xtes =
np.load('/content/drive/MyDrive/Capstone/NPdatasets/ValXdata256
.n py')
xtes.dtype
# block 1
train_shape =
xtr.shape[0] test_shape
= xtes.shape[0] bsize =
64
print(f'Train shape : {train_shape} \nBatch Size : {bsize}')
# block 2
# block 3
5
from tensorflow.keras import layers as L
class VAE(tf.keras.Model):
def init (self, lat_dim):
self.lat_dim = lat_dim
super(VAE, self). init ()
self.encoder = tf.keras.Sequential([
L.InputLayer(input_shape=(256, 256, 3)),
L.Conv2D(filters=32, kernel_size=3, strides=(2, 2),
activation='relu'),
L.Conv2D(filters=64, kernel_size=3, strides=(2, 2),
activation='relu'),
L.Flatten(),
L.Dense(lat_dim + lat_dim)
])
self.decoder = tf.keras.Sequential([
L.InputLayer(input_shape=(lat_dim,)),
L.Dense(units=64*64*64, activation='relu'),
L.Reshape(target_shape=(64, 64, 64)),
L.Conv2DTranspose(filters=64, kernel_size=3, strides=2,
padding='same',activation='relu'),
L.Conv2DTranspose(filters=32, kernel_size=3, strides=2,
padding='same',activation='relu'),
# L.Conv2DTranspose(filters=32 , kernel_size=3, strides=1,
padding='same'),
L.Conv2DTranspose(filters=3 , kernel_size=4, strides=1,
padding='same')
])
@tf.function
def sample(self, eps = None):
if eps is None:
eps = tf.random.normal(shape = (100, self.lat_dim))
return self.decode(eps, sig = True)
# block 4
optimizer = tf.keras.optimizers.Adam(1e-4)
# @tf.function
def train_step(model, x,
5
loss = compute_loss(model, x)
gradients = tape.gradient(loss,
model.trainable_variables)
optimizer.apply_gradients(zip(gradients,
model.trainable_variables))
# block 5
import matplotlib.pyplot as
plt epochs = 10
latent_dim = 2
num_examples_to_generate = 16
random_vector_for_generation =
tf.random.normal( shape=[num_examples_to_ge
nerate, latent_dim])
model = VAE(latent_dim)
for i in range(predictions.shape[0]):
plt.subplot(4, 4, i + 1)
plt.imshow(predictions[i, :, :, 0],
cmap='gray') plt.axis('off')
plt.savefig('image_at_epoch_{:04d}.png'.format(epoch
)) plt.show()
loss =
tf.keras.metrics.Mean() for
test_x in test_ds:
loss(compute_loss(model,
test_x)) elbo = -loss.result()
ELBO.append(elbo)
display.clear_output(wait=False)
print('Epoch: {}, Test set ELBO: {}, time elapse for
current epoch: {}'
.format(epoch, elbo, end_time -
start_time)) generate_and_save_images(model,
# block 6
# block 7
pred =
predictions pred
= pred*255
pred = np.asarray(pred)
pred =
pred.reshape((256,256,3)) pred
= pred[...,::-1]
6
# Block 8
# block 9
plt.plot(elbLoss)
6
Multi- Class Classification Model Deployment using Flask
def center_crop(img, nw =
None): wid, hei = img.size
if nw is None:
nw = min(wid, hei)
l = (wid -
63
r = (wid + nw)/2
b = (hei + nw)/2
preds = model.predict(x)
return preds
preds_1 = model_1.predict(x_1)
return preds_1
@app.route('/')
def index():
# Main page
return render_template('index.html')
@app.route('/i')
6
def i():
return render_template('index.html')
@app.route('/p')
def p():
return render_template('predict.html')
@app.route('/c')
def c():
return render_template('contactus.html')
6
binary_result="Diseases detected are : "
listToStr = ','.join([str(elem) for elem in
diseases]) result=binary_result+listToStr
return render_template('predict.html',
prediction_text=' {}
'.format(result)) binary_result = "No Disease Detected"
return render_template('predict.html',
prediction_text='Prediction : {}
'.format(binary_result))
6
CHAPTER 6
REFERENCES
[2] S. Pachade et al., “Retinal Fundus Multi -Disease Image Dataset (RFMiD): A Dataset
for Multi -Disease Detection Research,” Data, vol. 6, no. 2, p. 14, Feb. 2021, doi:
10.3390/data6020014.
[3] E. Sudheer Kumar and C. Shoba Bindu, “ MDCF : Multi-Disease Classification Framework
on Fundus Image Using Ensemble CNN Models,” September 2021.
Available: https://jilindaxuexuebao.com/details.php?id=DOI:10.17605/OSF.IO/ZHA9C
[4] Dominik Müller, Iñaki Soto-Rey, and Frank Kramer, “Multi – Disease Detection in Retinal
Imaging Based on Ensembling Heterogeneous Deep Learning Methods,” March 2021.
Available: https://arxiv.org/abs/2103.14660
[5] Jie Hu, Li Shen, Samuel, Albanie, Gang Sun, and Enhua Wu, “Squeeze-and-Excitation
Networks,” May 2019. Available: https://arxiv.org/abs/1709.01507
[7] Andrés G Marrugo, and María S Millán, “Retinal Image Analysis : Preprocessing and
Feature Extraction,” February 2011. Available:
https://iopscience.iop.org/article/10.1088/1742-6596/274/1/012039
[8] Ekberjan Derman, “Dataset Bias Mitigation Through Analysis of CNN Training Scores,” June
2021. Available: https://arxiv.org/abs/2106.14829
[9] Diederik P. Kingma, and Max Welling “An Introduction to Variational Autoencoders,” December
2019. Available:
[10] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir
Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich, “Going Deeper With
Convolutions,” Sept 2014. Available: https://arxiv.org/abs/1409.4842
[11] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew
Wojna, “Rethinking the Inception Architecture for Computer Vision,” December 2015.
Available: https://arxiv.org/abs/1512.00567v3
6
[12] Alex Labach, Hojjat Salehinejad, and Shahrokh Valaee, “Survey of Dropout Methods
for Deep Neural Networks,” October 2019. Available: https://arxiv.org/abs/1904.13310
[13] Shaofeng Cai, Yao Shu, Gang Chen, Beng Chin Ooi, Wei Wang,and Meihui Zhang,
“Effective and Efficient Dropout for Deep Convolutional Neural Networks,” July 2020.
Available: https://arxiv.org/abs/1904.03392
[14] Huimin Ren, Yun Yue, Chong Zhou, Randy C. Paffenroth, Yanhua Li, and Matthew L.
Weiss, “Robust Variational Autoencoders: Generating Noise-Free Images from Corrupted
Images.”
Available: https://users.wpi.edu/~yli15/Includes/AdvML20_Huimin.pdf
[15] “Tensorflow and Keras Model Deployment using Flask in Google Colab.”
https://medium.datadriveninvestor.com/tensorflow-and-keras-model-deployment-using-flask-in-googl
e-colab-b8e49d1d4af0
6
BIODATA
6
Disease Identification and Retinal Fundus Scan Correction
With over 1 billion preventable cases,
the ability to pre-emptively solve the issue is
using Deep Learning
the best strategy. But this is not possible in
certain places with lack of proper eye-care in
either technological or technician Kartik Sai1, Sriharshitha Deepala1, Hemant Kumar
departments. To tackle this issue, we need a Rathore1
disease risk predictor that can tell us if the 1
Vellore Institute of Technology
retina of the patient shows any signs or lead
indicators of a predefined set of symptoms.
Abstract
This disease risk predictor and
classifier
In thiscan be a implemented
paper deep learning using the is presented which can show if the scan of a retinal fundus risks having a disease and if so, the
application
Introduction
Retinal Fundus
nature of the disease Multi-Disease Dataset
with a risk probability Model
along with a scan Architecture
clarifier. Image processing and Deep learning methods are used to achieve this result.
(RFMiD). This dataset consists of 3200
This model is hosted with the help of a website where the user can upload their scan of their retinal fundus in order to get a diagnosis and a corrected
retinal fundus When
scan (if required). images
the user and
uploadsthe
their scan onto the site, the deep learning model processes the image and outputs the disease risk and
corresponding set of ground truth values Methodology
in This is achieved using Deep learning techniques including Convolutional neural network (CNN) and
the kind of disease being risked in their case.
CSV format,
Variational and this (VAE)
Autoencoder enablesforthe imple-
disease • The initial and
type/risk identification stepscan
of allcorrection
the models is to load the data-set. The RFMiD data-set is loaded using
respectively.
tensorflow.data pipeline(A data pipeline is a series of data processing steps)
Software
In today’sandworld,hardware
we face many Details
difficulties in terms of eye care, including treatment, quality of prevention, vision rehabilitation services, and
• diagnosis
The RFMiD data-set is highly biased, andintopreparing
reduce bias perform Data Augmentation to
scarcity of trained eye care experts. Early detection and of ocular abnormalities can help before-hand for disease prevention
increase low
and early treatment procedures. This project aims at classification and image density.
prediction of diseases. With over 1 billion preventable cases, the ability to pre-
emptively solve the issue is the best strategy. But this is•not possible
After in certain
Loading places
the data and with lack of
applying proper eye-care
different types of in either technological
pre-processing or technician
techniques, create a
deep learning model to predict disease risk (unhealthy or healthy iris scan) as shown in the
Figure I by utilizing inception V3, Squeeze and Excitation block and by considering many
other parameters.
• After successful creation of disease prediction model,save the model weights in .H5 format
add multi-diseases classification to it as shown in Figure II.
• To remove noisy scans, create an Iimage
Figure de-noiser
: Binary Classifierusing Variational Auto-Encoder to
denoise the images.
Conclusion
This project was a consequence of a problem survey done on local ophthalmologists. Complaints ranged from incorrect, slow diagnosis and lack of
expertise or help that is required to properly help with the accurate decision of a disease or risk of a disease. This creates a large problem in proper
planning, safety of a patient which could result in improper precautionary measures taken or in some cases, no precautionary measures. This project
aims at helping to solve that issue and also hopes to provide a sort of pre-diagnosis to any individuals so that they take the utmost care and can be
paticularly useful in very remote places, with a very steep patient per doctor ratio.
HardwareDetails: Freetierofgoogle
colab with 12GB RAM, 108GB disk and Nvidia Tesla T4 GPU.