Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 83

ABSTRACT

A great challenge in the biomedical engineering is the non-invasive assessment of the


physiological changes occurring inside the human body. Specifically, detecting the abnormalities
in the human eye is extremely difficult due to the various complexities associated with the
process. Retinal images captured by digital cameras can be used to identify the nature of the
abnormalities affecting the human eye. Conventional disease identification techniques from
retinal images are mostly dependent on manual intervention. Human observation is highly prone
to error, the success rate of these techniques is quite low. Diabetic Retinopathy is one such
disease of retina which occurs in people suffering from long standing diabetes. It is a
multistage progressing disease namely NDPR and PDR. Microaneurysms, haemorrhages and
exudates are the abnormal features commonly observed in the retinal image of a person
affected by diabetic retinopathy. Image processing techniques are used to pre-process the
fundus image, which is followed by segmentation of anomalies. Feature extraction is done and
the detected features are used to classify the different stages of diabetic retinopathy. The
classification technique used is Fast Regional convolutional Neural Network, and the accuracy
obtained is 98.9 %.

ii
CONTENTS

TITLE PAGE NO

ACKNOWLEDGEMENT i

ABSTRACT ii

LIST OF FIGURES v

LIST OF TABLES viii

1. INTRODUCTION 1
1.1 Overview 1
1.2 Motivation 3
1.3 Problem Statement 4

2. LITERATURE SURVEY 5
2.1 Technical Papers 5

3. REQUIREMENT ANALYSIS 10
3.1 Feasibility study 10
3.2 System Requirements 12
3.3 Software Requirements 12
3.4 Hardware Requirements 12

4. SYSTEM ANALYSIS AND DESIGN 13


4.1 Existing system 13
4.2 Proposed system 13
4.3 Data flow diagram 14

3
5. IMPLEMENTATION 25
5.1 Algorithms 25
5.1.1 Proposed algorithm 25
5.1.2 Otsu thresholding algorithm 26
5.1.3 Fast Recurrent Neural Network algorithm 27
5.2 Methods 28
5.2.1 Structuring Element 28
5.2.2 Morphological operations 28

6. TESTING 48
6.1 Introduction 48
6.2 Test cases 50

7. SNAPSHOTS AND RESULTS 55


7.1 Snapshots 55
7.2 Results 59

CONCLUSION 60

BIBLIOGRAPHY

4
LIST OF FIGURES

FIGURE NO DESCRIPTION PAGE NO

Vision Affected by Diabetic Retinopathy. a. Normal 1


Figure 1.1
Vision and b. Vision affected by Diabetic Retinopathy

Figure 1.2 Fundus image with abnormalities 2

Figure 4.1 System architecture 14

Figure 4.2 Level 0 DFD 15

Figure 4.3 Level 1 of DFD 15

Figure 4.4 Level 1.1 (pre-processing) DFD 17

Figure 4.5 Red, Blue and Green channel images 18

Figure 4.6 Level 1.2 (Segmentation) DFD 19

Figure 4.7 Input image and the generated mask 20

Figure 4.8 Level 1.3 (Feature extraction) DFD 22

Figure 4.9 Construction of GLCM 23

Figure 4.10 Level 1.4 (Classification) DFD 23

Figure 5.1a Original image 29

Figure 5.1b Image after dilation with disk shaped SE 29

Figure 5.2a Original image 30

5
Figure 5.2b Image after erosion with disk shaped SE 30

Figure 5.3a Opening operation with disk shaped image 31

Figure 5.3b Closing operation with disk shaped SE image 31

Figure 5.4a Original image 32

Figure 5.4b Canny edge detected image 32

Figure 5.5a Illustration of a 3 x 3 median filter 32

Original image (left) and image after median filtering 33


Figure 5.5b (right)

Figure 5.6 GLCM calculation 34

Figure 6.1 Case 1 actual output for Moderate severity level 50

Figure 6.2 Case 2 actual output for Normal Image 51

Figure 6.3 Case 3 actual output for Moderate severity level 52

Figure 6.4 Case 4 actual output for Moderate severity level 53

Figure 6.5 Case 5 actual output for Moderate severity level 54

Figure 7.1 First view of GUI 55

Figure 7.2 Database of Fundus images 56

Figure 7.3 Image is loaded from database 56

Figure 7.4 Error message if image is not choosen 57

Figure 7.5 Progress bar for grading the image 57

6
Figure 7.6 Final segmented and classified results 58

Figure 7.7 Help pop-up for users 58

Figure 7.8 Accuracy and other performance measures 59

vii
LIST OF TABLES

TABLE NO DESCRIPTION PAGE NO

Table 5.1 Rules for dilation and erosion 29

Table 5.2 Function descriptions 35

Table 6.1 Test case 1 – best case 50

Table 6.2 Test case 2 – worst case 51

Table 6.3 Test case 3 – Not working case 52

Table 6.4 Test case 4 – Misclassified image 53

Table 6.5 Test case 5 – FOV 0° 54

viii
CHAPTER 1

INTRODUCTION
1.1 OVERVIEW
Diabetic Retinopathy (DR) is a complication of Diabetes which affects the eye. It
is caused by damage to the blood vessels of the retina – the light-sensitive tissue at the
back of the eye. At first, diabetic retinopathy may cause no symptoms or only mild
vision problems. Eventually, it can cause blindness. It is one of the leading cause of
blindness in the world. Around 80 percent of population having diabetes for more than
10 or more years has some stage of the disease. Almost two-third of all Type 2 and
almost all Type 1 diabetics are expected to develop DR over a period of time.

a b
Figure 1.1 Vision Affected by Diabetic Retinopathy. a. Normal
Vision b. Vision affected by Diabetic Retinopathy

Figure 1.1 shows the difference between normal vision and diabetic retinopathy affected
vision.

Diabetic retinopathy is classified into two types:

Non-proliferative diabetic retinopathy (NPDR) is the early stage of the disease


in which symptoms will be mild or nonexistent. In NPDR, the blood vessels in the retina
are weakened. Tiny bulges in the blood vessels, called microaneurysms, may leak fluid
into the retina. This leakage may lead to swelling of the macula.
Proliferative diabetic retinopathy (PDR) is the more advanced form of the
disease. At this stage, circulation problems deprive the retina of oxygen. As a result new,
fragile blood vessels can begin to grow in the retina and into the vitreous, the gel-like
fluid that fills the back of the eye. The new blood vessels may leak blood into the
vitreous, clouding vision.

Other complications of PDR include detachment of the retina due to scar tissue
formation. In PDR, new blood vessels grow into the area of the eye that drains fluid from
the eye. This greatly raises the eye pressure, which damages the optic nerve. If left
untreated, PDR can cause severe vision loss and even blindness.

Fundus Image
It is the image of the internal structure of the eye captured using specialized
fundus cameras that consist of an intricate microscope attached to a flashed enabled
camera. The main structures that can be visualized on a fundus image are the central
and peripheral retina, optic disc and macula.

Retinal abnormalities
The common abnormalities found in the human retina are Microaneurysms,
Haemorrhages and Exudates as shown in Figure 1.2.

Blood Vessels
Exudates
Optic
disc Haemorrhages

Microaneurysm

Figure 1.2 Fundus image with abnormalities


 Microaneurysms
A small swelling that forms on the ends of tiny blood vessels. These small swellings may
break and allow blood to leak into nearby tissue. The earliest visibility of diabetic
retinopathy is the microaneurysms.

 Haemorrhages
Retinal hemorrhage is a disorder of the eye in which bleeding occurs into the retina.
Retinal hemorrhages that take place outside of the macula if left undetected for many
years, and may sometimes only be picked up when the eye is examined in detail by
ophthalmoscopy or fundus photography. Some retinal hemorrhages can cause severe
impairment of vision.
 Exudates
As Diabetic Retinopathy progresses, a fluid rich in protein and cellular elements that
oozes out of blood vessels due to inflammation and is deposited in nearby tissues.
Exudates are manifested as spatially random yellowish or whitish patches of varying
sizes, shapes and locations. These are the visible sign of DR and a major cause of
visual loss in Non- Proliferative forms of DR.

1.2 MOTIVATION
Diabetic Retinopathy has become one of the leading causes of blindness, but it can
be cured if diagnosed at an early stage, this requires regular eye examinations. Currently
the examinations are done by ophthalmologists or optometrists manually. People from
low income backgrounds and rural areas may not be able to afford these regular
checkups. Setting up mass screening centers with automated systems can eliminate
unnecessary visits to the ophthalmologist. This will also eliminate human intervention
and thus reduce the cost.

The automated system will grade the images on the level of severity and refer only
those patients who need medical attention to the Ophthalmologist. This will also relieve
the burden on the doctor who would otherwise have to go through a lot of images which
will come from the mass screening camps.
1.3 PROBLEM STATEMENT
Diabetes effects the circulatory system of a person, including that of the retina,
which leads to DR. The oxygen supply to the visual system is reduced to a huge extent
and it causes swellings on the retinal vessels. Also retinal lesions are formed which
includes haemorrhages, microaneurysms and exudates. These are the symptoms for the
disease, which will not be visible in the initial stages of the disease. Therefore, unless
the patient takes regular examination of the disease, it cannot be identified and thus not
cured.

For a given collection of retinal fundus images (1…N), where N is greater than 100,
the purpose is:

(i) Precise classification of input images into normal, mild, moderate, severe.

(ii) To increase classification accuracy and analyze the efficiency of the proposed work
with the existing algorithms.
CHAPTER 2

LITERATURE SURVEY

2.1 TECHNICAL PAPERS

Image Processing for Identifying Different Stages of Diabetic


Retinopathy [1].
In this work, an algorithm has been developed for the identification of the
different stages of Diabetic Retinopathy (DR). The irregularities in the retinal blood
vessels such as changes in the perimeter, area of spread are considered as a measure of
severity of DR. The input fundus images are obtained from the MESSIDOR Database.
A total of 100 images are analyzed in the present study. These images are graded into
four sub sets as Grade 0, 1, 2 and 3. Grade 0 corresponds to normal retinal images or
no DR. Grade 1 images correspond to Mild Non-Proliferative Retinopathy. Grade 2
images correspond to Moderate and Severe Non-Proliferative Retinopathy. Grade 3
images correspond to Proliferative Retinopathy. This algorithm provides 85% accuracy.

Segmentation Method: Edge detection is a fundamental tool in image processing, in the


areas of feature detection and feature extraction. The result of applying an edge detector
to an image leads to a set of connected curves that indicate the boundaries of objects,
the boundaries of surface markings as well as curves that correspond to
discontinuities in surface orientation. Thus, applying an edge detection algorithm to a
retinal image significantly extract the blood vessels. For edge detection technique the
non-linear Sobel filter is employed. A nonlinear filter replaces each pixel value with a
nonlinear function of its surrounding pixels. Like the linear filters, the nonlinear filters
operate on a neighborhood. The nonlinear Sobel filter is a highpass filter that extracts the
outer contours of objects. It highlights significant variations of the light intensity along
the vertical and horizontal axes.
Machine Learning (ML) Model: No ML model has been used here, the fundus images
are classified in to grades based on the perimeter and area values of the blood vessels.
BP and SVM based Diagnosis of Diabetic Retinopathy [2].
This work proposes an algorithm for identification of the different stages of
diabetic retinopathy. The early signs of DR Microaneurysms are considered for
classifying the different stages of the disease. The input fundus image is pre-processed
using median filter to remove salt and pepper noise. MAs look like isolated patterns and
are detached from the vessels. Based on the shape, intensity level and size the features
of Microaneurysms can be extracted. After the detection of Microaneurysms, Depending
on the count of detected Microaneurysms, the images are classified whether it is normal,
mild moderate and severe by using SVM. This method provides 83% accuracy, 74%
sensitivity and 100% specificity.

Segmentation Method: MAs look like an isolated pattern and are detached from the
vessels. Based on the shape, intensity level and size the features of Microaneurysms can
be extracted. As soon as image is pre-processed, the candidates Microaneurysms are
segmented through isolating them from veins. Blood vessels are huge in subject and are
related section, hence can recognize from MA situated on area. By the experimentation
the threshold value is chosen to remove blood vessels. The threshold value are
eliminated if the object has greater area. For the experimentation, two threshold values
are chosen to cast off noise objects having greater area and scale down than MAs. The
consequential image having objects which have the identical region and some of them are
MAs. MAs are identified from the noises which are in irregular shape. The noise
having same area as Microaneurysms are removed based on the major and minor axis.
Based on perimeter and circularity the Microaneurysms are detected. Based on the
resultant image from the previous segment the canny edge detection is performed. Each
perimeter and object area is manipulated and these outcomes are utilized to form a
simple metric representing the roundness of an entity.
ML Model: SVM training approach is utilized to compare coaching information to
discover a fine viable way to categories the DR images into their respective categories,
natural, slight, moderate and severe. It is robust procedure for the categorization of
information and deterioration. It appears for a hyper plane that may linearly separate
courses of objects. It looks for a hyper plane that can linearly separate classes of items.
This technique is utilized to recognize the different classes. It is used to differentiate the
various categories.
Diabetic Retinopathy: Patient Identification and Measurement of the
Disease Using ANN [3].
In this system, fundus retinal image is used to diagnose diabetic retinopathy
patient. Developed system used green plane out of RGB component of image due more
clarity. Green plane is selected because red planes have saturated features and blue plane
has low intensity features while contours are easily viewed in green plane due to proper
intensity features. Median filtering operation and histogram equalization operations is
performing on a green plane to analyze various features in DR patient. In the given
system defects boundaries are created around the affected areas of the retina image and
calculating the areas. According to the area of DR feature component and with the help
ANN by training and defining weights to the previous layer. Severity of the disease
can be analyzed. Achieved an accuracy of 86.36%.

Segmentation Method: Morphological operation like closing and opening (Dilation or


Erosion) is required to identify feature accurately with the help of threshold value for
Defects. Besides that Hough transform is used to isolate the features of the particular
shape within an image. In the given system defects boundaries are created around the
affected areas of the retina image and calculates the areas.

ML Model: Artificial Neural Networks are relatively crude electronic models based on
the neural structure of the brain. ANN is also called perceptron. Single layer perceptrons
are used to classify linear data, by introducing additional hidden layers non-linear data
can be classified. A multilayered perceptron is used with one input layer, one hidden
layer and one output layer is being used. According to the area of DR feature
component and with the help Artificial Neural Network (ANN) by training and defining
weights to the previous layer. Severity of the disease can be classified.
Diabetic Retinopathy Detection using Fast Recurrent Neural Network [4].
This system proposes classification of fundus images using Fast Recurrent
Neural Network algorithm. From the total 778 images 287 features are extracted. After
extraction classes are formed as class 0, class 1, class 2, class 3 having 360, 88, 142,
188 respectively. The Fast Recurrent Neural Networks algorithm is one of the best
among classification algorithms - able to classify large amounts of data with accuracy.
Fast Recurrent Neural Networks are an ensemble learning method for classification and
regression that construct a number of decision trees at training time and outputting the
class that is the mode of the classes output by individual trees where each tree depends on
the values of a random vector sampled independently with the same distribution for all
trees in the forest. The basic principle is that a group of “weak learners” can come
together to form a “strong learner”. Achieved an accuracy of 74.93%.

Segmentation Method: No segmentation (Huge feature set is extracted), for feature


extraction MaZda software is used. MaZda load images in the form of Windows Bitmap,
DICOM and unformatted grey-scale image files with pixels intensity encoded with 8 or
16 bits. Additionally, there is an option for reading details of image acquisition
protocol extracted from the image information header. Image normalization. Defining
regions of interest (ROI), then analysis is performed within these regions. Up to 16
regions of any shape can be defined; they can be also edited, loaded and saved as disk
files. The feature set is divided into following groups: histogram-, co-occurrence matrix-,
run-length matrix-
, gradient, autoregressive model- and Haar wavelet derived features. Displaying image
analysis reports, saving and loading reports into disk files.

ML Model: The Fast Recurrent Neural Networks algorithm is one of the best
among classification algorithms - able to classify large amounts of data with accuracy.
Fast Recurrent Neural Networks are an ensemble learning method for classification and
regression that construct a number of decision trees at training time and outputting the
class that is the mode of the classes output by individual trees. The basic principle is that
a group of “weak learners” can come together to form a “strong learner”. Fast Recurrent
Neural Networks are a wonderful tool for making predictions considering they do not
over-fit because of the law of large numbers. Introducing the right kind of randomness
makes them accurate classifier.
Automatic Detection of Microaneurysms and Classification of Diabetic
Retinopathy Images using SVM Technique [5].
The main objective of this work is to detect the early stage of DR using the
features extracted from the pre-processed image. The image obtained from the database is
subjected to the pre-processing steps such as green channel extraction, contrast
enhancement, median filtering and histogram equalization. After pre-processing, the
image is morphologically operated by a disk shaped structuring element. Connected
component analysis method is used for the removal of optic disk. This image is then
utilized for feature extraction. The features like Microaneurysms area, homogeneity and
texture properties are extracted. The appropriate features for classification are selected.
Support Vector Machine technique is used for classifying the input images as normal
and DR based image as well as detecting the earlier stage of DR using the extracted
features.

Segmentation Method: The pre-processed image is converted to binary image by


applying proper thresholding value. This binary image is subjected to morphological
operations i.e. opening and closing. Closing operation is defined as dilation and
opening as erosion. Dilation is an operation that grows or thickens objects in a binary
image. Erosion shrinks or thins the objects in the binary image. Structuring element
is defined as the shape (dimension) that controls the process of thickening and
thinning. As the optic disk and microaneurysms are circular in shape, a disk shape
structuring element is used in this project and the optic disk occupies more area of the
retinal image and it should be removed for facilitating the microaneurysms detection.
Thus the connected component analysis method is used for the elimination of optic
disk.

ML Model: Support Vector Machine Support vector machine is a supervised learning


process applied for analyzing the training data to find an optimal way to classify the
diabetic retinopathy images into their respective classes namely Normal, Mild and Severe.
SVM is a robust method used for data classification and regression. SVM models
constructs a hyperplane for separating the given data linearly into separate classes. The
training data should be statistically sufficient. The classification parameters are formed
according to the calculate features using the SVM algorithm. These classification
parameters are used for classifying the images.
CHAPTER 3

REQUIREMENT ANALYSIS

3.1 FEASIBILITY STUDY

A feasibility study is carried out to select the best system that meets performance
requirements. The main aim of the feasibility study activity is to determine whether it
would be financially and technically feasible to develop the product. The feasibility study
activity involves the analysis of the problem and collection of all relevant information
relating to the product such as the different data items which would be input to the
system, the processing required to be carried out on these data, the output data
required to be produced by the system as well as various constraints on the behavior of
the system. The key objective of the feasibility study is to weigh up three types of
feasibility. They are:
a) Operational Feasibility
b) Technical Feasibility
c) Economic Feasibility

 Operational Feasibility:
This is mainly related to human organizational and political aspects. The points to
be considered are:
 What new skills will be required? Do the existing staff members have
these skills? If not, can they be trained in due course of time?
 What changes will be brought with the system?
The proposed system will provide an automated system for the detection of the
disease.
The technicians will have to learn how to feed the images to the system.
 Technical Feasibility:
Technical feasibility analysis makes a comparison between the level of technology
available and that is needed for the development of the project. The level of technology
consists of the factors like software tools, machine environment, and platform developed
and so on. It is also concerned with specifying equipment and software that will
successfully satisfy the user requirement. The technical needs of the system may vary
considerably.

 Economic Feasibility:
This is the most important part of the project because the terms and conditions for
implementing the project have to be economically feasible. The risk of finance does not
exist as the existing hardware is sufficient and the software is free of cost. The system is
economically feasible.

Hardware Interface:
Describes the logical and physical characteristics of each interface between the
software product and the hardware components of the system. This may include the
supported device types, the nature of the data and control interactions between the
software and the hardware, communication protocols to be used.

Software Interface:
Describes the connections between this product and other specific software
components (name and version), including databases, operating systems, tools, libraries,
and integrated commercial components. Identify the data items or messages coming into
the system and going out and describe the purpose of each. Describe the services needed
and the nature of communications.

Objective of Software Project Planning

The objective of software project planning is to provide a frame work that enables
the reasonable estimation of resources, cost and schedule.

Software Scope

The first activity in software project planning is the determination of software


scope. A software project scope must be unambiguous and understandable at the
management and technical levels. The software scope means the actual operation that is
going to be carried out by the software and its plus points and limitations.

Resources

The second task of software planning is the estimation of resources required. Each
resource is specified with the following characteristic. Resource descriptions, details of
availability, when it is required, how long it is required.

 Human Resource
 Hardware Resource
 Software Resource

3.2 SYSTEM REQUIRMENTS

The system requirements are a listing of what software programs or hardware


devices are required to operate the program properly.

3.3 SOFTWARE REQUIREMENTS

Tools : Matlab, Weka


Operating System : Windows 7 and above
Java Version : JDK 7

3.4 HARDWARE REQUIREMENTS

Processor : Intel Core i5


Speed : 2.4 GHz
RAM : 4 GB
Hard Disk : 20 GB
CHAPTER 4

SYSTEM ANALYSIS AND DESIGN

4.1 EXISTING SYSTEM

All the existing systems are concentrating on segmenting the primary anomalies
responsible for initial stage of the disease. Their proposed method is also database
dependent and would work only for that specific database. There is no single approach or
method for segmenting all the anomalies. Building an optimal feature set is needed as this
disease is vision threatening and high benchmarking results need to be achieved. Many
classification techniques are proposed with different and novel approaches of pre-
processing. All these surveyed papers achieve an average of 85% accuracy which still
need to be improved. Researches throughout the world are trying to propose better
classification algorithms by building significant feature set. Limitations of the
existing systems are defects that arise due to DR compliances which cannot be extracted
using single method. Hence there is a need to developed system will overcome and able
extract all features using single platform.

4.2 PROPOSED SYSTEM

The Proposed system is a multistage classifier of Diabetic Retinopathy. This


system overcomes the drawbacks of the existing system by classifying the disease in to
four stages, namely, Normal, NPDR 1, NPDR 2 and PDR. This multistage classification
is important because the disease itself progresses in multiple stages. The reoccurrence
of the disease depends on the stage in which the treatment is provide, so it is not enough
to classify the image as normal and abnormal.

The pre-processing part of both the existing and proposed system remain similar,
the difference comes in the segmentation and feature extraction stages. Existing system
only segmented anomalies like microaneurysms, the problem with this is this anomaly
occurs in the initial stage of the disease. Treatment cannot be given at this stage, hence
this is a major drawback. The proposed system overcomes this by segmenting
haemorrhages along with microaneurysms and also by considering a large feature set
which includes the area and count of the segmented anomalies. The feature set also
includes textural features like energy and correlation, and statistical features like mean
and variance. This feature set is then used to classify the image into the respective
severity. Figure 4.1 shows the system architecture of the proposed system.

Figure 4.1 System architecture

4.3 DATA FLOW DIAGRAM


A data flow diagram (DFD) illustrates how data is processed by a system in terms
of inputs and outputs. As its name indicates its focus is on the flow of information, where
data comes from, where it goes and how it gets stored.

Applications of DFD

DFDs are a common way of modeling data flow for software development. For
example, a DFD for a word-processing program might show the way the software
processes the data that the user enters by pressing keys on the keyboard to produce the
letters on the screen.
Significance of DFD

DFDs are popular for software design because the diagram makes the flow of data
easy to understand and analyze. DFDs represent the functions a system performs
hierarchically, starting with the highest-level functions and moving through various layers
or levels of sub-functions. As a modeling technique, DFDs are useful for performing a
structured analysis of software problems, allowing developers to spot and pinpoint issues
in software development.
Every system is developed either to satisfy a need or to overcome the drawbacks of an
existing system.

Level 0

Severity
Detection and
Fundus Image
Classification Level

Figure 4.2 Level 0 DFD


Figure 4.2 shows the level 0 DFD. The fundus image of the eye is given as the input, the
disease is detected and classified to give the severity level of the disease i.e. grades
0,1,2,3 corresponding to normal, mild, moderate and sever.

Level 1

1.1 1.2

Pre-Processing Segmentation

Fundus image

Grade 1.4 1.3


(0, 1, 2, 3) Classification Feature Extraction

Figure 4.3 Level 1 of DFD


Each fundus image fed into the system undergoes a series of processes as shown in
Figure 4.3, they are:

Pre-processing:

Patient movement, poor focus, bad positioning, reflections, inadequate illumination


can cause a significant proportion of images to be of such poor quality as to interfere with
analysis. In approximately 10% of the retinal images, artifacts are significant enough to
impede human grading. Preprocessing of such images can ensure adequate level of
success in the automated abnormality detection. In the retinal images there can be
variations caused by the factors including differences in cameras, illumination,
acquisition angle and retinal pigmentation. First step in the preprocessing is to
attenuate such image variations by normalizing the color of the original retinal image
against a reference image. Few of the retinal images acquired using standard clinical
protocols often exhibit low contrast. Also, retinal images typically have a higher
contrast in the center of the image with reduced contrast moving outward from the
center. For such images, a local contrast enhancement method is applied as a second
preprocessing step.

Segmentation:

Involves the partitioning of an image or volume into Involves the partitioning of an


image or volume into distinct (usually) non-overlapping regions in a meaningful way.

Segmentation-

• Identifies separate objects within an image Identifies separate objects within an image.

• Finds regions of connected pixels with similar properties.

• Finds boundaries between regions.

• Removes unwanted regions.

Feature Extraction:

Feature extraction a type of dimensionality reduction that efficiently represents


interesting parts of an image as a compact feature vector. This approach is useful when
image sizes are large and a reduced feature representation is required to quickly complete
tasks such as image matching, classification and retrieval.

Classification:

Image classification analyzes the numerical properties of various image features and
organizes data into categories. Classification algorithms typically employ two phases of
processing: training and testing. In the initial training phase, characteristic properties of
typical image features are isolated and, based on these, a unique description of each
classification category, i.e. training class, is created. In the subsequent testing phase,
these feature-space partitions are used to classify image features.

Level 1.1

Colour Fundus Image

RGB to green
channel

Re-size

Median Filtering

CLAHE

Enhanced Image

Figure 4.4 Level 1.1 (pre-processing) DFD


Preprocessing stage as shown in Figure 4.4 includes the following sub stages:

RGB to Green Channel:

The color image is converted to a gray scale image and then the green channel is
extracted from it. Green channel is better than the red or blue channels because the red
channel image is too bright and the blue channel image is too dark. All the anomalies are
visible properly in the green channel image. A comparison of the images of the three
channels is shown in Figure 4.5.

Figure 4.5 Red, Blue and Green channel images

Image resizing:

The green channel image is then resized to 560x720 Standard aspect ratio.

Median filtering:

One of the major advantages of pre-processing an image is to remove noise.


Median filtering is one of the methods that is used for the same. Median filtering is a
nonlinear method used to remove noise from images. It is widely used as it is very
effective at removing noise while preserving edges. It is particularly effective at
removing ‘salt and pepper’ type noise. The median filter works by moving through the
image pixel by pixel, replacing each value with the median value of neighboring pixels.
The pattern of neighbors is called the "window", which slides, pixel by pixel over the
entire image 2 pixel, over the entire image. The median is calculated by first sorting all
the pixel values from the window into numerical order, and then replacing the pixel
being considered with the middle (median) pixel value.
CLAHE:

Contrast Limited Adaptive Histogram Equalization is one of the well-known


enhancement techniques. In histogram equalization, the dynamic range and contrast of an
image is modified by altering the image such that its intensity histogram has a desired
shape. The intensity levels are changed such that the peaks of the histogram are stretched
and the troughs are compressed. In contrast limited histogram equalization (CLHE), the
histogram is cut at some threshold and then equalization is applied. Contrast limited
adaptive histogram equalization (CLAHE) is an adaptive contrast histogram equalization
method, where the contrast of an image is enhanced by applying CLHE on small data
regions called tiles rather than the entire image. The output of this stage is an enhanced
image.

Level 1.2

Pre-Processed Mask Vessel


Generation detection
Image
and removal

Boundary detection
and removal

Microaneurysm
Haemorrhage
segmentation
Segmentation

Segmented Anomalies

Figure 4.6 Level 1.2 (Segmentation) DFD


The enhanced image obtained in the previous stage is then segmented. Segmentation
involves the following sub stages:

Mask Generation:

The mask is a binary image with the same resolution as that of fundus image
whose positive pixels correspond to the foreground area. It is important to separate the
fundus from its background so that the further processing is only performed for the
fundus and not interfered by pixels belonging to the background. In a fundus mask,
pixels belonging to the fundus are marked with ones and the background of the fundus
with zeros. The fundus can be easily separated from the background by converting the
original fundus image from the RGB to HSI color system where a separate channel is
used to represent the intensity values of the image. The intensity channel image is
threshold by a low threshold value as the background pixels are typically significantly
darker than the fundus pixels. A median filter is used to remove any noise from the
created fundus mask and the edge pixels are removed by morphological erosion. The
final mask obtained is shown in Figure 4.7.

Figure 4.7 Input image and the generated mask

Vessel detection and removal:

Blood vessel segmentation is a significant step in the red lesion detection. Since
the blood vessels and red lesions namely microaneurysms and haemorrhages are both
red in color the blood vessels need to be extracted out of the fundus image in order to
effectively detect the microaneurysms and haemorrhages. A Contrast limited adaptive
histogram equalization (CLAHE) is performed on the negative of green channel image.
Top-hat filter operation is applied using a flat disc shaped structuring element. Top-hat
filtering is the equivalent of subtracting the result of performing a morphological
opening operation on
the input image from the input image itself. Suitable threshold is used to segment out the
blood vessels. This threshold is selected based on the a priori knowledge of the quality of
the image. The resultant image comprises of blood vessels along with haemorrhages,
micro-aneurisms and other stray structures. After removing structures that have area less
than a decided threshold the image containing only blood vessels is obtained.

Boundary detection and removal:

The boundary of the fundus image gets separated in the process of segmentation
and has to be removed as it causes false detections. By using the generated mask the
boundary is removed.

Microaneurysm and Haemorrhage Segmentation:

The microaneurysms and dot haemorrhages are Structures with sharp edges and
circular in shape. Edge detection is done. The detected edges are then filled by applying
morphological closing operation by using a flat disc shaped structuring element. The
image obtained contains microaneurysms and Haemorrhages and blood vessels. Finally
the blood vessels are subtracted from the image to get the red lesions. Microaneurysms
and haemorrhages are further separated from each other depending on the difference in
their sizes. Structures having area smaller than a threshold are classified as candidate
microaneurysms. The candidate microaneurysms contain small fragments of blood vessels
along with microaneurysms. Since these fragments are thin linear structures these are
eliminated by applying a morphological opening operation with a disc shaped structuring
element. Structures bigger than microaneurysms are classified as candidate haemorrhages.
These also contains fragments of blood vessels which are eliminated by applying a
morphological opening operation.

Level 1.3

Once the anomalies are segmented the next stage is feature extraction. The
different features extracted are shown in Figure 4.8. The gray scale image of the input
image is used to extract textural features and statistical features. The textural features
are obtained by
using GLCM. Gray-Level Co-Occurrence Matrix is created by calculating how often a
pixel with the intensity (gray- level) value i occurs in a specific spatial relationship to a
pixel with the value j. By default, the spatial relationship is defined as the pixel of interest
and the pixel to its immediate right (horizontally adjacent), but you can specify other
spatial relationships between the two pixels. Each element (i, j) in the resultant glcm is
simply the sum of the number of times that the pixel with value i occurred in the
specified spatial relationship to a pixel with value j in the input image. The gray-level co-
occurrence matrix can reveal certain properties about the spatial distribution of the gray
levels in the texture image.

Green-Channel
GLCM Textural Features

Green-Channel & Feature Set


Binary Image Green-Channel Statistical features

Binary Image Area, Count

Figure 4.8 Level 1.3 (Feature extraction) DFD

To illustrate, the Figure 4.9 shows how graycomatrix calculates the first three
values in a GLCM. In the output GLCM, element (1,1) contains the value 1 because there
is only one instance in the input image where two horizontally adjacent pixels have the
values 1 and 1, respectively. glcm (1,2) contains the value 2 because there are two
instances where two horizontally adjacent pixels have the values 1 and 2. Element (1,3) in
the GLCM has the value 0 because there are no instances of two horizontally adjacent
pixels with the values 1 and 3. graycomatrix continues processing the input image,
scanning the image for other pixel pairs (i,j) and recording the sums in the corresponding
elements of the GLCM. The statistical features are calculated by applying the respective
formula. From the segmented anomalies the area and count are also calculated. The
textural
features, statistical features and the area and count form the feature set. This is fed to the
classifier in the next stage.

Figure 4.9 Construction of GLCM

Level 1.4

Training Feature Set


*

Classifier Grades
(0,1,2,3)
*
Testing Feature Set

Figure 4.10 Level 1.4 (Classification) DFD

Figure 4.10 shows the DFD for classification, training and testing data are both given as
input to a classifier and the grade of severity is obtained as output.
Classification will be made according to the following procedure:

Step 1: Definition of Classification Classes

Depending on the objective and the characteristics of the image data, the classification
classes should be clearly defined.

Step 2: Selection of Features

The most significant features are selected from the huge feature set.

Step 3: Sampling of Training Data

Training data should be sampled in order to determine appropriate decision rules.


Classification techniques such as supervised or unsupervised learning will then be
selected on the basis of the training data sets.

The proposed system uses supervised learning. Supervised learning is the machine
learning task of inferring a function from labeled training data. The training data consist
of a set of training examples. In supervised learning, each example is a pair consisting
of an input object and a desired output value. A supervised learning algorithm analyzes
the training data and produces an inferred function, which can be used for mapping new
examples. An optimal scenario will allow for the algorithm to correctly determine the
class labels for unseen instances.

Step 4: Estimation of Universal Statistics

Various classification techniques will be compared with the training data, so that an
appropriate decision rule is selected for subsequent classification.

Step 5: Classification

Depending up on the decision rule, the images are classified into different classes. These
classes represent the severity level of the disease and are represented by grades 0, 1, 2 and
3, where 0 is normal, 1 is mild, 2 is moderate and 3 is sever.
CHAPTER 5

IMPLEMENTATION

5.1 ALGORITHMS

5.1.1 Proposed algorithm

Input: Fundus image

Output: Grade of severity (0, 1, 2, 3)

Step 1: Input fundus image is retrieved from the test set

Step 2: Green channel of the image is extracted

Step 3: The image is then passed through median filter

Step 4: CLAHE is applied to the output of previous step

Step 5: The image is then resized to a standard size of 576*720

Step 6: Morphological operations are applied to extract microaneurysms and


Haemorrhages

Step 7: Area and count of these anomalies are extracted as features. Statistical and
GLCM features are also extracted

Step 8: The feature set extracted is then provided to the Fast Recurrent Neural
Network classifier for classification of severity levels.
5.1.2 Otsu thresholding algorithm

Otsu thresholding divides image into Foreground and Background Pixels, thus
assigning Pixels nearer to the black level as 0 and white level as 1, converting image to
binary. The thresholding identifies minimum variance between these pixels to aptly
identify them.

Otsu's thresholding method involves iterating through all the possible threshold
values and calculating a measure of spread for the pixel levels each side of the threshold,
i.e. the pixels that either fall in foreground or background. The aim is to find the
threshold value where the sum of foreground and background spreads is at its minimum.

Otsu thresholding algorithm has the following steps:

Input: Pre-processed image

Output: Binary image

Step 1: Read a gray scale image.

Step 2: Calculate image histogram.

Step 3: Select a threshold and referred as t,

3.1 Calculate foreground variance.

3.2 Calculate background variance.

Step 4: Calculate Within-Class variance.

Step 5: Repeat steps 3 and 4 for all possible threshold value.

Step 6: Final global threshold, T = threshold in MIN (Within-class variance)

Step 7: Binarize Image = gray scale image > T


5.1.3 Fast Recurrent Neural Network algorithm

3.1 Convolutional Neural Networks

Image representation for classification task used often feature extraction methods which have
been proven to be effective for different visual recognition tasks. Local binary patterns
method is used for texture features extracting. Histograms of oriented gradients are applying
for image processing. Usually these types of methods have used to transform images and
describe them for numerous tasks. Most of the applied features need to be identified by an
expert and then manually coded as per the data type and domain. This process is difficult
and expensive in terms of expertise and time. As a solution, deep learning reduces the task
of developing new feature extractor by automating the phase of extracting and learning
features. The proposed Plant Disease Classification system is able to recognize the healthy
and unhealthy plants then classify them by exploiting this technology. There exist many
different architectures of deep learning. The model presented here is a classifier system
developed by using convolutional neural networks category, which is the most efficient and
useful deep neural network used for this type of data. Therefore, CNNs applied to learn
images representation on large-scale datasets for recognition tasks can be exploited by
transferring these learning representations on other tasks with limited amount of training
data. To address this problem, we propose using the convolutional neural network AlexNet
applied on the largescale datasets, by transferring its learned image representations and
reuse them to the classification task with limited training
data. The main idea is based on designing a method which reuse a part of training layers of
AlexNet. In the following Sections introduces the method and the CNN architecture
exploited.

Convolutional Networks (ConvNets) are currently the most efficient deep models for
classifying images data. Their multistage architectures are inspired from the science of
biology. Through these models, invariant features are learned hierarchically and
automatically. They first identify low level features and then learn to recognize and combine
these features to learn more complicated patterns. And each layer has specific number of
neurons and presented in 3 dimensions: height, width, depth. To understand convolutional
neural network structure, we can observe it as two distinct parts. In input, images are
presented as a matrix of pixels. It has 2 dimensions for a grayscale image. The colour is
represented by a third dimension, of depth 3 to represent the fundamental colours (Red,
Green, Blue). The first part of a CNN is the convolutive part. It functions as a feature
extractor of images. In this part, an image is passed through a succession of filters, or
convolution kernels, creating new images called convolution maps. Some intermediate
filters are used to reduce the resolution of the image by a local maximum operation.

3.1.1 Convolution Layer:

The primary purpose of Convolution in case of a ConvNet is to extract features from the input
image. Convolution preserves the spatial relationship between pixels by learning image
features by using pixel values of input data. We will try to understand how it works over
images.
Every image can be considered as a matrix of pixel values. Let us consider a 5 x 5 image
whose pixel values are 0 and 1 (note that for a grayscale image, pixel values range from 0 to
255, the green matrix below is a special case where pixel values are only 0 and 1):
Consider another 3 x 3 matrix used as filter

Then, the Convolution of the 5 x 5 image and the 3 x 3 matrix filter can be computed as shown
in the figure below:

We slide the orange matrix ie kernel over our original image (green) by 1 pixel & for every
position, we compute element wise multiplication between the two matrices and add the
multiplication outputs to get the final value which forms a single element of the output matrix
(pink). Note that the 3×3 matrix “sees” only a local part of the input image in each stride.

In CNN terminology, the 3×3 matrix is known as ‘filter‘ or ‘kernel’ or ‘feature detector’. The
matrix formed by sliding the filter over the image and computing the dot product is called the
‘Convolved Feature’ or ‘Activation Map’ or the ‘Feature Map‘. It is important to note that
filters acts as feature detectors from the original input image.

A Convolutional Neural Network learns the values of these filters on its own during the
training process and we need to specify parameters such as number of filters, filter size,
architecture of the network etc. before the training process. The more number of filters we
use for for feature extraction, the more image features get extracted and our network becomes
more efficient at recognizing patterns of unseen images.

Let us consider a CONV layer accepting a volume of size [W1×H1×D1] where W1 is the
width, H1 is the height and D1 the depth, the outputs of neurons in this type of layers are
calculated by applying the product between their weights and a local region they are
connected to in the input volume.

The obtained output volume [W2×H2×D2] is called convolution map where W2 is the width,
H2 is the height and D2 is the depth if we decided to use D2 filters or convolution kernels.
Convolution maps produce a volume equal to [W2×H2×D2], where W2, H2, D2 are given by
equations (1), (2), (3):

With:
F: spatial extend of the filter.
K: number of filters.
P: zero padding (hyperparameter controlling the output volume).
S: stride (hyperparameter with which we slide the filter).

In the end, a feature extractor vector or CNN code concatenate the output information as a
unique vector. This code is then connected to the input of a second part, consisting of fully
connected layers (multilayer perceptron). The role of this part is to combine the characteristics
of the CNN code to classify the image. It determines the class scores, presenting in an output
volume of size [1×1×k]. The architecture of this part is a usual multilayer perceptron and each
of the k output. The size of the Feature Map is controlled by three parameters that we needs to
be decided before the convolution step:
➢ Depth: Depth means the number of filters we use for the convolution operation. You
can think of these three feature maps as stacked 2d matrices, then ‘depth’ of the feature
map would be three.
➢ Stride: Stride is the number of pixels by which we slide our filter matrix over the input
image matrix. When the stride is 1 then we move the filters by one pixel at a time.
When the stride is 2, then the filters jump 2 pixels at a time, When a larger stride is
used it
produces smaller feature maps.
➢ Zero-padding: Sometimes, it is convenient to pad the input matrix with zeros around
the border, so that we can apply the filter to bordering elements of our input image
matrix. A use of zero padding is that it allows us to control the diminishing size of the
feature maps.

3.1.2 Nonlinear activation function:(ReLU)

An additional operation called ReLU has been used after every Convolution operation in
Figure below. ReLU stands for Rectified Linear Unit and is a non-linear operation. Its output
is given by:

F(x)= max(0,x)
ReLU is an element wise operation which replaces all negative pixel values in the feature map
by zero. The purpose of ReLU is to introduce non-linearity in our Convolutional Neural
Network, since most of the real-world data we would want our ConvNet to learn would be
non- linear, Other non linear functions such as tanh or sigmoid can also be used instead of
ReLU, but ReLU has been found to perform better in most situations.
3.1.3 Pooling Layer:

Spatial Pooling is also called subsampling or down sampling which reduces the
dimensionality of each feature map but retains the most important information. Spatial
Pooling can be of different types: Max pooling, Average pooling, Sum etc.
In case of Max Pooling, we take a spatial neighbourhood (for example, a 2×2 window) and
take the largest element from the rectified feature map within that window. Instead of taking
the largest element we can also take the average (Average Pooling) or sum of all elements in
that window. In practice, Max Pooling has been shown to work efficiently.

Figure shows an example of Max Pooling operation on a Rectified Feature map (obtained
after convolution + ReLU operation) by using a 2×2 window.

Figure 3.5.1 Max pool operation

We slide our 2 x 2 window by 2 cells (also called ‘stride’) and take the maximum value in
each region. As shown in Figure, this reduces the dimensionality of our feature map.
Pooling operation is applied separately to each feature map, the function of Pooling is to
progressively reduce the spatial size of the input representation and it has following
functions

● Makes the input representations smaller and more manageable.


● Reduces the number of parameters and computations in the network, therefore,
controlling problem of overfitting.
● Pooling operation makes the network invariant to small transformations, distortions
and translations in the input image. So a small distortion in input will not change the
output of Pooling, since we take the maximum or average value in a local
neighbourhood.

POOL layer inserted between successive Conv layers, applying a down sampling operation
along the spatial dimension’s width and height. It uses MAX operation to optimize the spatial
size of the representation as well as reducing the amount of parameters, [21]. Pool Layer
produces a volume [W2×H2×D2] where W2, H2, D2 are given by applying equations (4), (5)
and (6) :

Together these layers extract the useful features from the images, introduce non-linearity in
our network and reduce feature dimension while aiming to make the features somewhat
equivariant to scale and translation.

3.1.4 Fully connected layer

The term “Fully Connected” implies that every neuron in the previous layer is connected to
every neuron on the next layer.The output from the convolutional and pooling layers represent
high-level features of the input image. The purpose of the Fully Connected layer is to use
these high-level features for classifying the input image into different classes based on the
training dataset. Most of the features from convolutional and pooling layers may be
good for the classification task, but combinations of those features will work even better.

30
Figure 3.5.2 Fully connected layer

Sum of all the output probabilities from the Fully Connected Layer is one. This is ensured by
using the Softmax function as the activation function in the output layer of the Fully
Connected Layer.

3.1.5 Training using Backpropagation:

The training process of the CNN is summarized below:


● Step1: We initialize all filters and parameters / weights with random values

● Step2: The network takes a training image as input, goes through the forward
propagation steps convolution, ReLU and pooling operations along with forward
propagation in the Fully Connected layer and finds the output probabilities for each
class.
○ Let’s say the output probabilities for an image are [0.2, 0.4, 0.1, 0.3]
○ Since weights are randomly assigned for the first training step, output
probabilities are also random in nature.

31
● Step3: Calculate the total error at the output layer
○ Total Error = ∑ ½ (target probability – output probability) ²

● Step4: Use Backpropagation to calculate the gradients of the error with respect to all
weights in the network and use gradient descent to update all filter values / weights
and parameter values to minimize the output error.
○ The weights are adjusted in proportion to their contribution to the total error.
○ When the same image is input again, output probabilities might now be [0.1,
0.1, 0.7, 0.1], which is closer to the target vector [0, 0, 1, 0].
○ This means that the network has learnt to classify a particular image correctly
by adjusting its weights and filters such that the output error is reduced.
○ Parameters like number of filters, filter sizes, architecture of the network etc.
should be fixed before Step 1 and do not change during the entire training
process – only the values of the filter matrix and connection weights get
updated after each step.

● Step5: Repeat steps 2-4 with all images in the training set.

The above steps train the ConvNet – this essentially means that all the weights and parameters
of the ConvNet have been optimized to efficiently to classify images from the training set.

When a new image is given as input to the ConvNet, the network would go through the
forward propagation step and the output probabilities are calculated using the weights which
have been optimized to correctly classify all the previous training examples. If our training
set is large enough, the network will generalize well to new images and classify them
into correct categories.

3.1.6 Visualizing Convolutional Neural Networks


In general, the more convolution steps we use, the more complicated features our ConvNet
will be able to learn to recognize. For example, a ConvNet may learn to detect edges from
raw pixels in the first layer, then use these edges to detect simple shapes in the second
layer, and then use these shapes to determine more complicated higher-level features.

32
3.1.7 Last layer activation function:
The activation function has to be applied to the last fully connected layer which is usually
different from the others. An appropriate activation function needs to be selected according to
specific task. An activation function applied to the multiclass classification task is a softmax
function it normalizes output real values from the last fully connected layer to target class
probabilities, where values range between 0 and 1 and sum of all the values is 1.

3.1.8 AlexNet Architecture


AlexNet solves the problem of image classification where the input is an image of one of
1000 different classes and the output is a vector of 1000 numbers. The input to AlexNet is an
RGB image of size 256×256. This means all the images in the training set and all test images
need to be of size 256×256.If the input image is not 256×256, it needs to be converted to
256×256 before training the network.

Random crops of size 227×227 were generated from the 256×256 input images to feed
the first layer of AlexNet. Note that the paper mentions the network inputs to be
224×224, but that was a mistake and the numbers make sense only with 227×227
instead.

Figure 3.5.3 AlexNet Architecture

33
In AlexNet 5 Convolutional Layers and 3 Fully Connected Layers are used. Multiple
Kernels extract interesting features in an image. In a single convolutional layer, there are
usually many kernels of the same size. For example, the first Conv Layer of AlexNet uses 96
kernels of size 11x11x3. The width and height of the kernel are usually same and the depth is
same as the number of channels. The first two Convolutional layers are followed by
the Overlapping Max Pooling layers. The third, fourth and fifth convolutional layers are
connected directly. The fifth convolutional layer is followed by a Max Pooling layer, the
output of which goes into a series of two fully connected layers. The second fully
connected layer feeds into a softmax classifier which classifies the image.

Reducing Overfitting

The size of the Neural Network is its capacity to learn, but if you are not careful, it will try to
memorize the features in the training data without understanding the concept. As a result, the
Neural Network will work exceptionally well on the training samples, but they fail to learn
the real concept. It will fail to work well when a new or unseen test data is used.
This is called overfitting.

Data Augmentation

Showing a Neural Net different variation of the same image helps in preventing overfitting.
We can use mirror images, cropping and flipping for data augmentation.

Dropout

With about 60Million parameters to train, the authors experimented with other ways to reduce
overfitting too. In dropout, a neuron is dropped from the network with a probability of 0.5.
When a neuron is dropped, it does not contribute to both forward and backward propagation.
As a result, the learnt weight parameters are more robust and do not get overfitted easily.
During testing dropout is not used and the whole network is used, but output is scaled by a
factor of 0.5 to account for the missed neurons while training. Today dropout regularization is
very important and implementations are better than the original one have been developed.

34
3.1.9 CNN Benefits
The main motivation behind the emergence of CNNs in deep learning is to address many of
the limitations that traditional neural networks faced when applied to those problems. When
used in areas like image classification, traditional fully-connected neural networks simply
don’t scale well due to their disproportionally large number of connections. Convolutional
Neural Networks bring a few new ideas that contribute to improve the efficiency of deep
neural networks.

1 ) Sparse Representations
Working on an image classification problem that involves the analysis of large pictures that
are millions of pixels in size. A traditional neural network will model the knowledge using
matrix multiplication operations that involve every input and every parameter which results
easily in tens of billions of computations. As CNNs are based on convolution operations
between and input and a kernel tensors. Well, it turns out that the kernel in convolution
functions tends to be drastically smaller than the input which simplifies the number of
computations required to train the model or to make predictions. In our sample scenario, a
potential CNN algorithm will focus only on relevant features of the input image requiring
fewer parameters to use in the convolution. The result could be a few billion operations
smaller and more efficient than traditional fully-connected neural networks.

2 ) Parameter Sharing
Another important optimization technique used in CNNs is known as parameter sharing.
Conceptually, parameter sharing simply refers to the fact that CNNs tend to reuse the same
35
parameters across different functions in the deep neural network. More specifically, parameter
sharing entails that the weight parameters will be used on every position of the input which
will allow the model to learn a single set of weights once instead of a different set for every
function. Parameter sharing in CNNs typically results on massive savings in memory compared
to traditional models.

3 ) Equivariance
Conceptually, a function can be considered equivariance if, upon a change in the input, a similar
change is reflected in the output. Using a mathematically nomenclature, a function f(x) is
considered equivariant to a function g() if f(g(x))= g(f(x)). It turns out that convolutions are
equivariant to many data transformation operations which means that we can predict how
specific changes in the input will be reflected in the output.
5.2 METHODS

5.2.1 Structuring Element


A structuring element (SE) is a binary morphology that is used to probe the image. It
is a matrix consisting of only 0's and 1's that can have any arbitrary shape and size. The
pixels with values of 1 define the neighborhood. There are two types of SE, two-
dimensional or flat SE usually consists of origin, radius and approximation N value.
Three-dimensional or non-flat SE usually consists of radius (x-y planes), height and
approximation N value. There are different types of SE shapes but in this project, disk
shaped, ball shaped and octagon shaped SE are used.

5.2.2 Morphological operations

Morphological operations are used to understand the structure or form of an


image. This usually means identifying objects or boundaries within an image.

Morphological operations apply a structuring element to an input image, creating an


output image of the same size. In a morphological operation, the value of each pixel in the
output image is based on a comparison of the corresponding pixel in the input image with
its neighbors. By choosing the size and shape of the neighborhood, a morphological
operation can be created that is sensitive to specific shapes in the input image. There are
many types of morphological operations such as dilation, erosion, opening and closing.

5.2.2.1 Dilation and Erosion

Dilation and erosion are basic morphological processing operations. They are
defined in terms of more elementary set operations, but are employed as the basic
elements of many algorithms. Both dilation and erosion are produced by the interaction
of structuring element with a set of pixels of interest in the image.

Dilation adds pixels to the boundaries of objects in an image, while erosion


removes pixels on object boundaries. The number of pixels added or removed from the
objects in an image depends on the size and shape of the structuring element used to
process the image. In the morphological dilation and erosion operations, the state of any
given pixel in the output image is determined by applying a rule to the
corresponding pixel and its
neighbors in the input image. The rule used to process the pixels defines the operation as
a dilation or an erosion. Table 5.1 shows the operations and the rules.

Operation Rule
Dilation The value of the output pixel is the maximum value of all the pixels in
the input pixel's neighborhood. In a binary image, if any of the pixels is
set to the value 1, the output pixel is set to 1.
Erosion The value of the output pixel is the minimum value of all the pixels in
the input pixel's neighborhood. In a binary image, if any of the pixels is
set to 0, the output pixel is set to 0.

Table 5.1: Rules for dilation and erosion

DILATION

Suppose 𝐴 and
as 𝐴 ⊕ 𝐵 =𝐵∪are sets of pixels. Then the dilation of 𝐴 by 𝐵, denoted 𝐴 ⊕ 𝐵, is
defined 𝑥∈𝐵 𝐴𝑥. This means that for every point 𝑥 ∈ 𝐵, 𝐴 is translated
by those coordinates. An equivalent definition is that 𝐴 ⊕ 𝐵 = {(x, y) + (u, v): (x, y)
∈ A, (u, v) ∈ 𝐵}. Dilation is seen to be commutative, that 𝐴 ⊕ 𝐵 = B ⊕ 𝐴. Figure 5.1a
shows
an original fundus image before dilation and Figure 5.1b shows the same image after
dilation with disk shaped SE of radius 8. Optic disc becomes more prominent and
exudates can also be seen near macula.

Figure 5.1a: Original image Figure 5.1b: Image after dilation with
disk shaped SE
EROSION

Given sets 𝐴 and 𝐵, the erosion of 𝐴 by 𝐵, written 𝐴 ⊝ 𝐵, is defined as 𝐴 ⊝ 𝐵 =

{𝑤: 𝐵𝑤 ⊆ 𝐴}. Figure 5.2a shows an original fundus image before dilation and Figure 5.2b
shows the same image after erosion with disk shaped SE of radius 8. Blood vessels become
more prominent.

Figure 5.2a: Original image Figure 5.2b: Image after erosion with
disk shaped SE

5.2.2.2 Opening and Closing

Dilation and erosion are often used in combination to implement image processing
operations. Erosion followed by dilation is called an open operation. Opening of an image
smoothes the contour of an object, breaks narrow isthmuses (“bridges”) and eliminates
thin protrusions. Dilation followed by erosion is called a close operation. Closing of an
image smoothes section of contours, fuses narrow breaks and long thin gulfs,
eliminates small holes in contours and fills gaps in contours.

Opening operation of image is defined as 𝐴 ∘ 𝐵 = (𝐴 ⊖ 𝐵) ⊕ 𝐵. Since opening


operation of image consists of erosion followed by dilation, therefore it can also be defined

as 𝐴 ∘ 𝐵 =∪ {𝐵𝑤 : 𝐵𝑤 ⊆ 𝐴}.
Closing operation of image is defined as 𝐴 ∙ 𝐵 = (𝐴 ⊕ 𝐵) ⊖ 𝐵. Figure 5.3a and
Figure 5.3b shows the difference between opening operation and closing operation of
fundus images.
Figure 5.3a: Opening operation with Figure 5.3b: Closing operation with
disk shaped image disk shaped SE image

5.2.2.3 Edge Detection

In an image, an edge is a curve that follows a path of rapid change in image intensity.
Edges are often associated with the boundaries of objects in a scene. Edge detection refers
to the process of identifying and locating sharp discontinuities in an image. It is possible to
use edges to measure the size of objects in an image, isolate particular objects from their
background, and to recognize or classify objects.

The Canny method finds edges by looking for local maxima of the gradient of I. The
gradient is calculated using the derivative of a Gaussian filter. The method uses two
thresholds, to detect strong and weak edges, and includes the weak edges in the output only
if they are connected to strong edges. This method is therefore less likely than the others to
be fooled by noise, and more likely to detect true weak edges. The Canny method performs
better than the others due to the fact that it uses two thresholds to detect strong and weak
edges and for this reason, Canny algorithm is chosen for edge detection.

Figure 5.4a and b show original image and Canny edge detection methods
respectively. It is apparent that by using Canny edge detection method, the weak fine
blood vessels can be detected.
Figure 5.4a: Original image Figure 5.4b: Canny edge detected image

5.2.2.4 Median Filtering

Median filtering is a nonlinear operation often used in image processing to reduce


"salt and pepper" noise. A median filter is more effective than convolution when the goal
is to simultaneously reduce noise and preserve edges. Median of a set is the middle value
when values are sorted. For even number of values, the median is the mean of the middle
of two. Figure 5.5a shows an illustration of a 3 x 3 median filter for a set of sorted values
to obtain the median value.

55 70 57
55 57 62 63 65 66 68 70 260 65
68 260 63

66 65 62

Figure 5.5a: Illustration of a 3 x 3 median filter

This method of obtaining the median value means that very large or very small values
(noisy values) will be replaced by the value closer to its surroundings. Figure 5.5b shows
the difference before and after applying median filtering. The “salt and pepper” noise in the
original image have been clearly reduced after applying the median filtering.

Figure 5.5b: Original image (left) and image after median filtering (right)

5.2.2.5 Feature Extraction

Texture Analysis

Texture describes the physical structure characteristic of a material such as


smoothness and coarseness. It is a spatial concept indicating what, apart from color and
the level of gray, characterizes the visual homogeneity of a given zone of an image.
Texture analysis of an image is the study of mutual relationship among intensity
values of neighboring pixels repeated over an area larger than the size of the relationship.
The main types of texture analysis are structural, statistical and spectral.

Mean, standard deviation, third moment and entropy are statistical type. Mean,
standard deviation and third moment are concern with properties of individual pixels. Mean
N 1 N 1

is defined as: Mean = µ1 = iPi, j


i0 j0
and standard deviation is defined as: SD = σ1 =

N 1 N 1

Pi, ji 2 . Third moment is a measure of the skewness of the histogram and is
1
i0 j 0
𝑖=0
defined as: 𝜇3(𝑧) = ∑𝐿−1(𝑧𝑖 − 𝑚)3𝑝(𝑧𝑖). Entropy is a statistical type of texture that
measures randomness in an image texture. An image that is perfectly flat will have entropy
of zero. Consequently, they can be compressed to a relatively small size. On the other
hand, high entropy images such as an image of heavily cratered areas on the moon have a
great deal of contrast from one pixel to the next and consequently cannot be compressed
as much

as low entropy images. Entropy is defined as: − ∑ 𝑃 log2 𝑃. The texture features used in
this project are mean, standard deviation, third moment and entropy.

The Gray Level Co-occurrence Matrix (GLCM) method is a way of extracting


second order statistical texture features. A GLCM is a matrix where the number of
rows and columns is equal to the number of gray levels, G, in the image. The matrix
element P (i, j |
∆x, ∆y) is the relative frequency with which two pixels, separated by a pixel distance (∆x,
∆y), occur within a given neighborhood, one with intensity ‘i’ and the other with intensity
‘j’. The matrix element P (i, j | d, ө) contains the second order statistical probability values
for changes between gray levels ‘i’ and ‘j’ at a particular displacement distance d and at a
particular angle (ө). Using a large number of intensity levels G implies storing a lot of
temporary data, i.e. a G × G matrix for each combination of (∆x, ∆y) or (d, ө). Due to their
large dimensionality, the GLCM’s are very sensitive to the size of the texture samples on
which they are estimated. Thus, the number of gray levels is often reduced. GLCM matrix
formulation can be explained with the example illustrated in Figure 5.6 for four different
gray levels. Here one pixel offset is used (a reference pixel and its immediate neighbor). If
the window is large enough, using a larger offset is possible. The top left cell will be filled
with the number of times the combination 0,0 occurs, i.e. how many time within the image
area a pixel with grey level 0 (neighbor pixel) falls to the right of another pixel with grey
level 0(reference pixel).
Figure 5.6 GLCM calculation
Serial No. Function Description

retinopathy(varargin) Gui which


1 provides an
interface for user

Pop up function
describing all the
2 Help.m
steps to use the
software

Main driver
program which
main1(sBmp,h) contains modules
3 of segmentation
and feature
extraction

Computes all the


GLCM_Features1(x,0) GLCM features
4 and returns them
as a structure.

Computes all the


Feature_Extraction1(I,green_channel) statistical features
5 like mean etc. and
returns the same

imresize(x1,[x,y]) Resize the given


6 image to a given
resolution

Segments the
MA_main(x_1) micro-aneurysm
7 and returns the
area and count.

Segments the
binarythining_HA(x_1) haemorrhages
8 and returns the
area and count of
these anomalies.
Applies CLAHE
adapthisteq(MAHA,'Atribute',value,'Atribute',value) and returns the
9 contrast adjusted
image

Returns the
im2bw(med_J,thresholdvalue) binary equivalent
10 of the grayscale
image.

Segments the
CoyeFil(x) vessel structure
11 from the fundus
image.

Removes all the


bwareaopen(binary_image,threshold) structures below
12 the given
threshold

regionprops(labeledImage,HA) Labels all the


13 independent
elements.

RGB2Lab(im(:,:,1),im(:,:,2),im(:,:,3)) Converts true


14 color image to
LAB color space.

Subtracts two
imsubtract(JF, J) binary images
15 and returns the
difference image.

Performs otsu
graythresh(Z) algorithm and
16 returns a
threshold value.

Performs the
imcomplement(BW2) compliment of
17 given binary
image
classifier_example(allResults,h) Loads the training
18 data and the
testing data.

Performs the
wekaClassification(feature_train, class_train, classification of
feature_test, class_test, featName, classifier) the given test data
19
against the
training data.

Loads the
attribute related
20 Loadarrf()
file from system
for classification.

Table 5.2 Function descriptions

Table 5.2 describes the functions defined in the system. The MATLAB code for some of
the functions are:

 CoyeFil(x)

Segments the blood vessels from the fundus image.


function [BW2]=CoyeFil(im)

% Convert RGB to Lab


lab = RGB2Lab(im(:,:,1),im(:,:,2),im(:,:,3));
f = 0;
wlab = reshape(bsxfun(@times,cat(3,1-f,f/2,f/2),lab),[],3);
[C,S] = pca(wlab);
S = reshape(S,size(lab));
S = S(:,:,1);
gray = (S-min(S(:)))./(max(S(:))-min(S(:)));

%% Contrast Enhancment of gray image using CLAHE


J1 = adapthisteq(gray,'numTiles',[8 8],'nBins',128);
J = adapthisteq(J1,'numTiles',[8 8],'nBins',128);
J = imadjust(J,stretchlim(J),[]);

% Apply Average Filter


h = fspecial('average', [9 9]);
JF = imfilter(J, h);
Z = imsubtract(JF, J);

%% Threshold
level=graythresh(Z);

%% Convert to Binary
BW = im2bw(Z, level);

%% Remove small pixels


BW2 = bwareaopen(BW, 100);

%% Overlay
BW2 = imcomplement(BW2);

 binarythining_HA(x_1)

Segments the haemorrhages and returns the area and count.

function [areaHA,numberHA,maskedImage,Ie3]= binarythining_HA (x)

gray=rgb2gray(x);

%green channel extraction


r =x(:,:,1);
g =x(:,:,2);
b =x(:,:,3);

%Contrast adjustment
Ie3=adapthisteq(g,'numTiles',[8 8],'nBins',128);
J = imadjust(Ie3,stretchlim(Ie3),[]);

g_fill=imfill(g,'holes');
MAHA=imsubtract(g_fill,g);
med_x=medfilt2(MAHA,[3,3]);
BW=im2bw(med_x,graythresh(med_x));
h_op=imopen(BW,strel('disk',2));

%vessels
vv=~CoyeFil(x);
vessel_d=imdilate(vv,strel('disk',5));
vessel_e=imerode(vessel_d,strel('disk',3));
FF1=imsubtract(h_op,vessel_e);
ff1=im2bw(FF1);

%eliminate fovea
labeledImage = bwlabel(ff1,8);
blobMeasurements = regionprops(labeledImage,ff1, 'all');
allBlobAreas = [blobMeasurements.Area];
not_allowableareas=allBlobAreas>=max(allBlobAreas);
non_keeperIndexes2 = find(not_allowableareas);
non_keeperImage2= ismember(labeledImage, non_keeperIndexes2);
maskedImage = ff1;
maskedImage(non_keeperImage2)=0;
maskedImage=bwareaopen(maskedImage,20);

%overlay
HA=maskedImage;
labeledImage = bwlabel(HA,8);
blobMeasurements = regionprops(labeledImage,HA, 'basic');
numberHA = size(blobMeasurements, 1);
allBlobAreas = [blobMeasurements.Area];
areaHA=sum(allBlobAreas);
end

 MA_main(x_1)

Segments the micro-aneurysm and returns the area and count.

function [areaMA,vv,numberMA,vve,MAF] = MA_main(x)

%green channel extraction


r =x(:,:,1);
g =x(:,:,2);
b =x(:,:,3);

%guasian filter
hsize = [3 3];
sigma =2;
h = fspecial('gaussian', hsize, sigma);
ft_g = imfilter(g,h);
g_fill=imfill(ft_g,'holes');
MAHA=imsubtract(g_fill,ft_g);
J=adapthisteq(MAHA,'numTiles',[8 8],'nBins',128);
med_J=medfilt2(J,[3,3]);
MA=im2bw(med_J,graythresh(med_J));
vv=~CoyeFil(x);

%MASK generation
J=adapthisteq(g,'numTiles',[8 8],'nBins',128);
J=adapthisteq(J,'numTiles',[8 8],'nBins',128);
se=strel('disk',15);
eyeMask= im2bw(J,graythresh(g));
eyeMask= imfill(eyeMask,'holes');
eyeMask=bwareaopen(eyeMask,100);
eyemask=imdilate(eyeMask,se);
eyemask=imerode(eyemask,se);
vv(~eyeMask)=0;
vv(~imerode(eyemask,strel('disk',15)))=0;
vvd=imdilate(vv,strel('disk',5));
vve=imerode(vvd,strel('disk',3));
MAF=imsubtract(MA,vve);
MAF=im2bw(MAF);
MAF=imfill(MAF,'holes');
MAF=xor(bwareaopen(MAF,3),bwareaopen(MAF,9));

labeledImage = bwlabel(MAF,8);
blobMeasurements = regionprops(labeledImage,MAF, 'all');
numberMA = size(blobMeasurements, 1);
allBlobAreas = [blobMeasurements.Area];
areaMA=sum(allBlobAreas);

 GLCM_Features1(x,0)

Computes all the GLCM features and returns them as a structure.

function [out] = GLCM_Features1(glcmin,pairs)


if ((nargin > 2) || (nargin == 0))
error('Too many or too few input arguments. Enter GLCM and pairs.');
elseif ( (nargin == 2) )
if ((size(glcmin,1) <= 1) || (size(glcmin,2) <= 1))
error('The GLCM should be a 2-D or 3-D matrix.');
elseif ( size(glcmin,1) ~= size(glcmin,2) )
error('Each GLCM should be square with NumLevels rows and NumLevels
cols');
end
elseif (nargin == 1) % only GLCM is entered
pairs = 0; % default is numbers and input 1 for percentage
if ((size(glcmin,1) <= 1) || (size(glcmin,2) <= 1))
error('The GLCM should be a 2-D or 3-D matrix.');
elseif ( size(glcmin,1) ~= size(glcmin,2) )
error('Each GLCM should be square with NumLevels rows and NumLevels
cols');
end
end

format long e
if (pairs == 1)
newn = 1;
for nglcm = 1:2:size(glcmin,3)
glcm(:,:,newn) = glcmin(:,:,nglcm) + glcmin(:,:,nglcm+1);
newn = newn + 1;
end
elseif (pairs == 0)
glcm = glcmin;
end

size_glcm_1 = size(glcm,1);
size_glcm_2 = size(glcm,2);
size_glcm_3 = size(glcm,3);

% checked
out.autoc = zeros(1,size_glcm_3); % Autocorrelation: [2]
out.contr = zeros(1,size_glcm_3); % Contrast: matlab/[1,2]
out.corrm = zeros(1,size_glcm_3); % Correlation: matlab
out.cprom = zeros(1,size_glcm_3); % Cluster Prominence: [2]
out.cshad = zeros(1,size_glcm_3); % Cluster Shade: [2]
out.dissi = zeros(1,size_glcm_3); % Dissimilarity: [2]
out.energ = zeros(1,size_glcm_3); % Energy: matlab / [1,2]
out.entro = zeros(1,size_glcm_3); % Entropy: [2]
out.homom = zeros(1,size_glcm_3); % Homogeneity: matlab
out.maxpr = zeros(1,size_glcm_3); % Maximum probability: [2]

out.sosvh = zeros(1,size_glcm_3); % Sum of sqaures: Variance [1]


out.savgh = zeros(1,size_glcm_3); % Sum average [1]
out.svarh = zeros(1,size_glcm_3); % Sum variance [1]
out.senth = zeros(1,size_glcm_3); % Sum entropy [1]
out.dvarh = zeros(1,size_glcm_3); % Difference variance [4]
out.denth = zeros(1,size_glcm_3); % Difference entropy [1]
out.inf1h = zeros(1,size_glcm_3); % Information measure of correlation1 [1]
out.inf2h = zeros(1,size_glcm_3); % Informaiton measure of correlation2 [1]
out.indnc = zeros(1,size_glcm_3); % Inverse difference normalized (INN) [3]
out.idmnc = zeros(1,size_glcm_3); % Inverse difference moment normalized [3]
glcm_sum = zeros(size_glcm_3,1);
glcm_mean = zeros(size_glcm_3,1);
glcm_var = zeros(size_glcm_3,1);

u_x = zeros(size_glcm_3,1);
u_y = zeros(size_glcm_3,1);
s_x = zeros(size_glcm_3,1);
s_y = zeros(size_glcm_3,1);

p_x = zeros(size_glcm_1,size_glcm_3); % Ng x
#glcms[1] p_y = zeros(size_glcm_2,size_glcm_3); % Ng
x #glcms[1] p_xplusy = zeros((size_glcm_1*2 -
1),size_glcm_3); %[1] p_xminusy =
zeros((size_glcm_1),size_glcm_3); %[1] hxy =
zeros(size_glcm_3,1);
hxy1 = zeros(size_glcm_3,1);
hx = zeros(size_glcm_3,1);
hy = zeros(size_glcm_3,1);
hxy2 = zeros(size_glcm_3,1);

for k = 1:size_glcm_3 % number glcms

glcm_sum(k) = sum(sum(glcm(:,:,k)));
glcm(:,:,k) = glcm(:,:,k)./glcm_sum(k); % Normalize each glcm
glcm_mean(k) = mean2(glcm(:,:,k)); % compute mean after norm
glcm_var(k) = (std2(glcm(:,:,k)))^2;

for i = 1:size_glcm_1

for j = 1:size_glcm_2

out.contr(k) = out.contr(k) + (abs(i - j))^2.*glcm(i,j,k);


out.dissi(k) = out.dissi(k) + (abs(i - j)*glcm(i,j,k));
out.energ(k) = out.energ(k) + (glcm(i,j,k).^2);
out.entro(k) = out.entro(k) - (glcm(i,j,k)*log(glcm(i,j,k) + eps));
out.homom(k) = out.homom(k) + (glcm(i,j,k)/( 1 + abs(i-j) ));
out.sosvh(k) = out.sosvh(k) + glcm(i,j,k)*((i - glcm_mean(k))^2);

out.indnc(k) = out.indnc(k) + (glcm(i,j,k)/( 1 + (abs(i-j)/size_glcm_1) ));


out.idmnc(k) = out.idmnc(k) + (glcm(i,j,k)/( 1 + ((i - j)/size_glcm_1)^2));
u_x(k)= u_x(k) + (i)*glcm(i,j,k);
u_y(k)= u_y(k) + (j)*glcm(i,j,k);
end

end
out.maxpr(k) = max(max(glcm(:,:,k)));
end

for k = 1:size_glcm_3

for i = 1:size_glcm_1

for j = 1:size_glcm_2
p_x(i,k) = p_x(i,k) + glcm(i,j,k);
p_y(i,k) = p_y(i,k) + glcm(j,i,k); % taking i for j and j for i
if (ismember((i + j),[2:2*size_glcm_1]))
p_xplusy((i+j)-1,k) = p_xplusy((i+j)-1,k) + glcm(i,j,k);
end
if (ismember(abs(i-j),[0:(size_glcm_1-1)]))
p_xminusy((abs(i-j))+1,k) = p_xminusy((abs(i-j))+1,k) +...
glcm(i,j,k);
end
end
end

end

for k = 1:(size_glcm_3)

for i = 1:(2*(size_glcm_1)-1)
out.savgh(k) = out.savgh(k) + (i+1)*p_xplusy(i,k);
out.senth(k) = out.senth(k) - (p_xplusy(i,k)*log(p_xplusy(i,k) + eps));
end

end
% compute sum variance with the help of sum entropy
for k = 1:(size_glcm_3)

for i = 1:(2*(size_glcm_1)-1)
out.svarh(k) = out.svarh(k) + (((i+1) - out.senth(k))^2)*p_xplusy(i,k);
end

end
% compute difference variance, difference entropy,
for k = 1:size_glcm_3
for i = 0:(size_glcm_1-1)
out.denth(k) = out.denth(k) - (p_xminusy(i+1,k)*log(p_xminusy(i+1,k) +
eps));
out.dvarh(k) = out.dvarh(k) + (i^2)*p_xminusy(i+1,k);
end
end
% compute information measure of correlation(1,2) [1]
for k = 1:size_glcm_3
hxy(k) = out.entro(k);
for i = 1:size_glcm_1

for j = 1:size_glcm_2
hxy1(k) = hxy1(k) - (glcm(i,j,k)*log(p_x(i,k)*p_y(j,k) + eps));
hxy2(k) = hxy2(k) - (p_x(i,k)*p_y(j,k)*log(p_x(i,k)*p_y(j,k) + eps));
end
hx(k) = hx(k) - (p_x(i,k)*log(p_x(i,k) + eps));
hy(k) = hy(k) - (p_y(i,k)*log(p_y(i,k) + eps));
end
out.inf1h(k) = ( hxy(k) - hxy1(k) ) / ( max([hx(k),hy(k)]) );
out.inf2h(k) = ( 1 - exp( -2*( hxy2(k) - hxy(k) ) ) )^0.5;
end

corm = zeros(size_glcm_3,1);
corp = zeros(size_glcm_3,1);
for k = 1:size_glcm_3
for i = 1:size_glcm_1
for j = 1:size_glcm_2
s_x(k) = s_x(k) + (((i) - u_x(k))^2)*glcm(i,j,k);
s_y(k) = s_y(k) + (((j) - u_y(k))^2)*glcm(i,j,k);
corp(k) = corp(k) + ((i)*(j)*glcm(i,j,k));
corm(k) = corm(k) + (((i) - u_x(k))*((j) - u_y(k))*glcm(i,j,k));
out.cprom(k) = out.cprom(k) + (((i + j - u_x(k) - u_y(k))^4)*...
glcm(i,j,k));
out.cshad(k) = out.cshad(k) + (((i + j - u_x(k) - u_y(k))^3)*...
glcm(i,j,k));
end
end
% root is required as done below:
s_x(k) = s_x(k) ^ 0.5;
s_y(k) = s_y(k) ^ 0.5;
out.autoc(k) = corp(k);
out.corrm(k) = corm(k) / (s_x(k)*s_y(k));
end

 Feature_Extraction1(I,green_channel)

Computes all the statistical features and returns the same.


function [out2]=Feature_Extraction1(x,g)

%calculate mean
mean = 0;
out2.mean = mean2(x);
%calculate Standard Deviation
stddev = 0;
y = int16(x)-int16(out2.mean); z = y.*y;
stddev = sum(sum(z));
stddev = stddev/numel(x);
out2.stddev = sqrt(stddev);

%calculate third moment


thirdMoment = 0;
total = (y./stddev).^3; thirdMoment = sum(sum(total));
out2.thirdMoment = thirdMoment/numel(x); %divide by number of element N;

%calculate entropy
GreenC=g;
Green_his_X1 = adapthisteq(GreenC);
Green_his_X2 = adapthisteq(Green_his_X1);
out2.entrop = entropy(Green_his_X2);
end

 wekaClassification(feature_train, class_train, feature_test, class_test,


featName, classifier)

Performs the classification of the given test data against the training data.

function [actual, predicted, probDistr] = wekaClassification(featTrain, classTrain,


featTest, classTest, featName, classifier)
import matlab2weka.*;

%% Converting to WEKA data


display(' Converting Data into WEKA format...');
numExtraClass = 0;
if (length(unique(classTest)) ~= length(unique(classTrain)))
% First take the list of test classes
uTestClasses = unique(classTest);

% Then, we forcefully add the classes that is not in the test class
% to the testing data.
tmp_idx = 1;
uTrainClasses = unique(classTrain);
for iclass = 1:length(uTrainClasses)
if (sum(ismember(uTestClasses, uTrainClasses{iclass})) == 0)
featTest = vertcat(featTest, featTest(end,:));
classTest = vertcat(classTest, uTrainClasses{iclass});
tmp_idx = tmp_idx + 1;
end
end
numExtraClass = tmp_idx - 1;
end

%convert the training data to an Weka object


convert2wekaObj = convert2weka('test',featName, featTest', classTest, true);
ft_test_weka = convert2wekaObj.getInstances();
clear convert2wekaObj;

%convert the testing data to an Weka object


convert2wekaObj = convert2weka('training', featName, featTrain', classTrain,
true);
ft_train_weka = convert2wekaObj.getInstances();
clear convert2wekaObj;
display(' Converting Completed!');

%% Training the classification model


display(' Classifying...');
if (classifier == 1)
import weka.classifiers.trees.RandomForest.*;
import weka.classifiers.meta.Bagging.*;
%create an java object
trainModel = weka.classifiers.trees.RandomForest();
%defining parameters
trainModel.setMaxDepth(0); %Set the maximum depth of the tree, 0 for
unlimited.
trainModel.setNumFeatures(0); %Set the number of features to use in random
selection.
trainModel.setNumTrees(100); %Set the value of numTrees.
trainModel.setSeed(1);
%train the classifier
trainModel.buildClassifier(ft_train_weka);
%trainModel.toString()
elseif(classifier == 2)
import weka.classifiers.trees.J48.*;
%create an java object
trainModel = weka.classifiers.trees.J48();
%defining parameters
trainModel.setConfidenceFactor(0.25); %Set the value of CF.
trainModel.setMinNumObj(2); %Set the value of minNumObj.
trainModel.setNumFolds(-1);
trainModel.setSeed(1);
%train the classifier
trainModel.buildClassifier(ft_train_weka);
elseif(classifier == 3)
import weka.classifiers.functions.SMO.*;
import weka.classifiers.functions.supportVector.Puk.*;
%create an java object
trainModel = weka.classifiers.functions.SMO();
%defining parameters
trainModel.setC(1.0);
trainModel.setEpsilon(1.0E-12);
trainModel.setNumFolds(-1);
trainModel.setRandomSeed(1);
trainModel.setToleranceParameter(0.001);
trainKernel = weka.classifiers.functions.supportVector.Puk();
trainKernel.buildKernel(ft_train_weka);
trainModel.setKernel(trainKernel);
%train the classifier
trainModel.buildClassifier(ft_train_weka);
elseif(classifier == 4)
import weka.classifiers.functions.Logistic.*;
%create an java object
trainModel = weka.classifiers.functions.Logistic();
%defining parameters
trainModel.setMaxIts(-1); %Set the value of MaxIts.
trainModel.setRidge(1.0E-8);
%train the classifier
trainModel.buildClassifier(ft_train_weka);
end

%% Making Predictions
actual = cell(ft_test_weka.numInstances()-numExtraClass, 1); %actual labels
predicted = cell(ft_test_weka.numInstances()-numExtraClass, 1); %predicted
labels
probDistr = zeros(ft_test_weka.numInstances()-numExtraClass,
ft_test_weka.numClasses()); %probability distribution of the predictions
for z = 1:ft_test_weka.numInstances()-numExtraClass
actual{z,1} = ft_test_weka.instance(z-
1).classAttribute.value(ft_test_weka.instance(z-1).classValue()).char();%
Modified by GM
predicted{z,1} = ft_test_weka.instance(z-
1).classAttribute.value(trainModel.classifyInstance(ft_test_weka.instance(z-
1))).char();% Modified by GM
probDistr(z,:) = (trainModel.distributionForInstance(ft_test_weka.instance(z-
1)))';
end
display('Classification Completed!');
CHAPTER 6

TESTING

6.1 INTRODUCTION

Testing is a process of executing a program or application with the intent of


finding the software bugs. It can also be stated as the process of validating and verifying
that a software program or application or product:

 Meets the requirements that guided its design and development,


 Responds correctly to all kinds of inputs,
 Performs its functions within an acceptable time,
 is sufficiently usable,
 can be installed and run in its intended environments, and
 Achieves the general result its stakeholder’s desire.

As the number of possible tests for even simple software components is practically
infinite, all software testing uses some strategy to select tests that are feasible for the
available time and resources. As a result, software testing typically (but not exclusively)
attempts to execute a program or application with the intent of finding software
bugs (errors or other defects). The job of testing is an iterative process as when one bug is
fixed, it can illuminate other, deeper bugs, or can even create new ones.

Software testing can provide objective, independent information about the quality of
software and risk of its failure. Software testing can be conducted as soon as executable
software (even if partially complete) exists. The overall approach to software
development often determines when and how testing is conducted. For example, in a
phased process, most testing occurs after system requirements have been defined and then
implemented in testable programs.
Types of testing:

Black box testing – Internal system design is not considered in this type of testing. Tests
are based on requirements and functionality.

White box testing – This testing is based on knowledge of the internal logic of an
application’s code. Also known as Glass box Testing. Internal software and code working
should be known for this type of testing. Tests are based on coverage of code statements,
branches, paths, conditions.

Grey box testing – Grey-box testing (American spelling: gray-box testing) involves
having knowledge of internal data structures and algorithms for purposes of designing
tests, while executing those tests at the user, or black-box level. The tester is not required
to have full access to the software's source code. Grey-box testing may also include
reverse engineering to determine, for instance, boundary values or error messages.

Unit testing – Testing of individual software components or modules. Typically done by


the programmer and not by testers, as it requires detailed knowledge of the internal
program design and code may require developing test driver modules or test harnesses.

Incremental integration testing – Bottom up approach for testing i.e. continuous testing
of an application as new functionality is added; Application functionality and modules
should be independent enough to test separately done by programmers or by testers.

Integration testing – Testing of integrated modules to verify combined functionality


after integration. Modules are typically code modules, individual applications, client and
server applications on a network, etc. This type of testing is especially relevant to
client/server and distributed systems.

Functional testing – This type of testing ignores the internal parts and focus on the
output is as per requirement or not. Black-box type testing geared to functional
requirements of an application.

System testing – Entire system is tested as per the requirements. Black-box type testing
that is based on overall requirements specifications, covers all combined parts of a
system.
6.2 TEST CASES

Case 1

Case name Best Case

Input High Quality Image (.png or .tif format)

Actual Output Moderate

Expected Output Moderate

Remarks Pass

Table 6.1 Test case 1 - best case


CONCLUSION

The fast and efficient early detection of Diabetic Retinopathy is only possible if
there is an effective method for segmenting the diabetic features in the fundus image. The
proposed system presents a fast, effective and robust way of detecting diabetic features in
the fundus images which can be used for classification of the images based on the
severity of the disease. The retinal images are subjected to gray scale conversion,
preprocessing and feature extraction steps. The extracted features are fed to a Fast
Recurrent Neural Network classifier which will classify the images into different
severity levels. Thus this Fast Recurrent Neural Network technique has given a
successful DR screening method which helps to detect the disease in multiple stages.
BIBLIOGRAPHY

[1] Sri, R. Manjula, M. Raghupathy Reddy, and K. M. M. Rao. "Image Processing for
Identifying Different Stages of Diabetic Retinopathy." International Journal on
Recent Trends in Engineering & Technology 11.1 (2014): 83.

[2] Soumya Sree, A.Rafega Beham. “BP and SVM based Diagnosis of Diabetic
Retinopathy”. International Journal of Innovative Research in Computer and
Communication Engineering Vol. 3, Issue 6, June 2015.

[3] Mane, Shreekant J. "Diabetic Retinopathy: Patient Identification and Measurement


of the Disease Using ANN.” International Journal of Technical Research and
Applications e-ISSN: 2320-8163, Issue 31(September, 2015), PP. 278-282

[4] Aravind, C., M. Ponnibala, and S. Vijayachitra. "Automatic detection of


microaneurysms and Classification of diabetic retinopathy images using SVM
technique." IJCA Proceedings on International conference on innovations in
intelligent instrumentation, optimization and Electrical sciences ICIIIOES (11).
2013.

[5] J D Labhade , L K Chouthmol. “Diabetic Retinopathy Detection using Fast


Recurrent Neural Network”. International Journal of Modern Trends in
Engineering and Research e-
ISSN No.:2349-9745, Date: 28-30 April, 2016

[6] Ahmed, Mohammed Shafeeq, and B. Indira. “A SURVEY ON AUTOMATIC


DETECTION OF DIABETIC RETINOPATHY”. International Journal of Computer
Engineering & Technology (IJCET) Volume 6, Issue 11, Nov 2015, pp. 36-45.

[7] Jagadish Nayak, P Subbanna Bhat, Rajendra Acharya U, C M Lim, Manjunath


Kagathi. “Automated Identification of Diabetic Retinopathy Stages Using Digital
Fundus Images”. Journal of Medical Systems Volume 32 Issue 2, April 2008.
[8] Sopharak, M.N. Dailey, B.Uyyanonvara, S. Barman, T. Williamson, K.T. Nwe, and
Y.A. Moe, “Machine learning approach to automatic exudates detection in retinal
images from diabetic Patients”, Journal of modern optics, vol.57, no. 2, pp.124-13,
2010.

[9] Gary G. Yen, and Wen Fung Leong, “A Sorting System for Hierarchical Grading
of Diabetic Fundus Images: A Preliminary Study”, IEEE Transactions on
Information Technology in Biomedicine, Vol. 12, No. 1, pp 118-130, January 200.

[10] James L. Kinyoun, Donald C. Martin, Wilfred Y. Fujimoto, Donna L. Leonetti.


Opthalmoscopy versus Fundus Photographs for Detecting and Grading Diabetic
Retinopathy. Morphology Fundamentals: Dilation and Erosion: Morphological
Operations (Image Processing Toolbox™).

[11] Bernhard M. Ege et al. ―Screening for diabetic retinopathy using computer
based image analysis and statistical classification, in Computer Methods and
Programs in Biomedicine,Vol. 62 (2000) ,pp.165–175.

[12] Wong Li Yun, Rajendra Acharya U, Y V. Venkatesh, Caroline Chee, Lim Choo
Min, E.Y.K.Ng. Identification of Different Stages Of Diabetic Retinopathy Using
Retinal Optical Images. Information Sciences Vol. 178, PP 106–121, 2008.

[13] DIARET DB0 http://www.it.lut.fi/project/imageret/diaretdb1/

[14] DIARET DB1 http://www.it.lut.fi/project/imageret/diaretdb0/

[15] MESSIDOR http://www.adcis.net/en/Download-Third-Party/Messidor.html/

You might also like