Professional Documents
Culture Documents
Main Project Report 2017-18
Main Project Report 2017-18
BACHELOR OF TECHNOLOGY
in
ELECTRONICS AND COMMUNICATION ENGINEERING
Submitted by
Supervisor
Dr. T. Prabhakar
Associate Professor, Dept. of ECE.
(Accredited by NBA, NAAC with ‘A’ Grade & ISO 9001:2008 Certified Institution)
CERTIFICATE
This is to certify that the thesis entitled TEXTURE FEATURE EXTRACTION AND
KUMAR, bearing Reg. No: 14341A04H3, Reg. No: 15345A0425, Reg. No: 15345A0426,
respectively have been carried out in partial fulfilment of the requirement for the award of
GMRIT, Rajam affiliated to JNTUK, KAKINADA is a record of bonafide work carried out
by her under my guidance & supervision. The results embodied in this report have not been
submitted to any other University or Institute for the award of any degree.
ii
ACKNOWLEDGEMENT
It gives us an immense pleasure to express deep sense of gratitude to my guide
Dr.T.Prabhakar, Associate Professor, Department of Electronics and Communication
Engineering of whole hearted and valuable guidance throughout the report. Without his
sustained and sincere effort, this report would not have taken this shape. He encouraged and
helped us to overcome various difficulties that we have faced at various stages of our project.
We would like to sincerely thank Dr. M. V. Nageswara Rao, HOD, Electronics and
Communication Engineering, for providing all the necessary facilities that led to the
successful completion of our project.
I would like to thank our respected Dr. M. Sekar, Dean-R&D Coordinator, providing
support and stimulating environment in which the project has been developed.
We would like to take this opportunity to thank our beloved Vice - Principal
Dr. J. Raja Murugadoss, for providing a great support to us in completing our project and
for giving us the opportunity of doing the project.
We would like to thank all the faculty members and the non-teaching staff of the
Department of Electronics and Communication Engineering for their direct or indirect support
for helping us in completion of this project.
Finally, we would like to thank all of our friends and family members for their
continuous help and encouragement.
iv
TABLE OF CONTENTS
ACKNOWLEDGEMENTS iii
ABSTRACT iv
LIST OF TABLES vii
LIST OF FIGURES viii
LIST OF ABBREVIATIONS ix
1. INTRODUCTION 1
1.1. ALZHEIMER'S DISEASE (AD) 1
1.2. STATISTICS OF ALZHEIMER’S 3
1.3. HOW ALZHEIMER’S AFFECTS THE BRAIN 4
1.4. STAGES OF THE DISEASE 6
1.4.1. Early-stage Alzheimer’s 7
1.4.2. Middle-stage Alzheimer’s 7
1.4.3. Late-stage Alzheimer’s 7
1.5. DETECTION OF ALZHEIMER’S 7
1.6. BRAIN IMAGING OR NEUROIMAGING 8
1.7. MAGNETIC RESONANCE IMAGING 8
1.7.1. Working of MRI 10
1.8. WORK FLOW 11
1.9. BASIC MRI SCANS 12
1.9.1. Role of structural MRI in Alzheimer's disease 13
1.9.2. Reason for Choosing MRI 14
1.10. DIGITAL IMAGE PROCESSING 15
1.11. INTRODUCTION TO MATLAB 15
1.11.1. Images as Matrices 16
2. LITERATURE REVIEW 18
3. GLCM FEATURE EXTRACTION OF MR IMAGES 25
3.1. TEXTURE FEATURE 25
3.1.1. Texture Feature Extraction 25
v
3.2. CO-OCCURRENCE MATRICES 26
3.3. GRAY LEVEL CO-OCCURRENCE MATRIX 26
3.3.1. Algorithm 27
3.4. HARALICK TEXTURE FEATURES 28
4. AD AND NORMAL MR IMAGE CLASSIFICATION 33
4.1. DIGITAL IMAGE CLASSIFICATION 33
4.1.1. Spectral differentiation 33
4.1.2. Radiometric differentiation 34
4.1.3. Spatial differentiation 34
4.1.4. Supervised classification 35
4.1.5. Unsupervised classification 35
4.2. K-NN CLASSIFICATION 37
4.2.1. How to choose the value of K 38
4.3. KNN PROS. AND CONS 38
5. RESULT AND DISCUSSION 40
5.1. GLCM FEATURE’S MEAN AND STANDARD DEVIATION 40
5.2. GLCM FEATURE’S OF NORMAL AND AD 41
5.3. PERFORMANCE OF K-NN CLASSIFIER 48
5.4. MATLAB OUTPUT 48
6. CONCLUSION 49
REFERENCE 50
vi
LISTOFTABLES
vii
LISTOF FIGURES
1.1 Plaques and tangles tend to spread through the cortex as Alzheimer's progresses. 5
viii
LIST OF ABBREVIATIONS
AD : Alzheimer’s Disease.
TP : True Positive.
TN : True Negative.
FP : False Positive.
FN : False Negative.
N : Normal.
MMSE : Mini-Mental State Examination.
GRE : Gradient Echo.
GLCM : Gray Level Co-occurrence Matrix.
MRI : Magnetic Resonance Imaging.
K-NN : Kernal Nearest Neighbor.
CT : Computed Tomography.
MPRAGE : Magnetization Prepared Rapid Acquisition Gradient Echo.
PCA : Principal Component Analysis.
ix
Texture Feature Extraction and Classification of Alzheimer’s Disease and Normal using MR
images
CHAPTER – 1
INTRODUCTION
general idea of whether the person is aware of his problem or he feels that nothing is wrong;
and he knows the date, time and where he is and can remember a short list of words and
follow instructions and do simple calculations etc.
The Mini-Mental State Examination (MMSE) is one of the tests most commonly used to
assess mental function. In the MMSE, a health professional asks the patient a series of
questions designed to test a range of everyday mental skills. The maximum MMSE score is
30 points. A score of 20 - 24 suggests mild dementia, 13 - 20 suggests moderate dementia,
and less than 12 indicates severe dementia. On an average, the MMSE score of a person with
Alzheimer’s declines by 2 - 4 points each year. All these tests require the patient’s
cooperation and the doctor’s skill. New imaging technologies have revolutionized the
understanding of the structure and function of the live brain. Currently, a standard medical
workup for Alzheimer’s disease often includes structural imaging with Magnetic Resonance
Image (MRI)[3] [9] or, less frequently, Computed Tomography (CT). These images are used
primarily to detect tumours, evidence of small or large strokes, and damage from severe head
trauma or a build up of fluid. Researchers do intensively try to find whether the use of MRI
and other imaging methods may be expanded to play a more direct role in diagnosing
Alzheimer’s. Many research findings have shown that the brains of people with Alzheimer’s
shrink significantly as the disease worsens. Research has also shown that shrinkage in specific
brain regions, particularly the hippocampus, may be an early sign of Alzheimer’s. The
hippocampus is a major component of the brains of humans and other mammals. It belongs to
the limbic system and plays important roles in long-term memory and spatial navigation.
However, scientists have not yet agreed upon standardized values that would establish the
significance of a specific amount of shrinkage in any person at a single point in time. Due to
the deposition of amyloid-beta and tau on the hippocampus of the AD patients, the texture
features can be utilized to identify the AD.
The research work undertaken is focused on developing a biomarker to detect the
Alzheimer’s disease from Hippocampus MRI texture features, which is automatic, accurate
and fast. [1]
Dementia: Dementia is a general term for the loss of memory and other cognitive abilities
serious enough to interfere with daily life.
Alzheimer's has no current cure, but treatments for symptoms are available and research
continues. Although current Alzheimer's treatments cannot stop Alzheimer's from progressing,
they can temporarily slow the worsening of dementia symptoms and improve quality of life for
those with Alzheimer's and their caregivers. Today, there is a worldwide effort under way to find
better ways to treat the disease, delay its onset, and prevent it from developing. [1]
beginning in the areas important for memory before spreading to other regions. Scientists do
not know exactly what role plaques and tangles play in Alzheimer’s disease. Most experts
believe that they disable or block communication among nerve cells and disrupt processes the
cells need to survive. The destruction and death of nerve cells causes memory failure,
personality changes, problems in carrying out daily activities and other symptoms of
Alzheimer’s disease. Two abnormal structures called plaques and tangles are prime suspects
in damaging and killing nerve cells. Plaques are deposits of a protein fragment called beta-
amyloid (BAY-tuh AM-uh-loyd) that build up in the spaces between nerve cells. Tangles are
twisted fibers of another protein called tau (rhymes with “wow”) that build up inside cells.
Though autopsy studies show that most people develop some plaques and tangles as they age,
those with Alzheimer’s tend to develop far more and in a predictable pattern, beginning in the
areas important for memory before spreading to other regions.
Fig 1.1 Plaques and tangles tend to spread through the cortex as Alzheimer's progresses.
Scientists do not know exactly what role plaques and tangles play in Alzheimer's disease.
As shown in fig 1.1 plaques and tangles spread throughout the brain. Most experts believe
they somehow play a critical role in blocking communication among nerve cells and
disrupting processes that cells need to survive. It's the destruction and death of nerve cells that
causes memory failure, personality changes, problems carrying out daily activities and other
symptoms of Alzheimer's disease. [1]
Different types of dementia are associated with particular types of brain cell damage in
particular regions of the brain. For example, in Alzheimer's disease, high levels of certain
proteins inside and outside brain cells make it hard for brain cells to stay healthy as shown in
fig 1.2 the neurons get damaged. The brain region called the hippocampus is the center of
learning and memory in the brain, and the brain cells in this region are often the first to be
damaged. That's why memory loss is often one of the earliest symptoms of Alzheimer's.
While most changes in the brain that cause dementia are permanent and worsen over time,
thinking and memory problems caused by the following conditions may improve when the
condition is treated or addressed:
1) Depression
2) Medication side effects
3) Excess use of alcohol
4) Thyroid problems
5) Vitamin deficiencies
Alzheimer’s disease typically progresses slowly in three general stages: early, middle and
late (sometimes referred to as mild, moderate and severe in a medical context). Since
Alzheimer’s affects people in different ways, each person may experience symptoms or
progress through the stages differently.
and tau levels in cerebrospinal fluid and brain changes detectable by imaging. Recent research
suggests that these indicators may change at different stages of the disease process.
lower-energy spin down state. Now a hydrogen dipole has two spins, 1 high spin and 1 low.
In low spin both dipole and field are in parallel direction and in high spin case it is anti-
parallel. They release the difference in energy as a photon, and the released photons are
detected by the scanner as an electromagnetic signal, similar to radio waves. As a result of
conservation of energy, the resonant frequency also dictates the frequency of the released
photons. The photons released when the field is removed have energy and therefore a
frequency which depends on the energy absorbed while the field was active. It is this
relationship between field-strength and frequency that allows the use of nuclear magnetic
resonance for imaging. An image can be constructed because the protons in different tissues
return to their equilibrium state at different rates, which is a difference that can be detected.
Five different tissue variables — spin density, T1 and T2 relaxation times and flow and
spectral shifts can be used to construct images. By changing the parameters on the scanner,
this effect is used to create contrast between different types of body tissue or between other
properties, as in fMRI and diffusion MRI. The 3D position from which photons were released
is learned by applying additional fields during the scan. This is done by passing electric
currents through specially-wound solenoids, known as gradient coils. These fields make the
magnetic field strength vary depending on the position within the patient, which in turn makes
the frequency of released photons dependent on their original position in a predictable
manner, and the original locations can be mathematically recovered from the resulting signal
by the use of inverse Fourier transform. Contrast agents may be injected intravenously to
enhance the appearance of blood vessels, tumours or inflammation. Contrast agents may also
be directly injected into a joint in the case of arthrograms, MRI images of joints. Unlike CT,
MRI uses no ionizing radiation and is generally a very safe procedure. Nonetheless the strong
magnetic fields and radio pulses can affect metal implants, including cochlear implants and
cardiac pacemakers. In the case of cochlear implants, the US FDA has approved some
implants for MRI compatibility. In the case of cardiac pacemakers, the results can sometimes
be lethal, so patients with such implants are generally not eligible for MRI. MRI is used to
image every part of the body, and is particularly useful for tissues with many hydrogen nuclei
and little density contrast, such as the brain, muscle, connective tissue and most tumours. An
advantage of MRI is its ability to produce images in axial, coronal, sagittal and multiple
oblique planes with equal ease. MRI scans give the best soft tissue contrast of all the imaging
modalities. With advances in scanning speed and spatial resolution, and improvements in
computer 3D algorithms and hardware, MRI has become an important tool in neuroradiology.
MRI scanners are particularly well suited to image the non-bony parts or soft tissues of
the body. They differ from computed tomography (CT), in that they do not use the damaging
ionizing radiation of x-rays. The brain, spinal cord and nerves, as well as muscles, ligaments,
and tendons are seen much more clearly with MRI than with regular x-rays and CT; for this
reason MRI is often used to image knee and shoulder injuries. In the brain, MRI can
differentiate between white matter and grey matter and can also be used to diagnose
aneurysms and tumors. Because MRI does not use x-rays or other radiation, it is the imaging
modality of choice when frequent imaging is required for diagnosis or therapy, especially in
the brain. However, MRI is more expensive than x-ray imaging or CT scanning. One kind of
specialized MRI is functional Magnetic Resonance Imaging (fMRI.) This is used to observe
brain structures and determine which areas of the brain “activate” (consume more oxygen)
during various cognitive tasks. It is used to advance the understanding of brain organization
and offers a potential new standard for assessing neurological status and neurosurgical risk.
As shown in fig 1.4 the input images are MR images and the image properties are:
We used AD and Normal MR images of each 30. Then we extracted the 14 texture
features of each image using GLCM texture extraction algorithm. Then we used K-NN
classifier for characterizing AD and NORMAL MRI images. We fed the 14 features of 30 AD
images and 30 NORMAL images to classifier as test sets and training sets. These are taken by
classifier as 14 x 30 matrix. The K-NN classifier calculates the distance function using these
features and classify the AD and NORMAL on the basis of this distance value.
GLCM Feature
Extraction
Extracted Features
fed to k-NN
Performance
evaluation
commonly run clinical scan. The T1 weighting can be increased (improving contrast) with
the use of an inversion pulse as in an MPRAGE sequence. Due to the short repetition time
(TR) this scan can be run very fast allowing the collection of high resolution 3D datasets. A
T1 reducing gadolinium contrast agent is also commonly used, with a T1 scan being
collected before and after administration of contrast agent to compare the difference. In the
brain T1- weighted scans provide good gray matter/white matter contrast; in other words, T1-
weighted images highlight fat deposition. T2-weighted MRI - T2-weighted scans are another
basic type. Like the T1- weighted scan, fat is differentiated from water - but in this case fat
shows darker, and water lighter. For example, in the case of cerebral and spinal study, the
CSF (cerebrospinal fluid) will be lighter in T2-weighted images. These scans are therefore
particularly well suited to imaging edema, with long TE and long TR. Because the spin echo
sequence is less susceptible to inhomogeneities in the magnetic field, these images have long
been a clinical workhorse. Proton Density weighted MRI- Spin density, also called proton
density, weighted scans try to have no contrast from either T2 or T1 decay, the only signal
change coming from differences in the amount of available spins (hydrogen nuclei in water).
It uses a spin echo or sometimes a gradient echo sequence, with short TE and long TR.
Vision is the most advanced of our senses, so it is not surprising that images play the
single most important role in human perception. However, unlike humans, who are limited to
the visual band of the electromagnetic (EM) spectrum, imaging machines cover almost the
entire EM spectrum, ranging from gamma to radio waves. They can operate on images
generated by sources that humans are not accustomed to associating with images. These
include ultrasound, electron microscopy, and computer-generated images. Thus, digital image
processing encompasses a wide and varied field of applications.[14]
The name MATLAB stands for MATrix LABoratory. MATLAB was written
originally to provide easy access to matrix software developed by the LINPACK (linear
system package) and EISPACK (Eigen system package) projects. MATLAB is a high-
performance language for technical computing. It integrates computation, visualization, and
programming environment. Furthermore, MATLAB is a modern programming language
environment: it has sophisticated data structures, contains built-in editing and debugging
tools, and supports object-oriented programming. These factors make MATLAB an excellent
tool for teaching and research.
MATLAB is being used as a platform for laboratory exercises and the problems
classes in the Image Processing half of the Computer Graphics and Image Processing course
unit. This handout describes the MATLAB development environment you will be using, you
are expected to have read it and be familiar with it before attempting the Laboratory and
Coursework Assignments.
MATLAB is a data analysis and visualization tool designed to make matrix
manipulation as simple as possible. In addition, it has powerful graphics capabilities and its
own programming language. The basic MATLAB distribution can be expanded by adding a
range of toolboxes, the one relevant to this course is the image-processing toolbox (IPT). The
basic distribution and all of the currently available toolboxes are available in the labs. The
basic distribution plus any installed toolboxes will provide a large selection of functions,
invoked via a command line interface.
MATLAB’s basic data structure is the matrix. In MATLAB a single variable is a 1 x 1
matrix, a string is a 1 x n matrix of chars. An image is an n x m matrix of pixels. A raw image
will take up a lot of storage space. Methods have been defined to compress the image by
coding redundant data in a more efficient fashion, or by discarding the perceptually less
significant information. MATLAB supports reading all of the common image formats. Image
coding is not addressed in this course unit. [23]
The right side of this equation is a digital image by definition. Each element of this
array is called an image element, picture element, pixel, or pel. The terms image and pixel are
used throughout the rest of our discussions to denote a digital image and its elements.
where f(1, 1) =f(0,0) (note the use of a monospace font to denote MATLAB quantities).
Clearly, the two representations are identical, except for the shift in origin. The notation f(p,
q) denotes the element located in row p and column q. For example, f(6, 2) is the element in
the sixth row and second column of matrix f. Typically, we use the letters M and N,
respectively, to denote the number of rows and columns in a matrix. A 1×N matrix is called a
row vector, whereas an M×1 matrix is called a column vector. A 1×1 matrix is a scalar.
Matrices in MATLAB are stored in variables with names such as A, a, RGB,
real_array, and so on. Variables must begin with a letter and contain only letters, numerals,
and underscores. As noted in the previous paragraph, all MATLAB quantities in this book are
written using monospace characters. We use conventional Roman, italic notation, such as
f(x,y), for mathematical expressions.
CHAPTER – 2
LITERATURE REVIEW
Clinical criteria for the diagnosis of Alzheimer’s disease include insidious onset and
progressive impairment of memory and other cognitive functions. There are no motor,
sensory, or coordination deficits early in the disease. The diagnosis cannot be determined by
laboratory tests. These tests are important primarily in identifying other possible causes of
dementia that must be excluded before the diagnosis of Alzheimer’s disease may be made
with confidence. Neuropsychological tests provide confirmatory evidence of the diagnosis of
dementia and help to assess the course and response to therapy. The criteria proposed are
intended to serve as a guide for the diagnosis of probable, possible, and definite Alzheimer’s
disease; these criteria will be revised as more definitive information becomes available. [25]
The Open Access Series of Imaging Studies is a series of magnetic resonance imaging
data sets that is publicly available for study and analysis. The initial data set consists of a
cross-sectional collection of 416 subjects aged 18 to 96 years. One hundred of the included
subjects older than 60 years have been clinically diagnosed with very mild to moderate
Alzheimer’s disease. The subjects are all right-handed and include both men and women. For
each subject, three or four individual T1-weighted magnetic resonance imaging scans
obtained in single imaging sessions are included. Multiple within-session acquisitions provide
extremely high contrast-to-noise ratio, making the data amenable to a wide range of analytic
approaches including automated computational analysis. Additionally, a reliability data set is
included containing 20 subjects without dementia imaged on a subsequent visit within 90 days
of their initial session. Automated calculation of whole-brain volume and estimated total
intracranial volume are presented to demonstrate use of the data for measuring differences
associated with normal aging and Alzheimer’s disease. [8].
Alzheimer’s disease (AD), is a degenerative disease which leads to memory loss and
problems with thinking and behaviour.AD is a type of dementia which accounts for an
estimated 60% to 80% of cases. Accurate diagnosis depends on the identification of
discriminative features of AD. Recently, different feature extraction methods are used for the
classification of AD. In this paper, we proposed a classification framework to select features,
which are extracted using Gray-Level Co-occurrence Matrix (GLCM) method to distinguish
between the AD and the Normal Control (NC). In order to evaluate the proposed method, we
have performed evaluations on the MRI acquiring from the OASIS database. The proposed
method yields an average testing accuracy of 75.71% which indicates that the proposed
method can differentiate AD and NC satisfactorily. [3].
Grey Level Co-occurrence Matrices (GLCM) are one of the earliest techniques used for
image texture analysis. In this paper we defined a new feature called trace extracted from the
GLCM and its implications in texture analysis are discussed in the context of Content Based
Image Retrieval (CBIR). The theoretical extension of GLCM to n-dimensional gray scale
images are also discussed. The results indicate that trace features outperform Haralick features
when applied to CBIR. [5].
The nearest neighbor decision rule assigns to an unclassified sample point the
classification of the nearest of the nearest of a set of previously classified points. This rule is
independent of the underlying joint distribution on the sample points and their classifications,
and hence the probability of error R of such a rule must be at least as great as the Bayes
probability of error R*--the minimum probability of error over all decision rules taking
underlying probability structure into account. However, in a large sample analysis, we will
show in the M-category case that R* ≤ R ≤ R*(2 - MR*/ (M-1)), where these bounds are the
tightest possible, for all suitably smooth underlying distributions. Thus for any number of
categories, the probability of error of the nearest neighbor rule is bounded above by twice the
Bayes probability of error. In this sense, it may be said that half the classification information
in an iu6uite sample set is contained iu the nearest neighbor. [22]
A preliminary study for mapping sea ice patterns (texture) with 100-m ERS-1 synthetic
aperture radar (SAR) imagery is presented in the paper. We used gray-level co-occurrence
matrices (GLCM) to quantitatively evaluate textural parameters and representations and to
determine which parameter values and representations are best for mapping sea ice texture.
We conducted experiments on the quantization levels of the image and the displacement and
orientation values of the GLCM by examining the effects textural descriptors such as entropy
have in the representation of different sea ice textures. We showed that a complete gray-level
representation of the image is not necessary for texture mapping, an eight-level quantization
representation is undesirable for textural representation, and the displacement factor in texture
measurements is more important than orientation. In addition, we developed three GLCM
implementations and evaluated them by a supervised Bayesian classifier on sea ice textural
contexts. This experiment concludes that the best GLCM implementation in representing sea
ice texture is one that utilizes a range of displacement values such that both microtextures and
macrotextures of sea ice can be adequately captured. These findings define the quantization,
displacement, and orientation values that are the best for SAR sea ice texture analysis using
GLCM. [6].
MRIs obtained from OASIS MRI database. We then test our trained neural network on the
entire set of 457 MRIs provided by OASIS dataset to confirm the accuracy of diagnosis by
our system. Our results produce nearly 90% accuracy in AD diagnosis and classification. [10].
The different feature extraction techniques for lesion identification in Dynamic Contrast
Enhancement - Magnetic Resonance Imaging (DCE - MRI) of Breast is discussed in the
paper. In DCE- MRI, kinetic feature extraction is a popular radiological approach used by
Radiologists. However, extracting more features like entropy, homogeneity, heterogeneity,
and statistical features of the region of interests would enhance the accuracy of lesion
diagnosis, especially in ‘not sure' (plateau) cases. This paper discuss about a survey of
different feature extraction techniques such as structural, statistical, modelbased and
transform based methods. The paper also advocates a comparative study of feature extraction
using statistical and intensity time kinetic curve methods. These two features are employed in
understanding the prominence of malignancy in the lesion.
in the AD subjects are noticeably enlarged when compared to normal subjects. Features
obtained from the segmented ventricles are also clearly distinct and demonstrate the
differences in the AD subjects. As ventricle volume and its morphometry are significant
biomarkers, this study seems to be clinically relevant. [21].
The Open Access Series of Imaging Studies is a series of magnetic resonance imaging
data sets that is publicly available for study and analysis. The initial data set consists of a
cross-sectional collection of 416 subjects aged 18 to 96 years. One hundred of the included
subjects older than 60 years have been clinically diagnosed with very mild to moderate
Alzheimer’s disease. The subjects are all right-handed and include both men and women. For
each subject, three or four individual T1-weighted magnetic resonance imaging scans
obtained in single imaging sessions are included. Multiple within-session acquisitions provide
extremely high contrast-to-noise ratio, making the data amenable to a wide range of analytic
approaches including automated computational analysis. Additionally, a reliability data set is
included containing 20 subjects without dementia imaged on a subsequent visit within 90 days
of their initial session. Automated calculation of whole-brain volume and estimated total
intracranial volume are presented to demonstrate use of the data for measuring differences
associated with normal aging and Alzheimer’s disease. [2].
The classification of the brain MRI is an important task. In this paper, the automatic
approach to the classification of brain tumor into malignant Vs. benign and low grade Vs.
high grade glioma is present. This method employs GLCM technique to extract the texture
features from images and stored as a feature vector. The extracted features were classified
using supervised SVM and KNN algorithm. The proposed system is applied on the 251
images (85 malignant and 166 benign) of clinical database and 80 images (50 low grade
glioma and 30 high grade glioma) of brats 2012 training database. The accuracy of the
proposed system is 96% and 86% for SVM and KNN respectively for clinical database and
85% and 72.50% for SVM and KNN respectively for Brats database. [14].
A novel method for detecting the onset of Alzheimer’s disease (AD) from Magnetic
Resonance Imaging (MRI) scans is presented in the paper. It uses a combination of three
different machine learning algorithms in order to get improved results and is based on a three-
class classification problem. The three classes for classification considered in this study are
normal, very mild AD and mild and moderate AD subjects. The machine learning algorithms
used are: the Extreme Learning Machine (ELM) for classification, with its performance
optimized by a Particle Swarm Optimization (PSO) and a Genetic algorithm (GA) used for
feature selection. A Voxel-Based Morphometry (VBM) approach is used for feature
extraction from the MRI images and GA is used to reduce the high dimensional features
needed for classification. The GA-ELM-PSO classifier yields an average training accuracy of
94.57% and a testing accuracy of 87.23%, averaged across the three classes, over ten random
trials. The results clearly indicate that the proposed approach can differentiate between very
mild AD and normal cases more accurately, indicating its usefulness in detecting the onset of
AD. [15].
CHAPTER - 3
3.3.1 Algorithm
The virtual variable is created in the following way (using the settings on the GLCM
Texture page of the Variable properties dialog box identified in bold):
1. Quantize the image data. Each sample on the echogram is treated as a single image pixel
and the value of the sample is the intensity of that pixel. These intensities are then further
quantized into a specified number of discrete gray levels as specified under Quantization.
2. Create the GLCM. It will be a square matrix N x N in size where N is the Number of
levels specified under Quantization. The matrix is created as follows:
a. Let s be the sample under consideration for the calculation.
b. Let W be the set of samples surrounding sample s which fall within a window centered
upon sample s of the size specified under Window Size.
c. Considering only the samples in the set W, define each element i,j of the GLCM as the
number of times two samples of intensities i and j occur in specified Spatial relationship
(where i and j are intensities between 0 and Number of levels-1).
The sum of all the elements i, j of the GLCM will be the total number of times the
specified spatial relationship occurs in W.
d. Make the GLCM symmetric:
i. Make a transposed copy of the GLCM
ii. Add this copy to the GLCM itself.
This produces a symmetric matrix in which the relationship i to j is indistinguishable for
the relationship j to i (for any two intensities i and j). As a consequence the sum of all the
elements i, j of the GLCM will now be twice the total number of times the specified
spatial relationship occurs in W (once where the sample with intensity i is the reference
sample and once where the sample with intensity j is the reference sample), and for any
given i, the sum of all the elements i, j with the given i will be the total number of times a
sample of intensity i appears in the specified spatial relationship with another sample.
e. Normalize the GLCM:
i. Divide each element by the sum of all elements.
The elements of the GLCM may now be considered probabilities of finding the
relationship i, j (or j, i) in W.
3. Calculate the selected Feature. This calculation uses only the values in the GLCM. See:
1) Angular Second Moment
2) Contrast
3) Correlation
4) Variance
5) Inverse Difference Moment
6) Sum Average
7) Sum Variance
8) Sum Entropy
9) Entropy
10) Difference Variance
11) Difference Entropy
12) Information Measure of Correlation I
13) Information Measure of Correlation II
14) Maximal Correlation Coefficient
4. The samples in the resulting virtual variable are replaced by the value of this calculated
feature. These 14 features are called Haralick Texture Features.
G= ………………………………………. Eq (3.1)
Haralick then described 14 statistics that can be calculated from the co-occurrence matrix
with the intent of describing the texture of the image:
2. Contrast:
Contrast measures the quantity of local changes in an image. It reflects the sensitivity of
the textures in relation to changes in the intensity. It returns the measure of intensity contrast
between a pixel and its neighbourhood. Contrast is 0 for a constant image. It is the amount of
local variation present in an image. If the amount of local variation is large, the contrast
feature also has consistently higher values comparatively. If the gray scale difference occurs
continually, the texture becomes coarse and the contrast becomes large. The texture becomes
acute if the contrast has a small value.
Contrast = ………. Eq (3.3)
3. Correlation:
This feature measures how correlated a pixel is to its neighbourhood. It shows the linear
dependency of gray level values in the co-occurrence matrix. Feature values range from -1 to
1, these extremes indicating perfect negative and positive correlation respectively.
Where ux, uy, σx and σy are means and standard deviations of px and py, the partial
probability density functions.
4. Entropy:
Entropy is the randomness or the degree of disorder present in the image. The value of
entropy is the largest when all elements of the co-occurrence matrix are the same and small
when elements are unequal.
Entropy = - ………………………………. Eq (3.5)
It is also called as Homogeneity. Homogeneity measures the similarity of pixels and how
close the distribution of elements in the GLCM is to the diagonal of GLCM. A diagonal gray
level co-occurrence matrix gives homogeneity of 1. It becomes large if local textures only
have minimal changes. As homogeneity increases, the contrast, typically, decreases.
Inverse difference moment = ………………… Eq (3.6)
6. Variance:
Variance is normally used to find how each pixel varies from the neighbouring pixel (or
centre pixel) and is used in classify into different regions.
Variance = ………………………………… Eq (3.7)
7. Sum average:
Px+y(k) =
i+j = k, k=2,3,4.......2Ng.
8. Sum variance:
Sum variance = ………………………… Eq (3.9)
Where f8 is the sum entropy.
9. Sum entropy:
Px-y(k) = ,
|i-j| = k, k=0,1,2,.......Ng-1.
....………………………………………… Eq (3.13)
HXY =
HXY1 =
CHAPTER - 4
AD & NORMAL MR IMAGE CLASSIFICATION
While certain aspects of digital image classification are completely automated, a human
image analyst must provide significant input. There are two basic approaches to classification,
supervised and unsupervised, and the type and amount of human interaction differs depending
on the approach chosen.
The image analyst must select a sufficient number of training sites in each class to
represent the variation present within each class in the image. The classification algorithm
then uses spectral characteristics of the training sites to classify the remainder of the image.
Training sites developed in one scene may or may not be transferrable to an entire study area.
If ground conditions, lighting conditions, or atmospheric effects change from scene to
another, then training sites must be developed independently for each scene. Furthermore,
training sites may not be transferrable across time; in addition to the conditions noted above
that change over time as well as space, real changes in the land cover occurring at a training
site location over time will cause incorrect classification results in the second image. Accurate
supervised classification results depend entirely on the analyst’s ability to collect a sufficient
number of training sites and to recognize when training sites can or cannot be transferred from
one image to another.
Unsupervised classification requires less input from the analyst before processing. The
classification algorithm searches and analyses the image, grouping pixels into clusters which
it deemed to be uniquely representative of the image content. After classification, the image
analyst must determine if these arbitrary classes have meaning in the context of the end-user
application. A significant amount of time may be spent trying to determine the physical
meaning of a class identified by the unsupervised algorithm. In addition, experimentation is
required to determine the optimal number of unique classes used for initialization of the
algorithm. Furthermore, there is no basis to believe that the classes discovered in one image
will be the same classes discovered in a second image. Time spent trying to optimize and
interpret the unsupervised results may far exceed the time an analyst would have spent
selecting training sites for supervised classification. Finally, because it is impossible to ensure
consistency in class identification from one image to the next, unsupervised classification is
not useful for change detection.
It is the first algorithm we shall investigate which is most often used for classification,
although it can also be used for estimation and prediction. K-Nearest neighbor is an example
of instance-based learning, in which the training data set is stored, so that a classification for a
new unclassified record may be found simply by comparing it to the most similar records in
the training set. This method that has been used in many applications in areas such as data
mining, statistical pattern recognition, image processing. Successful applications include
recognition of handwriting, satellite image and EKG pattern. [19] [14]
K-nearest neighbor algorithm (KNN) is part of supervised learning that has been used in
many applications in the field of data mining, statistical pattern recognition and many
others.
KNN is a method for classifying objects based on closest training examples in the feature
space.
An object is classified by a majority vote of its neighbors. K is always a positive integer.
The neighbors are taken from a set of objects for which the correct classification is
known.
It is usual to use the Euclidean distance, though other distance measures such as the
Manhattean distance could in principle be used instead.
1. Determine the parameter K = number of nearest neighbors beforehand. This value is all
up to us.
2. Calculate the distance between the query-instance and all the training samples. We can
use any distance algorithm.
3. Sort the distances for all the training samples and determine the nearest neighbor based
on the K-th minimum distance.
4. Since this is supervised learning, get all the Categories of your training data for the sorted
value which fall under K.
5. Use the majority of nearest neighbors as the prediction value.
Cons: Indeed it is simple but K-NN algorithm has drawn a lot of flake for being extremely
simple! If we take a deeper look, this doesn’t create a model since there’s no abstraction
process involved. Yes, the training process is really fast as the data is stored verbatim (hence
lazy learner) but the prediction time is pretty high with useful insights missing at times.
Therefore, building this algorithm requires time to be invested in data preparation (especially
treating the missing data and categorical features) to obtain a robust model.
CHAPTER 5
RESULTS and DISCUSSION
This project gives the analysis of GLCM texture feature extraction and classification
algorithm and it is tested on 60 MRI pictures of which 30 tests are observed to be Normal and
30 are AD. The GLCM texture features are extracted from the Normal and AD images.
In this project, we utilized brain MR images of 133 Normal people and 30 AD patients.
Entire brain T1-weighted 3D MPRAGE- Magnetization-prepared Rapid Acquisition Gradient
Echo data set were acquired using Siemens 1.5T Vision scanner in a single imaging session.
In this project utilized axial view for testing the proposed method. The training set
consists of 60 samples and test sample is classified based on the trained data.
The plots shown in fig 5.1 to fig 5.14 are the 14 Haralick texture features calculated using
GLC matrix extracted from MR images. The plots are the comparison of AD and NORMAL
features values.
NORMAL AD
Contrast
0.2
0.15
0.1
0.05
0
0 5 10 15 20 25 30 35
NORMAL AD
Correlation
0.98
0.97
0.96
0.95
0.94
0.93
0 5 10 15 20 25 30 35
NORMAL AD
Variance
10
8
6
4
2
0
0 5 10 15 20 25 30 35
NORMAL AD
0.96
0.94
0.92
0 5 10 15 20 25 30 35
NORMAL AD
Sum Average
5
4
3
2
1
0
0 5 10 15 20 25 30 35
NORMAL AD
Sum Variance
40
30
20
10
0
0 5 10 15 20 25 30 35
NORMAL AD
Sum Entropy
2
1.5
0.5
0
0 5 10 15 20 25 30 35
NORMAL AD
Entropy
3
2.5
1.5
0.5
0
0 5 10 15 20 25 30 35
NORMAL AD
Difference Variance
0.15
0.1
0.05
0
0 5 10 15 20 25 30 35
NORMAL AD
Difference Entropy
0.5
0.4
0.3
0.2
0.1
0
0 5 10 15 20 25 30 35
NORMAL AD
-0.7
-0.75
-0.8
NORMAL AD
0.96
0.94
0.92
0.9
0 5 10 15 20 25 30 35
NORMAL AD
NORMAL AD
CHAPTER - 6
CONCLUSION
REFERENCES
[2] D. Marcus, A. Fotenos, J. Csernansky, J. Morris, and R. Buckner, “Open access series of
imaging studies (oasis): Longitudinal MRI data in nondemented and demented older adults,”
Journal of cognitive neuroscience, vol. 12, pp. 2677–2684, 2012.
[3] ER, Amulya, SoumyaVarma, and Vince Paul. "Classification of brain MR images using
texture feature extraction." International Journal of Computer Science and Engineering5.5
(2017): 1722-1729.
[6] L. Soh and C. Tsatsoulis, “Texture Analysis of SAR Sea Ice Imagery Using Gray Level
CoOccurrence Matrices”, IEEE Transactions on Geoscience and Remote Sensing,Vol. 37, No.
2, pp 780795, 1999.
[9] Nayaki, K. Sankara, and Abraham Varghese. "Alzheimer's detection at anearly stage using
local measures on MRI: A comparative study on local measures." Data Science &
Engineering (ICDSE), 2014 International Conference on. IEEE, 2014.
[11] Anandh, K. R., C. M. Sujatha, and S. Ramakrishnan. "A method to differentiate mild
cognitive impairment and Alzheimer in MR images using eigenvalue descriptors." Journal of
medical systems 40.1 (2016): 25.
[13] Daza, Julian Camilo, and Andrea Rueda. "Classification of Alzheimer's disease in MRI
using visual saliency information." Computing Conference (CCC), 2016 IEEE 11th
Colombian. IEEE, 2016.
[14] Wasule, Vijay, and Poonam Sonar. "Classification of brain MRI using SVM and KNN
classifier." Sensing, Signal Processing and Security (ICSSS), 2017 Third International
Conference on. IEEE, 2017.
[15] Saraswathi, Saras, et al. "Detection of onset of Alzheimer's disease from MRI images
using a GA-ELM-PSO classifier." Computational Intelligence in Medical Imaging (CIMI),
2013 IEEE Fourth International Workshop on. IEEE, 2013.
[17] Long, Xiaojing, and Chris Wyatt. "An automatic unsupervised classification of MR
images in Alzheimer's disease." Computer Vision and Pattern Recognition (CVPR), 2010
IEEE Conference on. IEEE, 2010.
[18] Preethi, G., and V. Sornagopal. "MRI image classification using GLCM texture features."
Green Computing Communication and Electrical Engineering (ICGCCEE), 2014
International Conference on. IEEE, 2014.
[22] Cover, Thomas, and Peter Hart. "Nearest neighbor pattern classification." IEEE
transactions on information theory 13.1 (1967): 21-27.
[23] Rafael C.. Gonzalez, Richard E.. Woods, and Steven L.. Eddins. Digital Image
Processing Using MATLAB®. Gatesmark Publishing, 2009.
[24] Malu, G., and Elizabeth Sherly. "A Study on Different Feature Extraction Techniques for
Lesion Identification in MRI Breast Images."
[25] McKhann, Guy, et al. "Clinical diagnosis of Alzheimer's disease Report of the
NINCDS‐ADRDA Work Group* under the auspices of Department of Health and Human
Services Task Force on Alzheimer's Disease." Neurology 34.7 (1984): 939-939.