Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Early Detection of Alzheimer’s Disease

Jayant Harwalkar Hemankith Reddy M Shravani


Department of Computer Science Department of Computer Science Department of Computer Science
PES University PES University PES University
Bangalore, India Bangalore, India Bangalore, India
jayanthharwalkar@gmail.com hemankith@gmail.com hemankith@gmail.com
Neelesh S developed machine learning and deep learning model
Department of Computer Science which predicts AD risk.
PES University Despite the currently available biomarkers, electronic
Bangalore, India healthcare data, health records, and increase in
hemankith@gmail.com digitalization

Abstract—This document is a model and instructions for


A
L TEX. This and the IEEEtran.cls file define the components of
your paper [title, text, heads, etc.]. *CRITICAL: Do Not Use
Symbols, Special Characters, Footnotes, or Math in Paper Title
or Abstract.
Index Terms—component, formatting, style, styling, insert of data, there is not enough information on how to use
this large-scale health data in the prediction of AD risk,
I. INTRODUCTION but there are few studies that demonstrate that
Alzheimer’s disease(AD), a brain disorder, is potential AD risk can be predicted when these resources
considered to be a form of dementia that slowly are combined with data-driven machine learning
destroys memory, thinking, and eventually, the ability to models.[5]
perform daily tasks. It is caused by the loss and
degeneration of neurons in the brain mostly in the
cortex region. AD is caused by the formation of plagues
in which clumps of abnormal proteins are formed
outside the neuron which block the neuron connections, II. LITERATURE REVIEW
disrupting signals, which leads to impairment of the
As the person ages, the risk of getting Alzheimer’s
brain. AD can also be formed by tangles in which a
disease also increases which is one of the major health
protein build-up occurs inside the neuron, affecting the
issues. So early diagnosis and detection are necessary to
signal transition. In AD, the brain starts to shrink, the
allow them to get efficient treatment.[] Considering the
gyri become narrow while the sulci widen. The risk of
amount of research that is made, machine learning
getting this disease increases with age and it is mostly
techniques, and branches of artificial intelligence are
seen in older people[].
widely used for this. Even with the existence of all these
AD can be diagnosed by doing a brain autopsy and methods, there are no instruments for detection.
biopsy and there is no complete cure for the disease[]. However, certain, physical, neuropsychological,
Having an early detection improves the chances for physiological, and neurological tests can be used for the
effective treatment and the ability of the individual to identification of this disease.[12] One such method is by
participate in a wide variety of clinical trials. Treatment using SVM for feature selection. There are many types
is effective if given in the early stages. Currently, there of classifications that uses SVM for feature extraction,
are no treatments to reverse the damage already caused for example, it can be done using SPECT images which
but proper medication can halt the further progression use SPECT perfusion imaging to classify healthy patients’
of AD and prolong life[]. images from those having AD. The approach is based on
AD can be detected by performing scans like magnetic linear programming formulation based on a linear
resonance imaging(MRI), computed tomography(CT), or hyperplane which performs simultaneous feature
positron emission tomography(PET)[3]. Researchers use selection and classification. They also contain proximity
raw MRI brain scans, demographic images, and clinical information regarding features and later generate a
data to compare it with normal cognition by using a
classifier that not only selects the most relevant voxels progression. Support breakthroughs in Alzheimer’s
but also the most relevant areas for classification which disease intervention, prevention, and therapy by using
gives more robust classifiers that are better for innovative diagnostic tools at the earliest possible stage
interpretation. This method has a specificity of 90 % and (when intervention may be most successful). ADNI has a
a specificity of 84%, this is also proven to be better than ground-breaking data-access policy, which makes data
Fisher Linear Discriminant(FLD) and also statistical available to all scientists worldwide without restriction.
parametric mapping(SPM) and also better than human We have acquired 2294 ADNI1 1.5T MRI scans which are
experts. The location of voxels is incorporated in the in the NiFTI format. The images are pre-classified into
problem, so the feature selection here depends on the CN, MCI or AD. Each of the images is of the shape 192 x
brain regions instead of separate non-connected voxels. 192 X 160.
The voxel values that are taken as features outperform
IV. IMPLEMENTATION DETAILS
the local approach of SPM.[6] Even though this
approach provided good results on Cohort 1, the results A. Architecture
weren’t great for inter-cohort as the accuracy dropped The images from the database are fed to a pipeline
to 74. This showed that the selected regions of the which consists of a series of pre-processing techniques.
considered refined atlas did not have good PSO performs feature selection on the pre-processed
generalisation ability.[7] One of the other methods is by images. The resultant images will be stored in a
using CNN for prediction and classification. The database and will be used by PSO to get optimal
classification is done using two methods, the first one parameters of the Convolutional Neural Network. This
contains CNN architecture on MRI scans based on 2D produces an optimized architecture for CN. The CNN
and 3 D convolutions. The CNN architecture is built from model is trained, validated and tested.
scratch. The second method consists of transfer learning
B. Pre-processing
techniques like the VGG19 pre-trained model. The two
methods are evaluated using nine performance metrics. The pre-processing steps are:
The end-to-end framework is applied for this medical (I) ADNI pipeline
image classification. Simple CNN architecture is applied (II) Registration
to 2D and 3D images of MRI. 2D and 3D convolutions- (III) Segmentation and Normalization
based CNN architectures are used. This can be used (IV) Skull Stripping
along with deep learning pre-trained models such as (V) Smoothing
VGG19. Standard CNN contains feature extraction,
feature reduction and classification so there is no need (I) ADNI pipeline: The images in the dataset are
to do extraction manually. The weights in initial layers obtained from MRI machines, and the machines
act as feature extractors, their values can be further use magnetic waves and radio waves to produce
improved by iterative learning.[9] the scans. There are parameters such as radio
frequency, magnetic frequency and uniformity of
III. DATASET the coil which can cause variations in the MRI
scans. To correct such variations in the images, the
The data were obtained from the Alzheimer’s Disease ADNI pipeline is used. The following is done on the
Neuroimaging Initiative (ADNI)database MRI image as a part of this pipeline
(adni.loni.usc.edu). The ADNI is a long-term study that (i) Post-Acquisition Correction: Scanners with
uses indicators such as imaging, genetic, clinical, and different acquisition parameters provide
biochemical markers to follow and detect Alzheimer’s considerable hurdles. Small changes in
disease early. The ADNI data repository has around 2220 acquisition parameters for quantitative
patients’ imaging, clinical, and genetic data from four sequences have a significant impact on machine
investigations (ADNI3, ADNI2, ADNI1 and ADNI GO). The learning models, thus rectifying these
image data (MRI scans) was used. ADNI provides inconsistencies is critical.
researchers with as they work on the progression of (ii) B1 Intensity Variation: B1 errors are one of the
Alzheimer’s disease. PET images, MRI images, genetics, problems in measuring MTR which expands to
cognitive tests, CSF, and blood data are collected and the magnetization transfer ratio since this MTR
validated and these can be used by researchers as value changes with a change in the
predictors of the disease. The first goal is to detect AD at magnetization transfer (MT) pulse amplitude.
the earliest stage (pre-dementia) and identify These errors can also be caused due to
biomarkers that can be used to track the disease’s
nonuniformity in the radiofrequency and brain disorders. In a brain MRI scan, it is the
incorrect transmitter output settings when method for differentiating brain tissue from non-
accounting for changing levels of RF coil loading. brain tissue. Even for experienced radiologists,
These mistakes need to be corrected to obtain separating the brain from the skull is a time-
images with no variations and loss of crucial consuming process, with results that vary widely
data. from person to person. This is a pipeline that only
(iii) Intensity Non-Uniformity: The quality of needs the input of a raw MRI picture and should
acquired data can be affected by intensity non- produce a segmented image of the brain after the
uniformity. The term ”intensity non-uniformity” necessary pre-processing.
refers to anatomically unrelated intensity (V) Smoothing: Smoothing involves removing
variance in data. It can be caused by the radio- redundant information and noise from the images.
frequency coil used, the acquisition pulse It helps in the easy identification of trends and
sequence used, and the sample’s composition patterns in the images. When the image is
and geometry. As a result, it is critical to correct produced in an MRI machine, it consists of
this variation, and a variety of approaches have different kinds of noise which need to be removed
been offered to do so. to obtain a clean image without loss of any crucial
(II) Registration: The act of aligning images to be information.
analyzed is called registration of images, and it is a C. CNN parameter optimisation using PSO
critical phase in which data from several images
The process of training is repetitive and continued
must be integrated everywhere. They can be taken
until the stop criteria are met. The steps to optimize PSO
at various times, from various perspectives, and
are:
with various sensors. Registration in medical
imaging allows you to merge data from multiple 1) Feed the pre-processed images as input to the CNN
modalities, such as CT, MR, SPECT, or PET, to get a network. The images should be of the same size
full picture of the patient. In our case, since the and characteristics. For example, they should of the
MRI scans are taken from different angles, it is the same dimensions, scale, colour gamma, etc.,
process of geometrically aligning all the images for 2) Design of PSO parameters. The algorithm’s particle
further analysis. It can be used to create a population is generated. This involves setting the
correspondence between features in a set of values for the number of particles, number of
images and then infer correspondence away from iterations, inertial weight, social constant, and
those features using a transformation model. cognitive constant etc., Random values can be set
(III) Segmentation and Normalization: The division of or can also be set according to some heuristic
brain tissue, which can be split into tissue sections 3) With the parameters obtained by the PSO, the
such as cerebrospinal fluid (CSF) which cushions parameters of CNN are initialised (parameters to be
the brain, grey matter (GM) where the actual set are given in the table below). CNN is ready to be
processing is done and white matter (WM) which trained now.
gives communication between different GM areas, 4) Training and validation of CNN. The CNN reads,
is the major focus of the brain magnetic resonance processes, validates and tests the input images.
imaging (MRI) image segmentation approach (CSF). This step produces values for the objective
Various significant brain regions that could be functions. The objective functions are AIC and
useful in identifying Alzheimer’s disease are found Recognition rate. These values are returned to the
and kept during image segmentation. PSO.
Normalization is the process of shifting and scaling 5) Calculate the objective function. The objective
an image so that the pixels have a zero mean and function is calculated by PSO to obtain the optimal
unit variance. By removing scale invariance, the values in the search space.
model can converge faster. 6) PSO parameters are updated. Both, the position of
(IV) Skull Stripping: Skull stripping is a process wherein the particles and the velocity of the particles that
the skull and the non-brain region of the image are characterize the particles, are updated by taking
removed and only the brain portion of the image is into consideration Pbest and Gbest. The updated
retained as we deal with only this region for position is with respect to its own They are updated
analysis of Alzheimer’s disease. Skull stripping is based on its own optimal position (Pbest) and the
one of the first steps in the process of diagnosing
optimal position of the entire swarm in the search
space (Gbest).
7) This process continues until the end criteria are
met. The end criteria can be the number of
iterations or a threshold value.
8) It is then determined which architecture is optimal.
Here the Gbest particle represents the optimal
architecture.
To elaborate further on how the algorithm will work,
an example is presented. The particle structure can
consist of 8 positions as shown below. Each particle has
these 8 positions and each of these positions is
responsible for tuning one hyper-parameter. The hyper-
parameters to be optimized are given in the below
table.

TABLE I: Structure of Particle


X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
TABLE II: Example Particle Structure

Particle Coordinate Hyper-Parameter Search Space


X1 Convolution layer number [1, 4]
X1 Filter number (1st layer) [32, 128]
X3 Filter size (1st layer) [1, 3]
X4 Filter number (2nd layer) [32, 128]
X5 Filter size (2nd layer) [1, 3]
X6 Filter number (3rd layer) [32, 128]
X7 Filter size (4th layer) [1, 3]
X8 Filter number (4th layer) [32, 128]
X9 Filter size (5th layer) [1, 3]
X10 Filter number (5th layer) [32, 128]

From Table II, it is observed that, the X1 coordinate


controls the hyper-parameter for convolution layer
number. If X1 = 4, it means that there will be 4
convolution layers. X2 and X3 control the hyper-
parameters filter number and size respectively. If X2 = 32
and X3 = 2, it implies that there will be 32 filters of size
5x5 (1 is mapped to 3x3, 2 to 5x5, 3 corresponds to 7x7,
and 4 implies 9x9). Similarly, X4 and X5 control filter
numbers and size for layer 2. The same goes for all the
remaining coordinates. X10 represents the batch size for
training.

TABLE III: Example particle generated by the algorithm


4 100 2 64 2 64 3 96 1 32

You might also like