Professional Documents
Culture Documents
MRI Brain Cancer Classification Using Support Vector Machine
MRI Brain Cancer Classification Using Support Vector Machine
Abstract-This research paper proposes an intelligent The methodology (Fig. 1) includes following modules:
classification technique to recognize normal and abnormal MRI Image preprocessing, Features extraction, Feature reduction
brain image. Medical image like ECG, MRI and CT-scan images Training and Classification/Testing.
are important way to diagnose disease of human being efficiently.
Image preprocessing is used to improve the quality of
The manual analysis of tumor based on visual inspection by
images. Medical images are corrupted by different type of
radiologist/physician is the conventional method, which may lead
to wrong classification when a large number of MRIs are to be
noises like Rician noise etc. It is very important to have good
analyzed. To avoid the human error, an automated intelligent quality of images for accurate observations for the given
classification system is proposed which caters the need for application. Median filter is simple to understand. It preserves
classification of image. One of the major causes of death among brightness differences resulting in minimal blurring of
people is Brain tumor. The chances of survival can be increased regional boundaries. It also preserves the positions of
if the tumor is detected correctly at its early stage. Magnetic boundaries in an image, making this method useful for visual
resonance imaging (MRI) technique is used for the study of the examination and measurement.
human brain. In this research work, classification techniques
Feature extraction refers to various quantitative
based on Support Vector Machines (SVM) are proposed and
measurement of medical images typically used for decision
applied to brain image classification. In this paper feature
extraction from MRI Images will be carried out by gray scale,
making regarding the pathology of a structure or tissue. In
symmetrical and texture features. The main objective of this
image processing, feature extraction is a special form of
paper is to give an excellent outcome (i.e. higher accuracy rate dimensionality diminution. When the input data to an
and lower error rate) of MRI brain cancer classification using algorithm is too large to be processed and it is assumed to be
SVM. disgracefully unnecessary then the input data will be
transformed into a compact representation set of features.
Index Terms- Classification, MRI, SVM, PCA, Skull masking. Transforming the input data set into the set of features is
called feature extraction. If the extracted features are
I. INTRODUCTION cautiously selected, it is expected that the features set will
Automated and efficient diagnosis of medical images is extract the important information from the input data in order
very important. Computer and Information Technology are to perform the desired task using this reduced representation
very much useful in medical image processing, medical instead of the full size input.
analysis and classification. More often Medical images are Principle Component Analysis(PCA) [4] is used to reduce
usually obtained by X-rays and MRI MRI is essential tool in
.
the dimensionality of data i.e. reduced features. Martinez and
the clinical and surgical environment due to superior soft tissue Kak showed that if training sets are small comparing to feature
differentiation, high spatial resolution, contrast and it does not dimension, PCA can outperform LDA [13].
use any harmful ionizing radiation which may affect patients. The reduced features are submitted to a support vector
Cancer develops in a part of the body when cells begin to grow machine for training and testing. Therefore this method will
out abnormally. Radiologists examine MRI Images based on decrease the computation time and complexity.
visual interpretation to identify the presence of tumor. There Classifiers such as SVM, K-Nearest Neighbor (KNN),
might be a possibility when large volwne of MRI to be Artificial Neural Network (ANN), Probabilistic Neural
analyzed then there is a possibility of wrong diagnosis by Network(PNN), Hidden Markov Model (HMM), etc. are used
radiologists because sensitivity of the human eye decreases for various applications such as hand written digit
with escalating nwnber of cases, predominantly when only a identification, object identification, speaker identification, face
small nwnber of slices are affected. Hence there is a need for identification, text classification and for medical applications.
efficient automated systems for analysis and classification of Each of the classification schemes previously mentioned has
medical images. The MRI image may contain both normal and its own unique properties and associated strengths and
abnormal image. problems. In KNN, the major limitation is that it uses all
features in distance computationally intensive, mainly when
SCEECS 2014
Region filling (Fig. 8.) is used to fill in holes inside the
brain region.
B. Filtering
Filtering is the process of removing noise from MRI Fig. 8 Region Filling
images. Medical images are corrupted with different kinds of
noise while image acquisition. In this paper median filter is Image enrichment is a very basic image processing task
used to remove noise from the MRI images. that defines us to have a better subjective decision over the
images. Power law Transformation [2] (Fig. 9.) is used for
image enrichment. Image enrichment simply means the
transformation an image f into image g using T. The values of
pixels in images f and g are denoted by r and s. As said, the
pixel values r and s are related by the expression,
s = T(r) (2)
C. Skull Masking
Skull masking means the removal of non-brain tissue like .� -.
," .," '
scalp, skull, fat, eyes, neck, etc, from MRI brain image. It helps II '
,
�.(.�
to improve the speed and accuracy of diagnostic and predictive
procedures in medical applications. This procedure is also
referred as Brain-Extraction/SkuU-Stripping [2].
Dilation and erosion are two fundamental morphological
operations. An opening is erosion followed by dilation with
the same structuring element:
Fig. 9 Power law Transform Fig. 10 Skull Masked Image
• Gray Scale
• Texture
• Symmetrical
Fig. 6 Eroded Image Fig. 7 Dilated Image
SCEECS 2014
Accordingly, 3 kinds of features are extracted, which tell PCA [4] is a proficient tool to reduce the dimension of a
the structure information of gray scale, symmetrical and data set consisting of a large number of interconnected
texture [1]. These features certainly have some redundancy, variables while retaining most of the variations. Reduce
but the idea behind this is to find the potential by useful dimension means reduced feature set which is act as an input
features. to the SVM during training part as well as testing part.
J) Gray Scale features: Gray scale features that extracted Steps to be followed in PCA:
are mean, variance, standard deviation, skewness and kurtosis.
• Compute the mean of the data matrix
• Subtract the mean from each image.
a) Standard Deviation= .Jvar iance
( ) LL(f(x,y)-mean)3
(3) • Compute the covariance matrix.
• Compute the Eigen vectors and values for covariance
1 11/ n
b) Skewness =
var
.
zance 3 (4)
•
matrix.
( )
x=1 y=1 Arrange the Eigen vectors according to the Eigen
values and as per the threshold value.
4
11/ n
4 LL(!(x,y)-mean)
1
•
. Compute the feature matrix (the space that will use it
lance
c) Kurtosis= (5)
var x=1 y=1 to project the testing image on it).
c) Jnverse= �
� p. (i,j)
.
given below
i,J=! ( 1- J) 2
(8)
f(x) = wT X + b (13)
n n
d) Energy= I I(p(i,j))2 (9) Such that for each training sample x" the function yields
i=! J=! f(x;} � 0 for Yi = +1, and f(xi) < 0 for y,= -1. In other words,
training samples of two different classes are separated by the
n n
hyperplane f(x) = wT X + b =0, where w is weight vector and
e) Contrast= I I(i,j)2 p(i,j) (10)
normal to hyperplane, b is bias or threshold and x, is the data
i=! J=!
point.
n n
P (i,J.)
1 • •
j) lDM-" " • •
- ft f;t 1 + (i-j)2
(11)
•
; ... - - -
3) Symmetrical feature:
n Maximum 1
- ��: ' - :: ::
- -
, , :: ". Support
I(m,-M)2 margrn 11
I
r
�
I -..,.
'"
, ''
'
�
,
�. Vectors
-"i=!:;.,! '"
t,' �
_
rE
_____
a) Exterior symmetry= (12)
1
n
I. . .
1
E. Principle Component Analysis (PCA) 1 •
1 • Class 1
Optimal
Excessive features used for classification not only
Hyperplane • Class 2
increase computation time but also increase storage memory.
They sometimes make classification more complicated. It is
required to reduce the number of features to overcome the Fig. 1 1 Linear SVM Classification
above mentioned problem.
SCEECS 2014
For a given trammg set, while there may exist many are used to check the performance of the classifiers.
hyperplane that maximize the separating margin between the
two classes, it is based on the hyperplane that maximizes the Ill. EXPERlMENTAL DISCUSSION
separating margin between the two classes(Fig. 1l.) This methodology (Fig. 1) includes following modules:
In Fig. 11, SVM classification with a hyperplane that Image preprocessing, Features extraction, Feature reduction
minimizes the separating margin between the two classes are Training and Testing. Image preprocessing is used to improve
indicated by data points by black square's and black circle's. the quality of images. Medical images are corrupted by
Support vectors are elements of the training set that lie on the different type of noises like Rician noise etc. It is very
boundary hyperplane of the two classes. important to have good quality of images for accurate
observations for the given application. Median filter is used to
2) Non-Linear SVM' In linear SVM straight line or remove noises while retaining as much as possible the
hyperplane is used to distinguish between two classes. But data important signal features.
sets or data points are separated by drawing a straight line Skull masking is used to remove non-brain tissue like
between two classes is not possible. In a nonlinear SVM scalp, skull, fat, eyes, neck, etc, from MRI brain image. For
classifier, a nonlinear operator is used to map the input pattern skull masking morphological operations such as erosion and
x into a higher dimensional space H. The nonlinear SVM dilation is used. It helps to improve the speed and accuracy of
classifier is defined as diagnostic and predictive procedures in medical applications.
Morphological operation is followed by region filling and
f(x) = WT <I>(x) + b (14) power law transformation for image enrichment.
Feature extraction refers to various quantitative
The data with linear separability may be analyzed with a measurements of medical images typically used for decision
hyperplane, and the linearly non separable data are analyzed making. In this work 28 features are calculated for each
with different kernel functions like higher order polynomials image. In training for 46 images, features have been extracted
and Quadratic. as shown in Fig. 13. Extracted feature set is applied to PCA.
The output of an SVM is a linear combination of the PCA is used to reduce the feature set which is extracted
training examples which are mapped onto a high-dimensional from images. The reduced features are submitted to a support
feature space through the use of kernel functions as in Fig. 12. vector machine for training and testing. Using PCA the feature
set is reduced to 24 out of 28 features per image. Therefore
• this method will decrease the computation time and
0: x � <I>(x) •
complexity.
'. ..-'"..... ..,.,...! •
." ... • The classification process is divided into two parts i.e. the
;/ 0
I \ training and the testing part. Firstly, in the training part known
! 0 '\ • •
•
0 data(i.e. 24 features * 46 images) are given to the classifier for
i
0
\ 0 training. Secondly, in the testing part, unknown data are given
0
• •
to the classifier and the classification is performed after
• \ j
''',.0 0 / training part. The accuracy rate and error rate of the
. ...... , .... " ., ..... . • classification depends on the efficiency of the training.
•
SCEECS 2014
rn Fe,ture. <28>:4& double> Name· V,lue Min Max
2 0,0099 0,0042 0,0034 0,0455 0,0010 0,0027 0,0489 0,0039 0,0011 0,0150 0,0167 0,012 . b F <48x6 char>
3 2,1074e+08 1.415&e+09 1.025ge+09 2.5817e+07 4,9939.+09 1.3&46e+09 1,8566e+07 1.6897e+09 6.4260e+09 1,0978e+08 1,1601e+08 1,5075e+C FV <46x28 double> ·0,8694 4,1163...
4 8.7096e+09 1.5180.+11 1.156ge+11 5,1676e+08 9.3221.+11 1.0882e+11 3.1918e+08 2,028ge+11 8,5054.+11 3.4212e+09 4.5711e+09 5,2454.+C l11li <28x46 double> ·0,8694 4,1163...
5 0,0995 0,0648 0,0584 0,2134 0,0319 0,0519 0,2211 0,0625 0,0330 0,1227 0,1292 O,11C I <256x256 uint8> ° 241
°
1
6 2,1026 1.4094 1.7492 4.3738 1,2524 1,4828 5,0075 1.3722 1,1892 2.5271 2,6805 2,22\ :2 <256x256 uint8> 201
7 0,0331 0,0159 0,0290 0,0316 0,0106 0,0169 0,1802 0,0173 0,0099 0,0369 0,0493 0,03C {: In <1.ll1ruct>
1�
8 0,9681 0,9&46 0,9376 0,9932 0,9351 0,9503 0,9633 0,9568 0,9196 0,9770 0,9720 0,974 46 46 46
9 0,9681 0,9&46 0,9376 0,9932 0,9351 0,9503 0,9633 0,9568 0,9196 0,9770 0,9720 0,974 1 <1.ll1ruct>
10 52.7295 33.5040 9.7476 1.3935e+03 1.5591 4,6802 1.0622e+03 29,0056 0,8382 118.9461 213.4155 65,46( c 1 1 1
11 9,1189 5,0672 2,0629 107.7579 0,6012 1.5254 90,1792 4,4331 0,4180 17.5155 23.2585 11,051 I <25&x256 double> ° 0.7882
12 0,0311 0,0151 0,0275 o,om 0,0106 0,0167 0,0858 0,0143 0,0099 0,0344 0,0424 0,029 f1 <25&x256 uint8> ° 201
13 0.7084 0,8959 0,6293 0.7868 0,8376 0.7591 0,5840 0,8972 0,8710 0.7094 0,6571 0.712 .b In '46,pn9'
14 0.7233 OJl34 0.7044 0.5759 0,3437 0.5191 1.1285 0.3056 0,2888 0.7466 0,8682 0.719 h 0,0013 0,0013 0,0013
; j:
15 0,9848 0,9926 0,9865 0,9902 0,9947 0,9917 0,9&78 0,9933 0,9951 0,9832 0,9798 0,985 256 256 256
16 0,9846 0,9925 0,98&4 0,9898 0,9947 0,9916 0,9656 0,9931 0,9951 0,9830 0,9795 0,985 i 46 46 46
;
17 0,8372 0,9462 0.7687 0,8851 0,9119 0,8654 0.7556 0,9468 0,9314 0,8388 0,8036 0,84C 256 256 256
18 2,0798 1.3834 1.7253 4,3450 1,2240 1,4542 5,0471 1,3471 1,1611 2,5045 2,6631 2,200 m 256 256 256
J�"
19 2.5294 2,1849 2,4749 2,8&68 2,1691 2.2990 3,2503 2,1727 2,1287 2,&405 2.7012 2.5&2 256 256 256
20 5.4024 4.4263 4,1122 14.&458 3.6742 3.8739 14,5414 4,3220 3.&458 6,8686 6,9784 5,847 <24x46 double> ·3.794", 3.2174",
21 0,6972 OJ016 0,6829 0.5565 OJ363 0,5070 1.0377 0.2921 0.2820 0.7181 0,8294 0,69( <25&x256 double> ° 0,4897
22 0,0331 0,0159 0,0290 0,0316 0,0106 0,0169 0,1802 0,0173 0,0099 0,0369 0,0493 0,03C Iil se <1.ll1rel>
13 0,1396 0,0785 0,1267 0,1019 0,0586 0,0853 0,2722 0.0738 0,0554 0,1510 0,1767 0,134 Iil sel <1.ll1rel>
���
24 ·0.7898 ·0.7687 ·0.7912 ·0,8558 ·0,8m ·0,8211 ·0.7299 ·0,7691 ·0.7990 ·0.7858 ·0,77&4 ·0.797 <1.4& cell>
25 0.7817 0.5690 0,7761 0.7599 0,6174 0.7178 0,8525 0.5634 0.5848 0.7871 0,8172 0.784 <1.4& double> ° 1
26
27
28
0,996&
0,9995
0,9862
0,9983
0,9998
1.6188
0,9970
0,999&
1.l260
0,9976
0,9995
0.3260
0,9988
0,9998
OJ127
0,9981
0,9997
0,0531
0,9914
0,9976
0.7337
0,9984
0,9997
0.3391
0,9989
0,9998
0,0686
0,9962
0,9994
2,6119
0,9954
0,9993
0.7382
0,99(
0,999
3.965
I ec <28x24 double>
3
·1.0000 1.0000
3 3
SCEECS 2014