An Advanced Breast Tumor Classification Algorithm: Dinesh Kumar, Vijay Kumar, Jyoti, Sumer Poonia, Felix Deepak Minj

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Research & Reviews: Journal of Computational Biology

Volume 1, Issue 1, April 2012, Pages 1-9


__________________________________________________________________________________________

STM Journals 2012. All Rights Reserved Page 1
An Advanced Breast Tumor Classification Algorithm

Dinesh Kumar
1
, Vijay Kumar
2
, J yoti
3
*, Sumer Poonia
4
, Felix Deepak Minj
5
1
Head & Asst. Prof., Asst. Prof.
4
, Department of Computer Science & Engineering, SIET, Shekhawati
Group of Institutions, Sikar Rajasthan, India

2
Head & Professor, Department of Computer Science & Engineering, Marudhar Engineering College,
Bikaner, Rajasthan, India
3
Asst. Prof., Department of Electronics & Comm. Engineering, Sobhasaria Engineering College,
Sikar, Rajasthan, India
5
Dean & Professor, Department of Engineering, SIET, Shekhawati Group of Institutions, Sikar,
Rajasthan, India
* Author for correspondenceE-mail: jyoti_matwa@yahoo.in, dinesh_matwa@yahoo.co.in,
vijay_matwa@yahoo.com, f_d_07@yahoo.com

1. INTRODUCTION

In Breast cancer is the common disease among
women who crossed 50 and above. It affects
one in 22 people and pushes the cervical
cancer in to second spot according to the
survey report displayed by the ICMR [1]. This
survey counts only on metros and not
accounted rural areas. Early detection and
characterization of breast cancer is possibly
reducing the chance of biopsies. Currently
digital mammogram is the best tool for
screening the breast cancer. Definite diagnosis
can be achieved only after biopsies but it is

very expensive and painful to the patient, so
image processing based techniques are
developed to avoid false biopsies. However
interpretation of poor illuminated based
mammography is very difficult. Physician
with different level of experience can come up
with different results for the same breast
mammography images. To minimize the
operator related deficiency many image
processing based algorithms are developed to
assisting the radiologist for analyzing the
breast mammography. Some of the researchers
are used texture feature for detecting and
classifying the breast masses. Baeg et al.
ABSTRACT

Classifying breast malignancy based on their shape is very difficult and challenging task. There are two
different types of breast tumor named as benign and malignant. Benign tumors are well defined and
round or oval structured one while malignant tumors are ill defined and irregular structured one. In this
paper, a novel best fitting (BSF) algorithm is proposed for classifying breast malignancy. In this
algorithm centroid of a contour is calculated and used as a center of best fitting circle whose radius is
calculated as arithmetic mean of minimum and maximum of radial distance which is measured from
centroid of the tumor. Entropy (E), normalized mean radial distance (Nr) of a given tumor and similarity
between best fitting circle and tumor is measured in terms of variation () are the feature (F1,F2, and
F3) for classification of breast tumors. The performance of each parameter and its combinations for all
150 contours are measured and evaluated through receiver operating characteristics (ROC). The
necessary datasets are taken from Rangayan database as well as our local dataset for validation and
verification.

Keywords: Breast tumor, Texture feature, Entropy, Normalized average radial distance, Variance,
Benign, Malignant
Research & Reviews: Journal of Computational Biology
Volume 1, Issue 1, April 2012, Pages 1-9
__________________________________________________________________________________________

STM Journals 2012. All Rights Reserved Page 2
classify the breast abnormalities based on
textural features named as denseness and
architectural distortion. Database consisting
404 biopsy were classified with area under
ROC of 0.9. A sensitivity analysis also
performed to examine the robustness of
introduced texture feature to variation in size
of abnormality masses [2]. Rangayyan et al.
used acutance as feature for classification.
This feature measures the averaged gray level
variation between inside and outside of the
tumor contour. The ROI was traced manually
for 54 tumors with classification rate of 92.6%
[3]. El-Faramawy used various shape features
like compactness, Fourier descriptor,
moments, and chordlength statistics for
classifications. By combination of all these
shape factors, he achieved classification rate of
76% [4]. Menut et al. classified the breast
masses with the help of parabolic modeling
method with the accuracy of 76% [5].
Thaitaikumar et al. reduces the 56.3% of
biopsies by using axial shear estrography as a
classifier for classification of breast tumors
[6]. Yanga et al. detected and classified the
breast masses with the help of five texture and
four shape features. He combined all these
features as a input of parabolic neural network
classifier and achieved 84.15% classification
rate with overall area under ROC of 0.93 [7].
Lee et al. analyzed the infiltrative nature of
breast masses by octave energy derived from
Reversible round off nonrecursive 1D discrtete
periodic wavelet transform. A test dataset of
breast sonograms with the lesion contour
delineated by an experienced physician and
three datasets of breast sonograms with the
lesion contour delineated by a Javabased
image processing program, ImageJ, are built
for feature efficacy evaluation. They are
achieved 95.1% and 84.4% of accuracy for
manual and Image J generated database by
combining octave energy feature with
morphometric feature [8]. Lio et al. developed
local texture based fully automated breast
tumor classification system with the accuracy
of 93.75% [9]. Nugayan et al. achieved 94.6%
of classification accuracy with the help of
fractal imensions based classifications method
[10]. The rest of the paper is organized as
follows. In Section 2 briefly described about
our proposed algorithm. Obtaining decision
level for classification is briefed out in
Section 3. Section 4 is described about result
and discussion. Section 5 will end up with
conclusion.

2. DESCRIPTION ABOUT OUR
PROPOSED ALGORITHM

The main objective of our algorithm is to
classify the tumor. The randomness of gray
scale is a one of measurable feature for
classification. The randomness of gray level of
tumor contour is measured in terms of Entropy
(F1). For deriving the second and third
features for classification, best fitting circle is
to draw and fit in to the corresponding input
tumor. The radius of best fitting circle is
calculated as arithmetic mean of minimum and
maximum radial distance, which is measured
from centroid of tumor to tumor contour. The
Research & Reviews: Journal of Computational Biology
Volume 1, Issue 1, April 2012, Pages 1-9
__________________________________________________________________________________________

STM Journals 2012. All Rights Reserved Page 3
normalized average radial distance (F2) is
calculated from above measured radial
distance. The centroid of the tumor is used as
the center of circle and similarity between best
fitting circle and a given tumor is measured in
terms of variance as third feature for
classifications (F3). The decision level of each
parameter is calculated for classification. So
for calculating radial distance, the centroid of
given tumor is calculated as follows.

2.1. Centroid Calculation
The centroid of the given input image is
calculated with the help of vertices. The
coordinate of each pixels (xi, yi) of the contour
(C) acting as a vertices of an image [10]. If
vertices of N points are known then the
coordinate of centroid is calculated as follows:
1
1
1
;
N
N
i
c c N
i
i
mixi
miyi
x y
mi
mi
=
=
=
= =

(1)
where
N is total number of pixels in a given image,
mi is the weighting factor or pixel value
(White or Block), x and y are the coordinates
of the centroid.

2.2. Converting Radial Distance as the
Radius of the Circle
After finding the centroid of the tumor, the
same point can be used as a center to measure
the radial distance of tumor contour (C). The
radial distance can be measured as follows:

2 2
( , ) ( ) ( ) ( , )
c c
r i j x i y i i j c = + _
(2)
where r is the radial distance from centroid to
tumor contour. x and y is the coordinates of
centroid.
i and j are the coordinates of input contours .

The normalized mean radial distance can be
calculated from the radial distance as follows:

Normalized mean radial distance
1
max
r
M
N
r
= (3)
where M1 is the mean radial distance

1
1 1
1
( , )
m n
i j
M r i j
N
= =
=



Due to abrupt variation and ill based nature of
malignant tumor, the normalized mean radial
distance is less value for malignant tumor than
the benign tumor. The arithmetic mean of
minimum and maximum radial distance used
as a radius of best fitting circle. The radius of
best fitting circle is

max min
2
r r
R
+
=
(4)

max
max ( , ) ( , ) r r i j i j c _
where
min
min( , ) ( , ) r i j i j c _

The average radial distance used as a radius of
best fitting circle. Circle has been fit on to the
corresponding input tumor as centroid of
tumors, a center of circle as described in
Figure 1.
Research & Reviews: Journal of Computational Biology
Volume 1, Issue 1, April 2012, Pages 1-9
__________________________________________________________________________________________

STM Journals 2012. All Rights Reserved Page 4

(a). Benign Breast tumor Contour

(b). Malignant Breast tumor Contour
Fig.1. The Proposed Circle Matched with
Tumor Contour for (a) Benign tumor
and (b) Malignant Tumor.

After fitting the circle with tumor, the
difference between tumor and best fitting
circle is calculated as follows:
2 2
( ) ( )
c c
d x i y i R = + (5)

The fitted circle may cross and merge with
different points of the tumor contours as
shown in Figure 1. We can meet three possible
cases while matching the circle with the input
tumor contours.
In first case, the proposed circle can exceed
the exterior portion of the tumor in some
points, i.e., d(i,j) will be negative.

2 2
( ) ( )
c c
x i y i R + <
(6)

In second case the circle will lying interior
portion of the tumor in some point, i.e., d(i,j)
will be positive.
2 2
( ) ( )
c c
x i y i R + >
(7)
In third case, proposed circle can merge with
the tumor contours in some points,. i.e., d(i,j)
is equal to zero, i.e.,
2 2
( ) ( )
c c
x i y i R + = (8)
The variance of the above measured
differences is calculated as follows:

( )
2
2
1
1
1
( )
1
N
i
Variance x x
N
o
=
| | | |
=
| |

\ . \ .

(9)
where N is the total number elements, and x is
the mean.
1
1
N
i
i
x x
N
=
=



Similarly the amount of randomness of a gray
scale images are measured in terms of entropy.
In this paper we used entropy is a first factor
(F1) for measuring the randomness of gray
scale variation. Entropy is a scalar value and it
is a statistical measure of randomness that can
be used to characterize the texture of the input
image.
Entropy is defined as

1 1
( ( , )*log( ( , )))
M N
i j
E sum I i j I i j
= =
=

(10)

Table I shows the features and its
corresponding description used in this study.


Research & Reviews: Journal of Computational Biology
Volume 1, Issue 1, April 2012, Pages 1-9
__________________________________________________________________________________________

STM Journals 2012. All Rights Reserved Page 5
Table I: Features used in This Study with
Their Description.

Notation
/features
Description
F1




F2



F3
1 1
( ( , )*log( ( , )))
M N
i j
E sum I i j I i j
= =
=


1
max
r
M
N
r
=
where
1
1 1
1
( , )
m n
i j
M r i j
N
= =
=


( )
2
2
1
1
1
( )
1
N
i
Variance x x
N
o
=
| | | |
=
| |

\ . \ .



After extracting all these three features, tumors
are classified as briefed in Table II.

Table I I : Feature Based Classification.
Tumors type Decision
Malignant
Benign

Malignant
Benign

Malignant
Benign
F1>Decision level
F1<=Decision level

F2<Decision level
F2>=Decision level

F3> Decision level
F3<=Decision level

3. FINDING DECISION LEVEL

Initially minimum values of each parameters
(F1, F2, and F3) are chosen as a threshold
value for classification. Based on this value
accuracy of classification is calculated. Then
the above said procedure will be repeated by
constant increment of initial threshold values
until it reaches either maximum accuracy or
minimum error. At one threshold value, either
accuracy will reach maximum or error will
attain minimum value. The threshold in which
accuracy attain maximum or error will attain
minimum then that threshold value will be
treated as a decision value for classification.
The performance of the classifier will be
evaluated through Receiver Operating
Characteristics (ROC). In ROC, if actual class
and predicted class (through our proposed
method) are same in nature (i.e., both defined
as a malignant) then the class is called as a
true positive. If malignant as defined as benign
or vice versa then it is called as a false positive
and false negative respectively. If benign class
as predicted as a benign then its called as true
negative. Hit rate, false alarm rate (expense),
specificity and accuracy of classifiers are
calculated for measuring the efficiency of
classifiers.

True positive rate = True positive/true positive
+ false negative (11)
False positive rate = False positive/ false
positive + true negative (12)
Accuracy = True positive + true negative/true
positive + false positive + true negative + false
negative (13)
Specificity =1 false positive rate (14)


Research & Reviews: Journal of Computational Biology
Volume 1, Issue 1, April 2012, Pages 1-9
__________________________________________________________________________________________

STM Journals 2012. All Rights Reserved Page 6
4. RESULT AND DISCUSSION

In this paper, we have tested 150 breast tumor
contour which are obtained from Prof. R. M.
Rangayyan database as well as our local hand
drawn database. Our proposed algorithm is
applied to all 150 breast tumor contour and all
these three features are extracted for
classification. Decision level of all individual
features are calculated as 0.0096, 0.8, and 75
for F1, F2, and F3, respectively. Based on
decision level tumors are classified with
accuracy of 88.667%, 92.667, and 96.67
forF1, F2, and F3, respectively . Based on F1,
13 out 71 benign tumor and 5 out of 79
malignant tumor are classified wrongly.
Similarly 5 out of 71 and 6 out of 79 benign
and malignant tumors are respectively
misclassified based on F2 and 4 out of 71 and
1 out of 79 benign and malignant tumors are
respectively misclassified based on F3 as
displayed in Table III. The performance of
classification is measured in terms of ROC
parameters like Ac, Az, se, sp, TPR, NPR for
all three parameters (F1, F2, and F3) and
compared with other existing system in Table
IV (86.667%, 0.901, 0.9620, 0.7246, 0.8261
and 0.9483; 92.6667, 0.9442, 0.9241, 0.9296,
0.9359, and 0.9167; 96.667, 0.972, 0.9873,
0.9437, 0.9512, and 0.9853).





Table II I a: Breast Tumor Classification Based
on F1.
Contours from
R.M.
Rangayyan and
our local hand
drawn database
Case Based on F1
Classified as
Benign Malignant
%
Correct
Benign 71 58 13 81.69%
Malignan 79 74 05 93.67%
Total 150 132 18 88.67%
Table I I Ib: Breast Tumor Classification Based
on F2.
Contours
from R. M.
Rangayyan
and our local
hand drawn
database
Case Based on F2
Classified as
Benign Malignant
%
Correct
Benign 71 66 05 92.95%
Malignan 79 73 06 92.4%
Total 150 139 11 92.67%
Table I I Ic: Breast Tumor Classification Based
on F3.
Contours from
R. M.
Rangayyan and
our local hand
drawn database
Case Based on F3
Classified as
Benign Malignant
%
Correct
Benign 71 67 04 94.36%
Malignan 79 78 01 98.73%
Total 150 145 05 96.67%

Research & Reviews: Journal of Computational Biology
Volume 1, Issue 1, April 2012, Pages 1-9
__________________________________________________________________________________________

STM Journals 2012. All Rights Reserved Page 7
For utilizing the mutual benefits of each
feature, all these three features are combined.

The decisions are made based on favor of any
two likely features among three features. Due
to mutual benefits of all these features, three
out of 71 benign and one out of 79 malignant
tumors are getting misclassified. That is rate of
misclassification are reduced. So the overall
accuracy of feature combination method is
improved as 97.33% as displayed in Table V.
For evaluating and validating our proposed
system, overall accuracy based on F1, F2, and
F3 and its overall combination are compared
with other existing system as displayed in Table VI
as well as in Figure 2.

Table I V: Comparison of ROC Parameters.
.
Measured
parameters
Wavelet
based
classification
on method
[10]
Local texture
based
classification
method [5]
Our proposed
BSF algorithm
(based on F1)
Our proposed BSF
algorithm
(based on F2)
Our
proposed
BSF
algorithm
(based on
F3)
Area under
ROC
(Az)
0.934 0.968 0.901 0.9442 0.972
Accuracy
(Ac)
0.84 0.9375 0.8667 0.9266 0.9667
Sensitivity
(S)
0.933 0.95 0.9620 0.9241 0.9873
Specificity
(Sp)
0.795 0.9231 0.7246 0.926 0.9437
Positive
predictive
value
(PPV)
0.714 0.9344 0.8261 0.9359 0.9512
Negative
predictive
value (NPV)
0.956 0.9412 0.9483 0.9167 0.9853


Table V: Comparison of Accuracy with Other Existing System.
S.
No.
Authors' names and their proposed method Percentage of accuracy
1 Menut et al./parabolic modeling method [5] 76%
2 Mudigonda et al./iterative boundary segmentation algorithms [12] 81%
3 Lee et al./wavelet based breast tumor classification method [8] 84%
4 Our proposed method based on
(a) Entropy (F1)
(b) Normalized mean radial length (F2)
(c) Variance (F3)
(d) Combination of all three features
88.667%
92.667%

96.667%
97.33%




Research & Reviews: Journal of Computational Biology
Volume 1, Issue 1, April 2012, Pages 1-9
__________________________________________________________________________________________

STM Journals 2012. All Rights Reserved Page 8
Table VI : Tumor Classification (Based on Combination of All Three Features).
Contours from R. M. Rangayyan
and our local hand drawn database
Case Based on F2
Classified as
Benign Malignant
% Correct
Benign 71 68 03 95.77%
Malignan 79 78 01 98.73%
Total 150 146 04 97.33%

5. CONCLUSION

In this paper, a Novel best Fitting algorithm is
proposed for classification of breast masses
into benign and malignant. There are two
portion of proposed algorithm. In first stage,
features are extracted for classification.
Then the second stage, the classifier
performance was measured in terms of ROC
parameters for all 150 global and local hand
drawn database. It shows that the classifier
performance was not statistically sensitive to
vary with size of ROI

REFERENCE

1. www.icmr.nci.in
2. Baeg S. and Kehtarnavaz N. Electronics
Letters on Computer Vision and Image
Analysis 2002. 1(1). 120p.
3. Rangyyan R. M., El-Faramaway N. M., Leo
DesautelsJ. E., et al. IEEE Transaction on
Medical Imaging 1997. 16(6). 799810p.
4. El-Faramaway N., Rangayyan R. M.,
Desautels J., et al. Shape factors for
analysis of breast tumors in
mammograms. In Proceedings Canadian
Conference on Electrical and Computer
Engineering, Calgray, AB, Canada, 1996,
355358p.
5. Menut O., Rangayyan R., and Desautels J.
Classification of breast tumors via
parabolic modeling of their contours. In
Proceedings 1997 IEEE Pacific Rim
Conference on Communications Computers
and Signal Processing, Victoria, BC,
Canada, August 1997. 10021005p.
6. Thitaikumar A., Mobbs L. M., Kramerchant
C. M., et al. IEEE Transaction on Physics
in Medicine and Biology 2008. 53(17).
48094823p.
7. Yang S.-C., Wang C.-M., Chung Y.-N., et
al. Biomedical Engineering: Applications,
Basis and Communication 2005. 17(5).
215228p.
8. Lee H. W., Liu B. D., Hung K. C., et al.
IEEE Journals on Selected Topics in Signal
Processing2009. 3(1). 8193p.
9. Lio B., Cheng H. D., Hung J., et al. Fully
automatic and segmentationrobust
classification of breast tumors based on
local texture analysis. In Proceeding of
11th Joint Conference on Information
Science, China, December 2008. 17p.
10. Nguyan T. M. and Rangayan R. M. Shape
analysis of breast masses in mammograms
Research & Reviews: Journal of Computational Biology
Volume 1, Issue 1, April 2012, Pages 1-9
__________________________________________________________________________________________

STM Journals 2012. All Rights Reserved Page 9
via the fractal dimensions. Proceeding of
22nd Annual IEEE International
Conference on Engineering in Medicine
and Biology, Shangai, China, 2005. 3210
3213p.
11. Lee N. J., Computer-aided diagnostic
systems for digital mammograms. Master
of Science Dissertations, Louisiana State
University, Louisiana, 2006.
12. Mudigonda N. R., Rangayan R. M., and
Leodesauteis J. E., Concavity and
convexity Analysis of mammographic
masses via an iterative boundary
segmentation algorithms. In Proceedings
of IEEE Canadian Conference on
Electrical and Computer Engineering,
Alberta, Canada, 1999. 1489-1494p.

You might also like