Professional Documents
Culture Documents
Breast Cancer Diagnosis Using Artificial Neural Networks With Extreme Learning Techniques
Breast Cancer Diagnosis Using Artificial Neural Networks With Extreme Learning Techniques
Abstract—Breast cancer is the second cause of dead among ANN is one of the best artificial intelligence techniques for
women. Early detection followed by appropriate cancer common data mining tasks, such classification and regression
treatment can reduce the deadly risk. Medical professionals can problems. A lot of research showed that ANN delivered good
make mistakes while identifying a disease. The help of technology accuracy in breast cancer diagnosis. However, this method has
such as data mining and machine learning can substantially several limitations. First, ANN has some parameters to be
improve the diagnosis accuracy. Artificial Neural Networks tuned in the beginning of training process such as number of
(ANN) has been widely used in intelligent breast cancer hidden layer and hidden nodes, learning rates, and activation
diagnosis. However, the standard Gradient-Based Back function. Second, it takes long time for training process due to
Propagation Artificial Neural Networks (BP ANN) has some
complex architecture and parameters update process in each
limitations. There are parameters to be set in the beginning, long
iteration that need expensive computational cost. Third, it can
time for training process, and possibility to be trapped in local
minima. In this research, we implemented ANN with extreme
be trapped to local minima so that the optimal performance
learning techniques for diagnosing breast cancer based on Breast cannot be guaranteed. Numerous efforts had been attempted to
Cancer Wisconsin Dataset. Results showed that Extreme get the solutions of neural networks limitations. Huang and
Learning Machine Neural Networks (ELM ANN) has better Babri [6] proved that Single Hidden Layer Neural Networks
generalization classifier model than BP ANN. The development of (SFLN) with tree steps extreme learning process that called
this technique is promising as intelligent component in medical ELM can solve that problems.
decision support systems.
In this paper, we revealed the implementation of artificial
Keywords—breast cancer; artificial neural networks; extreme
neural networks with extreme learning techniques in breast
learning machine; medical decision support systems cancer diagnosis. The dataset used for experiments was Breast
Cancer Wisconsin Dataset that was obtained from the
I. INTRODUCTION University of Wisconsin Hospital, Madison from Dr. William
The out of control development of cells in an organ is H. Wolberg [7]. We compared the perfomance of ELM with
conventional BP ANN with gradient descent based learning
called tumors that can be cancerous. There are two kinds of
algorithms. Sensitivity, specificity, and accuracy were used as
tumors, benign and malignant. Benign or non-cancerous tumors
are not spreading and are not life intimidating. In the other performance measurements. Results showed that ELM ANN
generally produced better result than BP ANN.
hand, malignant or cancerous tumor are expanding and life
threatening [1]. Malignant breast cancer is defined when the The rest of this paper is organized as the following. Section
growing cells are in the breast tissue. Breast cancer is the 2 is dedicated as literature review. In this section, brief review
second overall cause of mortality among women and the first of previous works in breast cancer diagnosis are presented. In
cause of dead among them between 40 and 55 ages [2]. Section 3, the concept, mathematical model, and training
process of extreme learning machine are explained. In Section
Regular breast cancer diagnosis followed by appropriate
4, experiments, results, and analysis are provided. Finally,
cancer treatment can reduce the unwilling risk. It is suggested
to do tumor evaluation test every 4-6 weeks. Based on that conclusions and future works are given in Section 5.
reason, benign and malignant detection based on classification II. LITREATURE REVIEW
features become very important [3].
The uses of classification systems in medical diagnosis,
Careful diagnosis in early detection has been proven to including breast cancer diagnosis, are growing rapidly.
lessen the dead rate because of breast cancer [4]. Depend on the Evaluation and decision making process from expert medical
expertise, mistakes can be made by medical professionals while diagnosis is key important factor. However, intelligent
identifying a disease. With the help of technology such as data classification algorithm may help doctor especially in
mining and machine learning, diagnosis can be more accurate minimizing error from unexperienced practitioners [3].
(91.1%) when related to a diagnosis made by an experienced
doctor (79.9%) [5]. Several techniques have been deployed to predict and
recognize meaningful pattern for breast cancer diagnosis. Ryua
10 | P a g e
www.ijarai.thesai.org
(IJARAI) International Journal of Advanced Research in Artificial Intelligence,
Vol. 3, No. 7, 2014
[8] developed data classification method, called isotonic βi = [βi1, βi2, . . . , βip]T,
separation. The performances were compared against support (3)
vector machines, learning vector quantization, decision tree
induction, and other methods based on two-breast cancer data wi ∙ x(k) is the inner product of wi and x(k),
set, sufficient and insufficient data. The experiment results bi is the bias of the i-th hidden node,
demonstrated that isotonic separation was a practical tool for o(k) ϵ Rp is the output of neural network for k-th vector.
classification in the medical domain.
Hybrid machine learning method was applied by Sahan [9] SLFN can approximate m vectors means that there are exist
in diagnosing breast cancer. The method hybridized a fuzzy- wi, βi, and bi, such that:
artificial immune system with k-nearest neighbour algorithm.
The hybrid method delivered good accuray in Wisconsin
Breast Cancer Dataset (WBCD). They believe it can also be
(4)
tested in other breast cancer diagnosis problems.
Comprehensive view of automated diagnostic systems
implementation for breast cancer detection was provided by
Ubeyli [10]. It compared the performances of multilayer
perceptron neural network (MLPNN), combined neural
network (CNN), probabilistic neural network (PNN), recurrent (5)
neural network (RNN) and support vector machine (SVM). The
aim of that works was to be a guide for a reader who wants to
develop this kind of systems. Equation (5) can be written as:
Numerous combinations and hybrid systems used neural
networks as a component. However, since almost all of the , (6)
employed neural networks are conventional gradient descent
BP ANN, the novel or hybrid method still suffered the neural where
networks drawbacks that were mentioned in the previous
section. H є Rm x M is the hidden layer output matrix of the neural
III. EXTREME LEARNING MACHINE networks.
Huge efforts had been attempted to solve the weaknesses of
g ( w1 x (1) b1 ) g ( wM x (1) bM )
BP ANN. Huang and Babri [6] demonstrated that single hidden H= (7)
layer feedforward neural networks (SLFN) with at most m
hidden nodes was capable to estimate function for m different g ( w1 x ( m ) b1 ) g ( wM x ( m ) bM )
vectors in training dataset.
Given m instances in D = {(x(k), t(k)) | x(k) ϵ Rn, t(k) ϵ Rp, k = β є RM xp
is the weights between hidden and output layers
1,..,m} as training dataset where x(k) = [x1(k), x2(k), …., xn(k)]T as
features and t(k) = [t1(k), t2(k), …., tp(k)]T as target. A SLFN with M
1T
number of hidden nodes, activation function g(x) in hidden
β= , (8)
nodes, and linear activation function in output nodes is
T
mathematically wrote as:
M
t (1) T
(1)
T= , (9)
where t ( m ) T
wi ϵ Rn is the weights between the input nodes and the i-th
hidden node In the traditional gradient descent based learning algorithm,
weights wi which was connecting the input layer and hidden
wi = [wi1, wi1, . . . , win ]T, layer and biases bi in the hidden nodes were needed to be
(2) initialized and tuned in every iteration. This was the main
factor which often made training process of neural became time
βi ϵ Rp is the weights between the i-th hidden node and the consuming and the trained model may not reach global
minima.
output nodes
11 | P a g e
www.ijarai.thesai.org
(IJARAI) International Journal of Advanced Research in Artificial Intelligence,
Vol. 3, No. 7, 2014
Huang [11] proposed minimum norm least-squares solution Cancer Wisconsin Dataset obtained from the University of
of SLFN which didn’t need to tune those parameters. Training Wisconsin Hospital, Madison from Dr. William H. Wolberg
SLFN with fixed input weights wi and the hidden layer biases bi [7]. The data has 699 instances with 10 attributes plus the class
was similar to find a least square solution of the linear attributes. The class distribution are 65.5% (458 instances) for
benign and 34.5% (241 instances) for malignant. The attribute
system :
information can be seen in TABLE I.
– TABLE I. ATTRIBUTE INFORMATION
(10) # Attributes Domain
1 Sample Code Number id number
The smallest norm least squares solution of that linear 2 Clump Thickness 1 – 10
system was 3 Uniformity of Cell Size 1 – 10
4 Uniformity of Cell Shape 1 – 10
(11) 5 Marginal Adhesion 1 – 10
6 Single Epithelial Cell Size 1 – 10
7 Bare Nuclei 1 – 10
where was the Moore-Penrose generalized inverse of 8 Bland Chromatin 1 – 10
matrix H. This solution had three important properties which 9 Normal Nucleoli 1 – 10
were minimum training error, smallest norm of weights, and 10 Mitoses 1 – 10
unique solution which is . 11 Class 2: benign; 4: malignant
12 | P a g e
www.ijarai.thesai.org
(IJARAI) International Journal of Advanced Research in Artificial Intelligence,
Vol. 3, No. 7, 2014
13 | P a g e
www.ijarai.thesai.org
(IJARAI) International Journal of Advanced Research in Artificial Intelligence,
Vol. 3, No. 7, 2014
However, in term of specificity, BP ANN has better The development of interfaces in mobile, desktop, or web
performance in three experiments which were experiment #1, application may be useful. Third, there are new cases added
#2, and #5. regularly in the hospital. Developing intelligent diagnosis
systems that can not only learn from available data in
To get general conclusion, overall comparison need to be repositories but also from newly available data will be required.
computed. In each performance measurement, commulative
average rates were matched. Fig 4 displays the whole REFERENCES
sensitivity, specificity, and accuracy average rates between BP [1] T. S. Subashini, V. Ramalingam, and S. Palanivel, “Breast mass
ANN and ELM ANN. Result showed that, generally ELM classification based on cytological patterns using RBFNN and SVM,”
Expert Syst. Appl., vol. 36, no. 3, pp. 5284–5290, Apr. 2009.
ANN were better than BP ANN.
[2] TheAmericanCancerSociety, “What is Breast Cancer.” .
0.980 0.974 [3] M. F. Akay, “Support vector machines combined with feature selection
1.000 0.948 0.964 for breast cancer diagnosis,” Expert Syst. Appl., vol. 36, no. 2, pp. 3240–
0.921
3247, Mar. 2009.
0.900 0.843 [4] D. West, P. Mangiameli, R. Rampal, and V. West, “Ensemble strategies
0.800 for a medical diagnostic decision support system: A breast cancer
diagnosis application,” Eur. J. Oper. Res., vol. 162, no. 2, pp. 532–551,
0.700 Apr. 2005.
[5] R. W. Brause, “Medical Analysis and Diagnosis by Neural Networks,”
0.600 in Proceeding ISMDA ’01 Proceedings of the Second International
Symposium on Medical Data Analysis, 2001, pp. 1–13.
0.500
[6] G. B. Huang and H. A. Babri, “Upper bounds on the number of hidden
Sensitivity Specificity Accuracy neurons in feedforward networks with arbitrary bounded nonlinear
activation functions.,” IEEE Trans. Neural Netw., vol. 9, no. 1, pp. 224–
BP ANN ELM ANN 9, Jan. 1998.
[7] W. H. Wolberg and O. L. Mangasarian, “Multisurface method of pattern
Fig. 4. Overall average rates between BP ANN dan ELM separation for medical diagnosis applied to breast cytology,” in
Proceedings of the National Academy of Sciences, U.S.A., Volume 87,
December 1990, pp. 9193–9196.
V. CONLUSION AND FUTURE WORKS
[8] Y. U. Ryu, R. Chandrasekaran, and V. S. Jacob, “Breast cancer
The performances of ELM ANN were generally better than prediction using the isotonic separation technique,” Eur. J. Oper. Res.,
BP ANN in breast cancer diagnosis. Although the specificity vol. 181, no. 2, pp. 842–854, Sep. 2007.
rate was slightly lower than BP ANN, it can be clearly seen [9] S. Sahan, K. Polat, H. Kodaz, and S. Güneş, “A new hybrid method
that ELM ANN remarkably improved the sensitivity and based on fuzzy-artificial immune system and k-nn algorithm for breast
cancer diagnosis.,” Comput. Biol. Med., vol. 37, no. 3, pp. 415–23, Mar.
accuracy rates. Based on these results, we can conclude that 2007.
ELM ANN has better generalization model than BP ANN in [10] E. D. Übeyli, “Implementing automated diagnostic systems for breast
diagnosing breast cancer based on Breast Cancer Wisconsin cancer detection,” Expert Syst. Appl., vol. 33, no. 4, pp. 1054–1062,
Dataset. Nov. 2007.
[11] G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, “Extreme learning machine:
There are some necessary works to be done in near future. Theory and applications,” Neurocomputing, vol. 70, no. 1–3, pp. 489–
First, it is important to communicate the model with domain 501, Dec. 2006.
expert. The hybrid of ELM ANN with Decision Tree or any [12] C. P. Utomo, “The Hybrid of Classification Tree and Extreme Learning
other technique that can produce meaningful knowledge Machine for Permeability Prediction in Oil Reservoir,” Int. J. Comput.
representation will be promising. Second, to make intelligent Sci. Issues, vol. 10, no. 1, pp. 52–60, 2013.
diagnosis tool that can be used by end user, it is necessary to
develop interactive user interface.
14 | P a g e
www.ijarai.thesai.org