Das2020 Article NewApproachesInMetaheuristicTo

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Arabian Journal for Science and Engineering (2020) 45:2459–2471

https://doi.org/10.1007/s13369-019-04026-y

RESEARCH ARTICLE - SPECIAL ISSUE - INTELLIGENT COMPUTING


AND INTERDISCIPLINARY APPLICATIONS

New Approaches in Metaheuristic to Classify Medical Data Using


Artificial Neural Network
Soumya Das1 · Sarojananda Mishra2 · Manas Ranjan Senapati3

Received: 14 January 2019 / Accepted: 2 July 2019 / Published online: 10 July 2019
© King Fahd University of Petroleum & Minerals 2019

Abstract
Artificial Neural Networks have been applied as a useful technique for classification. Based on the above, we propose a
method called Velocity-Enhanced Whale Optimization Algorithm which is hybridized with Artificial Neural Network for
classifying and diagnosing the different cancer data like breast cancer, cervical cancer and lungs cancer. The proposed
algorithm is compared with four other benchmark algorithms available in the texts, i.e., C4.5, Learning Vector Quantiza-
tion, Linear Discriminate Analysis and Factorized Distribution Algorithm. Robustness of proposed approach is validated by
calculating the Correct Classification Rate, Averaged Square Classification Error which demonstrates that its performance
is better than other techniques.

Keywords  Velocity-Enhanced Whale Optimization Algorithm (VEWOA) · Artificial Neural Network · Linear Discriminate
Analysis (LDA) · Learning Vector Quantization (LVQ) · Factorized Distribution Algorithm (FDA)

1 Introduction signs can be detected at an early stage; however, in the later


stage bleeding, pelvic pain are some of the symptoms. In
Cervical cancer is the fourth most common cause of death 90% cases, human papillomavirus infection (HPV) causes
worldwide. In 2012, nearly 528,000 cases of cervical can- cervical cancer. Other risk factors include smoking, weak
cer were reported. This contributes to 8% of total cases immune system and birth control pills. HPV vaccine pro-
and total death from cancer. About 70% of cervical can- tects against 90% of cervical cancer, but the risk of cancer
cer was reported in developing countries. Basically, cancer relapse remains the same. PAP test or acetic acid is used to
occurs due to abnormal growth of a specific type of body screen the cervical cancer at precancerous stages. Basically,
cell that has the ability to invade any parts of the body. No it monitors the precancerous changes occurring inside body.
Treatment of cervical cancer consists of surgery, chemo-
therapy and radiation therapy. While the result of success in
5-year survival rate in USA is up to 68% [1], still the success
Soumya Das is a Ph.D. Scholar at Biju Patnaik University of rate depends solely on early detection.
Technology, Rourkela.
Lung cancer can be defined by uncontrolled cell growth
* Manas Ranjan Senapati in tissue of lung. The growth of the cancerous tissue spreads
manassena@gmail.com beyond lung into other parts of the body by a process called
Soumya Das metastasis. The lung cancer also known as lung carcinoma
aug10.soumya@gmail.com can be categorized into two types: small-cell lung carcinoma
Sarojananda Mishra (SCLC) and non-small-cell lung carcinoma (NSCLC) [2].
sarose.mishra@gmail.com The most common symptoms are coughing, weight loss,
1
shortness of breath and chest pain. The most influential
Biju Patnaik University of Technology, Rourkela, India
cancer-causing actor is smoking. Nearly 10–15% of non-
2
Department of Computer Science and Engineering, Indira smokers are detected to be cancerous. In that case, the can-
Gandhi Institute of Technology, Sarang, Sarang, India
cer is caused by combination of genetic factors exposure
3
Department of Information Technology, Veer Surendra Sai to radon gas asbestos, second-hand smoke or other forms
University of Technology, Burla, India

13
Vol.:(0123456789)

2460 Arabian Journal for Science and Engineering (2020) 45:2459–2471

of air pollution. It can be seen on chest cardiographics and 2 Literature Review


computerized tomography (CT) scans. The diagnosis is con-
firmed by biopsy. In 2012, nearly 1.8 million people were Soft computing principles have led to important growth
diagnosed with lung cancer and from that nearly 1.6 million in medical science [9–11]. Soft computing techniques are
people lost their lives to it. This makes it the second most gradually popular due to their distinctive capacity to extract
common in women and one of the primary causes of death the hidden patterns [12]. Error is less likely to occur by neu-
among men. ral networks. Neural networks are making medical system
Breast cancer is considered as main reason of death dependent on them in order to find different patterns that
in women. This type of cancer cannot be identified at an are present in datasets. Different models like Radial Basis
early stage [3]. Thus, it is very difficult to recognize the Functional Neural Network (RBFNN) [13], Adaptive Neuro-
accurate symptoms of breast cancer [4]. Women suffer- Fuzzy Information System (ANFIS), Functional Link Artifi-
ing from breast cancer before can have the risk of having cial Neural Network (FLANN) [14] are broadly used in the
this disease once again. Early detection of breast cancer field of medical science and learning algorithms like recur-
is the only way to overcome this disease. A lot of tests are sive least square (RLS) and least mean square (LMS) [15]
performed to know the tumors are malignant one or not are used to update weights. Combination of neural networks
[5]. Difficulty of breast cancer is that most women hesitate with fuzzy logic and wavelet transform forms hybrid models
to show all these symptoms and signs at the diagnosis which can be used to detect hidden patterns in a data [16].
time. One of the best ways for detection of the nature of But these techniques get trapped in neighborhood minima.
cancer in breast is biopsy. But rate of success in biopsy To overcome these methods, bio-enlivened strategies were
is within 10–31%. The ‘malignant group’ is lately being developed. These bio-enlivened strategies are mostly opti-
detected using artificial intelligence. Breast cancer diagno- mizing techniques that are inspired by communal deeds of
sis (BCD) issue is termed as arrangement issue [6]. In this creatures. They are also known as evolutionary optimizing
regard, machine learning is considered as highly effective techniques because of their nature of incoming at a semi-
due to its anonymity that makes more effective against optimal solution that follows the process of natural evolu-
complex nonlinear relationships within variables. Around tion [17]. One of the most effective evolutionary algorithms
the globe, researchers are highly interested to study this is the genetic algorithm [18]. In the field of biologically
type of problem. inspired algorithm, swarm intelligence is another new prom-
Researchers in the ground of cancer detection give impor- ising method that is inspired by copying the social behav-
tant information about the course and outcome of this dis- ior of fishes, ants and birds to search for their optimal food
ease. Different machine learning algorithm has become a source. One such optimization method is the particle swarm
significant tool that can be used to learn patterns in medical optimization (PSO) that originates under swarm intelligence.
data repositories [7]. Different strategies of classification are Its inspiration is from coordination and collection behav-
widely used in medical domain for classification of data into ior of birds [17]. PSO method is also applied with statis-
various classes according to few constraints. Researchers are tical models. PSO can be used in identification of tumors
conducting experiments to diagnose the diseases using dif- [19]. Differential evolution is an evolutionary optimization
ferent classification methods of machine learning techniques algorithm, i.e., similar to genetic algorithm [20]. Harmony
like Support Vector Machine (SVM) and Linear Discrimi- search, one of the most effective bio-inspired algorithms,
nant Analysis (LDA) as researchers are working better to is inspired by the behavior of musicians to get the ultimate
diagnose various resources [8]. tune which can be used to solve complex problems [21,
In this paper, breast cancer, cervical cancer and lung can- 22]. The nonlinear complex forecast and numerical pattern
cer are evaluated by the use of Artificial Neural Network identification problems are solved most effectively by func-
(ANN) trained with Velocity-Enhanced Whale Optimiza- tional networks [23]. Stacked autoencoder model was used
tion Algorithm (VEWOA). ANN is also trained by lots of on cervical cancer and attained the classification accuracy
different techniques like K-nearest neighbor (KNN), logis- of 97.25% which was proposed in [24]. Softmax classifica-
tic regression and SVM. Its comparative study is also pro- tion with stacked autoencoder is one of the deep learning
posed. A survey of literature related to various techniques methods. The results are also compared with KNN, SVM,
used for detection of different type of cancer in patients is and feedforward NN. A survey on machine learning methods
discussed in Sect. 2. A summary of different classification in detection and classification of cervical cancer for the last
methods is discussed in Sect. 3. VEWOA and VEWOA-NN 15 years was presented in [25]. Thirty journals from differ-
are described in Sect. 4. In Sect. 5, results are depicted, and ent databases, i.e., Google Scholar, Scopus, IEEE and sci-
finally, Sect. 6 describes the conclusion. ence direct, were reviewed, and the drawbacks of the papers

13
Arabian Journal for Science and Engineering (2020) 45:2459–2471 2461

were also discussed. In [26], detection and characteristics p


∑ |Ni | ( )
of cervical cancer using different models of neural network Gain (N) = Entropy (N) −
|N|
Entropy Ni (1)
are briefly outlined and a comparative study with respect i=1

to the performance of different models has been given. In N = set of cases; p = number of partitions; |Ni | = number of
[27], pap smears which are performed 2 years prior to like- cases in the partition i; |N| = number of cases in N.
lihood of cervical cancer are discussed. False-negative pap Entropy formula is:
cytology causes fatal errors causing patients to be diagnosed
in advanced stages. They discussed the importance of such p

tests. A detailed survey on positron emission tomography/ Entropy (N) = −ni log2 ni (2)
magnetic resonance imaging for detection of cervical cancer i=1

tissues is proposed in [28]. In [29], a technique to classify N = set of cases; p = number of partitions; ni = proportion of
the lung cancer while highlighting the importance of each Ni to N.
gene group from minority classes is presented. The proposed
method can identify more significant genes from the pool 3.2 Learning Vector Quantization (LVQ)
of all genes adaptively. The anatomic extent of a cancer is
provided through stage classification. The eighth edition of LVQ is a special case of ANN which is a prototype-based
lung cancer stage classification is considered to be world- supervised classification algorithm. It is similar to KNN and
wide standard as of January 2017 [30]. Lung cancers are is developed by Kohonen [35]. When we are having labeled
classified as malignant and benign and calculated the degree input data then we can use Learning Vector Quantization.
of their malignancy with Fourier Transform Infrared (FTIR) It is supervised version of vector quantization method.
spectroscopy [31]. The FTIR is also combined with principal Classifier decision regions are improved when the learn-
component analysis–linear discriminant analysis. In [32], ing method slightly reposition the Voronoi vectors by using
researchers provided a systematic literature review and ana- class information. Two stages are involved in this technique,
lyzed the lung cancer data from International Association i.e., self-organizing map (SOM) followed by LVQ. This
for the study of lung cancer database. They also developed method is used in pattern classification problems. The first
proposals for revision involving multispecialty international step involves feature selection, i.e., a small set of feature is
review. They discovered four patterns of the disease. In [33], selected using unsupervised learning method that contains
the correct and consistent regional lymph node classification concentrated input data. In the second step, classification is
for early detection of lung cancer is provided. The effect done in which individual classes are assigned with feature
of International Association for the Study of Lung Cancer domains. LVQ algorithm starts from trained SOM having
(IASLC) lymph node map in lung cancer classification is {p} input and Voronoi weight {Vj}. Each Voronoi cell {Vj}
also discussed in detail. gets the best classification labels of input. A Voronoi cell
boundary does not match with classification boundaries.
This problem is solved with the help of LVQ by shifting of
3 Classification Algorithms boundaries:

3.1 C4.5 1. As in SOM algorithm, if the weight of winning output


node X(p) have same level of class, then they are moved
It is a statistical classifier algorithm which is an extension of together by:
Iterative Dichotomiser (ID3) algorithm that generates a deci-
sion tree evolved by Sudrajat et al. [34]. It is an algorithm ( )
ΔVX(p) = 𝛼(t) p − VX(p) (t) (3)
which involves decision tree by counting gain value. Here
the biggest node acts as root node. Steps of C4.5 algorithm 2. If Voronoi weight (Vj) and input p have different class
for building a decision tree are given as: label, then they are moved apart by
( )
1. The attribute with the largest gain value is selected as ΔVX(p) (t) = −𝛼(t) p − VX(p) (t) (4)
root node.
2. For each value, a branch is created. 3. Voronoi weights (Vj) are left unchanged that corre-
3. The process is repeated for each branch till in all cases sponds to other input regions
the branches have same class. ΔVj(t) = 0 (5)
𝛼(t) = learning rate decreasing with number of iterations
Gain search formula is:
of training

13

2462 Arabian Journal for Science and Engineering (2020) 45:2459–2471

In this manner, better classification is carried out than SOM Step-1 P ≤ 0
algorithm. (1  −  n) × N ≫ 0 points are generated randomly as
per n × N points using location approximation
3.3 Linear Discriminate Analysis (LDA) method
Step-2 Promising points are selected
LDA can be used both as a dimensionality reduction as well Step-3 K t (𝜋xi b|𝜋yi b) , the conditional probabilities are
as a linear classifier. It is quite similar to principal compo- computed
nent analysis (PCA) classifier [36]. Step-4 A new population according to the given equation
In LDA, each of the class probability density function
is assumed to be model as normal density function. All the  ( )
l
K(b, P + 1) = 𝜋i=1 K t 𝜋xi b|𝜋yi b (10)
classes having normal density have same covariance. Let
there are ‘P’ classes. Let Np be a m × Tp matrix of Tp sam- is generated.
ples, as ‘m’ dimensional columns of data from class ‘p’. The Step-5 When termination criteria are met then FINISH
prior probabilities 𝜋p , the common covariance matrix ‘L’ and Step-6 The best points of previous generation are added to
means ‘ 𝜇p ’ for each class are defined as follows: the generation points (elitist)
Step-7 Then p ≤ p + 1
T Go to Step-3.
𝜋p = ∑p b (6)
i=1
Tb
FDA needs finite samples of points. Optimum conver-
gence of FDA depends on the size of samples. FDA can run
Np 1Tb by using any selection method, mostly truncation selection
𝜇p = (7) method is used.
Tb

∑p � T
��
T
�T
4 Proposed Algorithm
N − 𝜇 1 N − 𝜇 1
i=1 p p T b
p p T b (8)
L=
T −P Whale Optimization Algorithm (WOA) is simple when
where 1t is a t × 1 matrix of 1’s. global solutions are explored and positions of search agents
After that a new data point n is classified as are updated. A new optimization algorithm, i.e., VEWOA
is proposed to improve the solution accuracy, reliability
1 of search and convergence speed of WOA. Before going
argp max nt 𝛴 −1 𝜇p − 𝜇pT 𝛴 −1 𝜇p (9)
2 through proposed algorithm, some of the features of WOA
It is used in pattern classification and machine learning are discussed for easy understanding of VEWOA method.
application as a preprocessing step. Goal of LDA is projec-
tion of a feature space/dataset into a lower subspace without 4.1 Overview of WOA
effecting the class discriminator information [37]. It is also
similar to analysis of variance (ANOVA) and regression WOA is a stochastic optimization algorithm which has taken
analysis. inspiration from the special bubble-net hunting behavior of
whales of humpback types [39]. Here, the population of
3.4 Factorized Distribution Algorithm (FDA) swarm is used to get the global optimum (global maximum
or minimum) for any kind of optimization problem, the
FDA is a type of evolutionary algorithm which uses distribu- search process starts by making a random solutions set and
tion in order to combine mutation and recombination [38]. upgrading set till the end condition is met.
Distribution is done with the set of selected points. Then, In order to hunt their prey, whales first create a helix-
these selected points are used for generation of new points structured path and then they follow the bubbles to locate the
for the next generation. A distribution with n binary vari- position of their prey. In WOA, this behavior is simulated by
ables has 2n parameters. FDA works for exact factorization alternating between spiral models and encircling mechanism
as well as for approximate factorization. Factorization can to update the position with 0.5 probabilities. The mechanism
also be used at the time of initialization of the local approxi- can be defined as follows:
mation method is given by:

13
Arabian Journal for Science and Engineering (2020) 45:2459–2471 2463

4.1.1 Shrinking Encircling Preys

In WOA, search agents update their current positions toward


the target prey which is currently giving the best candidate
solution. This can be shown [40].
Pt+1 = P∗t − B × D (11)

D = ||C × P∗t − Pt || (12)

B=2×B×r−B (13)

C = 2×r (14)
where P* = best position; Pt+1 = whale position; t = current
iteration; B = decreased linearly from 2 to 0 over the course
of iterations and r = uniformly distributed random number
Fig. 1  Shrinking encircling mechanism
within the range of [0, 1]. D corresponds to the distance
between tth whale and the best-positioned whale.

4.1.2 Spiral Bubble‑Net Feeding Maneuver

Equations (15) and (16) given are used between the position


of prey and whale to imitate the helix-structured movement
of whales as follows [41].
Pt+1 = ebl × cos (2𝜋l) × D� + P∗t (15)

D� = ||P∗t − Pt || (16)
where b = constant defining shape of the logarithmic spiral;
D′ = corresponds to the updated distance; l = uniformly dis-
tributed random number within the range of [− 1, 1].

4.1.3 Global Exploration

Fig. 2  Spiral update of position


The search agent current position is not updated based on
best search agent, rather on a random search agent. This
been taken from particle swarm optimization (PSO); here
gives a factor of global exploration. This is done when value
each velocity directs the particle toward a potential better
of A is < − 1 or > 1:
global optimum. Thus, revealing new score of each particle
Pt+1 = Prand − B × D�� (17) in each iteration as they update their position based on the
global pest position.
D�� = ||C × Prand − Pt || (18) ( ) ( )
Vt+1 = w × Vt + c1 × r1 × Pbest − Pt + c2 × r2 Gbest − Pt
where D″ corresponds to the updated distance. Prand is ran-
domly selected from whales in the current iteration [42]. (19)
Pt+1 = Pt + Vt+1 (20)
4.2 Velocity‑Enhanced Whale Optimization where Vt+1 = velocity of particle t + 1; Pbest = personal best
Algorithm (VEWOA) of particle P; Gbest = global best; c1, c2 = constants such that
c1 + c2≤ 4; r1, r2 = random numbers; w = inertia coefficient.
When it comes to explore global solutions and update posi- The particles can also update their position according to
tions of search agents, WOA is relatively simple. In order the spiral bubble-net feeding maneuver (Eqs. 15 and 16).
to improve reliability of search, the solution accuracy and In this algorithm, we have assumed that the whales
convergence speed of WOA, a new algorithm is proposed update position 50% of time using Eq. (11) and in remaining
namely VEWOA. Here, we consider whales as school of 25% using Eq. (15) and left 25% for global random search
fishes with their own velocity. This concept of velocity has (Figs. 1, 2, 3).

13

2464 Arabian Journal for Science and Engineering (2020) 45:2459–2471

Pseudo code of VEWOA

"Population of Search Agents is initialized Pi (i = 1, 2, 3 , … n )


Fitness of every search agent is calculated
P* = best search agent
while (t < maximum number of iterations)
c1, c2, w is updated.
for each search agent
r1, r2 is updated
velocity of current search agent is updated using Eq. (19)
End for
for each search agent
parameters a, B are updated
if1 (p<0.5)
if2 (|B| < 1)
position of the current search agent by the Eq. (15) is updated
else if2 (|B|>=1)
a random search agent ( ) is selected
Position of the current search agent is updated by the Eq. (17)
end if2
elseif1 (p>=0.5)
The position of the current search by the Eq. (11) is updated
end if1
end for
if any search agent goes beyond the search space, retract its position
The fitness of each search agent is calculated
personal best of each agent is updated
If there is a better solution P* is updated.
t=t+1
end while
return P* "

4.3 VEWOA‑Trained ANN

Our proposed approach is discussed in this section. Our neu-


ral network contains one hidden layer in addition to the ini-
tial connection of weights and bias between the input layer
and hidden layer. The no. of search agents is taken as 25.
Each search agent is treated as a one-dimensional vector
having three components.

(i) Weights which are assigned between information


layer and hidden layer.
(ii) Weights which are assigned between hidden layer and
output layer.
(iii) Biases

Fig. 3  Velocity updation After each search, agent has associated weight and bias; it is
then proceeded to find the optimal solution.
The VEWOA-trained NN can be summarized as follows:

13
Arabian Journal for Science and Engineering (2020) 45:2459–2471 2465

(i) Initialize: in this step, each agent is assigned the ran- with at first, the data are preprocessed. After the preprocess-
dom set of weights and bias. ing step is over, the missing values are all replaced with the
(ii) Evaluate fitness: in this step, the optimal set of instance mean, and all the inputs are normalized through
weights and biases is generated according to the min–max normalization. The formulae for min–max nor-
minimized mean-squared error (MSE), which can malization can be summarized as follows:
also be treated as fitness function. Xorig − Xmin
(iii) Update position: the position of the whales is updated Xnorm = (21)
accordingly. Xmax − Xmin

Our proposed technique VEWOA-NN has been compared


Steps (ii) and (iii) are repeated until the stopping criteria with many other classifiers, and the superiority of the former
are achieved. can be shown based on the following performance criterions.
The workflow model is depicted in Fig. 4 in the form of
a flowchart. Summary of the work in flow chart format is (i) Correct Classification Rate (CCR)
given at Fig. 5.  
n
CCR = .
N (22)

5 Results and Discussion where ‘n’ is the no. of accurately grouped patterns


  ‘N’ is the total no. of patterns.
5.1 Implementation Details (ii) Average Squared Classification Error (ASCE)

The optimization techniques aim to reduce the error in the ∑n � �2


ck − nk
objective function. In this case, VEWOA-NN is applied as ASCE = i=1 (23)
N
the optimization algorithm to the neural network. To begin

Fig. 4  Workflow model of
Initialize
VEWOA populations of
Training Dataset search agents

Calculate fitness
Assign search of each search
agents to agent
network

Evaluate Update position of


Network using search agents
Error

NO

Termination
Update POS* if
condition
their is a best
met?
solution
YES

Return the network


with minimum error

13

2466 Arabian Journal for Science and Engineering (2020) 45:2459–2471

have provided the graphs for error convergence as well as


START weight convergence in each iteration for better understand-
ing. The techniques which converge fast are considered to
be the efficient one.

Dataset is loaded 5.2 Details of Dataset

In this paper, we have used three different datasets to claim


the superiority of VEWOA-NN.
Attributes and features
are identified
(a) The dataset of Wisconsin Breast Cancer is available in
database at [43]. The database contains nine param-
eters. For the classification, we consider all nine param-
Data processing is
performed eters and also divide the data set into 70:30 ratios for
training and testing. The attributes of the dataset are as
follows:
Processe d Data is fe e d to
VEWOANN 1. Clump thickness: 1–10
2. Uniformity of cell size: 1–10
3. Uniformity of cell shape: 1–10
Classification results are compared and
4. Marginal adhesion: 1–10
classification accuracy is calculated 5. Single epithelial cell size: 1–10
6. Bare nuclei: 1–10
7. Bland chromatin: 1–10
8. Normal nucleoli: 1–10
C4.5,LVQ,FDA and VEWOA-NN are
calculated 9. Mitoses: 1–10

(b) The cervical cancer dataset has been collected from


‘Hospital Universitario De Caracas’ in Caracas, Ven-
Confusion matrix is
developed. zuela. It is available at [44]. It has 858 no. of instances
and 36 attributes.
(c) The lung cancer data were published in [45]. The donor
of this dataset is Aeberhard [46]. The no. of instances
STOP
is 32 and no. of attributes is 57.

5.3 Result Analysis
Fig. 5  Workflow model of overall network
In Tables 1, 2 and 3, validation results of WBC data, cervical
  ck = number of observations in class k cancer and lung cancer are shown. Different classification
  nk = number of appropriately classified observa- algorithms are compared on the basis of no. of parameters
tions in class k. for training, time taken for calculating mean and standard
(iii) Computational cost: it is the time taken to evaluate deviation, CCR of mean and standard deviation, ASCE of
the classifier. mean and standard deviation. Algorithms such as C4.5,
FDA, LDA and LVQ consume much more time as com-
We have conducted the experiment in 30 runs each con- pared to other methods. The proposed method VEWOA-NN
taining 100 iterations. The computational cost is obtained provides optimum value of CCR, minimum value of ASCE
by simulating each algorithm in the same machine for the and less time-consuming being the best one of all.
same no. of iterations. The output can be one of the two, Tables 4, 5 and 6 depict the confusion matrices for the
i.e., either 0 (benign tumor) or 1(malignant tumor). The above-mentioned data. In WBC data (Table 4), the benign
results are summarized by computing the mean and respec- data as predicted by the classifier became 448 while the
tive standard deviation. In addition, the graphs are shown to malignant ones came to 6, thereby resulting in a classifica-
analyze results more clearly. Graphs help in better under- tion accuracy of 97.8 in case of benign and 97.5 for malig-
standing of the results and presenting a comparison. We nant ones. In cervical cancer (Table 5), the benign data

13
Arabian Journal for Science and Engineering (2020) 45:2459–2471 2467

Table 1  WBC data: quantitative assessment


Classification model No. of parameters Time in sec- Time in sec- CCR (Mean) CCR (SD) ASCE (Mean) ASCE (SD)
onds (Mean) onds (SD)

C4.5 9.000 7.542 0.561 0.894 0.210 0.561 0.312


FDA 9.000 5.231 0.342 0.923 0.101 0.412 0.210
LDA 9.000 4.426 0.241 0.784 0.211 0.312 0.330
LVQ 9.000 2.315 0.432 0.793 0.031 0.204 0.120
WOA-NN 9.000 1.416 1.321 0.929 0.210 0.106 0.131
‘VEWOA-NN’ 9.000 0.667 0.312 0.976 0.005 0.169 0.016

The best result of each column is indicated in bold

Table 2  Cervical cancer data: quantitative assessment


Classification model No. of parameters Time in sec- Time in sec- CCR (Mean) CCR (SD) ASCE (Mean) ASCE (SD)
onds (Mean) onds (SD)

C4.5 4.000 4.412 0.426 0.542 0.021 0.623 0.262


LDA 4.000 3.320 0.531 0.618 0.034 0.516 0.243
FDA 4.000 2.211 0.125 0.723 0.014 0.612 0.510
LVQ 4.000 1.121 0.231 0.816 0.102 0.348 0.427
WOA-NN 4.000 0.76 0.213 0.891 0.097 0.145 0.049
‘VEWOA-NN’ 4.000 0.27 0.331 0.946 0.004 0.121 0.024

The best result of each column is indicated in bold

Table 3  Lung cancer data: quantitative assessment


Classification model No. of parameters Time in sec- Time in sec- CCR (Mean) CCR (SD) ASCE (Mean) ASCE (SD)
onds (Mean) onds (SD)

C4.5 31.000 5.623 0.351 0.822 0.051 0.201 0.142


LDA 31.000 4.241 0.420 0.768 0.036 0.302 0.173
FDA 31.000 1.623 0.361 0.818 0.060 0.601 0.428
LVQ 31.000 1.421 0.142 0.806 0.701 0.207 0.650
WOA-NN 31.000 0.762 0.198 0.812 0.841 0.198 0.110
‘VEWOA-NN’ 31.000 0.281 0.413 0.845 0.902 0.115 0.031

The best result of each column is indicated in bold

Table 4  The confusion matrix obtained from VEWOA-trained NN in Table 5  The confusion matrix obtained from VEWOA-trained NN in
WBC cervical cancer

CLASS ‘BENIGN’ ‘MALIG- % Classification CLASS ‘BENIGN’ ‘MALIG- % Classification


NANT’ NANT’

‘BENIGN’ 448 10 97.8 ‘BENIGN’ 790 13 98.3


‘MALIGNANT’ 6 235 97.5 ‘MALIGNANT’ 5 50 90.9

found out to be 790 as opposed to 5 in malignant causing the Table 6  The confusion matrix obtained from VEWOA-trained NN in
classification accuracy to rise up to 98.3 and 90.9 for benign lung cancer
and malignant tumors, respectively. In lung cancer (Table 6), CLASS Type-1 Type-2 Type-3 % Classification
the type-I and type-II classification accuracy found out to
be 88.89 and 84.6 while type-III classification accuracy is Type-1 8 1 0 88.89
80. The classification accuracy of VEWOA-NN is compared Type-2 1 11 1 84.6
with other classification techniques in Tables 7, 8 and 9. Type-3 1 1 8 80

13

2468 Arabian Journal for Science and Engineering (2020) 45:2459–2471

Table 7  Classification accuracy Algorithm Results in %


of WBC dataset
‘C4.5’ 89.4
‘LVQ’ 92.3
‘FDA’ 78.4
‘LDA’ 79.3
WOA-NN 92.9
‘VEWOA-NN’ 97.65

Table 8  Classification accuracy Algorithm Results in %


of cervical cancer dataset
‘C4.5’ 54.2
Fig. 6  Convergence of weight using VEWOA-NN in WBC breast
‘LVQ’ 61.8 cancer data
‘FDA’ 72.3
‘LDA’ 81.6
WOA-NN 89.1
‘VEWOA-NN’ 94.6

Table 9  Classification accuracy Algorithm Results in %


of lung cancer dataset
‘C4.5’ 82.2
‘LVQ’ 76.8
‘FDA’ 81.8
‘LDA’ 80.6
WOA-NN 81.2
‘VEWOA-NN’ 84.5

The accuracy of C4.5 found out to be 89.4, 54.2 and 82.2,


respectively, in all the three datasets. The accuracy improved Fig. 7  Convergence of weight using VEWOA-NN in cervical cancer
data
in case of LVQ rising up to 92.3, 61.8 and 76.8 in case of
breast cancer, cervical cancer and lung cancer. The accuracy
of FDA found out to be 78.4, 72.3 and 81.8, respectively,
in all the three datasets. The accuracy of LDA found out to
be 79.3, 81.6 and 80.6, respectively, in all the three data-
sets. The accuracy of VEWOA-NN is 97.65, 94.6 and 84.5,
respectively, in three datasets. In addition, to these tables,
we summarize the results in graphs. Figures 6, 7 and 8 depict
the convergence of weights in breast cancer data, cervical
cancer data and lung cancer data, respectively. Figures 9,
10 and 11 show the bar graph plot on mean CCR versus
standard deviation CCR in breast cancer, cervical cancer
and lung cancer. CCR (mean) in the x-axis and CCR (SD)
in the y-axis are plotted. In Figs. 12, 13 and 14, the mean-
squared error convergence is shown for cervical cancer and
lung cancer. The MSE obtained is the best obtained from
running all the iterations. From the tables and graphs, it is
evident that VEWOA-NN is dominating the other techniques Fig. 8  Convergence of weight using VEWOA-NN in lung cancer data
in the same dataset.

13
Arabian Journal for Science and Engineering (2020) 45:2459–2471 2469

Fig. 9  Mean CCR versus standard deviation CCR In breast cancer

1 Fig. 13  Convergence of MSE curve in cervical cancer


0.8

0.6
CCR (Mean)
0.4 CCR (SD)
0.2

0
1 2 3 4 5 6

Fig. 10  Mean CCR versus standard deviation CCR In cervical cancer

0.8
Fig. 14  Convergence of MSE curve in lung cancer
0.6
CCR (Mean)
0.4 CCR (SD)
6 Conclusion
0.2

0 One of the leading causes of death is cancer. Even though we


1 2 3 4 5 6
have applied highly advanced techniques and took the help
of many experienced radiologists, early detection is still a
Fig. 11  Mean CCR versus standard deviation CCR in lung cancer farfetched dream. Cancer diagnosis and treatment programs
strive to prolong a patient’s life in the best possible way.
The proposed method contributes to early detection, thereby
assisting the medical institutes. An undetected tumor can
prove to be lethal. Therefore, early detection is given the
highest priority. In this article, we have proposed a clas-
sification technique to detect malignant tumors. The clas-
sification accuracy of the proposed algorithm is 97.65 in
breast cancer, 94.6 in cervical cancer and 84 in lung cancer.
Further research work is required to increase the accuracy
and decrease the time which hopefully can be achieved by
manipulating the hidden layer nodes, applying some hybrid
techniques, etc.

Fig. 12  Convergence of MSE curve in breast cancer

13

2470 Arabian Journal for Science and Engineering (2020) 45:2459–2471

References 20. Donate, J.P.; et al.: Time series forecasting by evolving artificial
neural networks with genetic algorithms, differential evolution
and estimation of distribution algorithm. Neural Comput. Appl.
1. Valigopalini, S.; et al.: Trends in cervical cancer incidence and
22(1), 11–20 (2013). https​://doi.org/10.1007/s0052​1-011-0741-0
mortality in Oklahoma and the United States. Cancer Epidemiol.
21. Lee, S.; Geem, Z.W.: A new meta-heuristic algorithm for continu-
56, 140–145 (2018)
ous engineering optimization: harmony search theory and prac-
2. Zheng, M.: Classification and pathology of lung cancer. Surg.
tice. Comput. Methods Appl. Mech. Eng. 194, 3902–3933 (2005).
Oncol. Clin. N. Am. 25(3), 447–468 (2016)
https​://doi.org/10.1016/j.cma.2004.09.007
3. Senapati, M.R.; Dash, P.K.: Local linear wavelet neural net-
22. Mao, C.: Harmony search-based test data generation for branch
work based breast tumor classification using firefly algorithm.
coverage in software structural testing. Neural Comput. Appl.
Neural Comput. Appl. 22(7–8), 1591–1598 (2013). https​://doi.
25(1), 199–216 (2014). https​://doi.org/10.1007/s0052​1-013-1474-z
org/10.1007/s0052​1-012-0927-0
23. El-Sebakhy, E.A., Faisal, K.A., Helmy, T., Azzedin, F., Al-
4. Sakim, H.A.M.; Salleh, N.M.; Othman, N.H.: Neural network inputs
Suhaim, A.: A evaluation of breast cancer tumor classification
selection for breast cancer cells classification. Stud. Comput. Intell.
with unconstrained functional networks classifier. In: IEEE Inter-
199, 1–11 (2009). https​://doi.org/10.1007/978-3-642-00909​-9
national Conference on Computer Systems and Applications, pp.
5. Ster, B., Dobnikar, A.: Neural network in medical diagnosis:
281–287 (2006). https​://doi.org/10.1109/aiccs​a.2006.20510​2
comparison with other methods. In: Engineering Applications of
24. Adem, K.; et al.: Classification and diagnosis of cervical cancer
Neural Networks, pp. 427–430 (1996)
with softmax classification with autoencoder. Expert Syst. Appl.
6. Moghaddam, S.S.; Keshteli, M.H.; Mahmoodjanloo, M.: New
115, 557–564 (2018)
approaches in metaheuristics to solve the fixed charge transpor-
25. William, W.; et al.: A review of image analysis and machine learning
tation problem in a fuzzy environment. Neural Comput. Appl.
techniques for automated cervical cancer screening from pap-smear
(2017). https​://doi.org/10.1007/s0052​1-017-3027-3
images. Comput. Methods Programs Bio Med. 164, 15–22 (2018)
7. Delen, D.: Analysis of cancer data: a data mining approach.
26. Devi, M.A.; et al.: Classification of cervical cancer using artificial
Expert Syst. 26, 100–112 (2009). https ​ : //doi.org/10.111
neural networks. Procedia Comput. Sci. 89, 465–472 (2016)
1/j.1468-0394.2008.00480​.x
27. Philip, Lauren; et al.: Pap tests in the diagnosis of cervical cancer:
8. Sisodia, D.; Sisodia, D.S.: Prediction of diabetes using classifica-
help or hinder? Gynecol. Oncol. 150(1), 61–66 (2018)
tion algorithms. Procedia Comput. Sci. 132, 1578–1585 (2018).
28. Shih, I.L.; et al.: PET/MR hybrid imaging of cervical and endo-
https​://doi.org/10.1016/j.procs​.2018.05.122
metrial cancer. J. Cancer Res. Pract. 5(3), 91–98 (2018)
9. Patra, A.; Das, S.; Mishra, S.N.; Senapati, M.R.: An adaptive local
29. Li, J.; et al.: Adaptive multinomial regression with overlapping
linear optimized radial basis functional neural network model for
groups for multi class classification of lung cancer. Comput. Biol.
financial time series prediction. Neural Comput. Appl. 28(1),
Med. 100, 1–9 (2018)
101–110 (2015). https​://doi.org/10.1007/s0052​1-015-2039-0
30. Frank, C.; et al.: The eighth edition lung cancer stage classifica-
10. Dash, S.; Senapati, M.R.; Jena, U.R.: K-NN based automated rea-
tion. Chest 151(1), 193–203 (2017)
soning using bilateral filter based texture descriptor for comput-
31. Kazonowska, E.; et al.: The classification of lung cancers and
ing texture classification. Egypt. Inform. J. (2018). https​://doi.
their degree of malignancy by FTIR, PCA-LDA analysis, and a
org/10.1016/j.eij.2018.01.003
physics-based computational model. Talanta 186, 337–345 (2018)
11. Dash, S.; Jena, U.R.; Senapati, M.R.: Homomorphic normaliza-
32. Frank, C.; et al.: The IASLC lung cancer staging project: summary
tion-based descriptors for texture classification. Arab. J. Sci. Eng.
of proposals for revisions of the classification of lung cancers with
(2018). https​://doi.org/10.1007/s1336​9-017-2961-9
multiple pulmonary sites of involvement in the forthcoming eighth
12. Nayak, T.; Senapati, M.R.; Prasad, S.: A survey on web text
edition of TMM classification. J. Thoracic Oncol. 11(5), 639–650
information retrieval in text research. J. Appl. Sci. Eng. Tech-
(2016)
nol. 10(10), 1164–1174 (2015). https​://doi.org/10.19026​/rjase​
33. Ahmed, H.; et al.: Cross-disciplinary analysis of lymph node clas-
t.10.1884
sification in lung cancer on CT scanning. Chest 151(4), 776–785
13. Senapati, M.R.; Das, S.P.; Champati, P.K.; Routray, P.K.: Local
(2017)
linear radial basis function neural networks for classification of
34. Sudrajat, A.R.; Irianingsih, I.; Krisnawan, D.: Analysis of data mining
breast cancer data. Prog. Sci. Eng. Res. J. 2(6), 033–042 (2014)
classification by comparison of C4.5 and ID algorithm. In: IOP Con-
14. Das, S.; Patra, A.; Mishra, S.; Senapati, M.R.: A self-adaptive
ference Series: Materials Science and Engineering, vol. 166 (2017)
fuzzy-based optimised functional link artificial neural network
35. Kohonen, T.: Learning vector quantization. In: Self-Organiz-
model for financial time series prediction. Int. J. Bus. Forecast.
ing Maps. Springer Series in Information Sciences, vol. 30,
Mark. Intell. 2(1), 19–36 (2015). https​://doi.org/10.1504/IJBFM​
pp. 175–189. Springer, Berlin, Heidelberg (1995). https​://doi.
I.2015.07535​8
org/10.1007/978-3-642-97610​-0_6
15. Sahoo, S.; Sahoo, S.S.; Senapati, M.R.; Dash, P.K.: Local linear
36. Izenman, A.J.: Linear discriminant analysis. In: Modern Mul-
wavelet neural network and RLS for usable speech classification.
tivariate Statistical Techniques. Springer Texts in Statistics,
IJCS 8, 412–418 (2011)
pp. 237–280. Springer, New York, NY (2013). https​: //doi.
16. Abonyi, J.; Szeifert, F.: Supervised fuzzy clustering for the identi-
org/10.1007/978-0-387-78189​-1_8
fication of fuzzy classifiers. Pattern Recognit. Lett. 24(14), 2195–
37. Garrett, D.; Peterson, D.A.; Anderson, C.W.; Thaut, M.H.: Com-
2207 (2003). https​://doi.org/10.1016/S0167​-8655(03)00047​-3
parison of linear, nonlinear, and feature selection methods for
17. Majhi, S.K.: An efficient feed foreword network model with sine
EEG signal classification. IEEE Trans. Neural Syst. Rehabil. Eng.
cosine algorithm for breast cancer classification. Int. J. Syst. Dyn.
11(2), 141–144 (2003)
Appl. (IJSDA) 7(2), 1–14 (2018). https​://doi.org/10.4018/IJSDA​
38. Zhang, Q.: On stability of fixed points of limit models of univari-
.20180​4010
ate marginal distribution algorithm and factorization distribution
18. Mahfoud, S.; Mani, G.: Financial forecasting using genetic algo-
algorithm. IEEE Trans. Evolut. Comput. 8(1), 80–93 (2004)
rithms. Appl. Artif. Intell. 10, 543–565 (1996)
39. Abdel-Basset, M.A.; et al.: A hybrid whale optimization algorithm
19. Neshat, M.: FAIPSO: fuzzy adaptive informed particle swarm
based on local search strategy for the permutation flow shop sched-
optimization. Neural Comput. Appl. 23, 95–116 (2013). https​://
uling problem. Future Gener. Comput. Syst. 85, 129–145 (2018)
doi.org/10.1007/s0052​1-012-1256-z

13
Arabian Journal for Science and Engineering (2020) 45:2459–2471 2471

40. Majdi, M.; et  al.: Hybrid whale optimization algorithm with 44. Fernandes, K., Cardoso, J.S., Fernandes, J.: Transfer learning
simulated annealing for feature selection. Neuro-computing 260, with partial observability applied to cervical cancer screening. In:
302–312 (2017) Iberian Conference on Pattern Recognition and Image Analysis.
41. Kaveh, A.; Ghazaan, M.I.: Enhanced whale optimization algo- Springer (2017)
rithm for sizing optimization of skeletal structures. Mech. 45. Hong, Z.Q.; Yang, J.Y.: Optimal discriminant plane for a small
Based Des. Struct. Mach. 45(3), 345–362 (2016). https​://doi. number of samples and design method of classifier on the
org/10.1080/15397​734.2016.12136​39 plane. Pattern Recognit. 24(4), 317–324 (1994). https​://doi.
42. Mirjalili, S.; Lewis, A.: The whale optimization algorithm. Adv. org/10.1016/0031-3203(91)90074​-F
Eng. Softw. 51, 67 (2016). https​://doi.org/10.1016/j.adven​gsoft​ 46. Dua, D., Graff, C.: UCI machine learning repository. University of
.2016.01.008 California, School of Information and Computer Science, Irvine,
43. Wolberg, W.H.; Mangasarian, O.L.: Multisurface method of pat- CA. http://archi​ve.ics.uci.edu/ml (2019)
tern separation for medical diagnosis applied to breast cytology.
Proc. Natl. Acad. Sci. USA 87, 9193–9196 (1990)

13

You might also like