My KHElsevier

Applied Soft Computing 46 (2016) 230–245
Contents lists available at ScienceDirect
Applied Soft Computing

journal homepage: www.elsevier.com/locate/asoc
An improved krill herd algorithm with global exploration capability

for solving numerical function optimization problems and its
application to data clustering
R. Jensi ∗ , G. Wiselin Jiji
Dr. Sivanthi Aditanar College of Engineering, Tiruchendur, Tamilnadu, India
a r t i c l e i n f o a b s t r a c t
Article history: Krill herd algorithm is a stochastic nature-inspired algorithm for solving optimization problems. The
Received 28 December 2015 performance of krill herd algorithm is degraded by poor exploitation capability. In this study, we propose
Received in revised form 7 April 2016 an improved krill herd algorithm (IKH) by making the krill the global search capability. The enhancement
Accepted 20 April 2016
comprises of adding global search operator for exploration around the defined search region and thus the
Available online 10 May 2016
krill individuals move towards the global best solution. The elitism strategy is also applied to maintain
the best krill during the krill update steps. The proposed method is tested on a set of twenty six well-
Keywords:
known benchmark functions and is compared with thirteen popular optimization algorithms, including
Global optimization
High convergence
original KH algorithm. The experimental results show that the proposed method produced very accurate
Global exploration results than KH and other compared algorithms and is more robust. In addition, the proposed method
Metaheuristic has high convergence rate. The high performance of the proposed algorithm is then employed for data
Data clustering clustering problems and is tested using six real datasets available from UCI machine learning laboratory.
The experimental results thus show that the proposed algorithm is well suited for solving even data
clustering problems.
© 2016 Elsevier B.V. All rights reserved.
1. Introduction food foraging behavior of bird flocks or fish school. Ant Colony
Optimization (ACO) [12] was initially developed by Marco Dorigo,
Over the last few decades, many nature-inspired algorithms simulates the behavior of ants searching for a path between food
have been proposed for solving numerical optimization problems. source and colony. Artificial bee colony (ABC) [11] is another opti-
Nature-inspired algorithms [3–5] play a vital role in solving many mization algorithm developed by Karaboga in 2005 based on the
engineering optimization problems [6–10,23–25,32] owing to the foraging behavior of honey bee swarm. Biogeography-based opti-
global exploration and exploitation ability. These algorithms imi- mization (BBO) [15] is developed by Simon in 2008, is imitating
tate the behavior of living things in nature such as animals, birds, the migrating behavior of species between islands. StudGA (SGA)
fishes, etc. Several heuristic algorithms have been developed in the [20], a variant of GA, is an optimization algorithm in which fittest
literature. Genetic algorithm (GA) [17] was proposed by Goldberg individual is selected rather than stochastic selection and shares
in 1998 simulating the survival of fittest among individuals in the this information among others using genetic algorithm operators.
population over many generations. Evolutionary strategy (ES) [16] Cuckoo search (CS) [21] which is a recently proposed optimization
and differential evolution (DE) [13,14] algorithms belong to the algorithm by Yang et al. in 2009, inspired by the obligate brood
sub class of evolutionary algorithms which use selection, muta- parasitic behavior of cuckoo species. Firefly algorithm (FFA) [22]
tion and recombination operators. Population-based Incremental is developed by Yang in 2010 based on the flashing behaviour of
Learning (PBIL) [18] developed by Shumeet Baluja in 1994, is an fireflies. In Refs. [2,35] authors proposed a new PSO algorithm com-
optimization algorithm which combines the genetic algorithm with bined with levy flight for solving optimization problems. Krill herd
simple competitive learning. Particle Swarm Optimization (PSO) (KH) [1] is introduced by Gandomi and Alavi in recent times based
[19] was proposed by Eberhart and Kennedy in 1995, imitates the on the imitation of krill individuals behavior. Even though there
exists several optimization algorithms, research is still going on
in the development of optimization algorithm which will provide
∗ Corresponding author.
high convergence rate and global optimum solution.
E-mail address: r jensi@yahoo.co.in (R. Jensi).
http://dx.doi.org/10.1016/j.asoc.2016.04.026
1568-4946/© 2016 Elsevier B.V. All rights reserved.
R. Jensi, G.W. Jiji / Applied Soft Computing 46 (2016) 230–245 231
The remaining section of the paper is organized as follows. where Ni is the movement induced by other krill individuals, Fi is
Section 2 provides the variants of krill herd algorithm in solving the foraging action and Di is the random physical diffusion of the
function optimization problems. Section 3 briefly explains the orig- ith krill individuals.
inal krill herd algorithm, and Section 4 presents the proposed IKH The direction of motion induced, ∝i , depends on the three com-
approach. Section 5 provides the experimental results, Section 6 ponents, namely local swarm density, a target swarm density and
provides the experimental results of data clustering and Section 7 a repulsive swarm density. The movement of a krill individual Ni is
is concluded with future discussion. defined as:
Ninew = N max ∝i + ωn Niold (2)

2. Variants of krill herd algorithm
where
In Ref. [1], the authors tested four different krill herd algo- target
∝i = ∝local
i
+ ∝i (3)
rithms such as KH without any genetic operators, KH with crossover
operator, KH with mutation operator and KH with crossover and and Nmax is the maximum induced speed, ωn is the inertia weight,
mutation operators. They concluded that KH with crossover oper- Niold is the motion induced previously, ∝local
i
is the local effect
ator performed well for solving the optimization problems. Hence, offered by neighbours and ∝i
target
is the best krill individual’s target
hereafter KH refers to KH with crossover operator. Krill herd effect.
would fail to reach the global optima due to the poor exploita- The second movement of KH approach foraging action Fi
tion capability. In order to overcome the problem of KH, the depends on two parameters, namely current food location and
researchers developed novel variants of KH method. Krill herd (KH) information about previous food location. The ith krill individual’s
is hybridized with other methods by utilizing the advantages of all motion is described as:
the methods employed.
Wang et al. [28] employed an updated genetic reproduction Fi = Vf ˇi + ωf Fiold (4)
mechanism such as stud selection and crossover operator into
where
KH instead of stochastic selection while updating krill individ-
ual position. The algorithm is named as Stud krill herd because ˇi = ˇifood + ˇibest (5)
the operators are derived from Stud GA. G.G. Wang et al. [29]
introduced an effective biogeography-based krill herd (BBKH) in and Vf is the foraging speed, ωf is the inertia weight of the foraging
which migration operator from BBO is combined with KH. In addi- action, Fiold is the previous foraging motion, ˇifood is the food attrac-
tion, Wang et al. [30] proposed a hybrid differential evolution KH tive and ˇibest is the best fitness found by the ith krill so far. The
method in which differential evolution operator is hybridized with value for ωn , ωf is equal to 0.9 at the first iteration and decreases
KH during krill individual position updating process. Wang et al. gradually to 0.1 at the end of the iteration.
[31] improved KH performance by including levy flight method The third movement of KH approach is random physical diffu-
into KH resulting Levy-flight krill herd algorithm. Wang et al. [37] sion. The physical diffusion of the ith krill individual depends on
developed a hybrid metaheuritic algorithm called HS/KH in which two components, namely maximum diffusion speed and a random
Harmony search is combined with KH in order to perform explo- directional vector and it is defined as:
ration and exploitation effectively. Wang et al. [38] introduced a I

Levy flight distribution and elitism strategy for updating KH motion Di = Dmax 1 − ı (6)
Imax
calculation. Wang et al. [39] proposed a hybridized algorithm FKH
where operators used in firefly algorithm is combined into KH. Li where Dmax is the maximum diffusion speed, ı is the random vec-
et al. [43] proposed an improved KH algorithm with linear decreas- tor in the range [–1, 1], I is the current generation and Imax is the
ing step. In Ref. [44] authors introduced opposition based learning maximum generation.
strategy and free search operator into KH algorithm in order to Based on the three movements defined above, the position of
avoid the diversity problem. Wang et al. [45] developed a hybrid ith krill individual during the time interval is:
metaheuristic cuckoo search and krill herd (CSKH) algorithm where dXi
operators from cuckoo search are combined with KH to enhance its Xi (t + t) = Xi (t) + t (7)
dt
effectiveness and reliability. Wang et al. [46] proposed a hybrid
It is clearly seen that t is an important parameter and its value
simulated annealing-based krill herd algorithm for solving global
determines the convergence speed. For more details refer [1].
optimization problems.
4. The proposed IKH algorithm

3. Krill herd (KH) algorithm
Sometimes KH gets stuck in local optima due to the poor explo-
Krill herd (KH) [1] is a new metaheuristic population based ration and exploitation. To improve the performance of KH, in
global optimization algorithm. The inspiration of KH algorithm this study we incorporate global search towards the best solution
is the herding behaviour of krill swarm when looking for food into KH so that the proposed algorithm reaches promising region
and communication with each other. The implementation of KH within the defined boundary. In the proposed method, original KH
method is based on three movements such as: algorithm is performed to move the krill into new position. After
exploration stage, global search is carried out with a small step to
(i) Movement influenced by other krill individual. achieve global best solution. Here the step size is determined by
(ii) Foraging action. using the dimension of the problem. Step size is calculated using
(iii) Physical diffusion. Eq. (8):
⎧ exp (−nd)
KH approach follows Lagrangian model for effective search and ⎪
⎨ exp (−2) , if nd > 10
it is described as:
stepsize = (8)
dXi ⎪
⎩ 1
= Ni + Fi + Di (1) , otherwise
dt exp (−0.5)
232 R. Jensi, G.W. Jiji / Applied Soft Computing 46 (2016) 230–245
where nd is the dimension of the problem. 5.1. Comparison of optimization results

Krill individual position is fine-tuned towards the best solution
with a small step to explore the search space effectively. The fine- As seen from Table 2, IKH achieves optimal results for 21
tuning of the krill position is done based on the probability. If the functions out of 26 functions. IKH outperforms other thirteen algo-
probability is less than 0.5 then krill position is changed according rithms for functions f1–f7, f9–f16, f18, f20, f22–23, f25–f26 except 5
to Eq. (9), otherwise Eq. (10) is used to generate new krill position. functions namely f8, f17, f19, f21 and f24. BBO algorithm obtains
optimal mean results of 9.12E + 02 for f8 function (Rosenbrock)
j j
Xmax − Xmin only. LFSPO gets mean optimal results for f17, f19 and f21 functions
⊕ random [−1, 1] (9) and CS algorithm finds mean result of 6.17E − 02 for the func-
Xinew = Xit+1 + 0.001 ⊕
2 tion f24. While comparing the mean results of the algorithms, IKH
indeed outperforms well and reaches optimal values for most of
Xinew = Xit+1 ⊕ step (10) the unimodal and multimodal functions.
While examining the best optimal value obtained by the algo-
where rithms as given in Table 3, IKH algorithm obtains best result on 21
functions of the 26 functions. ACO obtains best value on f5, FFA on
f6, LFPSO on f17, f19, GA on f21, f24 functions. Although ACO, FFA,
j j
Xmax − Xmin
step = Xit+1 − Xbest ⊕ random [−1, 1] + stepsize ⊕ ⊕ GA get best value on some functions but the mean optimal results
2
for those functions are not satisfactory as seen from Table 2, which
random [−1, 1] (11) indicates that these algorithms fail to obtain the optimal results in
all runs. LFPSO obtains best and mean optimal results on f17 and
f19 functions. IKH outperforms well for 21 functions.
random [−1, 1] is the normally distributed pseudorandom num- IKH algorithm utilizes exploration and exploitation capability of
j j
bers in the range [–1,1], stepsize is from Eq. (8), Xmax , Xmin are KH algorithm in addition it uses the global exploration operator to
maximum and minimum bound values of the jth dimension, respec- update the krill position towards the best krill. IKH enables the krill
tively and ⊕ represents element-by- element multiplication. to move randomly around the search space thus making the krill to
After finding new position of the krill, individual fitness is eval- have more diverse position while updating its position in each iter-
uated. The greedy selection technique is employed for choosing the ation. IKH also ensures that it has high convergence rate compared
better krill position (i.e.) if the fitness of the newly calculated krill is to KH algorithm. As KH fails to perform global exploration, it does
better than current fitness of the krill, then old position is replaced not reach the global optimum solution for any one of the tested
by the newly found krill position, otherwise the old position is benchmark functions. Based on the analysis of the experimental
retained. The above procedure is repeated for all krill individuals. results, we conclude that IKH generates optimal or near optimal
At the end of current generation, elitism approach is employed to solutions and is robust when compared to other methods.
keep the best krill in the population to the next generation. By this Wilcoxon’s rank sum test is done to determine statistically sig-
method, krill individuals are exploring the search space with ran- nificant between IKH and the other thirteen methods. The test is
dom jump towards the best solution and thus diversity problem is performed on optimum results obtained by 100 independent runs
avoided and converges quickly. The psuedocode and flowchart of of the corresponding methods with significance level at ˛ = 0.05.
the proposed algorithm is shown in Figs. 1 and 2, respectively. The test result values for 26 benchmark functions are given in
Tables 4–6 . From the values in Tables 4–6, it can be seen that IKH’s
result is statistically significant when compared to other methods
5. Experimental results except for functions f08, f17, f19, f21 and f24.
In order to compare the convergence characteristics of proposed
In this section, the experimental results of the proposed work and compared algorithm for solving the function optimization
are given. In order to test the performance of the proposed algo- problems, convergence graphs are drawn. The convergence graphs
rithm, 26 benchmark functions [36,37,40–42] are selected and its for Rosenbrock, Schwefel2.26, Step, Penalized1, Alpine and Wavy
properties are given in Table 1. We intend to improve the perfor- functions are shown in Fig. 3(a)–(f). The values taken for draw-
mance of KH algorithm for solving both unimodal functions and ing the graph are true mean optimal solution obtained through
multimodal functions, and hence the efficiency of the proposed 100 independent runs. On seeing the convergence behavior of all
work is compared with thirteen methods, such as ABC [11], CS [21], the methods, IKH quickly converges and achieves optimal result
FFA [22], ACO [12], BBO [15], DE [13,14], ES [16], GA [17], PBIL [18], for Rosenbrock, Step, Penalized1, Alpine and Wavy functions and
PSO [19], [3], SGA [20], LFPSO [2] and KH [1]. The code that was used FFA, LFPSO, KH algorithms perform well next to IKH. For Schwe-
to generate the results of ACO, BBO, DE, ES, GA, PBIL, PSO, SGA is fel2.26 function, BBO and SGA converge slowly and reach the better
available at http://academic.csuohio.edu/simond/bbo and KH code solution compared to IKH method. IKH method ranks third for
at [34]. For ACO, BBO, DE, ES, GA, HS, PBIL, and PSO, the parameters Schwefel2.26 function among thirteen approaches.
are set as in Refs. [15,37]. The control parameter values for LFPSO,
ABC, CS, and FFA are set as per the suggestions of the authors in
Refs. [2,11,21,22] respectively. For KH and proposed IKH, we set 6. Application of IKH to data clustering
the same parameter as [1]: Vf = 0.02, Dmax = 0.005, Nmax = 0.01.
For easy comparison, population size for all algorithms is set Data clustering [47] is the method in which a group of data
to 50 and maximum generation is 50. All the algorithms are exe- objects are divided into groups or clusters in such a way that the
cuted in Matlab 8.2 with Windows OS environment using Intel Core objects within the clusters are having high similarity while the data
i3, 3.30 GHz, 3.41 GB RAM. Each algorithm is executed 100 times objects in different clusters are dissimilar. Data clustering is an
independently for each benchmark functions. The mean and best unsupervised technique due to the unknown class labels. The sim-
optimization results of all algorithms over 100 runs for each test ilarity between data objects is measured by some distance metric.
function are given in Tables 2 and 3 respectively. The minimum There are several distance measurements [47] such as Euclidean
value obtained by the algorithms for each test function is marked distance, Minkowski metric, Manhatten distance, Cosine similarity,
in bold. Jaccard coefficient, Pearson correlation coefficient, and so on.
Table 1
Benchmark functions. D: dimension, C: characteristic, U: unimodal, M: multimodal.
ID Name Function C D Range Global minimum

n
f01 Sphere f1 (x) = xi 2 U 20 −100 ≤ xi ≤ 100 0

i=1
n n
f02 Schwefel 2.22 f2 (x) = |xi | + |xi | U 20 −10 ≤ xi ≤ 10 0

i=1 i=1
i
2
n
f03 Schwefel 1.2 f3 (x) = xj U 20 −100 ≤ xi ≤ 100 0
i=1
j=1

f04 Schwefel 2.21 f4 (x) = max |xi |, 1 ≤ i ≤ n U 20 −100 ≤ xi ≤ 100 0
i
R. Jensi, G.W. Jiji / Applied Soft Computing 46 (2016) 230–245

n−1 2

f05 Rosenbrock f5 (x) = 100 xi+1 − xi 2 + (xi − 1)2 U 20 −30 ≤ xi ≤ 30 0
i=1
n
2
f06 Step f6 (x) = (xi + 0.5) U 20 −100 ≤ xi ≤ 100 0
i=1
f07 Quartic with noise f7 (x) = ixi 4 + random (0, 1) U 20 −1.28 ≤ xi ≤ 1.28 0
i=1
n
f08 Schwefel 2.26 f8 (x) = 418.9828 × n − xi sin |xi | M 20 −512 ≤ xi ≤ 512 0
i=1
n

f09 Rastrigin f9 (x) = 10n + xi 2 − 10cos (2xi ) M 20 −5.12 ≤ xi ≤ 5.12 0
i=1
n
n

1 1
f10 Ackley f10 (x) = −20 exp −0.2 n
xi 2 − exp n
cos(2xi ) + (20 + e) M 20 −32 ≤ xi ≤ 32 0
i=1 i=1
n n
x
f11 Griewank f11 (x) = 1
4000
xi 2 − cos √i +1 M 20 −600 ≤ xi ≤ 600 0
i
i=1 i=1
2 n−1 2
2
n
f12 Penalized 1 f12 (x) =
n
(10sin (y1 ) + (yi − 1) 1 + 10sin (yi+1 ) + (yn − 1)2 + u (xi , 10, 100, 4) M 20 −50 ≤ xi ≤ 50 0
i=1 i=1
1
yi = 1 + 4 (xi + 1)
2 n−1 2
2
2
n
f13 Penalized 2 f13 (x) = 1
10
sin (3x1 ) + (xi − 1) 1 + sin (3xi+1 ) + (xn − 1)2 1 + sin (2xn ) + u (xi , 5, 100, 4) M 20 −50 ≤ xi ≤ 50 0
i=1 i=1
f14 Alpine f14 (x) = |xi sin (xi ) + 0.1xi | M 20 −10 ≤ xi ≤ 10 0

i=1
233
234
Table 1 (Continued)
ID Name Function C D Range Global minimum

n−1
(x2 +1)
x2 +1
f15 Brown f15 (x) = (xi2 ) i+1 + xi+1
2 i
U 20 −1 ≤ xi ≤ 4 0
i=1
n
2
f16 Dixon & Price f16 (x) = (x1 − 1)2 + i 2xi2 − xi−1 U 20 −10 ≤ xi ≤ 10 0
i=2
n n
2
f17 Trid10 f17 (x) = (xi − 1) − xi xi−1 M 10 −100 ≤ xi ≤ 100 −200
i=1 i=2

f18 Holzman 2 function f18 (x) = ixi 4 U 20 −10 ≤ xi ≤ 10 0
i=1
n−1
2
2
2
2

f19 Levy f19 (x) = (xi − 1) 1 + sin (3xi+1 ) + sin (3x1 ) + |xn − 1| 1 + sin (3xn ) M 20 −10 ≤ xi ≤ 10 0
i=1
n−1 2 2
sin2 100x +x −0.5
f20 Pathological function f20 (x) = 0.5 + i i+1
2 M 20 −100 ≤ xi ≤ 100 0
1+0.001 x2 −2xi xi+1 +x2
i=1 i i+1
2
n n
k xi k
f21 Perm f21 (x) = i +ˇ i
−1 M 4 −4 ≤ xi ≤ 4 0
k=1 i=1
n/4
2 2 4 4
f22 Powell f22 (x) = (x4i−3 + 10x4i−2 ) + 5(x4i−1 − x4i ) + (x4i−2 − x4i−1 ) + 10(x4i−3 − x4i ) U 20 −4 ≤ xi ≤ 5 0
i=1
f23 Sum square f23 (x) = ixi 2 U 20 −10 ≤ xi ≤ 10 0

i=1
n
n
2
f24 Power sum f24 (x) = xik − bk M 4 0 ≤ xi ≤ 4 0
k=1 i=1
where b = [8,18,44,114]
n n
2 n
4
1 1
f25 Zakharov f25 (x) = xi2 + 2
ixi + 2
ixi M 20 −5 ≤ xi ≤ 10 0
i=1 i=1 i=1
n
−x2
i
1
f26 Wavy1 f26 (x) = 1 − n
cos (kxi ) e 2 M 20 − ≤ xi ≤ 0
i=1
Table 2
Mean real optimization results in twenty six benchmark functions.
ID ABC CS FFA ACO BBO DE ES GA PBIL PSO SGA LFPSO KH IKH
f01 1.24E + 03 8.00E + 03 1.75E + 01 2.04E + 03 1.01E + 03 2.43E + 03 1.60E + 04 4.76E + 03 2.51E + 04 8.86E + 03 7.63E + 02 1.64E + 01 1.86E + 02 2.61E − 35
f02 5.49E − 01 4.64E + 01 3.27E + 00 5.06E + 01 7.13E + 00 2.10E + 01 7.92E + 01 3.66E + 01 6.10E + 01 4.93E + 01 1.06E + 01 1.13E + 00 1.23E + 06 2.70E − 25

f05 8.86E + 04 6.47E + 06 1.02E + 03 4.84E + 07 2.94E + 05 1.24E + 06 3.55E + 07 2.96E + 06 5.46E + 07 1.02E + 07 7.62E + 04 7.29E + 02 5.22E + 04 1.87E + 01
f07 8.58E − 01 2.73E + 00 6.21E − 02 1.01E + 01 6.83E + 00 8.81E + 00 2.29E + 01 1.03E + 01 2.59E + 01 1.10E + 01 8.08E + 00 4.42E − 02 8.63E − 02 3.96E − 04
f12 3.97E − 01 4.30E + 06 1.44E + 00 8.76E + 07 5.05E + 04 2.90E + 05 4.28E + 07 7.00E + 05 8.82E + 07 6.24E + 06 2.50E + 01 7.89E − 01 1.84E + 02 7.65E − 03
f13 7.07E + 04 1.86E + 07 1.94E − 04 2.02E + 08 3.47E + 05 2.52E + 06 1.14E + 08 4.41E + 06 2.25E + 08 2.78E + 07 2.12E + 04 6.73E − 04 1.39E + 03 3.27E − 05
f14 6.09E + 00 1.63E + 01 2.07E + 00 1.28E + 01 2.54E + 00 1.49E + 01 2.68E + 01 1.18E + 01 2.95E + 01 1.91E + 01 3.18E + 00 4.97E − 01 3.85E + 00 7.53E − 26
f15 8.43E − 03 1.73E + 01 2.06E + 01 4.58E + 20 5.94E + 00 7.05E + 00 4.61E + 03 2.53E + 01 5.76E + 02 1.32E + 05 2.44E + 00 4.27E + 01 1.52E + 02 4.16E − 56
f17 −1.42E + 02 6.77E + 02 −6.90E + 01 2.46E + 02 1.39E + 02 5.44E + 02 2.62E + 03 5.65E + 02 4.07E + 03 8.38E + 02 2.68E + 02 -1.91E + 02 −4.75E + 01 −1.79E + 02
f18 2.12E + 01 8.68E + 03 2.01E + 00 1.07E + 03 3.70E + 02 1.75E + 03 3.22E + 04 2.51E + 03 7.05E + 04 8.26E + 03 1.39E + 02 2.57E − 01 8.27E + 01 5.88E − 97
f19 2.55E + 01 1.44E + 02 2.03E + 00 1.32E + 02 1.46E + 01 3.86E + 01 3.22E + 02 8.76E + 01 3.51E + 02 1.15E + 02 1.18E + 01 8.47E − 01 1.72E + 01 1.27E + 00
f20 6.77E + 00 4.52E + 00 7.27E + 00 6.70E + 00 5.24E + 00 4.96E + 00 3.59E + 00 5.25E + 00 7.49E + 00 6.36E + 00 4.97E + 00 6.48E + 00 4.54E − 01 4.16E − 01
f21 1.29E + 00 2.08E + 00 7.43E + 01 4.56E + 00 7.27E + 00 2.53E + 00 1.86E + 01 1.13E + 01 6.06E + 00 4.21E + 00 1.31E + 01 9.32E − 01 1.18E + 02 1.70E + 00
f24 1.09E − 01 6.17E − 02 5.25E − 01 2.25E − 01 9.50E − 01 4.09E − 01 8.07E + 00 5.16E + 00 6.31E + 00 2.59E + 00 5.21E + 00 7.38E + 00 4.72E − 01 1.22E − 01
f26 4.38E − 01 5.81E − 01 4.88E − 01 7.26E − 01 1.61E − 01 5.96E − 01 7.35E − 01 6.02E − 01 7.02E − 01 6.60E − 01 2.78E − 01 3.26E − 01 5.91E − 01 0.00E + 00
Total 0 1 0 0 1 0 0 0 0 0 0 3 0 21
*The best value obtained by the algorithms for each benchmark function is shown in boldface.
235
236
Table 3
Best real optimization results in twenty six benchmark functions.
ABC CS FFA ACO BBO DE ES GA PBIL PSO SGA LFPSO KH IKH
F01 1.88E + 00 4.69E + 03 5.71E − 01 8.91E + 02 3.05E + 02 1.12E + 03 1.04E + 04 8.13E + 02 1.46E + 04 5.56E + 03 1.35E + 02 2.17E + 00 8.65E + 00 3.82E − 48
F02 2.92E − 01 3.15E + 01 1.25E + 00 3.26E + 01 3.10E + 00 1.45E + 01 4.97E + 01 1.68E + 01 4.57E + 01 2.72E + 01 4.30E + 00 5.35E − 01 5.19E + 01 2.42E − 27

F03 1.04E + 04 9.35E + 03 3.39E + 02 6.47E + 03 4.91E + 03 1.68E + 04 1.60E + 04 6.38E + 03 1.55E + 04 1.09E + 04 7.23E + 03 2.44E + 02 6.13E + 03 6.69E − 44
F05 1.54E + 02 8.23E + 05 6.30E + 01 2.42E − 26 1.66E + 04 3.27E + 05 1.68E + 07 1.41E + 05 2.82E + 07 2.20E + 06 6.47E + 03 6.28E + 01 2.11E + 03 1.84E + 01
F06 4.00E + 00 3.87E + 03 0.00E + 00 5.42E + 02 3.21E + 02 1.13E + 03 9.16E + 03 1.19E + 03 1.50E + 04 4.75E + 03 1.66E + 02 6.00E + 00 2.50E + 01 0.00E + 00
F07 1.09E − 01 7.36E − 01 1.27E − 02 7.23E + 00 5.41E + 00 6.62E + 00 1.34E + 01 6.71E + 00 1.38E + 01 6.61E + 00 6.44E + 00 6.78E − 03 1.37E − 02 5.19E − 06
F11 1.04E + 00 3.20E + 01 3.27E − 01 4.38E + 00 3.33E + 00 9.78E + 00 6.49E + 01 1.13E + 01 1.32E + 02 4.39E + 01 2.28E + 00 9.95E − 01 1.10E + 00 0.00E + 00
F12 3.90E − 03 1.45E + 05 9.70E − 02 3.17E − 02 5.60E + 00 9.61E + 02 1.00E + 07 7.62E + 00 7.63E + 06 3.64E + 05 2.25E + 00 7.25E − 02 9.53E + 00 2.09E − 03
F13 6.14E − 07 1.90E + 06 2.64E − 09 1.00E + 00 2.60E + 01 3.07E + 05 1.69E + 07 4.05E + 03 8.38E + 07 4.02E + 06 8.63E + 00 2.83E − 10 4.48E − 08 7.50E − 12
F14 2.24E + 00 1.09E + 01 3.74E − 01 7.64E + 00 1.31E + 00 1.10E + 01 2.02E + 01 6.93E + 00 1.78E + 01 1.33E + 01 1.31E + 00 6.64E − 02 1.82E − 01 2.44E − 28
F15 1.07E − 03 5.74E + 00 8.66E + 00 6.02E − 02 1.92E + 00 2.93E + 00 3.45E + 01 4.38E + 00 3.58E + 01 2.94E + 01 5.71E − 01 4.51E + 00 2.39E + 01 4.30E − 71
F17 −2.08E + 02 3.89E + 01 −1.75E + 02 −1.10E + 02 −1.84E + 02 −7.10E + 01 6.57E + 02 −1.79E + 02 1.39E + 03 2.20E + 02 −1.48E + 02 -2.10E + 02 −2.09E + 02 −2.07E + 02
F18 3.24E − 05 1.82E + 03 6.57E − 02 8.16E + 01 2.97E + 01 2.87E + 02 2.59E + 03 2.31E + 02 3.26E + 04 1.79E + 03 7.98E + 00 5.66E − 03 5.83E + 00 1.83E − 106
F19 2.52E − 01 6.73E + 01 4.23E − 01 6.17E + 01 4.99E + 00 1.70E + 01 2.28E + 02 2.65E + 01 2.05E + 02 5.99E + 01 4.69E + 00 8.76E − 02 3.84E + 00 2.33E − 01
F20 5.55E + 00 2.70E + 00 5.99E + 00 6.09E + 00 4.08E + 00 3.50E + 00 2.30E + 00 3.88E + 00 6.47E + 00 2.30E + 00 3.50E + 00 4.62E + 00 2.23E − 01 0.00E + 00
F21 4.22E − 02 7.25E − 02 1.55E − 01 2.96E − 01 1.33E − 01 3.80E − 02 2.79E − 01 0.00E + 00 7.46E − 02 5.85E − 02 4.68E − 01 6.25E − 04 7.05E − 01 2.54E − 03
F22 1.84E + 00 1.36E + 02 1.28E + 00 9.05E + 02 4.18E + 01 4.85E + 02 1.70E + 03 9.09E + 01 1.89E + 03 3.30E + 02 1.84E + 01 6.54E − 01 9.89E + 01 3.55E − 55
F23 8.94E − 02 3.84E + 02 1.70E + 00 2.83E + 02 2.09E + 01 9.35E + 01 1.16E + 03 9.66E + 01 1.34E + 03 3.85E + 02 2.09E + 01 3.41E − 01 4.28E − 02 3.32E − 52
F24 3.77E − 03 1.46E − 03 1.57E − 03 0.00E + 00 0.00E + 00 6.71E − 03 1.52E − 01 0.00E + 00 2.15E − 01 1.34E − 01 4.87E − 02 1.89E − 02 8.99E − 03 3.02E − 04
F26 3.09E − 01 5.10E − 01 2.65E − 01 6.36E − 01 5.57E − 02 4.93E − 01 6.32E − 01 5.08E − 01 6.01E − 01 5.65E − 01 1.77E − 01 1.93E − 01 4.55E − 01 0.00E + 00
Total 0 0 1 1 0 0 0 2 0 0 0 2 0 21
*The best value obtained by the algorithms for each benchmark function is shown in boldface.
Table 4
Statistical results of IKH for the benchmark functions f01–f11 using Wilcoxon rank sum test (number of runs = 100).
Wilcoxon rank sum test IKH vs.
ABC CS FFA ACO BBO DE ES GA PBIL PSO SGA LFPSO KH
f01 p-value 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2158 −12.2158 −12.2157 −12.2157 −12.2158 −12.2158 −12.2157 −12.2158 −12.2157 −12.2157
2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34

f03 p-value
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2158 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2158 −12.2158 −12.2157 −12.2157 −12.2158 −12.2158 −12.2157 −12.2158 −12.2157 −12.2157
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −11.727 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −13.0591 −13.0591 −12.9585 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.061 −13.0592
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157
f08 p-value 2.56E − 34 2.56E − 34 2.56E − 34 9.33E − 24 0.004937 2.56E − 34 2.56E − 34 6.49E − 10 2.56E − 34 2.56E − 34 0.208708 2.72E − 34 2.56E − 34
h-value 1 1 1 1 1 1 1 1 1 1 0 1 1
Zval −12.2157 −12.2157 −12.2157 −10.0484 2.811119 −12.2157 −12.2157 −6.17811 −12.2157 −12.2157 1.257127 −12.2108 −12.2157
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.206 −12.2157 −12.0691 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2158 −12.1253 −12.2133
Total 11 11 11 11 11 11 11 11 11 11 10 11 11
237
238
Table 5
Statistical results of IKH for the benchmark functions F12–F22 using Wilcoxon rank sum test (number of runs = 100).
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −11.3068 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157
f13 p-value 5.41E − 29 2.56E − 34 0.000132 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 0.001549 4.15E − 26
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −11.1748 −12.2157 −3.82268 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −3.16541 −10.5689
2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34

f14 p-value
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157
f17 p-value 0.000763 2.56E − 34 1.32E − 30 5.42E − 34 2.91E − 32 2.64E − 34 2.56E − 34 1.95E − 33 2.56E − 34 2.56E − 34 9E − 34 1.71E − 05 5.72E − 30
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −3.36577 −12.2157 −11.4998 −12.1546 −11.8248 −12.2133 −12.2157 −12.0496 −12.2157 −12.2157 −12.1131 4.299143 −11.3728
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −9.80898 −12.2157 −5.62346 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 6.097477 −12.2133
f20 p-value 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.22E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 0.000131
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2272 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −3.82513
f21 p-value 0.944483 0.005206 9.53E − 26 1.5E − 12 7.34E − 15 3.32E − 05 1.41E − 28 1.91E − 15 4.78E − 19 9.53E − 12 1.03E − 18 5.75E − 05 6.97E − 27
h-value 0 1 1 1 1 1 1 1 1 1 1 1 1
Zval 0.069637 −2.79402 −10.4907 −7.07484 −7.77855 −4.1501 −11.0893 −7.94717 −8.91715 −6.81339 −8.83167 4.02304 −10.735
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157
Total 10 11 11 11 11 11 11 11 11 11 11 11 11
Table 6

Statistical results of IKH for the benchmark functions F23–F26 using Wilcoxon rank sum test (number of runs = 100).
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157
f24 p-value 0.022124 0.746128 1.54E − 07 7.36E − 12 1.79E − 20 1.19E − 16 4.21E − 33 1.85E − 23 1.25E − 33 2.51E − 32 9.29E − 27 6.17E − 33 1.36E − 15
h-value 1 0 1 1 1 1 1 1 1 1 1 1 1
Zval −2.28823 0.323749 −5.24718 −6.85056 −9.27405 −8.28431 −11.986 −9.98075 −12.0863 −11.837 −10.7085 −11.9543 −7.98866
f25 p-value 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34 2.56E − 34
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157 −12.2157
h-value 1 1 1 1 1 1 1 1 1 1 1 1 1
Zval −13.0591 −13.0591 −13.0591 −13.0591 −13.0592 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591 −13.0591
Total 4 3 4 4 4 4 4 4 4 4 4 4 4
239
1: Initialize algorithm parameters Vf, Dmax , Nmax , NP, MI , and minimum and maximum bounds
2: Randomly generate krill individuals (solutions)
3: Evaluate the objective function value f and find the worst and best fitness values
4: iteration=1
5: while (stopping condition is not met) do
6: Store the pre-specified number of best krill
7: for each krill
8: Calculate Movement influenced by other krill individual
9: Calculate Foraging action
10: Calculate Physical diffusion
11: Implement crossover operator.
12: Update krill position using Eq. (7).
13: if rand < 0.5 then
14: Fine-tune the krill position using Eq. (9)
15: else
16: Fine-tune the krill position using Eq. (10)
17: end if
18: Evaluate the objective function value f and update the krill individual if necessary
19: end for
20: Replace the worst krill with the best krill stored before
21: end while
22: Output the global best solution.
Fig. 1. Pseudocode of the proposed algorithm.
Clustering is widely used in many fields of science and engi- where, cj is cluster center for a cluster j and is calculated as follows:
neering and it must often be solved as part of complicated tasks in
pattern recognition, data mining, information retrieval and image 1
cj = di (14)
analysis. The clustering algorithms are mainly classified into two nj
di ∈ cj
[47]: hierarchical and partitional. The most well known partitional
clustering algorithm is K-means which is the center-based clus- where nj is the total number of objects in cluster j.
tering algorithm. The advantage of K-means algorithm is simple
and efficient. But K-means suffers from initial cluster seed selection 6.2. K-means algorithm
since it is easily trapped in local minima. In this section, we intend
to apply the efficiency of proposed IKH algorithm to solve the data K-means [47] is the simplest partitional clustering algorithm
clustering problems. Several clustering approaches are introduced and it is widely used due its simplicity and efficiency. Given a set of
in the literature [26,27,48–57]. N data objects and the number of clusters k, the k-means algorithm
proceeds as follows:
Step 1: Randomly select ‘k’ cluster centers.
6.1. The problem statement
Step 2: Calculate the Euclidean distance between each data point
and cluster centers.
Clustering is the process of partitioning the set of N data objects
Step 3: Assign the data point to the cluster center whose distance
into K clusters or groups based on some distance (or similarity)
from the cluster center is minimum of all the cluster centers.
metric. Let D = {d1 , d2 ,. . .,dN } be a set of N data objects to be
Step 4: Update cluster center using Eq. (14).
partitioned and each data object di , i = 1,2,. . .,N is represented as
Step 5: If no data point was reassigned then stop, otherwise
di = {di1 ,di2 ,. . ..,dim } where dim represents mth dimension value of
repeat from step 2.
data object i.
The aim of clustering algorithm is to find a set of K partitions
C = {C1 ,C2 , . . ..,Ck |∀k: Ck =
/ ∅ and ∀l =/ k: Ck ∩Cl = ∅} in such a way 6.3. IKH for data clustering
that objects within the clusters are more similar and dissimilar
to objects in different clusters. These similarities are measured by To start with IKH for solving data clustering problems, two mod-
some optimization criterions, especially squared error function and ifications have to be done in the IKH algorithm. The first one is
it has been calculated as follows: solution representation and the other is fitness calculation. Given
the data set D and the number of clusters k as input to the algorithm,
k k the solution is represented as a row vector of size k × m, where k
f = min E di , cj (12) is the number of clusters and m is the number of features for the
j=1 i=1
clustering problem and it is shown in Fig. 4.
where cj represents a jth cluster center; E is a distance measure Then the population of solutions is represented as given below:
between a data object di and a cluster center cj . This optimization ⎡ ⎤
S1
criterion is used as the objective function value in our study. There
⎢ S2 ⎥
are many distance metric used in literature. In our study Euclidean ⎢ ⎥
⎢ ⎥
distance is used as distance metric which is defined as follows: P = ⎢ S3 ⎥ (15)
⎢ . ⎥
⎣ .. ⎦

M

E di , cj =
2 SN
dim − cjm (13)
m=1 Si = [C1, C2. . .., Ck] (16)
Initialize the parameters
Randomly initialize the swarm
Calculate objective function values
Sort the krill and store the best krill

j=1
Select jth krill
Calculate induced motion, foraging motion, physical

diffusion by using (2), (4), and (6) respectively
Apply crossover operator
Update krill position
No Yes
rand<0.5
Fine-tune position in stepwise Fine-tune position using

using Eq. (9) Eq. (11)
Calculate objective function value
Update krill position if improved
j=j+1
Yes
j<NP
No
Replace the worst krill with the best krill and
find the current best krill in the swarm
Yes
iteration<
maxiter
No
Output Best Solution
Fig. 2. Flowchart of the proposed algorithm.

Cj = [cj1, cj2, . . .cjm] , ∀j ∈ 1, 2, . . ., k (17) (12). The remaining steps of KH and IKH algorithms are executed
as described in Sections 3 and 4 respectively.
where k is the number of clusters, m is the set of features, N is the
number krill individuals. 6.4. Experimental results of IKH for data clustering
The population solutions are randomly initialized with the data
objects as cluster centroids. The second modification made to the To evaluate the performance of proposed algorithm for data
algorithm is fitness calculation. Fitness is calculated by using Eq. clustering, six datasets have been used. The datasets, namely,
Fig. 3. Performance Comparison for (a) Rosenbrock (b) Schwefel2.26 (c) Step (d) Penalized1 (e) Alpine (f) Wavy functions.
c12 c13 … …
c11 c1m c21 c22 c23 c2m ck1 ck2 ck3 … ckm
C1 C2 …. Ck
Fig. 4. Representation of a candidate solution for k clusters and m features.
Table 7
Test dataset descriptions.
S.no. Dataset name # of features # of classes # of instances (size of each class)
1 Iris 4 3 150 (50,50,50)

2 Wine 13 3 178 (59,71,48)
Glass 9 6 214 (70,17,76,13,9,29)
3 Cancer 9 2 683 (444,239)
4 CMC 10 3 1473 (629,333,511)
5 Vowel 3 6 871 (72, 89, 172, 151, 207, 180)
6 Liver disorders (LD) 6 2 345 (145,200)
Iris, Wine, Glass, Wisconsin Breast Cancer (WBC), Contraceptive of other methods. Our proposed algorithm achieves much better
Method Choice (CMC), Vowel and Liver Disorder (LD), are collected results for almost all datasets with small standard deviation. For
from Machine Learning Laboratory [33]. The datasets used in this iris dataset, KH and IKH algorithms converge to 96.6555 for each
study are described in Table 7. run.
In order to evaluate the performance and accuracy of the cluster- For wine dataset, IKH obtains better solution for worst, mean
ing result, total intra-cluster distances as defined in Eq. (12) criteria and standard deviation than K-means, K-means++, GA, SA, TS, ACO,
is used. The low value of the sum is, the higher the quality of the HBMO, PSO, and KH. KH obtains best value of 16292.19, but it fails
clustering is. to reach that value in all runs and thus the KH algorithm gets sixth
With the aim of compare the performance of our proposed algo- rank among all others. As for glass data set, KH and IKH achieve
rithm, several heuristic methods in the literature are used such as best optimum value of 210.242 and 210.252, respectively which
K-means, K-means++ [55], GA [49], SA [48], TS [52], ACO [51], HBMO needs 10,000 fitness function evaluations. The worst value of IKH
[54], PSO [53], whose results are directly taken from Refs. [56,57] is 215.9355, while the worst values over 100 runs for glass dataset
and is given in Table 8. Table 8 lists the best, worst, average and for the algorithms K-means, K-means++, GA, SA, TS, ACO, HBMO,
standard deviation of solutions and ranks the algorithms based on PSO, KH are 227.35, 223.71, 286.77, 287.18, 286.47, 280.08, 249.54,
the mean value for all datasets in Table 7. As compared algorithm 283.52, 251.2749 respectively. Thus IKH algorithm reaches near
results are directly taken from Ref. [56], KH and IKH algorithms best value in all runs.
are executed 100 times independently with same parameters as The best solution for Cancer dataset obtained by IKH and KH
described in section 5 except that the number of krill individuals algorithm is 2964.387. In some cases KH gets stuck in local min-
are set to 25 and maximum number of generations is 400 (Max. imum and thus it does not obtain optimum mean value when
NFE = 10,000). compared to IKH. For CMC dataset, MKH-H achieves best, worst
The experimental results given in Table 8 show that proposed and mean solutions of 5693.72, 5693.779 and 5693.735 with a
algorithm obtains near optimal solutions in compare to those standard deviation of 0.007975, while K-means, K-means++, GA,
Table 8
Comparison of objective function values for different datasets with other methods.
Dataset K-means K-means++ GA SA TS ACO HBMO PSO KH IKH
Iris Best 97.3259 97.3259 113.98650 97.4573 97.36597 97.10077 96.752047 96.8942 96.6555 96.6555
Worst 123.9695 122.2789 139.778272 102.0100 98.569485 97.808466 97.757625 97.8973 96.6555 96.6555
Mean 106.5766 98.5817 125.197025 99.9570 97.868008 97.171546 96.95316 97.2328 96.6555 96.6555
Std 12.938 5.578 14.563 2.01 0.53 0.367 0.531 0.347168 1.9E − 06 9.8E − 06
Rank 9 7 10 8 6 4 3 5 1 1

Wine Best 16,555.68 16,555.68 16,530.5338 16,473.4 16,666.23 16,530.533 16,357.2844 16,345.97 16,292.19 16,292.21
Worst 18,294.85 18,294.85 16,530.5338 18,083.25 16,837.54 16,530.534 16,357.2844 16,562.32 18,293.6 16,294.305
Mean 17,251.35 16,816.55 16,530.5338 17,521.09 16,785.45 16,530.53 16,357.2843 16,417.47 16,579.66 16,292.847
Std 874.148 637.140 0 753.084 52.073 0 0 85.4974 424.9147 0.7067427
Rank 9 8 5 10 7 4 2 3 6 1
Glass Best 215.73 15.36 282.32 275.16 283.79 273.46 247.71 270.57 210.2421 210.252
Worst 227.35 223.71 286.77 287.18 286.47 280.08 249.54 283.52 251.2749 222.8008
Mean 218.70 217.56 278.37 282.19 279.87 269.72 245.73 275.71 215.7225 215.9355
Std 2.456 2.455 4.138712 4.238 4.192734 3.584829 2.438120 4.557134 5.44876 2.737919
Rank 4 3 8 10 9 6 5 7 1 2
Cancer Best 2988.43 2986.96 3249.46 2993.45 3251.37 3046.06 3112.42 2973.50 2964.387 2964.387
Worst 2999.19 2988.43 3427.43 3421.95 3434.16 3242.01 3210.78 3318.88 3580.312 2964.393
Mean 2988.99 2987.99 2999.32 3239.17 2982.84 2970.49 2989.94 3050.04 2971.977 2964.389
Std 2.469 0.689 229.734 230.192 232.217 90.50028 103.471 110.8013 62.26148 0.001258
Rank 6 5 8 10 4 2 7 9 3 1
CMC Best 5703.20 5703.20 5756.5984 5849.03 5993.594 5819.1347 5713.9800 5700.985 5693.72 5693.72
Worst 5704.57 5705.37 5812.6480 5966.94 5999.805 5912.4300 5725.3500 5923.249 6755.956 5693.779
Mean 5705.37 5704.19 5705,6301 5893.48 5885.062 5701.9230 5699.2670 5820.965 5737.234 5693.735
Std 1.033 0.955 50.3694 50.867 40.84568 45.63470 12.690000 46.95969 178.0245 0.007975
Rank 5 4 10 9 8 3 2 7 6 1
Vowel Best 149,398.66 149,394.56 159,153.498 149,370.47 162,108.5381 159,458.1438 161,431.0431 148,976.0152 148,967.246 148,967.247
Worst 162,455.69 161,845.54 165,991.6520 165,986.42 165,996.4280 165,939.8260 165,804.671 149,121.1834 158,503.045 158,600.525
Mean 151,987.98 151,445.29 149,513.735 161,566.28 149,468.268 149,395.602 149,201.632 148,999.8251 150,035.986 150,172.425
Std 3425.250 3119.751 3105.5445 2847.085 2846.23516 3485.3816 2746.0416 28.8134692 1707.84248 1732.45161
Rank 9 8 5 10 4 3 2 1 6 7
Mean rank 7 5.83 7.67 9.5 6.33 3. 67 3.5 5.33 3.83 2.167
Final rank 8 6 9 10 7 3 2 5 4 1
243
SA, TS, ACO, HBMO, PSO, fail to obtain the best solutions. Similar [18] B. Shumeet, Population-Based Incremental Learning: A Method for
to Cancer dataset, KH algorithm achieves best solution of 5693.72 Integrating Genetic Search Based Function Optimization and Competitive
Learning, Carnegie Mellon University, Pittsburgh, 1994.
for CMC dataset, but fails to reach the value in all runs. For vowel [19] J. Kennedy, R. Eberhart, Particle swarm optimization, in: Proceeding of the
dataset, PSO performs well than all other algorithms including IKH. IEEE International Conference on Neural Networks, Perth, Australia, 1995, pp.
Nevertheless, KH and IKH achieve best solution of 148967.24. As 1942–1948.
[20] W. Khatib, P. Fleming, The stud GA: a mini revolution? in: A. Eiben, T. Back, M.
a conclusion, our proposed scheme reaches better optimal solu- Schoenauer, H. Schwefel (Eds.), Proceeding of the 5th International
tions with a small standard deviation in limited number iterations. Conference on Parallel Problem Solving from Nature, Springer-Verlag,
Based on the ranking analysis, IKH gets first rank among all others. London, 1998, pp. 683–691.
[21] X.S. Yang, S. Deb, Cuckoo search via Levy flights, in: World Congress on Nature
The time consumptions of the algorithms are nearly the same for
& Biologically Inspired Computing, IEEE Publication, USA, 2009, pp. 210–214.
solving each data clustering problems. [22] X.S. Yang, Firefly algorithm, Levy flights and global optimization, in: M.
Bramer, R. Ellis, M. Petridis (Eds.), Research and Development in Intelligent
Systems, XXVI, Springer, London, 2010, pp. 209–218.
7. Conclusion [23] L.-Y. Chuang, H.-W. Chang, C.-J. Tu, C.-H. Yang, Improved binary PSO for
feature selection using gene expression data Comput. Biol. Chem. 32 (2008)
29–38.
Krill herd (KH) is a new optimization method for solving many [24] Y. Zhang, D. Huang, M. Ji, F. Xie, Image segmentation using PSO and PCM with
complex global optimization problems. In this paper, we pre- Mahalanobis distance, Expert Syst. Appl. 38 (2011) 9036–9040.
[25] J.-P. Yang, C.-K. Kung, F.-T. Liu, Y.-J. Chen, C.-Y. Chang, R.-C. Hwang, Logic
sented an improved krill algorithm to solve function optimization circuit design by neural network and PSO algorithm, in: 2010 First
problems. The original krill herd saturated quickly and hence International Conference on Pervasive Computing Signal Processing and
trapped in local minimum. To alleviate the shortcomings of krill Applications (PCSPA), Harbin, China, 2010, pp. 456–459.
[26] R. Jensi, G. Wiselin Jiji, Hybrid data clustering approach using k-means and
herd, improved krill herd was proposed by introducing global
flower pollination algorithm, Adv. Comput. Intell. 2 (2) (2015).
exploration operator. Using these modifications, IKH algorithm [27] R. Jensi, G. Wiselin Jiji, MBA-LF: a new data clustering method using modified
converges to optimal solutions quickly. The simulation results show bat algorithm and Levy flight, ICTACT J. Soft Comput. 6 (1) (2015) 1093–1101.
[28] Gai-Ge Wang, Amir H. Gandomi, Amir H. Alavi, Stud krill herd algorithm,
that our proposed method is fast and efficient for solving function
Neurocomputing 128 (2014) 363–370.
optimization problems. The proposed algorithm IKH is then applied [29] Gai-Ge Wang, Amir H. Gandomi, Amir H. Alavi, An effective krill herd
to data clustering problem. The experimental results are compared algorithm with migration operator in biogeography-based optimization, Appl.
with other methods in the literature and indeed it reveals that the Math. Model. 38 (2014) 2454–2462.
[30] Gai-Ge Wang, Amir H. Gandomi, Amir H. Alavi, Guo-Sheng Hao, Hybrid krill
proposed algorithm is suitable for solving clustering problems. herd algorithm with differential evolution for global numerical optimization,
In the future study, we aim to do the following Neural Comput. Appl. 25 (2014) 297–308.
[31] Gaige Wang, Lihong Guo, Amir Hossein Gandomi, Lihua Cao, Amir Hossein
Alavi, Hong Duan, Jiang Li, Lévy-flight krill herd algorithm, Math. Prob. Eng.
(1) The proposed method is applied to solve other practical engi- 2013 (2013), Article ID 682073, http://dx.doi.org/10.1155/2013/682073, 14
pages.
neering problems, such as scheduling, path planning, text
[32] R. Jensi, G. Wiselin Jiji, A survey on optimization approaches to text document
documents clustering and constrained optimization. clustering, Int. J. Comput. Sci. Appl. 3 (6) (2013) 31–44.
(2) The performance of KH can be improved by hybridizing KH with [33] C.L. Blake, C.J. Merz, University of California at Irvine Repository of Machine
other optimization strategies. Learning Databases, 1998 http://www.ics.uci.edu/mlearn/MLRepository.html.
[34] http://www.mathworks.com/matlabcentral/fileexchange/55486-krill-herd-
algorithm.
[35] R. Jensi, G. Wiselin Jiji, An enhanced particle swarm optimization with levy
References flight for global optimization, Appl. Soft Comput. 43 (c) (2016), http://dx.doi.
org/10.1016/j.asoc.2016.02.018.
[1] A.H. Gandomi, A.H. Alavi, Krill herd: a new bio-inspired optimization [36] X. Yao, Y. Liu, G. Lin, Evolutionary programming made faster, IEEE Trans.
algorithm, Commun. Nonlinear Sci. Numer. Simul. 17 (12) (2012) 4831–4845, Evolut. Comput. 3 (1999) 82–102.
http://dx.doi.org/10.1016/j.cnsns.2012.05.010. [37] G. Wang, L. Guo, H. Wang, H. Duan, L. Liu, J. Li, Incorporating mutation scheme
[2] Hüseyin Haklı, Harun Uguz, A novel particle swarm optimization algorithm into krill herd algorithm for global numerical optimization, Neural Comput.
with Levy flight, Appl. Soft Comput. 23 (2014) 333–345. Appl. 24 (3) (2014) 853–871, http://dx.doi.org/10.1007/s00521-012-1304-8.
[3] R.C. Eberhart, Y. Shi, J. Kennedy, Swarm Intelligence, Morgan Kaufmann, 2001. [38] Lihong Guo, Gai-Ge Wang, Amir H. Gandomi, Amir H. Alavi, Hong Duan, A
[4] X.-S. Yang, Engineering Optimization an Introduction with Metaheuristic new improved krill herd algorithm for global numerical optimization,
Applications, first ed., John Wiley & Sons, New Jersey, 2010. Neurocomputing 138 (2014) 392–402.
[5] X.S. Yang, Nature-Inspired Metaheuristic Algorithms, second ed., Luniver [39] Gai-Ge Wang, Amir H. Gandomi, Amir H. Alavi, Yong-Quan Dong, A hybrid
Press, 2010. meta-heuristic method based on firefly algorithm and krill herd, in: A Hybrid
[6] D.M. Van, A.P. Engelbrecht, Data clustering using particle swarm Meta-Heuristic Method Based on Firefly Algorithm and Krill Herd, IGI Global,
optimization, in: Proceedings IEEE Congress on Evolutionary Computation, 2015, pp. 521–540, http://dx.doi.org/10.4018/978-1-4666-9479-8.ch019.
Canbella, Australia, 2003, pp. 215–220. [40] Dervis Karaboga, Bahriye Akay, A comparative study of artificial bee colony
[7] X. Cui, T. Potok, P. Palathingal, Document clustering using particle swarm algorithm, Appl. Math. Comput. 214 (2009) 108–132.
optimization, Proc. of IEEE Swarm Intelligence Symposium, IEEE Press (2005). [41] Momin Jamil, Xin-She Yang, A literature survey of benchmark functions for
[8] A. Sarangi, R.K. Mahapatra, S.P. Panigrahi, DEPSO and PSO-QI in digital filter global optimization problems, Int. J. Math. Model. Numer. Optim. 4 (2) (2013)
design, Expert Syst. Appl. 38 (2011) 10966–10973. 150–194.
[9] Y.-P. Chang, C.-N. Ko, A PSO method with nonlinear time-varying evolution [42] Ali R. Alroomi, The Farm of Unconstrained Benchmark Functions, University
based on neural network for design of optimal harmonic filters, Expert Syst. of Bahrain, Electrical and Electronics Department, Bahrain (October) 2013,
Appl. 36 (2009) 6809–6816. [Online] Available: http://www.al-roomi.org/cv/publications.
[10] H.-C. Yang, S.-B. Zhang, K.-Z. Deng, P.-J. Du, Research into a feature selection [43] Junpeng Li, Yinggan Tang, Changchun Hua, Xinping Guan, An improved krill
method for hyper spectral imagery using PSO and SVM, J. China Univ. Min. herd algorithm: krill herd with linear decreasing step, Appl. Math. Comput.
Technol. 17 (2007) 473–478. 234 (2014) 356–367.
[11] D. Karaboga, B. Basturk, A powerful and efficient algorithm for numerical [44] Liangliang Li, Yongquan Zhou, Jian Xie, A free search krill herd algorithm for
function optimization: artificial bee colony (ABC) algorithm, J. Glob. Optim. 39 functions optimization, Math. Prob. Eng. 2014 (2014), http://dx.doi.org/10.
(3) (2007) 459–471, http://dx.doi.org/10.1007/s10898-007-9149-x. 1155/2014/936374.
[12] M. Dorigo, T. Stutzle, Ant Colony Optimization, MIT Press, Cambridge, 2004. [45] G.-G. Wang, A.H. Gandomi, X.-S. Yang, A.H. Alavi, A new hybrid method based
[13] R. Storn, K. Price, Differential evolution—a simple and efficient heuristic for on krill herd and cuckoo search for global optimisation tasks, Int. J. Bio
global optimization over continuous spaces, J. Glob. Optim. 11 (4) (1997) Inspired Comput. (in press).
341–359. [46] Gai-Ge Wang, Lihong Guo, Amir Hossein Gandomi, Amir Hossein Alavi, Hong
[14] X. Li, M. Yin, Application of differential evolution algorithm on self-potential Duan, Simulated Annealing-Based Krill Herd Algorithm for Global
data, PLoS One 7 (12) (2012) http://dx.doi.org/10.1371/journal.pone.0051199. Optimization, 2013 (2013), http://dx.doi.org/10.1155/2013/213853.
[15] D. Simon, Biogeography-based optimization, IEEE Trans. Evol. Comput. 12 (6) [47] Jiawei han, Michelin Kamber, Data Mining Concepts and Techniques, Elsevier,
(2008) 702–713. 2010.
[16] H. Beyer, The Theory of Evolution Strategies, Springer, New York, 2001. [48] S.Z. Selim, K.S. Al-Sultan, A simulated annealing algorithm for the clustering
[17] D.E. Goldberg, Genetic Algorithms in Search. Optimization and Machine problem, Pattern Recognit. 24 (10) (1991) 1003–1008.
Learning, Addison-Wesley, New York, 1998.
[49] Ujjwal Maulik, Sanghamitra Bandyopadhyay, Genetic algorithm-based [55] D. Arthur, S. Vassilvitskii, K-means++: the advantages of careful seeding, in:
clustering technique, Pattern Recognit. 33 (2000) 1455–1465. Proceedings of the eighteenth annual ACM-SIAM symposium on discrete
[50] C. Sung, H. Jin, A tabu-search-based heuristic for clustering, Pattern Recognit. algorithms SODA ’07, Philadelphia, PA, USA: Society for Industrial and Applied
33 (2000) 849–858. Mathematics, 2007, pp. 1027–1035.
[51] P.S. Shelokar, V.K. Jayaraman, B.D. Kulkarni, An ant colony approach for [56] Taher Niknam, Babak Amiri, An efficient hybrid approach based on PSO, ACO
clustering, Anal. Chim. Acta 509 (2) (2004) 187–195. and k-means for cluster analysis, Appl. Soft Comput. 10 (2010) 183–197.
[52] Y. Liu, Z. Yi, H. Wu, M. Ye, K. Chen, A tabu search approach for the minimum [57] Ganesh Krishnasamy, Anand J. Kulkarni, Raveendran Paramesran, A hybrid
sum-of-squares clustering problem, Inf. Sci. 178 (2008) 2680–2704. approach for data clustering based on modified cohort intelligence and
[53] Yi-Tung Kao, Erwie Zahara, I-Wei Kao, A hybridized approach to data K-means, Expert Syst. Appl. 41 (2014) 6009–6016.
clustering, Expert Syst. Appl. 34 (3) (2008) 1754–1762.
[54] M. Fathian, B. Amiri, A honey-bee mating approach on clustering, Int. J. Adv.
Manuf. Technol. 38 (2008) 809–821.

My KHElsevier

Uploaded by

Copyright:

Available Formats

You might also like

My KHElsevier

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

My KHElsevier

Uploaded by

Copyright:

Available Formats

Applied Soft Computing 46 (2016) 230–245

Contents lists available at ScienceDirect

Applied Soft Computing

An improved krill herd algorithm with global exploration capability

Ninew = N max ∝i + ωn Niold (2)

4. The proposed IKH algorithm

where nd is the dimension of the problem. 5.1. Comparison of optimization results

ID Name Function C D Range Global minimum

f01 Sphere f1 (x) = xi 2 U 20 −100 ≤ xi ≤ 100 0

f02 Schwefel 2.22 f2 (x) = |xi | + |xi | U 20 −10 ≤ xi ≤ 10 0

R. Jensi, G.W. Jiji / Applied Soft Computing 46 (2016) 230–245

f14 Alpine f14 (x) = |xi sin (xi ) + 0.1xi | M 20 −10 ≤ xi ≤ 10 0

ID Name Function C D Range Global minimum

R. Jensi, G.W. Jiji / Applied Soft Computing 46 (2016) 230–245

f23 Sum square f23 (x) = ixi 2 U 20 −10 ≤ xi ≤ 10 0

ID ABC CS FFA ACO BBO DE ES GA PBIL PSO SGA LFPSO KH IKH

R. Jensi, G.W. Jiji / Applied Soft Computing 46 (2016) 230–245

ABC CS FFA ACO BBO DE ES GA PBIL PSO SGA LFPSO KH IKH

R. Jensi, G.W. Jiji / Applied Soft Computing 46 (2016) 230–245

Wilcoxon rank sum test IKH vs.

ABC CS FFA ACO BBO DE ES GA PBIL PSO SGA LFPSO KH

R. Jensi, G.W. Jiji / Applied Soft Computing 46 (2016) 230–245

Wilcoxon rank sum test IKH vs.

ABC CS FFA ACO BBO DE ES GA PBIL PSO SGA LFPSO KH

R. Jensi, G.W. Jiji / Applied Soft Computing 46 (2016) 230–245

R. Jensi, G.W. Jiji / Applied Soft Computing 46 (2016) 230–245

Wilcoxon rank sum test IKH vs.

ABC CS FFA ACO BBO DE ES GA PBIL PSO SGA LFPSO KH

Fig. 1. Pseudocode of the proposed algorithm.

Initialize the parameters

Randomly initialize the swarm

Calculate objective function values

Sort the krill and store the best krill

Select jth krill

Calculate induced motion, foraging motion, physical

Apply crossover operator

Update krill position

Fine-tune position in stepwise Fine-tune position using

Calculate objective function value

Update krill position if improved

Fig. 2. Flowchart of the proposed algorithm.

Fig. 4. Representation of a candidate solution for k clusters and m features.

S.no. Dataset name # of features # of classes # of instances (size of each class)

1 Iris 4 3 150 (50,50,50)

Dataset K-means K-means++ GA SA TS ACO HBMO PSO KH IKH

R. Jensi, G.W. Jiji / Applied Soft Computing 46 (2016) 230–245

You might also like