Applied Soft Computing: Magdalene Marinaki, Yannis Marinakis, Constantin Zopounidis

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Applied Soft Computing 10 (2010) 806812

Contents lists available at ScienceDirect

Applied Soft Computing


journal homepage: www.elsevier.com/locate/asoc

Honey Bees Mating Optimization algorithm for nancial classication problems


Magdalene Marinaki a, Yannis Marinakis b, Constantin Zopounidis c,*
a

Technical University of Crete, Department of Production Engineering and Management, Industrial Systems Control Laboratory, 73100 Chania, Greece
Technical University of Crete, Department of Production Engineering and Management, Decision Support Systems Laboratory, 73100 Chania, Greece
c
Technical University of Crete, Department of Production Engineering and Management, Financial Engineering Laboratory, University Campus, 73100 Chania, Greece
b

A R T I C L E I N F O

A B S T R A C T

Article history:
Received 2 December 2008
Received in revised form 8 September 2009
Accepted 10 September 2009
Available online 17 September 2009

Nature inspired methods are approaches that are used in various elds and for the solution for a number
of problems. This study uses a nature inspired method, namely Honey Bees Mating Optimization, that is
based on the mating behaviour of honey bees for a nancial classication problem. Financial decisions
are often based on classication models which are used to assign a set of observations into predened
groups. One important step towards the development of accurate nancial classication models involves
the selection of the appropriate independent variables (features) which are relevant for the problem at
hand. The proposed method uses for the feature selection step, the Honey Bees Mating Optimization
algorithm while for the classication step, Nearest Neighbor based classiers are used. The performance
of the method is tested in a nancial classication task involving credit risk assessment. The results of the
proposed method are compared with the results of a particle swarm optimization algorithm, an ant
colony optimization, a genetic algorithm and a tabu search algorithm.
2009 Elsevier B.V. All rights reserved.

Keywords:
Honey Bees Mating Optimization
Metaheuristics
Feature selection problem
Financial classication problem

1. Introduction
Several biological and natural processes have been inuencing
the methodologies in science and technology in an increasing
manner in the past years. Feedback control processes, articial
neurons, the DNA molecule description and similar genomics
matters, studies of the behaviour of natural immunological
systems, and more, represent some of the very successful domains
of this kind in a variety of real world applications. During the last
decade, nature inspired intelligence becomes increasingly popular
through the development and utilization of intelligent paradigms
in advanced information systems design. Cross-disciplinary teambased thinking attempts to cross-fertilize engineering and life
science understanding into advanced inter-operable systems. The
methods contribute to technological advances driven by concepts
from nature/biology including advances in structural genomics
(intelligent drug design through imprecise data bases), mapping of
genes to proteins and proteins to genes (one-to-many and manyto-one characteristics of naturally occurring organisms), modelling
of complete cell structures (showing modularity and hierarchy),
functional genomics (handling of hybrid sources and heterogeneous and inconsistent origins of disparate databases), self-

* Corresponding author. Tel.: +30 28210 37236; fax: +30 28210 69410.
E-mail addresses: magda@dssl.tuc.gr (M. Marinaki), marinakis@ergasya.tuc.gr
(Y. Marinakis), kostas@dpem.tuc.gr (C. Zopounidis).
1568-4946/$ see front matter 2009 Elsevier B.V. All rights reserved.
doi:10.1016/j.asoc.2009.09.010

organization of natural systems, etc. Among the most popular


nature inspired approaches, when the task is optimization within
complex domains of data or information, are those methods
representing successful animal and micro-organism team behaviour, such as swarm or ocking intelligence (birds ocks or sh
schools inspired particle swarm optimization [1]), articial
immune systems (that mimic the biological one [2,3]), or ant
colonies (ants foraging behaviours gave rise to ant colony
optimization [4,5]), etc.
In the recent few years a number of swarm intelligence
algorithms, based on the behaviour of the bees have been
presented [6]. These algorithms are divided, mainly, in two
categories according to their behaviour in the nature, the foraging
behaviour and the mating behaviour. The most important
approaches that simulate the foraging behaviour of the bees
are the Articial Bee Colony (ABC) algorithm proposed by
Karaboga and Basturk [7,8], the Virtual Bee algorithm proposed
by Yang [9], the Bee Colony Optimization algorithm proposed by
Teodorovic and DellOrco [10], the BeeHive algorithm proposed by
Wedde et al. [11], the Bee Swarm Optimization algorithm
proposed by Drias et al. [12] and the Bees algorithm proposed
by Pham et al. [13]. The Articial Bee Colony algorithm [7,8] is,
mainly, applied in continuous optimization problems and
simulates the waggled dance behaviour that a swarm of bees
perform during the foraging process of the bees. In this algorithm
there are three groups of bees, the employed bees (bees
that determines the food source (possible solutions) from a

M. Marinaki et al. / Applied Soft Computing 10 (2010) 806812

prespecied set of food sources and share this information


(waggle dance) with the other bees in the hive), the onlookers bees
(bees that based on the information that they take from the
employed bees they search for a better food source in the
neighborhood of the memorized food sources) and the scout bees
(employed bees that their food source has been abandoned and
they search for a new food source randomly). The Virtual Bee
algorithm [9] is, also, applied in continuous optimization
problems. In this algorithm, the population of the bees are
associated with a memory, a food source, and then all the
memories communicate between them with a waggle dance
procedure. The whole procedure is similar with a genetic
algorithm and it has been applied on two function optimization
problems with two parameters. In the BeeHive [11] algorithm, a
protocol inspired from dance language and foraging behaviour of
honey bees is used. In the Bees Swarm Optimization [12], initially
a bee nds an initial solution (food source) and from this solution
the other solutions are produced with certain strategies. Then,
every bee is assigned in a solution and when they accomplished
their search, the bees communicate between them with a waggle
dance strategy and the best solution will become the new
reference solution. To avoid cycling the authors use a tabu list. In
the Bees algorithm [13], a population of initial solutions (food
sources) are randomly generated. Then, the bees are assigned to
the solutions based on their tness function. The bees return to the
hive and based on their food sources, a number of bees are
assigned to the same food source in order to nd a better
neighborhood solution. In the Bee Colony Optimization [10]
algorithm, a step by step solution is produced by each forager bee
and when the foragers returns to the hive a waggle dance is
performed by each forager. Then the other bees, based on a
probability, follow the foragers. This algorithm looks like the Ant
Colony Optimization [5] algorithm but it does not use at all the
concept of pheromone trails.
Contrary to the fact that there are many algorithms that are
based on the foraging behaviour of the bees, the main algorithm
proposed based on the marriage behaviour is the Honey Bees
Mating Optimization algorithm (HBMO), that was presented
[14,15]. Since then, it has been used on a number of different
applications [1618]. The Honey Bees Mating Optimization
algorithm simulates the mating process of the queen of the hive.
The mating process of the queen begins when the queen ights
away from the nest performing the mating ight during which
the drones follow the queen and mate with her in the air. The
algorithm is a swarm intelligence algorithm since it uses a
swarm of bees where there are three kinds of bees, the queen,
the drones and the workers. There is a number of procedures
that can be applied inside the swarm. In the Honey Bees Mating
Optimization algorithm, the procedure of mating of the queen
with the drones is described. From this point of view someone
would classify the HBMO algorithm as a memetic algorithm,
since we have an elitist genetic algorithm where the queen plays
the role of super-parent. But of course this method is not a
simple memetic algorithm because in this algorithm we have a
number of details that differentiate the HBMO algorithm from a
simple memetic algorithm. First, the queen is ying randomly in
the air and, based on her speed and her energy, if she meets a
drone then there is a possibility to mate with him. Even if the
queen mates with the drone, she does not create directly a brood
but stores the genotype (with the term genotype we mean
some of the basic characteristics of the drones, i.e. part of the
solution) of the drone in her spermatheca and the brood is
created only when the mating ight has been completed.
Another difference of the proposed algorithm from a memetic
algorithm is that the broods are not created by using one queen
and one drone but each brood uses parts of the solutions

807

(genotype) of the one queen and more than one drones. In our
proposed algorithm except of this classic procedure that has
been used from the researchers in the previous published
algorithms based on Honey Bees Mating Optimization [1418],
we use also an adaptive memory procedure in order the queen
to have the possibility to store from previous selected good
drones (in previous mating ights) part of their solutions in
order to use them in a new mating ight and to produce more
ttest drones. Another difference from a classic memetic
algorithm is the role of the workers. Someone could say that
since the role of the workers is simply the brood care and they
are only a local search phase in the algorithm, then we have one
of the basic characteristics of the memetic algorithms (a genetic
algorithm with a local search phase [19]). But here we have a
strict parallelism of the local search phase with what happens in
real life, i.e. with the foraging behaviour of the honey bees. We
mean that each one of the workers, which are different honey
bees, takes care one brood in order to nd for him food and feed
him with the royal jelly in order to make him ttest and if this
brood is better than the queen to take her place. And thus, this
algorithm combines both the mating process of the queen and
one part of the foraging behaviour of the honey bees inside the
hive.
This paper presents a novel approach to solve the Feature
Subset Selection Problem using Honey Bees Mating Optimization
Algorithm. In the classication phase of the proposed algorithm,
a number of variants of the Nearest Neighbor classication
method are used [20]. The algorithm is used for the credit risk
assessment classication task that is a very challenging and
important management science problem in the domain of
nancial analysis [21]. Modern nance is a broad eld often
involved with hard decision-making problems related to risk
management. In several cases, nancial decision-making problems require the assignment of the available options into
predened groups/classes. Credit risk analysis, bankruptcy
prediction, and country risk assessment, among other are some
typical examples [22]. In this context the development of reliable
classication models is clearly of major importance to researchers and practitioners. The development of nancial classication
models is a complicated process, involving careful data collection
and pre-processing, model development, validation and implementation. Focusing on model development, several methods
have been used, including statistical methods, articial intelligence techniques and operations research methodologies. In all
cases, the quality of the data is a fundamental point. This is
mainly related to the adequacy of the sample data in terms of the
number of observation and the relevance of the decision
attributes (i.e., independent variables) used in the analysis.
During the last years the application of nature inspired methods
to nancial problems have been developed [2328]. More
precisely, in [24] a procedure that utilizes a genetic algorithm
in order to solve the Feature Subset Selection Problem is
presented and is combined with a number of Nearest Neighbor
based classiers. The genetic based classication algorithm is
applied for the solution of the credit risk assessment classication problem. In [25], a memetic algorithm, which is based on the
concepts of genetic algorithms and particle swarm optimization
is presented. Contrary to genetic algorithms, in this algorithm the
evolution of each individual of the population is performed using
a particle swarm optimization algorithm. The memetic-based
classication algorithm is combined with a number of nearest
neighbor based classiers and is tested in a very signicant
nancial classication task, involving the identication of
qualied audit reports. In [26] a tahu search metaheuristic
combined with a number of Nearest Neighbor classiers is
applied for the solution of the credit risk assessment problem

808

M. Marinaki et al. / Applied Soft Computing 10 (2010) 806812

while in [27] an ant colony optimization algorithm combined


with a number of Nearest Neighbor classiers for the solution of
the same nancial classication problem is presented. All the
methods presented in [2427] gave very satisfactory results and
compared to other classic metaheuristic algorithms their results
were in all cases better than the results of the classic
metaheuristic algorithms.
The rest of the paper is organized as follows: the next section
provides a short description of the feature selection problem. In
Section 3, a detailed analysis of the proposed algorithm is
presented. Section 4 describes the applications context using the
aforementioned nancial data sets and the experimental settings,
whereas Section 5 presents the obtained computational results.
The last section concludes the paper and discusses some future
research directions.
2. Feature selection problem
Recently, there has been an increasing need for novel datamining methodologies that can analyze and interpret large
volumes of data. The proper selection of the right set of features
for classication is one of the most important problems in
designing a good classier. Feature selection is widely used as the
rst stage of classication task to reduce the dimension of problem,
decrease noise, improve speed by the elimination of irrelevant or
redundant features. The basic feature selection problem is an
optimization problem, with a performance measure for each
subset of features to measure its ability to classify the samples. The
problem is to search through the space of feature subsets to
identify the optimal or near-optimal subset, with respect to the
performance measure.
In the literature many successful feature selection algorithms
have been proposed. These algorithms can be classied
into two categories based on whether features are done
independently of the learning algorithm used to construct
the classier. If feature selection depends on learning
algorithm, the approach is referred to as a w rapper model.
Otherwise, it is said to be a lter model. Filters, such as mutual
information (MI), are based on the statistical tools. Wrappers
assess subsets of features according to their usefulness to a
given classier.
Unfortunately, nding the optimum feature subset has been
proved to be NP-hard [29]. Many algorithms are, thus, proposed to
nd suboptimum solutions in comparably smaller amount of time
[30]. The Branch and Bound approaches (BB) [31], the Sequential
Forward/Backward Search (SFS/SBE) [3234] and the lter
approaches [33,35] search for suboptimum solutions. One of the
most important lter approaches is the Kira and Rendells Relief
algorithm [36]. Stochastic algorithms, including Simulated Annealing (SA) [37], Scatter Search algorithms [38], Ant Colony
Optimization [3943] and Genetic algorithms (GA) [33,44] are
of great interest because they often yield high accuracy and are
much faster.

3.1. Nearest Neighbor classiers

3. The Honey Bees Mating Optimization algorithm

Initially, the classic 1-Nearest Neighbor (1-nn) method is used.


The 1-nn works as follows: in each iteration of the feature selection
algorithm, a number of features are activated. For each sample of
the test set, the Euclidean Distance from each sample in the
training set is calculated. The Euclidean Distance is calculated as
follows:
v
u d
uX
Di j t jxil  x jl j2
(1)

In this paper, as it has already been mentioned, an algorithm


for the solution to the feature selection problem based on
the Honey Bees Mating Optimization is presented. This
algorithm is combined with three Nearest Neighbor-based
classiers, the 1-Nearest Neighbor, the k-Nearest Neighbor
and the Weighted k (wk)-Nearest Neighbor classier. A
pseudocode of the proposed Honey Bees Mating Optimization
based classication algorithm is presented in the following and
then an analytical description of the steps of the algorithm is
given.

where Di j is the distance between the test sample i 1; . . . ; M test


(M test is the number of test samples) and the training sample
j 1; . . . ; Mtrain (Mtrain is the number of training samples), and
l 1; . . . ; d is the number of activated features in each iteration.
With this procedure the nearest sample from the training set is
calculated. Thus, each test sample is classied in the same class to
which its nearest sample from the training set belongs.
The previous approach may be extended to the k-Nearest
Neighbor (k-nn) method, where we examine the k-nearest samples
from the training set and, then, classify the test sample by using a

l1

M. Marinaki et al. / Applied Soft Computing 10 (2010) 806812

voting scheme. The most common way is to choose the most


representative class in the training set.
Thus, the k-nn method makes a decision based on the majority
class membership among the k-Nearest Neighbors of an unknown
sample. In other words, every member among the k-Nearest
Neighbors has an equal percentage in the vote.
However, it is natural to attach more weight to those members
that are closer to the test samples. This method is called the
Weighted k Nearest Neighbor (wk-nn). In this method, the most
distant neighbor from the test sample is denoted by i 1 while the
Nearest Neighbor is denoted by i k. The ith neighbor receives the
weight
wi

i
k
X
i

(2)

Thus, the following holds:


wk  wk1      w1 > 0
wk wk1    w1 1:

(3)
(4)

3.2. Fitness function


The tness function measures the quality of the produced
members of the population. In this problem, the quality is
measured with the overall classication accuracy (see below for
a complete description). Thus, for each bee the classiers (1Nearest Neighbor, k-Nearest Neighbor or wk-Nearest Neighbor)
are called and the produced overall classication accuracy (OCA)
gives the tness function. In the tness function we would like to
maximize the OCA. The accuracy of a C class problem can be
described using a C  C confusion matrix. The element ci j in row i
and column j describes the number of samples of true class j
classied as class i, i.e., all correctly classied samples are placed in
the diagonal and the remaining misclassied cases in the upper
and lower triangular parts. Thus, the overall classication accuracy
(OCA) can be easily obtained as:
C
X
cii

OCA 100 

variable is increased by one until the size of spermatheca is


reached. Another two parameters have to be dened, the number
of queens and the number of broods that will be born by all queens.
In this implementation of Honey Bees Mating Optimization
(HBMO) algorithm, the number of queens is set equal to one,
because in the real life only one queen will survive in a hive, and
the number of broods is set equal to the number corresponding to
the queens spermatheca size. Then, we are ready to begin the
mating ight of the queen. At the start of the ight, the queen is
initialized with some energy content (initially, the speed and the
energy of the queen are generated at random) and returns to her
nest when the energy is within some threshold from zero to full
spermatheca [16]. A drone mates with a queen probabilistically
using the following annealing function [14,15]:
ProbD eD f =Speedt

i1

i1
C X
C
X
ci j

(5)

i1 j1

3.3. HBMO algorithm


Initially, we have to choose the population of the honey bees
that will congure the initial hive. Each bee is randomly placed in
the d-dimensional space as a candidate solution (in the feature
selection problem d corresponds to the number of activated
features). One of the key issues in designing a successful algorithm
for feature selection problem is to nd a suitable mapping between
feature selection problem solutions and bees in Honey Bees Mating
Optimization algorithm. Every candidate feature in HBMO is
mapped into a binary bee where the bit 1 denotes that the
corresponding feature is selected and the bit 0 denotes that the
feature is not selected. Afterwards the tness of each individual is
calculated as described previously and the best member of the
initial population of bees is selected as the queen of the hive while
all the other members of the population are the drones.
Before the process of mating begins, the user has to dene a
number that corresponds to the queens size of spermatheca. This
number corresponds to the maximum number of mating of the
queen in a single mating ight. Each time the queen successfully
mates with a drone the genotype of the drone is stored and a

809

(6)

where ProbD is the probability of adding the sperm of drone D to


the spermatheca of the queen (that is, the probability of a
successful mating), D f is the absolute difference between the
tness of D and the tness of the queen and Speedt is the speed of
the queen at time t. The probability of mating is high when the
queen is still at the beginning of her mating ight, therefore her
speed is high, or when the tness of the drone is as good as the
queens. After each transition in space, the queens speed and
energy decays according to the following equations:
Speedt 1 a  Speedt

(7)

energyt 1 a  energyt

(8)

where a is a factor 2 0; 1 and is the amount of speed and energy


reduction after each transition and each step. A number of mating
ights are realized. If the mating is successful (i.e., the drone passes
the probabilistic decision rule), the drones sperm is stored in the
queens spermatheca. By crossovering the drones genotypes with
the queens, a new brood (trial solution) is formed which later can
be improved, employing workers to conduct local search. One of
the major differences of the Honey Bees Mating Optimization
algorithm from the classic evolutionary algorithms is that since the
queen stores a number of different drones sperm in her
spermatheca she can use parts of the genotype of different drones
to create a new solution which gives the possibility to have more
ttest broods. In the crossover phase, we use a crossover procedure
which initially identies the common characteristics of the queen
and the drone and, then, inherits them to the broods. This crossover
operator is a kind of adaptive memory procedure. Initially, the
adaptive memory has been proposed by Rochat and Taillard [45] as
a part of a tabu search metaheuristic for the solution of the Vehicle
Routing Problem. This procedure stores characteristics of good
solutions. Each time a new good solution is found the adaptive
memory is updated. In our case, in the rst generation the adaptive
memory is empty. In order to add a solution or a part of a solution
in the adaptive memory there are a number of possibilities:
(1) The candidate for the adaptive memory solution is a previous
best solution (queen) and its tness function is at most 10%
worst than the value of the current best solution.
(2) The candidate for the adaptive memory solution is a member of
the population (drone) and its tness function is at most 10%
worst than the value of the current best solution.
(3) A sequence of activated features is common for the queen and
for a number of drones.
More analytically, in this crossover operator, the points are
selected randomly from the adaptive memory, from the selected
drones and from the queen. Thus, initially two crossover operator

810

M. Marinaki et al. / Applied Soft Computing 10 (2010) 806812

numbers are selected (Cr 1 and Cr 2 ) that control the fraction of the
parameters that are selected for the adaptive memory, the selected
drones and the queen. If there are common parts in the solutions
(queen, drones and adaptive memory) then these common parts
are inherited to the brood, else the Cr 1 and Cr2 values are compared
with the output of a random number generator, randi 0; 1. If the
random number is less or equal to the Cr 1 the corresponding value
is inherited from the queen, if the random number is between the
Cr 1 and the Cr 2 then the corresponding value is inherited,
randomly, from the one of the elite solutions that are in the
adaptive memory, otherwise it is selected, randomly, from the
solutions of the drones that are stored in spermatheca. Thus, if the
solution of the brood is denoted by bi t (t is the iteration number),
the solution of the queen is denoted by qi t, the solution in the
adaptive memory is denoted by adi t and the solution of the drone
by di t:
8
if randi 0; 1  Cr 1
< qi t;
(9)
bi t adi t;
if Cr 1 < randi 0; 1  Cr 2
:
di t;
otherwise:
In each iteration the adaptive memory is updated based on the best
solution. In real life, the role of the workers is restricted to brood
care and for this reason the workers are not separate members of
the population but they are used as local search procedures in
order to improve the broods produced by the mating ight of the
queen. Each of the workers have different capabilities and the
choice of two different workers may produce different solutions.
Each of the worker have the possibility to activate or deactivate a
number of different features. Each of the brood will choose,
randomly, one worker to feed it with royal jelly (local search phase)
having as a result the possibility of replacing the queen if the
solution of the brood is better than the solution of the current
queen. If the brood fails to replace the queen, then in the next
mating ight of the queen this brood will be one of the drones. The
number of drones remains constant in each iteration and the
population of drones consists of the ttest drones of all
generations.
4. Application
The nature inspired algorithm is applied to a nancial
classication problem which is related to credit risk assessment.
The data, taken from Doumpos and Pasiouras [46] involve 1330
rm-year observations for UK non-nancial rms, over the period
19992001. The sample observations are classied into ve risk
groups according to their level of likelihood of default, measured
on the basis of their QuiScore, a credit rating assigned by Qui Credit
Assessment Ltd. In particular, on the basis of their QuiScore, the
rms are classied into ve risk groups as follows:
 Secure group, consisting of rms for which failure is very unusual
and normally occurs only as a result of exceptional market
changes.
 Stable group, consisting of rms for which failure is rare and will
only come about if there are major company or market changes.
 Normal group, consisting of rms that do not fail, as well as some
that fail.
 Unstable (caution) group, consisting of rms that have a
signicant risk of failure.
 High-risk group, consisting of rms which are unlikely to be able
to continue their operation unless signicant remedial action is
undertaken.
Such multi-group credit risk rating systems are widely used in
practice, mainly by banking institutions. According to Treacy and
Carey [47] the internal reporting process in banking institutions is

Table 1
List of nancial ratios.
Feature number

Financial ratios

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

Current ratio
Quick ratio
Shareholders liquidity ratio
Solvency ratio
Asset cover
Annual change in xed assets
Annual change in current assets
Annual change in stock
Annual change in debtors
Annual change in total assets
Annual change in current liabilities
Annual change creditors
Annual change in loans/overdraft
Annual change in long-term liabilities
Net prot margin
Return on total assets
Interest cover
Stock turnover
Debtors turnover
Debtors collection (days)
Creditors payment (days)
Fixed assets turnover
Salaries/turnover
Gross prot margin
Earning Before Interest and Taxes (EBIT) margin
Earning Before Interest and Taxes, Depreciation
and Amortization (EBITDA) margin

facilitated by credit risk management systems that are based on a


multi-group scheme for rating risk. Internal reports prepared in
this context are more informative to senior management and they
provide a better representation of the actual risk exposure for the
banking institution.
The evaluation criteria used in the analysis involve ratios based
on the nancial statements of the rms. It should be noted that
credit scoring has non-nancial aspects as well but obtaining
objective, comprehensive, and reliable non-nancial information
is a cumbersome process, as such data are rarely available [46].
Therefore, within the scope of this current study the analysis was
based only on nancial information. A total of 26 nancial ratios
(Table 1) and annual changes (features) were initially considered
based on data availability and previous studies on credit
assessment and bankruptcy prediction.
5. Computational results
The algorithms were implemented in Fortran 90 (Lahey f95
compiler) on a Centrino Mobile Intel Pentium M750/1.86 GHz,
running Suse Linux 9.1. To test the efciency of the proposed
method, the 10-fold cross-validation procedure is utilized.
Initially, the data set is divided into 10 disjoint groups
containing approximately M=10 samples each, where M is the
number of the samples in the data set. Next, each of these groups
is systematically removed from the data set, a model is built
from the remaining parts (the training set) and, then, the
accuracy of the model is calculated using the excluding parts
(the test set).
As has already been mentioned, three approaches that use
different classiers, the 1-nn, k-nn and the wk-nn, are used. In
HBMO based classier the value of k is changed dynamically
depending on the number of iterations. Each generation uses
different k. The reason why k does not have a constant value is that
we would like to ensure the diversity of the bees in each iteration of
HBMO, respectively. The determination of k is done by using a
random number generator with a uniform distribution 0; 1 in
each iteration. Then, the produced number is converted to an

M. Marinaki et al. / Applied Soft Computing 10 (2010) 806812

integer k (e.g., if the produced number is in the interval 0.20.3,


then k 3).
The parameters of the proposed algorithm were selected after
thorough testing. A number of different alternative values were
tested and the ones selected are those that gave the best
computational results concerning both the quality of the solution
and the computational time needed to achieve this solution. Thus,
the selected parameters are given in the following:
(1)
(2)
(3)
(4)
(5)
(6)
(7)

Number of queens equal to 1.


Number of drones equal to 200.
Number of mating ights (M) equal to 50.
Size of queens spermatheca equal to 50.
Number of broods equal to 50.
a 0:9.
Number of different workers (w 7).

For comparison purposes, four other metaheuristic algorithms


were used, a tabu based metaheuristic, a genetic based metaheuristic, an ant colony optimization and a particle swarm optimization
based algorithm. For more details of how these methods are
applied for the solution of this problem please see [2427]. These
methods use in the classication phase the 1-nn, the k-nn and the
wk-nn classiers. In the genetic, aco and pso based metaheuristics
with the k-nn and wk-nn classier the value of k is changed
dynamically depending on the number of iteration. Each generation uses different k. The maximum number of iterations for the
tabu based metaheuristic is equal to 1000 and the size of the tabu
list is equal to 10. The parameter settings for the genetic based
metaheuristic are:
(1)
(2)
(3)
(4)

Population size equal to 200.


Number of generations equal to 50.
Probability of crossover equal to 0.8.
Probability of mutation equal to 0.25.

The parameter settings for the ant colony based metaheuristic


are:
(1) The number of ants used is set equal to the number of features
(26 for the credit risk assessment).
(2) The number of iterations that each ant constructs a different
solution, based on the pheromone trails, is set equal to 50.
(3) q 0:5.
The parameter settings for the PSO based metaheuristic are:
(1)
(2)
(3)
(4)
(5)

The number of swarms is set equal to 1.


The number of particles is set equal to 50.
The number of generations is set equal to 50.
The coefcients are c1 2; c2 2.
wmax 0:9 and wmin 0:01.

The selection of a set of appropriate input feature variables is


an important issue in building a good classier. The purpose of
feature variable selection is to nd the smallest set of features that
can result in satisfactory predictive performance. Because of the
curse of dimensionality, it is often necessary and benecial to
limit the number of input features in a classier in order to have a
good predictive and less computationally intensive model. In the
credit risk assessment problem analyzed in this paper, there are
226  1 possible feature combinations. The objective of the
computational experiments is to show the performance of the
proposed algorithm in searching for a reduced set of features with
high accuracy.

811

Table 2
Classication results.
Methods

Overall classication
accuracy (%)

Average no.
of features

HBMO 1-nn
HBMO k-nn
HBMO wk-nn
ACO 1-nn
ACO k-nn
ACO wk-nn
PSO 1-nn
PSO k-nn
PSO wk-nn
Gen 1-nn
Gen k-nn
Gen wk-nn
Tabu 1-nn
Tabu 5-nn
Tabu w5-nn
1-nn

75.21
74.18
73.52
72.18
72.85
71.72
72.10
73.30
71.05
68.12
69.84
69.02
67.74
66.09
67.36
55.78

10.2
10.3
10.6
11.2
12
11.8
11.3
10.4
11.7
12.4
13
11.2
12.2
12
12.4
26

Table 3
Selection frequencies of the features in the optimal feature sets.
Features

Frequency

Relative
frequency

Features

Frequency

Relative
frequency

1
2
3
4
5
6
7
8
9
10
11
12
13

17
19
14
27
15
11
11
14
5
11
6
2
8

(56.66)
(63.33)
(46.66)
(90)
(50)
(36.67)
(36.67)
(46.67)
(16.67)
(36.67)
(20)
(6.67)
(26.67)

14
15
16
17
18
19
20
21
22
23
24
25
26

8
18
13
10
14
10
5
8
13
7
8
18
19

(26.67)
(60)
(43.33)
(33.33)
(46.67)
(33.33)
(16.67)
(26.67)
(43.33)
(23.33)
(26.67)
(60)
(63.33)

Table 2 presents the classication results for the optimal


solution of the proposed algorithm, the HBMO based metaheuristic, for the nancial classication problem. The results of the
algorithms used for the comparisons are also shown. In the credit
risk assessment problem, which involves 5 classes, the nature
inspired metaheuristic provides considerably better results
compared to the other methods in terms of OCA. HBMO 1-nn
provides the best results on the evaluation criterion followed by
HBMO k-nn. In terms of the number of features used in the nal
model, HBMO 1-nn also provides the best results with an average
(over the 10 folds of the cross-validation analysis) of 10.2 features.
The signicance of the proposed algorithm is demonstrated by the
fact that when the classication problem is solved using 1-nn
classier and without solving the feature selection problem the
overall classication accuracy is only 55.78%. The selection
frequencies of the features (Table 3) indicate that the solvency
ratio, EBITDA margin, EBIT margin, net prot margin and the quick
ratio are the most important variables. It should be noted that the
combination of all the features of the proposed Honey Bees Mating
Optimization algorithm gave the ability in the proposed algorithm
to perform better than the other algorithms used in the
comparisons.
6. Conclusions and future research
An important issue in building a good classier is the selection of
a set of appropriate input feature variables. The Honey Bees Mating
Optimization algorithm has been proposed in this study for solving
this Feature Subset Selection Problem. Three different classiers

812

M. Marinaki et al. / Applied Soft Computing 10 (2010) 806812

were used for the classication problem, based on the nearest


neighbor classication rule. The performance of the proposed
algorithm was tested using nancial data involving credit risk
assessment. The obtained results indicate the high performance of
the proposed algorithm in searching for a reduced set of features
with high accuracy. HBMO 1-nn was found to provide the best
results in terms of accuracy rates using less then the half of the
available features. Future research will focus on the use of different
machine learning classiers (SVM, neural networks, etc.).
References
[1] J. Kennedy, R. Eberhart, Particle swarm optimization, Proceedings of 1995 IEEE
International Conference on Neural Networks 4 (1995) 19421948.
[2] D. Dasgupta (Ed.), Articial Immune Systems and their Application, Springer,
Heidelberg, 1998.
[3] L.N. De Castro, J. Timmis, Articial Immune Systems: A New Computational
Intelligence Approach, Springer, Heidelberg, 2002.
[4] M. Dorigo, L.M. Gambardella, Ant colony system: a cooperative learning approach
to the traveling salesman problem, IEEE Transactions on Evolutionary Computation 1 (1) (1997) 5366.
[5] M. Dorigo, T. Stutzle, Ant Colony Optimization, A Bradford Book, The MIT Press
Cambridge, Massachusetts, London, England, 2004.
[6] A. Baykasoglu, L. Ozbakir, P. Tapkan, Articial bee colony algorithm and its
application to generalized assignment problem, in: F.T.S. Chan, M.K. Tiwari
(Eds.), Swarm Intelligence, Focus on Ant and Particle Swarm Optimization, I-Tech
Education and Publishing, 2007, pp. 113144.
[7] D. Karaboga, B. Basturk, A powerful and efcient algorithm for numerical function
optimization: articial bee colony (ABC) algorithm, Journal of Global Optimization 39 (2007) 459471.
[8] D. Karaboga, B. Basturk, On the performance of articial bee colony (ABC)
algorithm, Applied Soft Computing 8 (2008) 687697.
[9] X.S. Yang, Engineering optimizations via nature-inspired virtual bee algorithms,
in: J.M. Yang, J.R. Alvarez (Eds.), IWINAC 2005, LNCS 3562, Springer-Verlag, Berlin/
Heidelberg, 2005, pp. 317323.
[10] D. Teodorovic, M. DellOrco, Bee colony optimizationa cooperative learning
approach to complex transportation problems, Advanced OR and AI Methods
in Transportation (2005) 5160.
[11] H.F. Wedde, M. Farooq, Y. Zhang, BeeHive: an efcient fault-tolerant routing
algorithm inspired by honey bee behavior, in: M. Dorigo (Ed.), Ant Colony
Optimization and Swarm Intelligence LNCS 3172, Springer, Berlin, 2004, pp.
8394.
[12] H. Drias, S. Sadeg, S. Yahi, Cooperative bees swarm for solving the maximum
weighted satisability problem, IWAAN International Work Conference on Articial and Natural Neural Networks LNCS 3512 (2005) 318325.
[13] D.T. Pham, A. Ghanbarzadeh, E. Koc, S. Otri, S. Rahim, M. Zaidi, The bees algorithma novel tool for complex optimisation problems, in: IPROMS 2006 Proceeding 2nd International Virtual Conference on Intelligent Production Machines
and Systems, Elsevier, Oxford, 2006.
[14] H.A. Abbass, A monogenous MBO approach to satisability, in: Proceeding of the
International Conference on Computational Intelligence for Modelling, Control
and Automation, CIMCA2001, Las Vegas, NV, USA, 2001.
[15] H.A. Abbass, Marriage in honey-bee optimization (MBO): a haplometrosis polygynous swarming approach, in: The Congress on Evolutionary Computation
(CEC2001), Seoul, Korea, May, (2001), pp. 207214.
[16] A. Afshar, O.B. Haddad, M.A. Marino, B.J. Adams, Honey-bee mating optimization
(HBMO) algorithm for optimal reservoir operation, Journal of the Franklin Institute 344 (2007) 452462.
[17] M. Fathian, B. Amiri, A. Maroosi, Application of honey bee mating optimization
algorithm on clustering, Applied Mathematics and Computation 190 (2007)
15021513.
[18] O.B. Haddad, A. Afshar, M.A. Marino, Honey-bees mating optimization (HBMO)
algorithm: a new heuristic approach for water resources optimization, Water
Resources Management 20 (2006) 661680.
[19] P. Moscato, C. Cotta, A gentle introduction to memetic algorithms, in: F. Glover,
G.A. Kochenberger (Eds.), Handbooks of Metaheuristics, Kluwer Academic Publishers, Dordrecht, 2003, pp. 105144.

[20] R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classication and Scene Analysis, second
edition, John Wiley and Sons, New York, 2001.
[21] M. Doumpos, K. Kosmidou, G. Baourakis, C. Zopounidis, Credit risk assessment
using a multicriteria hierarchical discrimination approach: A comparative analysis, European Journal of Operational Research 138 (2002) 392412.
[22] M. Doumpos, C. Zopounidis, P.M. Pardalos, Multicriteria sorting methodology:
Application to nancial decision problems, Parallel Algorithms and Applications
15 (12) (2000) 113129.
[23] A. Brabazon, M. ONeill, Biologically Inspired Algorithms for Financial Modelling,
Springer-Verlag, Berlin, 2006.
[24] M. Marinaki, Y. Marinakis, C. Zopounidis, Application of a genetic algorithm for
the credit risk assessment problem, Foundations of Computing and Decisions
Sciences 32 (2) (2007) 139152.
[25] M. Marinaki, Y. Marinakis, C. Zopounidis, A memetic algorithm for the identication of rms receiving qualied audit reports, Journal of Computational Optimization in Economics and Finance 1 (1) (2008) 5971.
[26] Y. Marinakis, M. Marinaki, M. Doumpos, N. Matsatsinis, C. Zopounidis, Optimization of nearest neighbor classiers via metaheuristic algorithms for credit risk
assessment, Journal of Global Optimization 42 (2008) 279293.
[27] Y. Marinakis, M. Marinaki, C. Zopounidis, Application of ant colony optimization
to credit risk assessment, New Mathematics and Natural Computation 4 (1)
(2008) 107122.
[28] L. Yu, S. Wang, K.K. Lai, L. Zhou, Bio-Inspired Credit Risk Analysis Computational
Intelligence with Support Vector Machines, Springer-Verlag, Berlin, 2008.
[29] R. Kohavi, G. John, Wrappers for feature subset selection, Articial Intelligence 97
(1997) 273324.
[30] A. Jain, D. Zongker, Feature selection: evaluation, application, and small sample
performance, IEEE Transactions on Pattern Analysis and Machine Intelligence 19
(1997) 153158.
[31] P.M. Narendra, K. Fukunaga, A branch and bound algorithm for feature subset
selection, IEEE Transactions on Computers 26 (9) (1977) 917922.
[32] D.W. Aha, R.L. Bankert, A comparative evaluation of sequential feature selection
algorithms, in: D. Fisher, J.-H. Lenx (Eds.), Articial Intelligence and Statistics,
Springer-Verlag, New York, 1996.
[33] E. Cantu-Paz, S. Newsam, C. Kamath, Feature selection in scientic application, in:
Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, 2004, pp. 788793.
[34] P. Pudil, J. Novovicova, J. Kittler, Floating search methods in feature selection,
Pattern Recognition Letters 15 (1994) 11191125.
[35] E. Cantu-Paz, Feature subset selection, class separability, and genetic algorithms,
in: Genetic and Evolutionary Computation Conference, 2004, 959970.
[36] K. Kira, L. Rendell, A practical approach to feature selection, in: Proceedings of the
Ninth International Conference on Machine Learning, Aberdeen, Scotland, (1992),
pp. 249256.
[37] W. Siedlecki, J. Sklansky, On automatic feature selection, International Journal of
Pattern Recognition and Articial Intelligence 2 (2) (1988) 197220.
[38] F.G. Lopez, M.G. Torres, B.M. Batista, J.A.M. Perez, J.M. Moreno-Vega, Solving
feature subset selection problem by a parallel scatter search, European Journal of
Operational Research 169 (2006) 477489.
[39] A. Al-Ani, Feature subset selection using ant colony optimization, International
Journal of Computational Intelligence 2 (1) (2005) 5358.
[40] A. Al-Ani, Ant colony optimization for feature subset selection, Transactions on
Engineering Computing and Technology 4 (2005) 3538.
[41] R.S. Parpinelli, H.S. Lopes, A.A. Freitas, An ant colony algorithm for classication
rule discovery, in: H. Abbas, R. Sarker, C. Newton (Eds.), Data Mining: A Heuristic
Approach, Idea Group Publishing, London, UK, 2002, pp. 191208.
[42] P.S. Shelokar, V.K. Jayaraman, B.D. Kulkarni, An ant colony classier system:
application to some process engineering problems, Computers and Chemical
Engineering 28 (2004) 15771584.
[43] C. Zhang, H. Hu, Ant colony optimization combining with mutual information for
feature selection in support vector machines, in: S. Zhang, R. Jarvis (Eds.), AI 2005.
LNAI 3809, 2005, 918921.
[44] W. Siedlecki, J. Sklansky, A note on genetic algorithms for large-scale feature
selection, Pattern Recognition Letters 10 (1989) 335347.
[45] Y. Rochat, E.D. Taillard, Probabilistic diversication and intensication in local
search for vehicle routing, Journal of Heuristics 1 (1995) 147167.
[46] M. Doumpos, F. Pasiouras, Developing and testing models for replicating credit
ratings: a multicriteria approach, Computational Economics 25 (2005) 327341.
[47] W.F. Treacy, M. Cavey, Credit risk rating systems at large US banks, Journal of
Banking and Finance 24 (2000) 167201.

You might also like