Credit Risk Management With ACO Rev1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

Credit Rating Prediction Using

Ant Colony Optimization


David Martens a,b Tony Van Gestel c,d Manu De Backer a
Raf Haesen a Jan Vanthienen a Bart Baesens e,a
a Department

of Decision Sciences & Information Management, K.U.Leuven


Naamsestraat 69, B-3000 Leuven, Belgium
{David.Martens;Manu.DeBacker;Raf.Haesen;Jan.Vanthienen}@econ.kuleuven.be
b Department

c Credit

of Business Administration and Public Management


Hogeschool Gent
Voskenslaan 270, Ghent 9000, Belgium
David.Martens@hogent.be

Risk Modelling, Group Risk Management, Dexia Group


Square Meeus 1, 1000 Brussel, Belgium
Tony.Vangestel@dexia.com

d Department

of Electrical Engineering, ESAT-SCD-SISTA, K.U.Leuven


Kasteelpark Arenberg 10, B-3001 Leuven (Heverlee), Belgium

e University

of Southampton, School of Management, United Kingdom


Highfield Southampton, SO17 1BJ, United Kingdom
Bart@soton.ac.uk

Abstract
The introduction of the Basel II Capital Accord has encouraged financial institutions to build internal rating systems assessing the credit risk of their various credit
portfolios. One of the key outputs of an internal rating system is the probability
of default (PD), which reflects the likelihood that a counterparty will default on
his/her financial obligation. Since the PD modeling problem basically boils down
to a discrimination problem (defaulter or not), one may rely on the myriad of classification techniques that have been suggested in the literature. However, since the
credit risk models will be subject to supervisory review and evaluation, they must
be easy to understand and transparent. Hence, techniques such as neural networks
or support vector machines are less suitable due to their black box nature. Building
upon previous research, we will use AntMiner+ to build internal rating systems for
credit risk. AntMiner+ allows to infer a propositional rule set from a given data
set hereby using the principles from Ant Colony Optimization. Experiments will be
conducted using various types of credit data sets (retail, small- and medium-sized
enterprises (SMEs) and banks). It will be shown that the extracted rule sets are both
powerful in terms of discriminatory power, and comprehensibility. Furthermore, a

Preprint submitted to Elsevier

29 October 2008

framework will be presented describing how AntMiner+ fits into a global Basel II
credit risk management system.
Key words: Ant Colony Optimization, Classification, Credit Scoring, Bankruptcy
Prediction, Basel II

Introduction

Over the past decades, financial institutions have seen an ever growing need for
quantitative analysis techniques to optimize and monitor decisions related to
risk and investment management. The gradual adoption of data warehousing
and knowledge discovery in data (KDD) technology is allowing these institutions to analyze ever larger amounts of data, using a range of powerful
techniques from various disciplines such as conventional statistics, machine
learning, neurocomputing, and operations research. This process is only being further accelerated by the recent implementation of several international
financial and accounting standards (such as Basel II, Solvency II, SarbanesOxley and IFRS). For example, by allowing banks to use their internal credit
risk assessment models as input for the minimum regulatory capital calculations, the Basel II framework is providing financial institutions with additional
incentives to refine existing credit scoring models since more accurate predictions require less conservative capital requirements. Hence, there has been a
growing interest throughout the financial world in research on novel data mining techniques and information technologies to support the implementation of
such compliance frameworks.
As a result of a longstanding interest from the research community, a myriad
of techniques have been proposed for many of the aforementioned problems,
in particular for classification problems such as credit scoring and bankruptcy
prediction. However, not all of these approaches have proven readily transferable from the academic domain to financial practice. Many of the representations applied by the suggested algorithms cannot be easily interpreted and
validated by humans. For example, neural networks are considered a black
box technique, since the reasoning behind how the non-linear prediction models reach their conclusions cannot easily be obtained from their structure.
This has not only hindered their acceptance by practitioners, but also fails to
address the increasing need for transparency under various regulatory frameworks. Credit risk analysts are unlikely to accept black box techniques such
as neural networks to make credit decisions, since under the Basel II accord,
they are now required to demonstrate and periodically validate their models,
and present reports to the national regulator for approval. Therefore, recent
research proposed the use of rule-based classification techniques to generate
2

powerful, as well as intuitive and transparent decision models.


Such a rule-based classification technique that has recently been proposed is
AntMiner+, which uses Ant Colony Optimization (ACO) to infer accurate
rules from the data. This paper will describe how this technique can be used
to generate comprehensible credit scoring models, which can then be fit into
a Basel II-compliant decision support system.
The paper is structured as follows. The next Section discusses the issues related
to building credit scoring models within the Basel II regulatory framework.
Section 3 provides an overview of the AntMiner+ classification technique,
as well as an introduction to ACO on which the technique is based. The
experimental Section 4 provides AntMiner+ credit scoring models for retail
banking, small and medium-sized enterprises (SMEs) and banks. Section 5
describes the further steps needed to obtain a Basel II compliant decision
support system, and finally, Section 6 concludes the paper.

Credit Scoring and Bankruptcy Prediction within Basel II

The recent introduction of the Basel II Capital Accord encourages financial


institutions to calculate their minimum regulatory safety capital to ensure that
they are able to return depositor funds at all times [? ]. The minimum safety
capital is determined at 8% of risk weighted assets, which are in turn quantified
taking into account three types of risk: credit risk, operational risk and market
risk. In calculating credit risk, banks must use three key risk parameters:
probability of default (PD), loss given default (LGD) and exposure at default
(EAD). These three parameters are then used as input to a Merton/Vasicek
model which then calculates the regulatory safety capital [? ].
The PD, LGD and EAD parameters can be obtained in three different ways.
The standard approach for credit risk allows banks to buy risk ratings from
external rating agencies, often called External Credit Assessment Institutions (ECAIs) in the spirit of the Accord. Examples of well-known ECAIs
are Moodys, Standard & Poors and Fitch. The risk ratings are then translated to risk weights provided in the Accord, which then allow to calculate
the risk weighted assets (RWA) and as such the regulatory capital. The foundation internal ratings based (IRB) approach allows banks to build their own
PD models and get LGD and EAD estimates from the supervisors, whereas
the advanced internal ratings based approach allows financial institutions to
estimate all three risk parameters themselves. Many financial institutions in
Western Europe, Asia and the US are currently taking steps to implement the
advanced IRB approach. More than ever, this has triggered the interest and
need to develop credit scoring and bankruptcy prediction models for estimat3

ing the PD of a set of obligors.


For retail portfolios, application scoring models will be developed that try to
quantify the credit risk of a set of recently acquired customers, given their
application characteristics (e.g. age, marital status, credit history, savings
amount, ...). Behavioural scoring models will be used to monitor the credit
risk of the existing customer base, given their most recent behaviour (e.g. average checking account status during previous month, number of credit cards,
...). For small and medium-sized enterprises (SMEs), financial institutions will
develop bankruptcy prediction models that will quantify the risk of financial
failure given a set of accounting ratios and measurements. For both retail
and SME type of obligors, one can usually assume that a sufficient number of
defaults are present in order to make statistical discrimination and classification meaningful. However, for certain type of counterparties, such as banks,
insurance companies and sovereign entities, the lack of default observations
necessitates the use of alternative methods. In this context, financial institutions will often build rating models hereby mimicking a set of externally
provided ratings (e.g. by an ECAI) given a set of candidate explanatory variables collected by the institution.
Ideally, the credit scoring, bankruptcy prediction and rating models should be
very powerful in terms of discriminatory power, so as to minimize the cost of
granting credit to bad customers or the profit lost when good customers are
rejected. Since these models now play a pivotal role in the risk management
strategy of a bank, they are also subject to supervisory review and validation
by financial regulators. Furthermore, in most countries, financial institutions
are obliged to explain why credit has been denied to an applicant. Both these
trends basically prohibit the use of black box, mathematically complex application scoring models, but instead stimulate the use of comprehensible,
easy-to-understand models.
Numerous classification techniques have been adopted for credit risk measurement and for financial forecasting in general. These techniques include
traditional statistical methods (e.g., discriminant analysis and logistic regression [? ? ]), nonparametric statistical models (e.g., k-nearest neighbor [? ? ],
decision tree [? ? ] and rule learners [? ]) and neural networks [? ? ? ]. Often, conflicts may be found when the conclusions of some of these studies are
compared. In [? ], a large-scale benchmarking study compares the classification performance of various state-of-the art classification techniques on eight
real-life credit scoring data sets. It concludes that neural networks perform
very well in terms of classification accuracy. However, their opacity and black
box nature prevents them from being used in a Basel II context. That is why
in this paper, we will use the rule-based classification technique, AntMiner+,
which provides comprehensible, accurate models that are in line with existing
domain knowledge.
4

AntMiner+: Classification based on Ant Colony Optimization

3.1 Ant Colony Optimization

Ant Colony Optimization (ACO) is a metaheuristic inspired on the foraging


behavior of real ant colonies [? ]. A biological ant by itself is a simple insect
with limited capabilities, and is guided by straightforward decision rules. However, these simple rules are sufficient for the overall ant colony to find short
paths from the nest to the food source. By dropping a chemical substance
called pheromone that attracts other ants, an ant indirectly communicates
with its fellow ants from the colony. How this indirect communication leads
to shortest path finding capabilities is shown in Fig. 1. Suppose two ants start
from their nest (left) and look for the shortest path to a food source (right).
Initially no pheromone is present on either trails, so there is a 50-50 chance
of choosing either of the two possible paths (see Fig. 1(a)). Suppose one ant
chooses the lower trail, and the other one the upper trail. The ant that has
chosen the lower (shorter) trail will have returned faster to the nest, resulting
in twice as many pheromone on the lower trail as on the upper one, as illustrated in Fig. 1(b). As a result, the probability that the next ant will choose
the lower, shorter trail will be twice as high, resulting in more pheromone
and thus more ants will choose this trail, until eventually (almost) all ants
will follow the shorter path. Note that the pheromone on the longer trail will
finally disappear through evaporation.
Ant Colony Optimization employs artificial ants that cooperate in a similar manner as their biological counterparts, in order to find good solutions
for discrete optimization problems [? ]. The first ACO algorithm is Ant System [? ? ], where ants iteratively construct solutions and add pheromone to
the paths corresponding to these solutions. Path selection is a stochastic procedure based on not only a history-dependent pheromone value, but also a
problem-dependent heuristic value. The pheromone value gives an indication
of the number of ants that chose the trail recently, while the heuristic value is
a problem dependent quality measure. When an ant reaches a decision point,
it is more likely to choose the trail with the higher pheromone and heuristic
values. Once the ant arrives at its destination, the solution corresponding to
the ants followed path is evaluated and the pheromone value of the path is
increased accordingly. Additionally, evaporation causes the pheromone level of
all trails to diminish gradually. Hence, trails that are not reinforced gradually
lose pheromone and will in turn have a lower probability of being chosen by
subsequent ants.
The performance of traditional ACO algorithms, however, is rather poor on
large instance problems [? ]. To overcome this issue, other ACO algorithms
5

have been proposed, such as Ant Colony System [? ], rank-based Ant System [?
], Elitist Ant System [? ] and MAX -MIN Ant System [? ]. As the latter is
the one employed in the AntMiner+ classification technique, the main features
of MAX -MIN Ant System are discussed next.
St
utzle et al. [? ] advocate that a better exploitation of the best solutions
can be obtained by only adding pheromone to the path of the best ant. To
avoid early search stagnation, which is the situation where all ants take the
same path and thus describe the same solution, possible pheromone values are
limited to the interval [min , max ]. Finally, initializing the pheromone values
to max entails a higher exploration at the beginning of the algorithm.
ACO has been applied to a wide variety of problems [? ], such as the vehicle
routing problem [? ? ? ], scheduling [? ? ], timetabling [? ], the traveling salesman problem [? ? ? ] and routing in packet-switched networks [? ]. Recently,
ACO has also entered the data mining domain, addressing both the clustering [? ? ] and classification task [? ? ? ], which is the topic of interest in this
paper. The first application of ACO to the classification task is reported by
Parpinelli et al. in [? ] and was named AntMiner. Extensions were put forward
by Liu et al. in AntMiner2 [? ] and AntMiner3 [? ]. Our approach, AntMiner+,
differs from these previous AntMiner versions in several ways, resulting in an
improved performance, as described in [? ]. Next follows a brief discussion of
the principles and workings of AntMiner+.

3.2 AntMiner+ Algorithm

ACO can be used to induce comprehensible and accurate rule-based classification models from data, as done in the AntMiner+ classification technique [?
].
First of all, an environment needs to be defined in which the ants operate.
When an ant moves through the environment from Start to Stop vertex, it
should incrementally construct a solution to the problem at hand, in this case
the classification problem. In order to build a set of classification rules, we define the construction graph in such a way that each ants path will implicitly
describe a classification rule. For each variable Vi a vertex vi,j is created for
each of its values V aluei,j . The set of vertices for one variable is defined as
a vertex group. To allow for rules where not all variables are involved, hence
shorter rules, an extra dummy vertex is added to each variable whose value
is undetermined, meaning it can take any of the values available. Although
only categorical variables are allowed, we make a distinction between nominal
(no apparent ordering in its values, e.g. sex and purpose of loan) and ordinal
variables (a clear ordering of the values, e.g. amount on savings or checking
6

account and income). Each nominal variable has one vertex group (with the
inclusion of the mentioned dummy vertex), but for the ordinal variables however, we build two vertex groups to allow for intervals to be chosen by the
ants. The first vertex group corresponds to the lower bound of the interval
and should thus be interpreted as < Vi+1 V aluei,k >, the second vertex
group determines the upper bound, giving < Vi+2 V aluei+1,l > (of course,
the choice of the upper bound is constrained by the lower bound). This allows
to have less, shorter and actually better rules. To extract a rule set that is
exhaustive, such that all future data points can be classified, the majority
class is not included in the vertex group of the class variable, and will be the
predicted class for the final else clause.
An example AntMiner+ construction graph for a credit scoring data set with
only three variables (purpose of the loan, amount on savings account and credit
history of the applicant) is shown in Fig. 2. The path denoted in bold describes
the rule if Purpose = car and Savings Account 0e and Savings Account
500e and Credit History=any then class=bad. A formal illustration of
the construction graph is provided in Fig. 3, for a data set with d classes, n
variables, of which the first and last variable are nominal and V2 is ordinal
(hence the two vertex groups). The weight parameters and determine the
relative importance of the pheromone and heuristic values, and its notion is
described by (1).
Now the environment is defined, we can explain the workings of the technique.
All ants begin in the Start vertex and walk through their environment to the
Stop vertex, gradually constructing a rule. Only the ant that describes the
best rule will update the pheromone of its path, as imposed by the MAX MIN Ant System approach. Evaporation decreases the pheromone of all
edges, while the pheromone levels are constrained to lie within the given interval [min , max ]. Then another iteration occurs with ants walking from Start
to Stop. Convergence occurs when all the edges of one path have a pheromone
level max and all others edges have pheromone level min . Next, the rule corresponding to the path with max is extracted and added to the rule set. Finally,
training data covered by this rule is removed from the training set. This iterative process will be repeated until the stop criterion is met, which is early
stopping. This procedure monitors the accuracy on a separate validation set,
and will stop inducing rules when the validation accuracy starts to decrease.
Next we will have a closer look at the algorithm specifics, such as the edge
probabilities and rule quality measure.
[(vi1,k ,vi,j ) (t)] .[vi,j (t)]
Pij (t) = Ppi

l=1 [(vi1,k ,vi,l ) (t)] .[vi,l (t)]


|Tij & CLASS = classant |
ij =
|Tij |
7

(1)
(2)

(vi1,k ,vi,j ) (0) = max

(3)

(vi1,k ,vi,j ) (t + 1) = (vi1,k ,vi,j ) (t) +

Q+
best
10

(4)

The edge to choose when an ant arrives at a vertex vi1,k , and thus the term
to add next, is dependent on the pheromone value of the edge between vertices
vi1,k and vi,j ((vi1,k ,vi,j ) ) and the heuristic value of the vertex vi,j (i,j ), and
normalized over all possible vertices, providing a probability Pij for each of
the possible vertices, according to (1). As the heuristic function is problemdependent, we have defined the heuristic value ij of vertex vi,j , corresponding
to the term Vi = V aluei,j , as the fraction of training cases that are correctly
covered (described) by this term, as defined by (2). Let us illustrate this definition with a simplified credit scoring data set of five data instances i1 , i2 , . . . , i5
and three variables Sex, Term of the loan and nominal variable Real Estate
stating what kind of real estate the applicant owns. Consider the vertex corresponding to Sex = M ale. As this is a binary classification problem, the only
class in the construction graph is the bad class, giving a heuristic value for
this vertex of:
|Sex = male & CLASS = bad|
= 3/4
|Sex = male|

(5)

The initial pheromone value is by definition max , as imposed by MAX -MIN


Ant System. The pheromone to add to the path of the best ant should be
proportional to the quality of the path, which we define as the sum of the
confidence and the coverage of the corresponding rule. Confidence measures
the fraction of the number of correctly classified remaining (not yet covered
by any of the extracted rules) data points by a rule compared to the total
number of remaining data points covered by that rule. The coverage gives
an indication of the overall importance of the specific rule by measuring the
number of correctly classified remaining data points over the total number
of remaining data points. More formally, the pheromone amount to add to
the path of the iteration best ant is given by the benefit of the path of the
iteration best ant, as indicated by (6), with ruleant the rule antecedent (if
part) comprising of a conjunction of terms corresponding to the path chosen
by the ant, rulecant the conjunction of ruleant with the class chosen by the ant,
and Cov a binary variable expressing whether a data point is already covered
by one of the extracted rules (Cov = 1) or not (Cov = 0). The number of
remaining data points can therefore be expressed as |Cov = 0|. This means
that, taking into account the evaporation factor as well, the update rule for
the best ants path is described by (4), where the division by ten is a scaling
factor that is needed such that both the pheromone and heuristic values lie
8

within the range [0, 1].


Q+ =

|rulecant |
|rulecant |
+
|ruleant | |Cov = 0|

{z

confidence

{z

coverage

(6)

For example, returning to our simple data set (see Table 1), suppose we have
following two rules:
R1 : if Sex = M and Term 1 y and Term 15 y
then customer = Bad
R2 : if Sex = M and Term 1 y and Term 1 y and Real Estate = A
then customer = Bad

As shown in Table 1, rule R1 correctly classifies 3 of the 4 data instances


described by the rule antecedent, yielding a confidence of 0.75. The coverage
of R1 is 0.6, as it correctly describes 3 of the 5 instances in the data set.
Similarly for rule R2, a confidence and coverage of respectively 1 and 0.2 is
obtained. This example shows that although rule R2 is completely accurate,
shown by the confidence of 1, it is not the best rule, as we also take into account
the coverage of the rule. The coverage makes sure that we avoid overfitting
and obtain less rules.
In previous research, a benchmarking study of AntMiner+ with state-of-theart classification techniques, such as C4.5, RIPPER and support vector machines, showed that AntMiner+ ranks at the absolute top when considering
both accuracy and comprehensibility [? ]. However, a reluctance to accept the
classification models may still exist as possibly unexpected signs in the hyperplane part of the AntMiner+ rules may arise, which may be due to spurious
correlations in the data, but do not represent the actual risk relationship (simply put wrong inequation signs, e.g. rules as: if Income 10.000e and Savings
Account 100.000e then customer = bad). To counter such inconsistencies
with existing domain knowledge, we have extended the AntMiner+ classification technique to incorporate domain knowledge [? ]. The basic principle is as
follows: considering our credit scoring example, we can make sure that increasing the amount on the applicants savings account cannot lead to a customer
changing from good to bad by removing the vertex group corresponding to
Savings Account (see Fig. 2): since the ants look only for rules to classify
bad customers (only the final else clause will classify a customer as good), the
term with Savings Account can only be in the form Savings Account X.
This allows the domain expert to enforce hard constraints on the inequality
signs. Furthermore, a bias may also exist towards certain values, in which case
the constraint is preferred and not mandatory. To deal with such soft constraints, the heuristic values can be adapted. For more details we refer to [? ].
The ability to incorporate domain knowledge is of crucial importance within
9

a credit scoring context, and reduces the Validation & Verification process of
the model dramatically (see Section 5.1, further in the text).
AntMiner+ is implemented in the platform-independent, object-oriented Java
programming environment, with usage of the MySQL open source database
server. Example screenshots of the Graphical User Interface (GUI) of AntMiner+
are included in Appendix.

Building Credit Risk Models with AntMiner+

In this section, we will illustrate how AntMiner+ can be used to build credit
risk systems in three different contexts: retail banking, small and medium sized
enterprises (SMEs), and bank ratings.
As AntMiner+ can only deal with categorical variables, a discretization preprocessing step takes place in which the continuous variables are turned into
discrete variables. This process is done in an automatic manner with the Weka
workbench [? ] according the criterion of Fayyad [? ]. All experiments were
run with 1000 ants and set at 0.85, as suggested in [? ].
4.1 Retail Banking
In this section, we will illustrate how AntMiner+ can be used to develop application scoring models in a retail banking context. The purpose of application
scoring is to provide a score or classification of a credit applicant given the
application characteristics provided. The data set that we will use is the German credit data set, which is a publicly available application scoring data set
(see www.ics.uci.edu/mlearn/MLRepository.html) having 1000 observations and 20 application characteristics. Table 2 presents the rules that were
extracted using AntMiner+.
The extracted rule set is concise and easy to understand. Only 5 of the original
20 application characteristics are used for making the discrimination. This
clearly has a beneficial impact on interpretability, but also on operational cost
and efficiency.
4.2 SME Bankruptcy Prediction
Under the IRB approach for corporate credits, the Basel II Capital accord
allows banks to separately distinguish exposures to SME borrowers (defined
10

as corporate exposures where the reported sales for the consolidated group of
which the firm is a part is less than 50 million e) from those to large firms.
The SME data set consists 422 observations, 74 bankrupt and 348 solvent
companies. The default data were collected from 1989-1997, while the other
data were extracted from the period 1996-1997 only. A total number of 40
candidate input variables was selected from financial statement data, using
a.o. liquidity, profitability and solvency measures (see [? ] for an extensive
description of this data set.
Table 3 represents the rules that were extracted by AntMiner+. Again, only
5 of the 40 original inputs are used in making the discrimination decision.
Note that the numbers were rounded and one variable was scaled randomly
for confidentiality reasons.
4.3 Rating Prediction
For retail and SME portfolios, one typically has a sufficient number of default
observations in order to make statistical discrimination meaningful. However,
when modeling credit risk for entities such as banks, sovereigns, or insurance
companies, the lack of default observations necessitates the use of an alternative modeling approach. That is why many financial institutions opt for
a mapping to external ratings in this context. In this section, we will study
how AntMiner+ can be used to model credit risk for bank entities. The data
was retrieved from the Bankscope database, which contains financial statements of more than 15.000 banks. For each of these banks the Moodys rating will be used as the basis of the target variable (low/speculative-grade or
good/investment-grade rating). These ratings were retrieved for the period
1998-2003. The rating at the end of May of the year T + 1 is predicted based
on a 3-year history of inputs observed during years T , T 1, T 2. A variety
of different inputs was selected covering, amongst others, asset quality, capital, operational result and liquidity. The size variable Total Assets was also
included as well as a geographical indicator Region (Euro-zone, dollar-zone,
EU accession countries, Japan and others). After data preprocessing, the data
set consisted of a cleaned database of 2996 observations with 37 inputs (see
[? ] for a more extensive description).
4.4 Classification Model Performance
Table 5 shows the results of the classification models induced by AntMiner+,
C4.5, support vector machine (SVM) and majority vote. The experimental
setup is the same for all included data sets. The data set is split up into
training, validation and test set according following fractions: 4/9, 2/9 and 3/9,
11

as is common practice in data mining [? ? ]. To eliminate any chance of having


unusually good or bad training and test sets, 10 runs are conducted where
the order of observations is first randomized before the training, validation
and test set are chosen. For each randomization AntMiner+ is run with hard
monotonicity constraints, as imposed by the financial expert.
The best average test set performance over the 10 randomizations is underlined
and denoted in bold face for each data set. We then use a paired t-test to test
the performance differences. Performances that are not significantly different
at the 5% level from the top performance with respect to a one-tailed paired
t-test are tabulated in bold face. Statistically significant underperformances
at the 1% level are emphasized in italics. Performances significantly different
at the 5% level but not at the 1% level are reported in normal script. Since
the observations of the randomizations are not independent, we remark that
this standard t-test is used as a common heuristic to test the performance
differences [? ].
As Table 5 shows, the non-linear SVM classifiers performs best in terms of accuracy, as can be expected [? ]. However, as mentioned before, the black-box
nature of such non-linear classifiers make them less suited for credit scoring,
where validation is required. When comparing the rule- and tree-based classifiers AntMiner+ and C4.5 we can observe very competitive accuracies, but
when considering the number of rules as well AntMiner+ comes out as the
best performing technique. On top of that, the AntMiner+ rule sets comply
with stated domain constraints, which, as pointed out in [? ], can result in a
decrease in accuracy. Yet a small decrease in accuracy can be allowable, an
inconsistency with domain knowledge is not.

Towards a Basel II Credit Risk Management System

Up till now, we have largely focused on extracting a comprehensible set of


rules to do risk management in a Basel II context. These rules now need to be
further analyzed and used in various activities so as to arrive at a full-fledged,
integrated Basel II risk decision and management application. In what follows,
we will discuss the most important activities, which are summarized in Fig. 4.

5.1 Verification and Validation

A first set of tools can be used to verify and validate (V&V) the extracted rule
set. Verification will attempt to look for syntax based anomalies in the rule set.
Whether the rule set is exhaustive (all cases being covered) and exclusive (a
12

case only covered by 1 rule) will be investigated in this step. Because of the ifthen-else nature of the AntMiner+ rule sets, they are by definition exhaustive
and exclusive, making the verification step obsolete. In the validation step,
it will be investigated whether the rules adequately model the risk involved
from a human interpretation viewpoint. The financial credit expert will also
be consulted and asked to interpret the rule set in this step.
In order to facilitate the verification and validation step, decision tables may
be adopted [? ]. Decision tables provide an alternative way of representing the
AntMiner+ rule sets in a user-friendly way. A decision table (DT) consists
of four quadrants, separated by double-lines, both horizontally and vertically
(cf. Fig. 5). The vertical line divides the table into a condition part (left),
specifying the inputs to be checked, and an action part (right) specifying the
classes assigned.
Each condition entry describes a relevant subset of values (called a state)
for a given input, or contains a dash symbol () if its value is irrelevant
within the context of that column. Subsequently, every action entry holds a
value assigned to the outcome class. True, false and unknown action values
are typically abbreviated by , , and , respectively. Every row in the
entry part of the DT thus comprises a classification rule, indicating what class
results from a certain combination of inputs. If each row only contains simple
states (no contracted or irrelevant entries), the table is called an expanded
DT, whereas otherwise the table is called a contracted DT. Table contraction
can be achieved by combining rows that lead to the same outcome class. The
number of rows in the contracted table can then be further minimised by
changing the order of the conditions. It is obvious that a DT with a minimal
number of rows is to be preferred since it provides a more parsimonious and
comprehensible representation of the extracted rule set than an expanded DT.
This is illustrated in Fig. 6.
In the literature, several kinds of DTs have been proposed. We will require
that the condition entry part of a DT satisfies the following two criteria:
completeness: all possible combinations of input values are included;
exclusivity: no combination is covered by more than one column.
As such, we deliberately restrict ourselves to single-hit tables, wherein columns
have to be mutually exclusive, because of their advantages with respect to verification and validation [? ]. It is this type of DT that can be easily checked
for potential anomalies, such as inconsistencies (a particular counterparty being assigned to more than one class) or incompleteness (no class assigned).
The decision table formalism thus allows for easy verification of the extracted
AntMiner+ rules. Additionally, for ease of legibility, the rows are arranged
in lexicographical order, in which entries at lower rows alternate first. As a
13

result, a tree structure emerges in the condition entry part of the DT, which
lends itself very well to a top-down evaluation procedure: starting at the first
column, and then working ones way to the right of the table by choosing
from the relevant condition states, one safely arrives at the outcome class for
a given case. This condition-oriented inspection approach often proves to be
more intuitive, faster, and less prone to human error, than evaluating a set of
rules one by one.
Decision tables can also be usefully adopted for validation purposes, as an
easily be checked for potential anomalies, such as in- consistency with monotonicity constraints: by placing the assumingly monotone variable in the last
column, adjacent rows are found with data entries that are equal in all variables except the last one. It can then be easily seen whether or not the class
variable changes in the expected manner. As AntMiner+ has the supplementary benefit of incorporating such monotonicity constraints, as demonstrated
in Section 3.2, the decision table will reveal no counter-intuitive patterns any
more. For example, Table 6 depicts the decision table corresponding to the
rule set extracted for the German credit scoring data set (see Table 2). Based
on this table, we can easily check that credit history can only have a positive
effect on the applicants assessment, if any.
We can conclude that this first step of verifying and validating the model
has been releaved significantly thanks to the nature of the induced rule sets
(exhaustive and exclusive) and because of the incorporation of monotonicity
constraints. This does however not mean that this phase is no longer needed,
as the domain expert still needs to check whether the model is suitable. From
that perspective, decision tables are still a very useful tool.

5.2 Traffic Light Decision Support System

Once the rule set has been verified and validated, it needs to be implemented
as a decision support system (DSS) which can be used by the credit officers
so as to make the actual credit decision: accept or reject. The DSS can be
implemented using a traffic light indicator approach that gives three possible
outcomes: a green light, an orange light or a red light [? ]. A green light
indicates that the rule set is confident enough to classify a customer as a good
payer and credit should be accepted. An orange light indicates a doubt case
for which human intervention is needed. This can be due to for example, low
confidence of the rule set, external information obtained from a credit bureau
(e.g. Equifax, Experian), a customer which is rejected borderline by the rule
set but is very profitable on other financial products, and/or a new marketing
campaign in which the financial institution decides to grant credit to some of
the more risky customers. The orange light can allow for model overrides by
14

the credit expert. A low side override means that a customer rejected by the
rule set is accepted, and a high side override vice versa. A red light indicates
that the rule set is confident enough to classify a customer as a bad payer and
credit should be rejected. Note that this traffic light indicator approach can
also be implemented using four colors (green, yellow, orange, red) or gauges
in a dashboard application. An implementation of a traffic light indicator
approach using four colors could be as follows. Red when the rule set predicts
bad customer and this is confirmed by the credit bureau information; Orange
when the rule set predicts bad customer, but credit bureau says customer is
good risk; Yellow when the rule set predicts bad customer, but confidence is
very low and the credit bureau says customer is good risk; and Green when the
rule set says good customer and the credit bureau says customer is good risk.
Note that the financial institutions can decide for themselves on the number
of colors and their meaning.
5.3 Interface to Basel II Calculation Engine
The extracted rule set must also interface with a Basel II calculation engine
which will use the rule outputs to calculate expected loss and the regulatory
capital that a financial institution needs to set aside in order to cover unexpected credit losses. Therefore, in a calibration phase, each rule should be
accompanied by a PD estimate which should be forward looking and based
on five years of historical data.
Once the estimates for the LGD and EAD have been obtained, the expected
loss and the regulatory capital can be calculated. The expected loss (EL) can
be calculated as EL = P D LGD EAD. It represents the long-run average
credit loss and will be used for debt provisioning. The regulatory safety capital
can then also be calculated based on the formulas provided in the Basel II
Accord. E.g., for retail exposures the formulas are as follows
q
q

1
K = LGD (( 1
1 (P D) + 1
1 (0.999)) P D)

(7)

regulatory capital = K EAD

whereby (1 ) represents the (inverse) cumulative standard normal distribution, and the asset correlation factor which is fixed in the Accord [? ] (e.g.
0.15 for residential mortgage exposures).
5.4 Evaluating the Model over Time: Backtesting and Benchmarking
The Basel II Capital Accord requires credit risk systems to be validated, at
least annually. The accord distinguishes between backtesting, which is com15

paring the predicted outcome by the rule set with the realized outcome, and
benchmarking, which is comparing the predicted outcome of the rule set with
the outcomes of models of other parties in the industry (such as credit bureaus, other financial institutions, or financial regulators). From a backtesting
perspective, the performance of the rule set needs to be monitored. Again,
a traffic light indicator approach can be adopted with three outcomes: green
light, orange light, red light [? ]. The decision which light to switch on can
be determined based on the outcome of a test statistic which monitors the
classification accuracy (e.g. McNemars test [? ]). A green light indicates that
the rule set performance is stable, e.g. no significant differences at the 5%
level are reported. It means the rule set can continue to be used. An orange
light may indicate e.g. a difference at the 5% level but not at the 1% level of
significance. It indicates a performance difference which requires no immediate action but needs to be closely monitored in the future. A red light then
indicates a significant performance difference at the 1% level. It indicates that
the model is no longer appropriate for the current data which could possibly
be due to a change of the population (often referred to as population drift)
or a new strategy of the financial institution. In other words, the model needs
to be rebuilt, which in our context would mean extracting a new rule set using AntMiner+. From a benchmarking perspective, a similar process can be
conducted, whereby the traffic lights now indicate how much the two parties
agree or disagree on their credit decisions.

Conclusion

The introduction of the recently suggested Basel II Capital Accord has encouraged financial institutions to build efficient and high-performing credit risk
models assessing the creditworthiness of their counterpartys. Ideally, these
models should be both powerful, in terms of discriminating defaulters from
non-defaulters, and comprehensible, in terms of explanatory power. In this
paper, we discussed how Ant Colony Optimization can be used to build credit
risk models for Basel II. More specifically, we used the AntMiner+ algorithm,
which is a rule induction technique based on the principles of MAX -MIN
Ant System. AntMiner+ distinguishes itself by the comprehensibility of the
induced models which are in line with existing domain knowledge. We have
also shown how decision tables can be useful to provide even more insight into
the classification model.
Experiments were conducted using three real-life credit risk data sets: one
in retail, one for SMEs, and one for bank ratings. It was illustrated that for
each of these data sets AntMiner+ extracted a powerful and concise rule set.
Furthermore, it was also discussed how the induced rule sets could fit into a
global credit risk management strategy and architecture. An interesting topic
16

for further research is to extend the algorithm to handle continuous targets


and generate regression rules, which could be useful e.g. for modeling LGD
and EAD.

Acknowledgment

We extend our gratitude to the (associate) editor and the anonymous reviewers, as their many constructive and detailed remarks certainly contributed
much to the quality of this paper. Further, we would like to thank the Flemish Research Council (FWO, Grant G.0615.05), and the Microsoft and KBCVlekho-K.U.Leuven Research Chairs for financial support to the authors.

References
[]

[]

[]

[]

[]

[]

[]

[]

[]

A. Abraham and V. Ramos. Web usage mining using artificial ant colony
clustering. In the Congress on Evolutionary Computation, pages 1384
1391. IEEE Press, 2003.
B. Baesens, T. Van Gestel, S. Viaene, M. Stepanova, and J. Suykens,
J.A.K.and Vanthienen. Benchmarking state of the art classification algorithms for credit scoring. Journal of the Operational Research Society,
54(6):627635, 2003.
Basel Committee on Banking Supervision. International convergence of
capital measurement and capital standards: a revised framework. Technical report, BIS, June 2006.
C. Blum. Beam-ACO hybridizing ant colony optimization with beam
search: An application to open shop scheduling. Computers & Operations
Research, 32(6):15651591, 2005.
B. Bullnheimer, R. F. Hartl, and C. Strauss. A new rank based version
of the ant system: A computational study. Central European Journal for
Operations Research and Economics, 7(1):2538, 1999.
B. Bullnheimer, R.F. Hartl, and C. Strauss. Applying the ant system to
the vehicle routing problem. In S. Voss, S. Martello, I.H. Osman, and
C. Roucairol, editors, Meta-Heuristics: Advances and Trends in Local
Search Paradigms for Optimization, 1999.
G. Di Caro and M. Dorigo. Antnet: Distributed stigmergetic control
for communications networks. Journal of Artificial Intelligence Research,
9:317365, 1998.
A. Colorni, M. Dorigo, V. Maniezzo, and M. Trubian. Ant system
for jobshop scheduling. Journal of Operations Research, Statistics and
Computer Science, 34(1):3953, 1994.
V.S. Desai, J.N. Crook, and G.A. Overstreet Jr. A comparison of neu17

[]

[]

[]

[]

[]
[]

[]

[]

[]
[]

[]

[]

[]

ral networks and linear scoring models in the credit union environment.
European Journal of Operational Research, 95(1):2437, 1996.
T. G. Dietterich. Approximate statistical test for comparing supervised
classification learning algorithms. Neural Computation, 10(7):18951923,
1998.
M. Dorigo and L. M. Gambardella. Ant colony system: A cooperative
learning approach to the traveling salesman problem. IEEE Transactions
on Evolutionary Computation, 1(1):5366, April 1997.
M. Dorigo, V. Maniezzo, and A. Colorni. Positive feedback as a search
strategy. Technical Report 91016, Dipartimento di Elettronica e Informatica, Politecnico di Milano, IT, 1991.
M. Dorigo, V. Maniezzo, and A. Colorni. Ant System: Optimization by a
colony of cooperating agents. IEEE Transactions on Systems, Man, and
Cybernetics Part B: Cybernetics, 26(1):2941, 1996.
M. Dorigo and T. St
utzle. Ant Colony Optimization. MIT Press, Cambridge, MA, 2004.
U.M. Fayyad and K.B. Irani. Multi-interval discretization of continuousvalued attributes for classification learning. In Proceedings of the
Thirteenth International Joint Conference on Artificial Intelligence
(IJCAI), pages 10221029, Chambery, France, 1993. Morgan Kaufmann.
L. M. Gambardella and M. Dorigo. Ant-Q: A reinforcement learning approach to the traveling salesman problem. In A. Prieditis and S. Russell,
editors, Proceedings of the Twelfth International Conference on Machine
Learning, pages 252260, Palo Alto, CA, 1995. Morgan Kaufmann Publishers Inc.
D. Hand. Pattern detection and discovery. In D. Hand, N. Adams,
and R. Bolton, editors, Pattern Detection and Discovery, volume 2447
of Lecture Notes in Computer Science, pages 112. Springer, 2002.
J. Handl, J. Knowles, and M. Dorigo. Ant-based clustering and topographic mapping. Artificial Life, 12(1):3561, 2006.
W.E. Henley and D.J. Hand. Construction of a k-nearest neighbour
credit-scoring system. IMA Journal of Mathematics Applied In Business
and Industry, 8:305321, 1997.
B. Liu, H. A. Abbass, and B. McKay. Density-based heuristic for rule
discovery with ant-miner. In 6th Australasia-Japan Joint Workshop on
Intelligent and Evolutionary Systems (AJWIS2002), Canberra, Australia,
2002.
B. Liu, H. A. Abbass, and B. McKay. Classification rule discovery with
ant colony optimization. In IAT, pages 8388. IEEE Computer Society,
2003.
D. Martens, M. De Backer, R. Haesen, B. Baesens, C. Mues, and
J. Vanthienen. Ant-based approach to the knowledge fusion problem.
In Proceedings of the Fifth International Workshop on Ant Colony
Optimization and Swarm Intelligence, Lecture Notes in Computer Science, pages 8596. Springer, 2006.
18

[]

[]

[]

[]

[]
[]

[]
[]

[]
[]
[]

[]

[]

[]

[]

D. Martens, M. De Backer, R. Haesen, M. Snoeck, J. Vanthienen,


and B. Baesens. Classification with ant colony optimization. IEEE
Transaction on Evolutionary Computation, 11(5):651665, 2007.
R. Montemanni, L. M. Gambardella, A. E. Rizzoli, and A. Donati.
Ant colony system for a dynamic vehicle routing problem. Journal of
Combinatorial Optimization, 10(4):327343, 2005.
R. S. Parpinelli, H. S. Lopes, and A. A. Freitas. An ant colony based
system for data mining: Applications to medical data. In Proceedings of
the Genetic and Evolutionary Computation Conference (GECCO-2001),
pages 791797, San Francisco, California, USA, 2001. Morgan Kaufmann.
D. Quintana, C. Luque, and P. Isasi. Evolutionary rule-based system
for IPO underpricing prediction. In GECCO 05: Proceedings of the 2005
conference on Genetic and evolutionary computation, pages 983989, New
York, NY, 2005. ACM Press.
D.J. Sheskin. Handbook of parametric and nonparametric statistical
procedures. Chapman and Hall/CRC, 2000.
K. Socha, J. Knowles, and M. Sampels. A MAX -MIN ant system
for the university timetabling problem. In M. Dorigo, G. Di Caro, and
M. Sampels, editors, Proceedings of ANTS 2002 Third International
Workshop on Ant Algorithms, volume 2463 of Lecture Notes in Computer
Science, pages 113, Berlin, Germany, September 2002. Springer-Verlag.
A. Steenackers and M.J. Goovaerts. A credit scoring model for personal
loans. Insurance: Mathematics and Economics, 8:3134, 1989.
T. St
utzle and H. H. Hoos. Improving the ant-system: A detailed report
on the MAX -MIN ant system. Technical Report AIDA 96-12, FG
Intellektik, TU Darmstadt, Germany, 1996.
T. St
utzle and H. H. Hoos. MAX -MIN ant system. Future Generation
Computer Systems, 16(8):889914, 2000.
D. Tasche. Traffic lights approach to PD validation. Technical report,
2003.
E. Tsang, P. Yung, and J. Li. EDDIE-automation, a decision support
tool for financial forecasting. Decision Support Systems, 37(4):559565,
September 2004.
T. Van Gestel, B. Baesens, P. Van Dijcke, J. Garcia, J.A.K. Suykens,
and J. Vanthienen. A process model to develop an internal rating system: sovereign credit ratings. Decision Support Systems, 42(2):11311151,
2006.
T. Van Gestel, B. Baesens, P. Van Dijcke, J.A.K. Suykens, J. Garcia,
and T. Alderweireld. Linear and nonlinear credit scoring by combining
logistic regression and support vector machines. Journal of Credit Risk,
1(4), 2005.
J. Vanthienen, C. Mues, and A. Aerts. An illustration of verification
and validation in the modelling phase of KBS development. Data and
Knowledge Engineering, 27(3):337352, 1998.
J. Vanthienen, C. Mues, and A. Aerts. An illustration of verification
19

[]
[]

[]
[]

[]

and validation in the modelling phase of kbs development. Data and


Knowledge Engineering, 27:337352, 1998.
J. Vanthienen and G. Wets. From decision tables to expert system shells.
Data and Knowledge Engineering, 13(3):265282, 1994.
A. Wade and S. Salhi. An ant system algorithm for the mixed vehicle routing problem with backhauls. In Metaheuristics: computer
decision-making, pages 699719, Norwell, MA, 2004. Kluwer Academic
Publishers.
D. West. Neural network credit scoring models. Computers and
Operations Research, 27:11311152, 2000.
I. H. Witten and E. Frank. Data mining: practical machine learning tools
and techniques with Java implementations. Morgan Kaufmann Publishers
Inc., San Francisco, CA, USA, 2000.
M.B. Yobas, J.N. Crook, and P. Ross. Credit scoring using neural and evolutionary techniques. IMA Journal of Mathematics Applied in Business
and Industry, 11:111125, 2000.

Appendix: Screenshots of AntMiner+ GUI


Several screenshots of the AntMiner+ Graphical User Interface are provided
in Fig. 7 and 8.
Fig. 7 shows the initial menu of AntMiner+, allowing the user to choose the
number of ants and evaporation rate . The minimal fraction uncovered data
input variable can be used as an alternative for the early stopping stop criterion: no more rules will be extracted when all but x% of the data has been
covered by the extracted rule set. Note that all experiments were conducted
with the early stopping criterion.
Fig. 8 shows the construction graph for the SME data set during different
stages of execution: from initialization (top) to convergence (bottom), with
the width of the edges being proportional to their pheromone level. In the
bottom box of each screenshot, the extracted rules with their accuracy on
both training, validation and test set are displayed.

20

50%

33%

50%

67%
(a)

(b)

Fig. 1. Path selection directed by pheromone: the more pheromone on a path, the
more likely an ant will follow the path. This simple mechanism of indirect communication is sufficient for the overall ant colony to find short paths from the nest to
the food source.
SavingsSavings
Credit
History
Purpose
Account
Account
Class
0e
0e
all paid
car
100e 100e
education
Start

bad
business
any

250e 250e
500e 500e

none taken
Stop
critical

1000e 1000e

any

3000e 3000e

Fig. 2. Example of a path described by an ant for a credit scoring construction graph
defined by AntMiner+. The rule corresponding to the chosen path is if Purpose =
car and Savings Account [0e,500e] then class = bad.

Weight Parameters Class

V0,=
=
=
v0,1
a1
b1

Start

a2

b2

a3

b3

a4

b4

V1,=
v1,1

V2,
V3,
v2,1 v3,1

Vm,=
vm,1

v1,2

v2,2 v3,2

vm,2

v0,2

Stop

v0,d1

v1,p1+1

v2,p2 v3,p3

vn,pm+1

Fig. 3. Multiclass construction graph of AntMiner+, with the inclusion of weight


parameters.

21

Decision
Support System

V&V
data
AntMiner+

R1: if (Checking Account < 100 and Duration > 15m)


then class = bad
R2: if (Purpose = new car and Credit History = critical)
then class = bad
R3: else if (Checking Account < 0 and Purpose = furniture and
Savings Account < 250 )
then class = bad
R4: else class = good

PD
Backtesting &
Benchmarking

LGD
EAD

Capital Requirements

Fig. 4. Credit risk management system with the use of AntMiner+. The induced
rule set is verified and validated, after which it can be used as a decision support
system to make actual credit risk decisions (accept or deny credit), and to calculate
capital requirements. Finally, backtesting and benchmarking validate the credit risk
management system over time.

condition subjects

action subjects

condition entries

action entries

Fig. 5. DT quadrants.

22

1. Condition1

2. Condition2

3. Condition3

1. Class1

2. Class2

yes

yes
yes
no

yes
no
no

no

yes

no

yes

no

yes

no

1. Class1

2. Class2

yes

no

(a) Expanded decision table


1. Condition1

yes

2. Condition2

3. Condition3

yes
no

no

yes

no

(b) Contracted decision table


Fig. 6. Minimizing the number of columns of a lexicographically ordered DT [? ].

Fig. 7. Screenshot of AntMiner+ initial menu.

23

Fig. 8. Screenshots of AntMiner+ run on the SME credit risk data set during different stages of execution: from initialization (top) to convergence (bottom)

24

Table 1
Illustration of Quality Measure Q+
Sex

Term

Real Estate

Customer

i1

Bad

R1

i2

Bad

i3

15

Good

i4

10

Bad

i5

15

Good

Confidence

3/4

1/1

Coverage

3/5

1/5

Q+

1.35

1.2

R2

Table 2
Example credit scoring rule set
R1: if (Checking Account < 100e and Duration > 15 m and
Credit History = no credits taken and Savings Account < 500e)
then class = bad
R2: else if (Purpose = new car/repairs/education/others and
Credit History = no credits taken/all credits paid back duly at this bank and
Savings Account < 500e)
then class = bad
R3: else if (Checking Account < 0e and
Purpose = furniture/domestic appliances/business and
Credit History = no credits taken/all credits paid back duly at this bank and
Savings Account < 250e)
then class = bad
R4: else if (Checking Account < 0e and Duration > 15 m and
Credit History = critical account and Savings Account < 250e)
then class = bad
R5: else class = good

Table 3
Example SME bankruptcy rule set
R1: if (Capital & Reserves (Tr) < -0.001 and Turnover (% TA) < 0.16 and
Current profit/Current loss (R) < -25000)
then class = default
R2: else if (Turnover(Tr) < -0.001 and Solvency Ratio (%)(Tr) < -20 and
Total Assets (Tr) < 0
then class = default
R3: else class = non-default

25

Table 4
Example bank rating rule set
R1: if Region = not EU15 and Loan Loss Res/Gross Loans 3 and
ln(Total Assets) 8.6

then class = low rating

R2: else if Loan Loss Prov/Net Int Rev 10.5 and Return on Avg Equity -3.4
then class = low rating
R3: else if Region = not EU15 and Total capital Ratio 10 and
Net Interest Margin 2.1
then class = low rating
R4: else if Region = EU Next or Others and Loan Loss Prov/Net Int Rev 42
then class = low rating
R5: else if Region = JPY or EU Next or Others and Cost to Income Ratio 80 and
Net Loans/Cust&ST Funding 46
then class = low rating
R6: else if Region = JPY or EU Next or Others and Loan Loss Prov/Net Int Rev 42 and
Net Interest Margin 2.1
then class = low rating
R7: else class = good rating

Table 5
Average out-of-sample performances

Accuracy
Number of
Rules

AntMiner+
C4.5
SVM
Majority Vote
AntMiner+
C4.5

german

SME

banks

Average

71.9
74.2
73.7
66.7
5.7
14.8

86.2
82.7
86.3
83.2
2.6
7.4

84.3
85.6
87.7
61.0
6.4
17

80.8
80.8
82.6
70.3
4.9
13.1

26

Table 6
Decision table predicting retail loan defaults
Duration
15m
> 15m

Purpose

car(old)/others
furniture/business

Checking Account

<0e

0 and <100e or
no checking account

radio/television

car(new)/retraining

100e
<0e

Savings Account

<250e or
unknown/no savings
250e
<500e

500e or
unknown/no savings

0e or
no checking account

27

Credit History

Bad

Good

all credits paid back duly


or all credits at this bank
paid back duly
existing credits paid
back duly till now or
critical account

all credits paid back duly


or all credits at this bank
paid back duly
existing credits paid
back duly till now or
critical account

List of Figures
1

Path selection directed by pheromone: the more pheromone


on a path, the more likely an ant will follow the path. This
simple mechanism of indirect communication is sufficient for
the overall ant colony to find short paths from the nest to the
food source.
Example of a path described by an ant for a credit
scoring construction graph defined by AntMiner+. The rule
corresponding to the chosen path is if Purpose = car and
Savings Account [0e,500e] then class = bad.

18

18

Multiclass construction graph of AntMiner+, with the


inclusion of weight parameters.

18

Credit risk management system with the use of AntMiner+.


The induced rule set is verified and validated, after which it
can be used as a decision support system to make actual credit
risk decisions (accept or deny credit), and to calculate capital
requirements. Finally, backtesting and benchmarking validate
the credit risk management system over time.

19

DT quadrants.

19

Minimizing the number of columns of a lexicographically


ordered DT [? ].

20

Screenshot of AntMiner+ initial menu.

20

Screenshots of AntMiner+ run on the SME credit risk data set


during different stages of execution: from initialization (top)
to convergence (bottom)

21

28

List of Tables
1

Illustration of Quality Measure Q+

22

Example credit scoring rule set

22

Example SME bankruptcy rule set

22

Example bank rating rule set

23

Average out-of-sample performances

23

Decision table predicting retail loan defaults

24

29

You might also like