Professional Documents
Culture Documents
Credit Risk Management With ACO Rev1
Credit Risk Management With ACO Rev1
Credit Risk Management With ACO Rev1
c Credit
d Department
e University
Abstract
The introduction of the Basel II Capital Accord has encouraged financial institutions to build internal rating systems assessing the credit risk of their various credit
portfolios. One of the key outputs of an internal rating system is the probability
of default (PD), which reflects the likelihood that a counterparty will default on
his/her financial obligation. Since the PD modeling problem basically boils down
to a discrimination problem (defaulter or not), one may rely on the myriad of classification techniques that have been suggested in the literature. However, since the
credit risk models will be subject to supervisory review and evaluation, they must
be easy to understand and transparent. Hence, techniques such as neural networks
or support vector machines are less suitable due to their black box nature. Building
upon previous research, we will use AntMiner+ to build internal rating systems for
credit risk. AntMiner+ allows to infer a propositional rule set from a given data
set hereby using the principles from Ant Colony Optimization. Experiments will be
conducted using various types of credit data sets (retail, small- and medium-sized
enterprises (SMEs) and banks). It will be shown that the extracted rule sets are both
powerful in terms of discriminatory power, and comprehensibility. Furthermore, a
29 October 2008
framework will be presented describing how AntMiner+ fits into a global Basel II
credit risk management system.
Key words: Ant Colony Optimization, Classification, Credit Scoring, Bankruptcy
Prediction, Basel II
Introduction
Over the past decades, financial institutions have seen an ever growing need for
quantitative analysis techniques to optimize and monitor decisions related to
risk and investment management. The gradual adoption of data warehousing
and knowledge discovery in data (KDD) technology is allowing these institutions to analyze ever larger amounts of data, using a range of powerful
techniques from various disciplines such as conventional statistics, machine
learning, neurocomputing, and operations research. This process is only being further accelerated by the recent implementation of several international
financial and accounting standards (such as Basel II, Solvency II, SarbanesOxley and IFRS). For example, by allowing banks to use their internal credit
risk assessment models as input for the minimum regulatory capital calculations, the Basel II framework is providing financial institutions with additional
incentives to refine existing credit scoring models since more accurate predictions require less conservative capital requirements. Hence, there has been a
growing interest throughout the financial world in research on novel data mining techniques and information technologies to support the implementation of
such compliance frameworks.
As a result of a longstanding interest from the research community, a myriad
of techniques have been proposed for many of the aforementioned problems,
in particular for classification problems such as credit scoring and bankruptcy
prediction. However, not all of these approaches have proven readily transferable from the academic domain to financial practice. Many of the representations applied by the suggested algorithms cannot be easily interpreted and
validated by humans. For example, neural networks are considered a black
box technique, since the reasoning behind how the non-linear prediction models reach their conclusions cannot easily be obtained from their structure.
This has not only hindered their acceptance by practitioners, but also fails to
address the increasing need for transparency under various regulatory frameworks. Credit risk analysts are unlikely to accept black box techniques such
as neural networks to make credit decisions, since under the Basel II accord,
they are now required to demonstrate and periodically validate their models,
and present reports to the national regulator for approval. Therefore, recent
research proposed the use of rule-based classification techniques to generate
2
have been proposed, such as Ant Colony System [? ], rank-based Ant System [?
], Elitist Ant System [? ] and MAX -MIN Ant System [? ]. As the latter is
the one employed in the AntMiner+ classification technique, the main features
of MAX -MIN Ant System are discussed next.
St
utzle et al. [? ] advocate that a better exploitation of the best solutions
can be obtained by only adding pheromone to the path of the best ant. To
avoid early search stagnation, which is the situation where all ants take the
same path and thus describe the same solution, possible pheromone values are
limited to the interval [min , max ]. Finally, initializing the pheromone values
to max entails a higher exploration at the beginning of the algorithm.
ACO has been applied to a wide variety of problems [? ], such as the vehicle
routing problem [? ? ? ], scheduling [? ? ], timetabling [? ], the traveling salesman problem [? ? ? ] and routing in packet-switched networks [? ]. Recently,
ACO has also entered the data mining domain, addressing both the clustering [? ? ] and classification task [? ? ? ], which is the topic of interest in this
paper. The first application of ACO to the classification task is reported by
Parpinelli et al. in [? ] and was named AntMiner. Extensions were put forward
by Liu et al. in AntMiner2 [? ] and AntMiner3 [? ]. Our approach, AntMiner+,
differs from these previous AntMiner versions in several ways, resulting in an
improved performance, as described in [? ]. Next follows a brief discussion of
the principles and workings of AntMiner+.
ACO can be used to induce comprehensible and accurate rule-based classification models from data, as done in the AntMiner+ classification technique [?
].
First of all, an environment needs to be defined in which the ants operate.
When an ant moves through the environment from Start to Stop vertex, it
should incrementally construct a solution to the problem at hand, in this case
the classification problem. In order to build a set of classification rules, we define the construction graph in such a way that each ants path will implicitly
describe a classification rule. For each variable Vi a vertex vi,j is created for
each of its values V aluei,j . The set of vertices for one variable is defined as
a vertex group. To allow for rules where not all variables are involved, hence
shorter rules, an extra dummy vertex is added to each variable whose value
is undetermined, meaning it can take any of the values available. Although
only categorical variables are allowed, we make a distinction between nominal
(no apparent ordering in its values, e.g. sex and purpose of loan) and ordinal
variables (a clear ordering of the values, e.g. amount on savings or checking
6
account and income). Each nominal variable has one vertex group (with the
inclusion of the mentioned dummy vertex), but for the ordinal variables however, we build two vertex groups to allow for intervals to be chosen by the
ants. The first vertex group corresponds to the lower bound of the interval
and should thus be interpreted as < Vi+1 V aluei,k >, the second vertex
group determines the upper bound, giving < Vi+2 V aluei+1,l > (of course,
the choice of the upper bound is constrained by the lower bound). This allows
to have less, shorter and actually better rules. To extract a rule set that is
exhaustive, such that all future data points can be classified, the majority
class is not included in the vertex group of the class variable, and will be the
predicted class for the final else clause.
An example AntMiner+ construction graph for a credit scoring data set with
only three variables (purpose of the loan, amount on savings account and credit
history of the applicant) is shown in Fig. 2. The path denoted in bold describes
the rule if Purpose = car and Savings Account 0e and Savings Account
500e and Credit History=any then class=bad. A formal illustration of
the construction graph is provided in Fig. 3, for a data set with d classes, n
variables, of which the first and last variable are nominal and V2 is ordinal
(hence the two vertex groups). The weight parameters and determine the
relative importance of the pheromone and heuristic values, and its notion is
described by (1).
Now the environment is defined, we can explain the workings of the technique.
All ants begin in the Start vertex and walk through their environment to the
Stop vertex, gradually constructing a rule. Only the ant that describes the
best rule will update the pheromone of its path, as imposed by the MAX MIN Ant System approach. Evaporation decreases the pheromone of all
edges, while the pheromone levels are constrained to lie within the given interval [min , max ]. Then another iteration occurs with ants walking from Start
to Stop. Convergence occurs when all the edges of one path have a pheromone
level max and all others edges have pheromone level min . Next, the rule corresponding to the path with max is extracted and added to the rule set. Finally,
training data covered by this rule is removed from the training set. This iterative process will be repeated until the stop criterion is met, which is early
stopping. This procedure monitors the accuracy on a separate validation set,
and will stop inducing rules when the validation accuracy starts to decrease.
Next we will have a closer look at the algorithm specifics, such as the edge
probabilities and rule quality measure.
[(vi1,k ,vi,j ) (t)] .[vi,j (t)]
Pij (t) = Ppi
(1)
(2)
(3)
Q+
best
10
(4)
The edge to choose when an ant arrives at a vertex vi1,k , and thus the term
to add next, is dependent on the pheromone value of the edge between vertices
vi1,k and vi,j ((vi1,k ,vi,j ) ) and the heuristic value of the vertex vi,j (i,j ), and
normalized over all possible vertices, providing a probability Pij for each of
the possible vertices, according to (1). As the heuristic function is problemdependent, we have defined the heuristic value ij of vertex vi,j , corresponding
to the term Vi = V aluei,j , as the fraction of training cases that are correctly
covered (described) by this term, as defined by (2). Let us illustrate this definition with a simplified credit scoring data set of five data instances i1 , i2 , . . . , i5
and three variables Sex, Term of the loan and nominal variable Real Estate
stating what kind of real estate the applicant owns. Consider the vertex corresponding to Sex = M ale. As this is a binary classification problem, the only
class in the construction graph is the bad class, giving a heuristic value for
this vertex of:
|Sex = male & CLASS = bad|
= 3/4
|Sex = male|
(5)
|rulecant |
|rulecant |
+
|ruleant | |Cov = 0|
{z
confidence
{z
coverage
(6)
For example, returning to our simple data set (see Table 1), suppose we have
following two rules:
R1 : if Sex = M and Term 1 y and Term 15 y
then customer = Bad
R2 : if Sex = M and Term 1 y and Term 1 y and Real Estate = A
then customer = Bad
a credit scoring context, and reduces the Validation & Verification process of
the model dramatically (see Section 5.1, further in the text).
AntMiner+ is implemented in the platform-independent, object-oriented Java
programming environment, with usage of the MySQL open source database
server. Example screenshots of the Graphical User Interface (GUI) of AntMiner+
are included in Appendix.
In this section, we will illustrate how AntMiner+ can be used to build credit
risk systems in three different contexts: retail banking, small and medium sized
enterprises (SMEs), and bank ratings.
As AntMiner+ can only deal with categorical variables, a discretization preprocessing step takes place in which the continuous variables are turned into
discrete variables. This process is done in an automatic manner with the Weka
workbench [? ] according the criterion of Fayyad [? ]. All experiments were
run with 1000 ants and set at 0.85, as suggested in [? ].
4.1 Retail Banking
In this section, we will illustrate how AntMiner+ can be used to develop application scoring models in a retail banking context. The purpose of application
scoring is to provide a score or classification of a credit applicant given the
application characteristics provided. The data set that we will use is the German credit data set, which is a publicly available application scoring data set
(see www.ics.uci.edu/mlearn/MLRepository.html) having 1000 observations and 20 application characteristics. Table 2 presents the rules that were
extracted using AntMiner+.
The extracted rule set is concise and easy to understand. Only 5 of the original
20 application characteristics are used for making the discrimination. This
clearly has a beneficial impact on interpretability, but also on operational cost
and efficiency.
4.2 SME Bankruptcy Prediction
Under the IRB approach for corporate credits, the Basel II Capital accord
allows banks to separately distinguish exposures to SME borrowers (defined
10
as corporate exposures where the reported sales for the consolidated group of
which the firm is a part is less than 50 million e) from those to large firms.
The SME data set consists 422 observations, 74 bankrupt and 348 solvent
companies. The default data were collected from 1989-1997, while the other
data were extracted from the period 1996-1997 only. A total number of 40
candidate input variables was selected from financial statement data, using
a.o. liquidity, profitability and solvency measures (see [? ] for an extensive
description of this data set.
Table 3 represents the rules that were extracted by AntMiner+. Again, only
5 of the 40 original inputs are used in making the discrimination decision.
Note that the numbers were rounded and one variable was scaled randomly
for confidentiality reasons.
4.3 Rating Prediction
For retail and SME portfolios, one typically has a sufficient number of default
observations in order to make statistical discrimination meaningful. However,
when modeling credit risk for entities such as banks, sovereigns, or insurance
companies, the lack of default observations necessitates the use of an alternative modeling approach. That is why many financial institutions opt for
a mapping to external ratings in this context. In this section, we will study
how AntMiner+ can be used to model credit risk for bank entities. The data
was retrieved from the Bankscope database, which contains financial statements of more than 15.000 banks. For each of these banks the Moodys rating will be used as the basis of the target variable (low/speculative-grade or
good/investment-grade rating). These ratings were retrieved for the period
1998-2003. The rating at the end of May of the year T + 1 is predicted based
on a 3-year history of inputs observed during years T , T 1, T 2. A variety
of different inputs was selected covering, amongst others, asset quality, capital, operational result and liquidity. The size variable Total Assets was also
included as well as a geographical indicator Region (Euro-zone, dollar-zone,
EU accession countries, Japan and others). After data preprocessing, the data
set consisted of a cleaned database of 2996 observations with 37 inputs (see
[? ] for a more extensive description).
4.4 Classification Model Performance
Table 5 shows the results of the classification models induced by AntMiner+,
C4.5, support vector machine (SVM) and majority vote. The experimental
setup is the same for all included data sets. The data set is split up into
training, validation and test set according following fractions: 4/9, 2/9 and 3/9,
11
A first set of tools can be used to verify and validate (V&V) the extracted rule
set. Verification will attempt to look for syntax based anomalies in the rule set.
Whether the rule set is exhaustive (all cases being covered) and exclusive (a
12
case only covered by 1 rule) will be investigated in this step. Because of the ifthen-else nature of the AntMiner+ rule sets, they are by definition exhaustive
and exclusive, making the verification step obsolete. In the validation step,
it will be investigated whether the rules adequately model the risk involved
from a human interpretation viewpoint. The financial credit expert will also
be consulted and asked to interpret the rule set in this step.
In order to facilitate the verification and validation step, decision tables may
be adopted [? ]. Decision tables provide an alternative way of representing the
AntMiner+ rule sets in a user-friendly way. A decision table (DT) consists
of four quadrants, separated by double-lines, both horizontally and vertically
(cf. Fig. 5). The vertical line divides the table into a condition part (left),
specifying the inputs to be checked, and an action part (right) specifying the
classes assigned.
Each condition entry describes a relevant subset of values (called a state)
for a given input, or contains a dash symbol () if its value is irrelevant
within the context of that column. Subsequently, every action entry holds a
value assigned to the outcome class. True, false and unknown action values
are typically abbreviated by , , and , respectively. Every row in the
entry part of the DT thus comprises a classification rule, indicating what class
results from a certain combination of inputs. If each row only contains simple
states (no contracted or irrelevant entries), the table is called an expanded
DT, whereas otherwise the table is called a contracted DT. Table contraction
can be achieved by combining rows that lead to the same outcome class. The
number of rows in the contracted table can then be further minimised by
changing the order of the conditions. It is obvious that a DT with a minimal
number of rows is to be preferred since it provides a more parsimonious and
comprehensible representation of the extracted rule set than an expanded DT.
This is illustrated in Fig. 6.
In the literature, several kinds of DTs have been proposed. We will require
that the condition entry part of a DT satisfies the following two criteria:
completeness: all possible combinations of input values are included;
exclusivity: no combination is covered by more than one column.
As such, we deliberately restrict ourselves to single-hit tables, wherein columns
have to be mutually exclusive, because of their advantages with respect to verification and validation [? ]. It is this type of DT that can be easily checked
for potential anomalies, such as inconsistencies (a particular counterparty being assigned to more than one class) or incompleteness (no class assigned).
The decision table formalism thus allows for easy verification of the extracted
AntMiner+ rules. Additionally, for ease of legibility, the rows are arranged
in lexicographical order, in which entries at lower rows alternate first. As a
13
result, a tree structure emerges in the condition entry part of the DT, which
lends itself very well to a top-down evaluation procedure: starting at the first
column, and then working ones way to the right of the table by choosing
from the relevant condition states, one safely arrives at the outcome class for
a given case. This condition-oriented inspection approach often proves to be
more intuitive, faster, and less prone to human error, than evaluating a set of
rules one by one.
Decision tables can also be usefully adopted for validation purposes, as an
easily be checked for potential anomalies, such as in- consistency with monotonicity constraints: by placing the assumingly monotone variable in the last
column, adjacent rows are found with data entries that are equal in all variables except the last one. It can then be easily seen whether or not the class
variable changes in the expected manner. As AntMiner+ has the supplementary benefit of incorporating such monotonicity constraints, as demonstrated
in Section 3.2, the decision table will reveal no counter-intuitive patterns any
more. For example, Table 6 depicts the decision table corresponding to the
rule set extracted for the German credit scoring data set (see Table 2). Based
on this table, we can easily check that credit history can only have a positive
effect on the applicants assessment, if any.
We can conclude that this first step of verifying and validating the model
has been releaved significantly thanks to the nature of the induced rule sets
(exhaustive and exclusive) and because of the incorporation of monotonicity
constraints. This does however not mean that this phase is no longer needed,
as the domain expert still needs to check whether the model is suitable. From
that perspective, decision tables are still a very useful tool.
Once the rule set has been verified and validated, it needs to be implemented
as a decision support system (DSS) which can be used by the credit officers
so as to make the actual credit decision: accept or reject. The DSS can be
implemented using a traffic light indicator approach that gives three possible
outcomes: a green light, an orange light or a red light [? ]. A green light
indicates that the rule set is confident enough to classify a customer as a good
payer and credit should be accepted. An orange light indicates a doubt case
for which human intervention is needed. This can be due to for example, low
confidence of the rule set, external information obtained from a credit bureau
(e.g. Equifax, Experian), a customer which is rejected borderline by the rule
set but is very profitable on other financial products, and/or a new marketing
campaign in which the financial institution decides to grant credit to some of
the more risky customers. The orange light can allow for model overrides by
14
the credit expert. A low side override means that a customer rejected by the
rule set is accepted, and a high side override vice versa. A red light indicates
that the rule set is confident enough to classify a customer as a bad payer and
credit should be rejected. Note that this traffic light indicator approach can
also be implemented using four colors (green, yellow, orange, red) or gauges
in a dashboard application. An implementation of a traffic light indicator
approach using four colors could be as follows. Red when the rule set predicts
bad customer and this is confirmed by the credit bureau information; Orange
when the rule set predicts bad customer, but credit bureau says customer is
good risk; Yellow when the rule set predicts bad customer, but confidence is
very low and the credit bureau says customer is good risk; and Green when the
rule set says good customer and the credit bureau says customer is good risk.
Note that the financial institutions can decide for themselves on the number
of colors and their meaning.
5.3 Interface to Basel II Calculation Engine
The extracted rule set must also interface with a Basel II calculation engine
which will use the rule outputs to calculate expected loss and the regulatory
capital that a financial institution needs to set aside in order to cover unexpected credit losses. Therefore, in a calibration phase, each rule should be
accompanied by a PD estimate which should be forward looking and based
on five years of historical data.
Once the estimates for the LGD and EAD have been obtained, the expected
loss and the regulatory capital can be calculated. The expected loss (EL) can
be calculated as EL = P D LGD EAD. It represents the long-run average
credit loss and will be used for debt provisioning. The regulatory safety capital
can then also be calculated based on the formulas provided in the Basel II
Accord. E.g., for retail exposures the formulas are as follows
q
q
1
K = LGD (( 1
1 (P D) + 1
1 (0.999)) P D)
(7)
whereby (1 ) represents the (inverse) cumulative standard normal distribution, and the asset correlation factor which is fixed in the Accord [? ] (e.g.
0.15 for residential mortgage exposures).
5.4 Evaluating the Model over Time: Backtesting and Benchmarking
The Basel II Capital Accord requires credit risk systems to be validated, at
least annually. The accord distinguishes between backtesting, which is com15
paring the predicted outcome by the rule set with the realized outcome, and
benchmarking, which is comparing the predicted outcome of the rule set with
the outcomes of models of other parties in the industry (such as credit bureaus, other financial institutions, or financial regulators). From a backtesting
perspective, the performance of the rule set needs to be monitored. Again,
a traffic light indicator approach can be adopted with three outcomes: green
light, orange light, red light [? ]. The decision which light to switch on can
be determined based on the outcome of a test statistic which monitors the
classification accuracy (e.g. McNemars test [? ]). A green light indicates that
the rule set performance is stable, e.g. no significant differences at the 5%
level are reported. It means the rule set can continue to be used. An orange
light may indicate e.g. a difference at the 5% level but not at the 1% level of
significance. It indicates a performance difference which requires no immediate action but needs to be closely monitored in the future. A red light then
indicates a significant performance difference at the 1% level. It indicates that
the model is no longer appropriate for the current data which could possibly
be due to a change of the population (often referred to as population drift)
or a new strategy of the financial institution. In other words, the model needs
to be rebuilt, which in our context would mean extracting a new rule set using AntMiner+. From a benchmarking perspective, a similar process can be
conducted, whereby the traffic lights now indicate how much the two parties
agree or disagree on their credit decisions.
Conclusion
The introduction of the recently suggested Basel II Capital Accord has encouraged financial institutions to build efficient and high-performing credit risk
models assessing the creditworthiness of their counterpartys. Ideally, these
models should be both powerful, in terms of discriminating defaulters from
non-defaulters, and comprehensible, in terms of explanatory power. In this
paper, we discussed how Ant Colony Optimization can be used to build credit
risk models for Basel II. More specifically, we used the AntMiner+ algorithm,
which is a rule induction technique based on the principles of MAX -MIN
Ant System. AntMiner+ distinguishes itself by the comprehensibility of the
induced models which are in line with existing domain knowledge. We have
also shown how decision tables can be useful to provide even more insight into
the classification model.
Experiments were conducted using three real-life credit risk data sets: one
in retail, one for SMEs, and one for bank ratings. It was illustrated that for
each of these data sets AntMiner+ extracted a powerful and concise rule set.
Furthermore, it was also discussed how the induced rule sets could fit into a
global credit risk management strategy and architecture. An interesting topic
16
Acknowledgment
We extend our gratitude to the (associate) editor and the anonymous reviewers, as their many constructive and detailed remarks certainly contributed
much to the quality of this paper. Further, we would like to thank the Flemish Research Council (FWO, Grant G.0615.05), and the Microsoft and KBCVlekho-K.U.Leuven Research Chairs for financial support to the authors.
References
[]
[]
[]
[]
[]
[]
[]
[]
[]
A. Abraham and V. Ramos. Web usage mining using artificial ant colony
clustering. In the Congress on Evolutionary Computation, pages 1384
1391. IEEE Press, 2003.
B. Baesens, T. Van Gestel, S. Viaene, M. Stepanova, and J. Suykens,
J.A.K.and Vanthienen. Benchmarking state of the art classification algorithms for credit scoring. Journal of the Operational Research Society,
54(6):627635, 2003.
Basel Committee on Banking Supervision. International convergence of
capital measurement and capital standards: a revised framework. Technical report, BIS, June 2006.
C. Blum. Beam-ACO hybridizing ant colony optimization with beam
search: An application to open shop scheduling. Computers & Operations
Research, 32(6):15651591, 2005.
B. Bullnheimer, R. F. Hartl, and C. Strauss. A new rank based version
of the ant system: A computational study. Central European Journal for
Operations Research and Economics, 7(1):2538, 1999.
B. Bullnheimer, R.F. Hartl, and C. Strauss. Applying the ant system to
the vehicle routing problem. In S. Voss, S. Martello, I.H. Osman, and
C. Roucairol, editors, Meta-Heuristics: Advances and Trends in Local
Search Paradigms for Optimization, 1999.
G. Di Caro and M. Dorigo. Antnet: Distributed stigmergetic control
for communications networks. Journal of Artificial Intelligence Research,
9:317365, 1998.
A. Colorni, M. Dorigo, V. Maniezzo, and M. Trubian. Ant system
for jobshop scheduling. Journal of Operations Research, Statistics and
Computer Science, 34(1):3953, 1994.
V.S. Desai, J.N. Crook, and G.A. Overstreet Jr. A comparison of neu17
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
ral networks and linear scoring models in the credit union environment.
European Journal of Operational Research, 95(1):2437, 1996.
T. G. Dietterich. Approximate statistical test for comparing supervised
classification learning algorithms. Neural Computation, 10(7):18951923,
1998.
M. Dorigo and L. M. Gambardella. Ant colony system: A cooperative
learning approach to the traveling salesman problem. IEEE Transactions
on Evolutionary Computation, 1(1):5366, April 1997.
M. Dorigo, V. Maniezzo, and A. Colorni. Positive feedback as a search
strategy. Technical Report 91016, Dipartimento di Elettronica e Informatica, Politecnico di Milano, IT, 1991.
M. Dorigo, V. Maniezzo, and A. Colorni. Ant System: Optimization by a
colony of cooperating agents. IEEE Transactions on Systems, Man, and
Cybernetics Part B: Cybernetics, 26(1):2941, 1996.
M. Dorigo and T. St
utzle. Ant Colony Optimization. MIT Press, Cambridge, MA, 2004.
U.M. Fayyad and K.B. Irani. Multi-interval discretization of continuousvalued attributes for classification learning. In Proceedings of the
Thirteenth International Joint Conference on Artificial Intelligence
(IJCAI), pages 10221029, Chambery, France, 1993. Morgan Kaufmann.
L. M. Gambardella and M. Dorigo. Ant-Q: A reinforcement learning approach to the traveling salesman problem. In A. Prieditis and S. Russell,
editors, Proceedings of the Twelfth International Conference on Machine
Learning, pages 252260, Palo Alto, CA, 1995. Morgan Kaufmann Publishers Inc.
D. Hand. Pattern detection and discovery. In D. Hand, N. Adams,
and R. Bolton, editors, Pattern Detection and Discovery, volume 2447
of Lecture Notes in Computer Science, pages 112. Springer, 2002.
J. Handl, J. Knowles, and M. Dorigo. Ant-based clustering and topographic mapping. Artificial Life, 12(1):3561, 2006.
W.E. Henley and D.J. Hand. Construction of a k-nearest neighbour
credit-scoring system. IMA Journal of Mathematics Applied In Business
and Industry, 8:305321, 1997.
B. Liu, H. A. Abbass, and B. McKay. Density-based heuristic for rule
discovery with ant-miner. In 6th Australasia-Japan Joint Workshop on
Intelligent and Evolutionary Systems (AJWIS2002), Canberra, Australia,
2002.
B. Liu, H. A. Abbass, and B. McKay. Classification rule discovery with
ant colony optimization. In IAT, pages 8388. IEEE Computer Society,
2003.
D. Martens, M. De Backer, R. Haesen, B. Baesens, C. Mues, and
J. Vanthienen. Ant-based approach to the knowledge fusion problem.
In Proceedings of the Fifth International Workshop on Ant Colony
Optimization and Swarm Intelligence, Lecture Notes in Computer Science, pages 8596. Springer, 2006.
18
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
20
50%
33%
50%
67%
(a)
(b)
Fig. 1. Path selection directed by pheromone: the more pheromone on a path, the
more likely an ant will follow the path. This simple mechanism of indirect communication is sufficient for the overall ant colony to find short paths from the nest to
the food source.
SavingsSavings
Credit
History
Purpose
Account
Account
Class
0e
0e
all paid
car
100e 100e
education
Start
bad
business
any
250e 250e
500e 500e
none taken
Stop
critical
1000e 1000e
any
3000e 3000e
Fig. 2. Example of a path described by an ant for a credit scoring construction graph
defined by AntMiner+. The rule corresponding to the chosen path is if Purpose =
car and Savings Account [0e,500e] then class = bad.
V0,=
=
=
v0,1
a1
b1
Start
a2
b2
a3
b3
a4
b4
V1,=
v1,1
V2,
V3,
v2,1 v3,1
Vm,=
vm,1
v1,2
v2,2 v3,2
vm,2
v0,2
Stop
v0,d1
v1,p1+1
v2,p2 v3,p3
vn,pm+1
21
Decision
Support System
V&V
data
AntMiner+
PD
Backtesting &
Benchmarking
LGD
EAD
Capital Requirements
Fig. 4. Credit risk management system with the use of AntMiner+. The induced
rule set is verified and validated, after which it can be used as a decision support
system to make actual credit risk decisions (accept or deny credit), and to calculate
capital requirements. Finally, backtesting and benchmarking validate the credit risk
management system over time.
condition subjects
action subjects
condition entries
action entries
Fig. 5. DT quadrants.
22
1. Condition1
2. Condition2
3. Condition3
1. Class1
2. Class2
yes
yes
yes
no
yes
no
no
no
yes
no
yes
no
yes
no
1. Class1
2. Class2
yes
no
yes
2. Condition2
3. Condition3
yes
no
no
yes
no
23
Fig. 8. Screenshots of AntMiner+ run on the SME credit risk data set during different stages of execution: from initialization (top) to convergence (bottom)
24
Table 1
Illustration of Quality Measure Q+
Sex
Term
Real Estate
Customer
i1
Bad
R1
i2
Bad
i3
15
Good
i4
10
Bad
i5
15
Good
Confidence
3/4
1/1
Coverage
3/5
1/5
Q+
1.35
1.2
R2
Table 2
Example credit scoring rule set
R1: if (Checking Account < 100e and Duration > 15 m and
Credit History = no credits taken and Savings Account < 500e)
then class = bad
R2: else if (Purpose = new car/repairs/education/others and
Credit History = no credits taken/all credits paid back duly at this bank and
Savings Account < 500e)
then class = bad
R3: else if (Checking Account < 0e and
Purpose = furniture/domestic appliances/business and
Credit History = no credits taken/all credits paid back duly at this bank and
Savings Account < 250e)
then class = bad
R4: else if (Checking Account < 0e and Duration > 15 m and
Credit History = critical account and Savings Account < 250e)
then class = bad
R5: else class = good
Table 3
Example SME bankruptcy rule set
R1: if (Capital & Reserves (Tr) < -0.001 and Turnover (% TA) < 0.16 and
Current profit/Current loss (R) < -25000)
then class = default
R2: else if (Turnover(Tr) < -0.001 and Solvency Ratio (%)(Tr) < -20 and
Total Assets (Tr) < 0
then class = default
R3: else class = non-default
25
Table 4
Example bank rating rule set
R1: if Region = not EU15 and Loan Loss Res/Gross Loans 3 and
ln(Total Assets) 8.6
R2: else if Loan Loss Prov/Net Int Rev 10.5 and Return on Avg Equity -3.4
then class = low rating
R3: else if Region = not EU15 and Total capital Ratio 10 and
Net Interest Margin 2.1
then class = low rating
R4: else if Region = EU Next or Others and Loan Loss Prov/Net Int Rev 42
then class = low rating
R5: else if Region = JPY or EU Next or Others and Cost to Income Ratio 80 and
Net Loans/Cust&ST Funding 46
then class = low rating
R6: else if Region = JPY or EU Next or Others and Loan Loss Prov/Net Int Rev 42 and
Net Interest Margin 2.1
then class = low rating
R7: else class = good rating
Table 5
Average out-of-sample performances
Accuracy
Number of
Rules
AntMiner+
C4.5
SVM
Majority Vote
AntMiner+
C4.5
german
SME
banks
Average
71.9
74.2
73.7
66.7
5.7
14.8
86.2
82.7
86.3
83.2
2.6
7.4
84.3
85.6
87.7
61.0
6.4
17
80.8
80.8
82.6
70.3
4.9
13.1
26
Table 6
Decision table predicting retail loan defaults
Duration
15m
> 15m
Purpose
car(old)/others
furniture/business
Checking Account
<0e
0 and <100e or
no checking account
radio/television
car(new)/retraining
100e
<0e
Savings Account
<250e or
unknown/no savings
250e
<500e
500e or
unknown/no savings
0e or
no checking account
27
Credit History
Bad
Good
List of Figures
1
18
18
18
19
DT quadrants.
19
20
20
21
28
List of Tables
1
22
22
22
23
23
24
29