Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 42, NO.

4, JULY 2012 421

Machine Learning in Financial Crisis


Prediction: A Survey
Wei-Yang Lin, Ya-Han Hu, and Chih-Fong Tsai

Abstract—For financial institutions, the ability to predict or fore- institutions, both of which can be regarded as binary classifi-
cast business failures is crucial, as incorrect decisions can have cation problems to assign new observations to two predefined
direct financial consequences. Bankruptcy prediction and credit decisions as classes (e.g., “good” and “bad” risk classes) [139].
scoring are the two major research problems in the accounting
and finance domain. In the literature, a number of models have In machine learning, the creation of a neural network is a well-
been developed to predict whether borrowers are in danger of known technique that has been widely used for the development
bankruptcy and whether they should be considered a good or bad of bankruptcy-prediction and credit-scoring models since the
credit risk. Since the 1990s, machine-learning techniques, such as 1990s [1], [35], [44], [169]. For example, in [18] 27 related
neural networks and decision trees, have been studied extensively studies were reviewed that concern the evaluation of business
as tools for bankruptcy prediction and credit score modeling. This
paper reviews 130 related journal papers from the period between risk, of which 12 focused on bankruptcy. In [155], 66 papers
1995 and 2010, focusing on the development of state-of-the-art in business were reviewed, 13 of which focused on bankruptcy
machine-learning techniques, including hybrid and ensemble clas- prediction. The review in [156] covered 54 papers in the field
sifiers. Related studies are compared in terms of classifier design, of finance, of which 11 addressed the problem of bankruptcy
datasets, baselines, and other experimental factors. This paper prediction. In [150], 38 papers out of 48 papers in the finance
presents the current achievements and limitations associated with
the development of bankruptcy-prediction and credit-scoring mod- domain focused on bankruptcy prediction.
els employing machine learning. We also provide suggestions for A number of researchers have reviewed related literature for
future research. domain problems associated with bankruptcy prediction and
Index Terms—Bankruptcy prediction, credit scoring, ensemble credit scoring prior to 2000. In [35], papers were classified in
classifiers, hybrid classifiers, machine learning. terms of country, industrial sector, and period of data as well as
the financial ratios and models or methods that are employed.
In [137], both statistical and operational based techniques that
I. INTRODUCTION are used to support credit scoring were surveyed. Similarly,
ANKRUPTCY prediction has a significant influence on in [44] a wide range of statistical methods was reviewed for the
B the decisions made by financial institutions, as erroneous
decisions can have dire financial consequences. This has made
problem of credit scoring.
Recently, in [7] an extensive review of traditional statisti-
bankruptcy prediction one of the major problems that are faced cal methods for the prediction of bankruptcy has been pre-
by decision makers in the realm of finance, leading to the de- sented, without addressing machine-learning-based methods.
velopment of numerous bankruptcy-prediction models, and at- In [71], the work preformed prior to 2005 was reviewed, in
tracting considerable attention both from academics and those which statistical and machine-learning techniques were applied
in the business community [71]. to overcome the bankruptcy-prediction problem. In particular,
Credit scoring, which focuses on credit admission evaluation, they highlighted the source of datasets, financial ratios, country
is another serious issue for those in financial institutions, who of origin, time line of study, and the comparative performance of
must determine whether loan customers belong to a high risk various techniques in terms of prediction accuracy. In contrast,
group, i.e., possessing a high risk [56]. in [33] several related studies were reviewed to assess consumer
Bankruptcy prediction and credit scoring are the two most im- credit risk, using statistical and machine-learning techniques,
portant financial issues which decision makers face in financial comparing the predictive accuracy of various classifiers. The
most recent review [151] presented a comprehensive review of
hybrid- and ensemble-based techniques that are used for the pre-
Manuscript received February 24, 2011; revised July 18, 2011; accepted diction of bankruptcy, focusing on the means by which various
September 24, 2011. Date of publication November 3, 2011; date of current techniques are combined, rather than the results obtained.
version June 13, 2012. This work was supported in part by the National Science Each of the aforementioned review papers looked at these
Council of Taiwan under Grant NSC 96-2416-H-194-010-MY3. This paper was
recommended by Associate Editor B. Chaib-draa. issues from a different perspective. Unfortunately, most of this
W.-Y. Lin is with the Department of Computer Science and Information En- research is out of date and provides little insight into the tech-
gineering, National Chung Cheng University, Chiayi County 62102, Taiwan niques that have been developed recently, thereby increasing the
(e-mail: wylin@cs.ccu.edu.tw).
Y.-H. Hu is with the Department of Information Management, difficulty to determine trends for future research. In addition,
National Chung Cheng University, Chiayi County 62102, Taiwan (e-mail: many previous reviews have only focused on statistical methods
yahan.hu@mis.ccu.edu.tw). and/or neural networks as core techniques. Many bankruptcy-
C.-F. Tsai is with the Department of Information Management,
National Central University, Taoyuan County 32001, Taiwan (e-mail: prediction and credit-scoring models that use machine-learning
cftsai@mgt.ncu.edu.tw). techniques have been investigated recently (c.f., Section III),
Digital Object Identifier 10.1109/TSMCC.2011.2170420

1094-6977/$26.00 © 2011 IEEE

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
422 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 42, NO. 4, JULY 2012

although most of the reviews provide only a general view of


the associated techniques. In other words, no comprehensive
comparison of statistical methods has been presented. What
kinds of techniques have been most widely developed? What
techniques are used as a representative baseline for compari-
son? What evaluation strategies, model training, and datasets
are commonly employed?
This paper provides a statistical survey of papers related to the
prediction of bankruptcy and credit scoring published between
1995 and 2010. In particular, we examine the techniques that
Fig. 1. Year-wise distribution of articles for neural networks, support vector
have been used (in addition to neural networks), examine the machines, and genetic algorithms.
experiments that have been conducted, and consider directions
for future work from the perspective of machine learning. We
reviewed a total of 130 journal papers. ically, these single classification techniques are divided into
This paper is organized as follows. In Section II, we provide four groups, which are supervised learning, unsupervised learn-
a comparison of related studies using various machine-learning ing, statistics-based learning, and other techniques. In addition,
techniques, divided into single and soft classification techniques. these models are compared with other single baseline models
In Section III, we compare related studies in terms of evaluation in terms of prediction performance with which to reach a final
methodology, including baselines, datasets, and experimental conclusion.
results. A discussion, conclusions, and suggestions for future In addition to comparative studies of individual classifiers,
research are presented in Section IV. several studies have focused on whether feature selection influ-
ences the performance of specific single classifiers [86], [90],
II. MACHINE-LEARNING TECHNIQUES [94], [116], [117], [128], [135], [140], [160]. All of these studies
confirmed that considering the selection of features to reduce
A. Pattern Classification dimensionality promotes the performance of single classifiers
The goal of classification is to allocate an (unknown) instance beyond what is possible without considering feature selection.
that is represented by particular features into one correct class However, it is interesting to note that for the 130 related studies
from a finite set of classes. The task of learning (or training) that are addressed in this paper, only half of them considered
involves the computation of a classifier or model by approx- feature selection prior to the training of classifiers. Most studies
imating the mapping between input–output examples, which without feature selection employed the original features pro-
enables the correct labeling of the training set at a particular vided in the datasets, which are available for public download
level of accuracy. After the model is generated or trained, it (cf., Section III-B) or selected the features according to the
can be used to classify unknown instances into one of the class advice of domain experts. Note that a discussion of related fi-
labels that are learned in the training set [37], [103]. nancial ratios as important features is beyond the scope of this
The task of learning (or training) involves the computation survey; see [35] and [71] for further details.
of a classifier or model by approximating the mapping between In the comparison of single classifiers that are shown in
input–output examples, which enables the correct labeling of the Table I, neural networks, support vector machines, and genetic
training set at a particular level of accuracy. After the model is algorithms are the three major machine-learning techniques that
generated or trained, it can be used to classify unknown instances are used in the prediction of financial crises. Fig. 1 shows year-
into one of the class labels that are learned in the training set. wise distribution of articles using these three techniques. Ob-
For cases of bankruptcy prediction and credit scoring, each serve that neural networks are the most widely used technique
data sample in the selected dataset comprises a number of fi- for the development of prediction models. This corresponds to
nancial ratios and corresponding binary class labels, represent- important survey literature, depicting the popularity of these
ing bankruptcy/nonbankruptcy for bankruptcy prediction and techniques in many domain problems, [17], [38], [126], [155],
good/bad credit for credit scoring, respectively. This enables the [156], [169].
training and testing of the classifier as a bankruptcy-prediction
model or credit-scoring model based on one specific classifica- C. Soft Classification Techniques
tion technique.
This section comprises two parts. The first part is an overview
of two well-known types of soft classification techniques: clas-
B. Single Classification Techniques sifier ensembles and hybrid classifiers. They are widely consid-
In general, the problem to predict financial crises can be ap- ered the core techniques showing superiority over many single
proached by designating a single classifier. Many studies have classification techniques in the prediction of financial crises.
been limited to a single classification technique with which Next, the second part provides a comparison of related studies
to develop bankruptcy-prediction and credit-scoring models. using soft classification techniques.
Table I lists related studies on constructing financial crisis pre- 1) Classifier Ensembles: In pattern recognition and machine
diction models that are based on single classifiers. More specif- learning, the combination of classifiers has recently become an

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
LIN et al.: MACHINE LEARNING IN FINANCIAL CRISIS PREDICTION: A SURVEY 423

TABLE I
COMPARISONS OF SINGLE CLASSIFIERS

area of active research [43]. The idea of combining classifiers In the literature, several combination methods have been used
can be expressed using a probabilistic framework to combine multiple classifiers. The simplest method is majority
voting. After the outputs of individual classifiers are produced,

K
the class with highest number of votes is selected as the final
p(t|x) = wk p(t|x, k) (1) classification decision. The other two representative methods
k =1 are bagging and boosting. In bagging, several classifiers are
where p(t | x) is the conditional distribution given the input trained independently using various training sets via the boot-
variable x, k = 1, 2, . . ., K indexes a set of possible models, strap method [15]. For example, consider a problem in which
and wk represents the probability for each model. It can be we are trying to predict credit risk that is based on the input data
called classifier ensembles or multiple classifiers. They were x. We can generate K bootstrap datasets by randomly sampling
proposed to improve the classification performance of single with replacement from the original training data. Then, we use
classifiers [67]. That is, the combination compensates for errors each bootstrap dataset to train a separate predictive model yk (x)
made by individual classifiers on different parts of the input where k = 1, 2, . . ., K. By doing so, each training example may
space. In this manner, the performance of modular classifiers appear repeatedly but not at all in any particular training dataset
generally outperforms even the best single classifiers used in of K. The final prediction of credit risk is given by
isolation.
1 
K
Classifier ensembles are based on the divide-and-conquer y(x) = yk (x). (2)
principle, in which a complex problem is divided into subprob- K
k =1
lems (i.e., simpler task), which are then solved using various
classification techniques. The solution is then reassembled from This procedure has been extensively analyzed in the machine-
the results of the subtasks. learning literature and proved to be an effective mean for

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
424 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 42, NO. 4, JULY 2012

combining multiple models. For instance, the theoretical anal- subsequent supervised classification [60]. Clustering results can
ysis in [37] suggests that the average error of a model can be be used for either the preclassification of unlabeled samples or
reduced to 1/K by averaging K versions of the model. the identification of major populations in a given dataset. The
On the other hand, in boosting, although like bagging that clustering results are then used as training examples to design
each classifier is trained using different training sets, the K classifiers. This second stage is the same as that used for training
classifiers are trained not in a parallel and independent way, or constructing classifiers.
but sequentially. More specifically, each weak classifier yk (x), For instance, Hsieh [47] introduces a hybrid method for the
where k = 1, 2, . . ., K, is trained using a weighted form of credit-scoring problem. He first utilizes K-means clustering to
the original training data, i.e., each data point having an asso- generate clusters with new class labels and eliminate the un-
ciated weighting coefficient. At each iteration of the boosting representative samples from each cluster. In principle, K-means
algorithm, the weighting coefficients are adjusted so that the clustering aims to partition a dataset {x1 , x2 , . . .,xN } into K
training samples that are misclassified by the current weak clas- subsets so as to minimize the distortion measure
sifier receive greater weight. In the next iteration, a new weak 
N 
K
classifier is trained using the updated weighted dataset. As a rn k xn − μk 2 (4)
result, the previously misclassified data points will exert more n =1 k =1
influence in training a subsequent classifier. After K iterations, where binary indicator rn k denotes that the data point xn is
we can obtain a strong classifier in the following form: assigned to the kth cluster, and μk denotes the mean of the kth
K  cluster. After performing K-means algorithm, the samples with

y(x) = sign αk yk (x) (3) new class labels are utilized in training the neural network. The
k =1 experimental results demonstrate that clustering helps construct
a more effective neural network for credit scoring.
where the value of αk represents a weighted measure of the clas- The third method to construct hybrid classifiers is based on
sification accuracy of the weak classifier yk (x); in other words, integrating two different techniques into a “single” module for
αk assigns greater weight to more accurate weak classifier [42]. classification, in which the integrated techniques complement
2) Hybrid Classifiers: The concept of hybrid classifiers is each other. One well-known example is the one in which training
based on combining two or more heterogeneous machine- a single classifier, such as the backpropagation neural network,
learning techniques. There are three ways to build hybrid classi- requires a set number of parameters to obtain an optimal classi-
fiers. The first one involves cascading different classifiers. This fier. To this end, a number of algorithms can be used to identify
approach is similar to boosting, in which multiple classifiers are optimal parameter settings, such as genetic algorithms [70], to
combined in a sequential manner. However, cascaded hybrid train the neural network.
classifiers are based on combining heterogeneous classifiers, For instance, Hu [49] employs the genetic algorithms to de-
and the number of cascaded classifiers is usually small, such termine the connection weights of the single-layer perceptron
as 2. More specifically, first-level classifiers are trained for a (SLP). When an input pattern xi is presented to the proposed
specific problem, and the output of this classifier becomes the SLP in [49], the corresponding actual value is defined as
input for the second-level classifier, and so on. To name just a
1 +
few, one well-known hybrid approach is based on neuro-fuzzy si = (S − Si− ) (5)
techniques [61], [96], in which a neural network model repre- m i
sents the first-level classifier whose outputs are used as inputs where m represents the number of training samples, and S+ i and
to create fuzzy rules in the second-level classifier as a fuzzy S−i , respectively, denote the outranking and outranked characters
inference system. of xi over all training samples. The experimental results show
On the other hand, cascading two classification techniques that the genetic algorithm (GA)-based SLP outperforms the
can be defined as follows. Given a training dataset D that con- traditional SLP in bankruptcy prediction.
tains m training examples, which is used to train and test the first Note that many hybrid-based studies do not include the hy-
classifier. Since it is not possible to obtain 100% classification brid classifiers that are defined in this paper, focusing on the
accuracy over the training set, the correctly classified data D by performance of feature selection [59] as the first component
the first classifier are collected, where D contains n examples with the output, i.e., the selected features, used to train specific
(n < m and D ∈D). Then, D is used to train the second classifier. individual classifier(s) as the second component.
In [141], the second classifier can provide better classification 3) Comparisons of Related Work: Before comparing related
results than single classifiers trained by the original dataset D work, we provide a list of techniques and their correspond-
over a given testing set. ing abbreviations shown in Table II to make later tables more
The second approach is based on combining clustering and readable.
classification techniques serially. In the first component, a clus- Table III lists studies that developed prediction models using
tering algorithm, such as self-organizing maps (SOM) [69], is classifier ensembles, comparing them with single classifiers in
used to preprocess the given dataset. Generally, this stage has terms of prediction performance. Their results show that ensem-
two purposes. The first purpose is to preprocess the input sam- ble techniques outperform individual techniques. More specif-
ples to eliminate nonrepresentative training examples from each ically, a number of studies have attempted to construct differ-
class. The second purpose is to identify classes of patterns for ent classifier ensembles for comparison, such as [105], [108],

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
LIN et al.: MACHINE LEARNING IN FINANCIAL CRISIS PREDICTION: A SURVEY 425

TABLE II TABLE IV
A LIST OF TECHNIQUES AND THEIR CORRESPONDING ABBREVIATIONS RELATED WORK USING HYBRID CLASSIFICATION TECHNIQUES

TABLE III
RELATED WORK USING ENSEMBLE TECHNIQUES

and [145]. The results in Table III indicate that a great deal of as possible to ensure that classifier ensembles perform better
work has gone into adopting neural network ensembles for the than single classifiers. Consequently, it is necessary to use dif-
prediction of financial crises. ferent training sets to train individual classifiers and different
Table IV lists related work associated with three types of classification techniques for combination. This increases the
hybrid classifiers: cascaded, cluster + single, and integrated- complexity of the experiments.
based hybrid classifiers. Recently, hybrid classifiers have been constructed more of-
Regarding the comparisons shown in Tables III and IV, many ten than classifier ensembles; therefore, it is worthwhile taking
recent studies have constructed different classifier ensembles time to determine how many studies have focused on cascaded,
and hybrid classifiers for the prediction of financial crises. More cluster + single, and integrated-based hybrid classifiers. Fig. 2
specifically, hybrid classification techniques are more widely shows year-wise distribution of articles for the three types of
considered than ensemble. There are two possible reasons for hybrid classifiers.
this. First, it is necessary to train and test a certain number Clearly, integrated-based and cascaded hybrid techniques are
of classifiers for later combination, and no exact solution is the two most commonly used techniques. In integrated-based
available concerning the number of classifiers that should be hybrid techniques, two algorithms are usually selected for inte-
combined? This results in excessive computational cost. Sec- gration, in which one performs the task of classification and the
ond, the issue of diversity in the combined classifiers is critical other is used to turn some parameters of the classifier. The re-
for success. That is, individual classifiers must be as diversified sults show that SVM [149] and MLP are the two most commonly

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
426 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 42, NO. 4, JULY 2012

TABLE V
BASELINE CLASSIFIERS USED IN THE LITERATURE

Fig. 2. Year-wise distribution of articles for hybrid classifiers.

used classifiers, and GA is widely used to optimize the parame-


ters for training these two classifiers. On the other hand, because
there are many possibilities in cascading learning algorithms, it
is difficult to identify the most popular ones.
In short, although several recent studies have focused mainly
on developing individual classifiers (where MLP and SVM are
the two most commonly used classifiers), soft classification
techniques are currently the trend for the prediction of financial
crises. This is because of the fact that many of these techniques
outperform single classification techniques. More specifically,
MLP classifier ensembles, and GA+SVM or MLP integrated-
based hybrid classifiers have been widely developed in the
literature.

III. MODEL EVALUATION


In addition to comparing related studies on machine-learning
techniques, we must also examine experimental setups and eval-
uation methodology to assess the prediction performance of con-
structed classifiers. This is because evaluation is critical to fully are mentioned earlier. In other words, they report that their
understanding the performance of the developed models. In the soft classification techniques outperform single classification
following, we survey a number of important issues, such as rep- approaches; however, they are not compared with other ensem-
resentative baseline classifiers for comparison and datasets used bles and/or hybrid classifiers for further validation. Some excep-
in experiments. tions include [58], in which a hybrid classifier is proposed based
on SOM and MLP, comparing this hybrid classifier with SOM
A. Baseline Classifiers ensembles. In [166], cascaded hybrid classifiers are developed,
using MLP ensembles for comparisons. For ensemble classi-
Generally, researchers select specific baseline classifiers as fiers, in [167] DT ensembles are constructed using the bagging
models with which to validate classifiers for the prediction of method and compared them with MLP ensembles.
financial crises. The aim of this section is to verify the exis-
tence of generally agreed upon baseline models in the literature.
Table V lists 19 baseline classifiers that are used in various stud- B. Datasets
ies, CBR, DT, ensembles, fuzzy sets, GA, hybrid classifiers,
In addition to selecting baseline models for comparison,
k-NN, k-means, LDA/MDA, LR, MARS, MLP, naı̈ve Bayes,
the foremost issue in such experiments is the selection of
probabilistic neural networks (PNN), radial basis function neu-
dataset(s). In the literature, different studies have used different
ral networks (RBFN), rough sets [109], SOM, SVM, and wavelet
datasets, some of which are publicly downloadable, such as the
NN. In addition, Table VI shows year-wise distribution of ar-
Australian,1 German,2 and Japanese3 datasets for the prediction
ticles for these baseline classifiers. More specifically, Fig. 3
of bankruptcy and credit scoring, while others collected data
shows the top five most commonly used baseline models in
from local companies in specific countries. Table VII lists the
related studies, MLP, LR, LDA/MDA, DT, and SVM.
datasets used in related studies including their sizes and num-
These statistics show that MLP is the most widely used base-
bers of attributes. Note that except the public datasets, different
line classifier for the prediction of financial crises, and LR and
studies used different numbers of attributes and data samples
LDA/MDA are two most commonly used statistical methods to
develop baseline models. However, since 2005, many studies
1 http://archive.ics.uci.edu/ml/datasets/Statlog+(Australian+Credit+Appro
have adopted SVM as their baseline classifier.
val)
For studies that focus on ensemble or hybrid classifiers, most 2 http://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data)

baselines have been based on some of the single classifiers that 3 http://archive.ics.uci.edu/ml/datasets/Japanese+Credit+Screening

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
LIN et al.: MACHINE LEARNING IN FINANCIAL CRISIS PREDICTION: A SURVEY 427

TABLE VI
YEAR-WISE DISTRIBUTION OF ARTICLES FOR BASELINE CLASSIFIERS

Following the selection of a dataset, it must be divided into


training and testing subsets to evaluate the prediction perfor-
mance of the constructed classifiers. In this survey, we found
no generally agreed upon proportion for training and testing
sets, even when using the same datasets. Table IX shows the
proportions of training and testing sets that are used in related
datasets. Note that studies failing to specify this information are
not presented in Table IX.
Fig. 3. Top five most widely used baseline models. These results show that different studies used different sizes
of training and testing data for the design and evaluation of
classifiers, even when using the same datasets. This indicates a
need for sensitivity analysis with regard to the size of training
over the same datasets. Table VIII shows year-wise distribution and testing sets to make the final conclusion. This issue has been
of articles using the selected datasets. Fig. 4 shows the top seven considered by only a few researchers, such as [135] and [139].
mostly commonly used datasets in the literature. To conduct a reliable experiment, the n-fold cross-validation
These results show that the Australian and German datasets strategy should be considered. This method eliminates variabil-
are the two most widely selected for the prediction of finan- ity in samples, which may influence the performance of predic-
cial crises. Although they are publicly available, only 47 related tion models and minimize the effect of bias. However, only 53
studies are based on these datasets. Most related studies have of the 130 studies used cross-validation strategies in performing
used their own collected datasets for experiments. The advan- the tests. This implies that the conclusions made by numbers of
tage to use public datasets is that they enable future researchers the studies are not necessarily reliable.
to make direct and fair comparisons; however, they lack applica- From the point of view of machine learning, the aim to pre-
bility to a number of specific real-world problems. Conversely, dict financial crises is to develop well-performing classifiers.
local datasets can be applied directly to practical problems; Because it is difficult to collect large numbers of real cases
however, it is very difficult for future researchers to compare of financial crisis (i.e., the dataset size is usually small), the
the results. design of experiments is critical to obtaining reliable findings.

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
428 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 42, NO. 4, JULY 2012

TABLE VII
DATASETS USED IN THE LITERATURE

Unfortunately, very few related studies have been concerned related studies, and this is the most straightforward indicator of
with this issue. performance. In particular, we present only the best performance
(i.e., the highest rate of accuracy) for each dataset in each year
between 1995 and 2010.
C. Prediction Performances Table X shows the relationship between datasets and their
Investigating the prediction performance of various classifi- predication accuracy. Note that A., B., C., G., J., K., S., Tai.,
cation techniques provides a deeper understanding of the current and Tur. represent Australian, Benelux, Compustat, German,
state of affairs, enabling future researchers to take the next step. Japanese, Korea, Spanish, Taiwan, and Turkish, respectively. In
In addition, although datasets are used differently in different addition, “Y” and “N” followed by prediction accuracy means
studies, it may be worth examining the relationship between performing and not performing feature selection, respectively.
selected datasets, prediction accuracy, and whether feature se- With regard to Table X, only the U.K. and U.S. datasets
lection is involved in the experiments. showed improvements in performance from 2005 to 2010 and
In the literature, a variety of measurements are used to assess 1996 to 2010, respectively. It is surprising that the use of pub-
the prediction performance of classifiers, such as the accuracy lic datasets, i.e., Australian, German, and Japanese, in recent
rate, and the Type I and II error rates. This study considers only studies that involved advanced techniques do not perform bet-
the accuracy rate because performance is examined in nearly all ter than those in previous studies. Similarly, other datasets

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
LIN et al.: MACHINE LEARNING IN FINANCIAL CRISIS PREDICTION: A SURVEY 429

TABLE VIII
YEAR-WISE DISTRIBUTION OF ARTICLES FOR THE CHOSEN DATASETS

Fig. 4. Top seven most widely used datasets.

demonstrate the same situation. This reveals the need to use accuracy rates than the average results using these three
a generally agreed upon evaluation methodology with regard to datasets.
datasets, the proportion of training and testing sets, and feature
selection, to fully understand the progress of machine-learning D. Comparisons of Bankruptcy-Prediction Models
techniques in the prediction of financial crises. In this section, the bankruptcy-prediction models using sin-
Because many of these datasets are difficult to collect, gle, ensemble, and hybrid machine-learning techniques are com-
we present only the average performances of the Australian, pared in terms of their prediction performances. This compar-
German, and Japanese datasets for future researchers. On aver- ison is possible to figure out which methods are superior to
age, current classification techniques provide 89.59%, 85.33%, others. However, since related studies used different numbers
and 87.93% over these three datasets, respectively. This im- of attributes and data samples over the same datasets, it is not
plies that future classification techniques should provide higher feasible to directly compare these models. Nevertheless, many

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
430 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 42, NO. 4, JULY 2012

TABLE IX
PROPORTIONS OF TRAINING AND TESTING SETS USED IN RELATED DATASETS

studies used similar experimental settings over the pub- constructing hybrid prediction models, especially by combining
lic datasets, i.e., the Australian, German, and Japanese clustering analysis with single classifiers, is relatively easy, and
datasets. Therefore, the prediction performances to use different they can provide reasonable well prediction performances.
machine-learning techniques over these three datasets are com-
pared, and Table XI shows the comparative results. Note that IV. DISCUSSION
the best performance in each technique is underlined.
It is interesting that there is no exact winner for these three The results that are shown in the previous sections indicate
datasets. That is, the single, ensemble, and hybrid models per- several issues that have not been comprehensively examined
form best over the German, Australian, and Japanese datasets, in the literature, providing opportunities for future researchers.
respectively. However, the hybrid model by combining clus- They can be divided into feature selection, model development,
tering analysis with a single classifier [47] provides the high- baseline models, and benchmark datasets.
est rate of prediction accuracy. Particularly, it combines the
k-means clustering algorithm with the multilayer perceptrons A. Feature Selection
(MLP) neural network. On the other hand, ensemble classifiers First of all, feature selection is a critical task for data prepro-
by bagging-based MLP neural networks [105] perform best and cessing, which has considerable potential to improve the pre-
second best over the Japanese and Australian datasets. diction performance of financial crises. Feature selection can be
Fig. 5 shows the average accuracy of the bankruptcy- defined as the process to choose a minimum subset of S features
prediction models by single, ensemble, and hybrid machine- from the original dataset of T features (S < T) so that the feature
learning techniques over the Australian, German, and Japanese space (i.e., the dimensionality) is optimally reduced.
datasets. On average, hybrid prediction models outperform sin- Because of the lack of generally agreed upon variables (or
gle and ensemble classifiers. It is surprising that single learn- financial ratios) as the most representative features to predict
ing techniques outperform the other two techniques over the bankruptcy and credit scoring, collected variables must first be
German dataset, where the performances of single and ensemble examined to evaluate their importance and explanatory power in
techniques are significantly different. However, although single the selected dataset. This step filters out noise features that re-
techniques slightly perform better than the ensemble ones on av- duce dimensionality, and enhance the performance of classifiers
erage, it is hard to conclude that single techniques are superior to over that of classifiers without feature selection.
the ensemble ones since the ensemble techniques perform better Related studies have shown that performing feature selec-
than single ones over the Australian and Japanese datasets. tion can make prediction models provide better results than the
To sum up, soft techniques have potential for better ones without feature selection, such as [86], [90], [94], [116],
bankruptcy prediction, in which developing ensemble classifiers [117], [128], [135], [140], and [160]. However, since there are
is relatively complicated that requires higher computational cost many feature selection algorithms in the literature, currently it
than hybrid classifiers. For example, a number of multiple di- is unknown which feature selection method performs best with
versified classifiers need to be trained and combined. However, many datasets. In other words, statistical-based methods, such as

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
LIN et al.: MACHINE LEARNING IN FINANCIAL CRISIS PREDICTION: A SURVEY 431

TABLE X
RELATIONSHIP BETWEEN DATASETS, PREDICTION ACCURACY, AND FEATURE SELECTION

principal component analysis and stepwise, should be compared can be based on combining one specific clustering algorithm
with intelligent methods, such as genetic algorithms. with classifier ensembles. That is, the second component of the
hybrid model is based on classifier ensembles rather than one
specific single classifier.
B. Model Development Furthermore, it is worth examining the impact of performing
The second issue is the development of prediction mod- feature selection on prediction performance of single classifiers,
els. Because many related studies have illustrated the superi- classifier ensembles, and hybrid classifiers. In other words, sin-
ority of machine-learning techniques over statistical methods, gle classifiers with the best feature selection algorithm may
one should consider machine-learning-based classification al- achieve reasonable well performance or perform comparable
gorithms for the prediction of financial crises to obtain better with classifier ensembles and hybrid classifiers followed by
results. However, combining both types of techniques would feature selection.
be a reasonable solution, as they could complement each other Since DT/MLP ensembles and GA + SVM/MLP and SOM
for classifier ensembles or hybrid classifiers rather than com- + MLP are the most highly developed in the literature (cf.,
pete with each other. This leads to a related issue that involves Section II-C3), one future direction is to conduct a comprehen-
the examination of performance, combining homogeneous and sive study of comparing these classifier ensembles and hybrid
heterogeneous classification techniques. classifiers.
On the other hand, since the hybrid models by combining
C. Baseline Models
clustering analysis with single classification techniques perform
better than single classifiers and classifier ensembles over three To conduct a reliable experimental study, it is important to de-
public datasets (cf., Section III-D), one possible hybrid approach termine a baseline(s) with which to validate proposed classifiers.

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
432 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 42, NO. 4, JULY 2012

TABLE XI individually. However, very few studies considered both strate-


COMPARISONS OF BANKRUPTCY-PREDICTION MODELS BY SINGLE, ENSEMBLE,
AND HYBRID MACHINE-LEARNING TECHNIQUES
gies for their experiments.
We believe that it is always better to use a number of differ-
ent datasets for model evaluation instead of limiting the study
to only one specific dataset. Moreover, the approach to eval-
uation using training and testing sets of different proportions
and/or cross validation is also important. This would increase
the likelihood to fully understand the value and performance
of the proposed classifiers and to increase the reliability of the
conclusions made according to these evaluation results.

V. CONCLUSION
This study has reviewed 130 related studies between 1995 and
2010 on the prediction of bankruptcy and credit scoring, using
machine-learning techniques. Unlike previous review articles,
this paper has reviewed more recent papers including those using
a variety of soft classification techniques for the prediction of
financial crises. More specifically, this statistical review has
compared related studies in terms of single and soft classifiers,
baseline classifiers, and datasets.
The development of models for the prediction of financial
crises using machine learning still has a long way to go. Soft
classification techniques appear to be the direction for future
research as core techniques that are used in the development
of prediction models. These include numerous techniques that
involve classifier ensembles and hybrid classifiers.
In addition to the development of prediction models, data pre-
processing for the selection of features is another area in which
Fig. 5. Average accuracy of the bankruptcy-prediction models by single, en-
prediction performance has to be improved. Determining base-
semble, and hybrid machine-learning techniques. line classifiers with which to compare models is an important
issue to demonstrate the robustness of developed classifiers. For
example, both single and soft classifiers can be used as base-
Because MLP and SVM are the two most widely developed sin- lines. Another important issue is to carefully select the dataset(s)
gle classifiers in the literature (cf., Section II-B), and MLP, DT, for experiments. More specifically, using various datasets with
and SVM are the three most popular baselines with the excep- different proportions of training and testing sets and cross vali-
tion of statistical models (cf., Section III-A), future studies must dation is likely to provide a better understanding of the perfor-
choose some of these classifiers for comparison. mance of the classifiers and provide more reliable conclusions.
However, since many recent prediction models are based on
soft classification techniques, comparing these models with sin- REFERENCES
gle classification techniques is inadequate. This is because soft
classification techniques have shown their outperformance over [1] M. Adya and F. Collopy, “How effective are neural networks at fore-
casting and prediction? A review and evaluation,” J. Forecast., vol. 17,
single methods (cf., Section II-C). Therefore, it would be advis- pp. 481–495, 1998.
able to consider some soft classifiers, such as DT/MLP ensem- [2] H. Ahn and K.-J. Kim, “Bankruptcy prediction modeling with hybrid
bles, GA + SVM/MLP, and SOM + MLP in the experimental case-based reasoning and genetic algorithms approach,” Appl. Soft Com-
put., vol. 9, pp. 599–607, 2009.
design in order to reliably assess the performance of the pro- [3] E. Alfaro, N. Garcia, M. Gamez, and D. Elizondo, “Bankruptcy fore-
posed classifiers. casting: An empirical comparison of AdaBoost and neural networks,”
Decision Support Syst., vol. 45, pp. 110–122, 2008.
[4] E. Angelini, G. di Tollo, and A. Roli, “A neural network approach for
D. Benchmark Datasets credit risk evaluation,” Quart. Rev. Econ. Finance, vol. 48, pp. 733–755,
2008.
In addition to the consideration of feature selection and devel- [5] A. F. Atiya, “Bankruptcy prediction for credit risk using neural networks:
A survey and new results,” IEEE Trans. Neural Netw., vol. 12, no. 4,
opment of prediction models by soft classification techniques, pp. 929–935, Jul. 2001.
selection of suitable datasets for experiments must be carefully [6] B. Back, T. Laitinen, and K. Sere, “Neural networks and genetic algo-
considered. In general, there are two strategies to select the rithms for bankruptcy predictions,” Expert Syst. Appl., vol. 11, pp. 407–
413, 1996.
benchmark datasets. The first one is based on the adoption of [7] S. Balcaen and H. Ooghe, “35 years of studies on business failure:
public datasets, such as the Australian, German, and Japanese An overview of the classic statistical methodologies and their related
datasets, and the second one is for specific real cases collected problems,” British Account. Rev., vol. 38, pp. 63–93, 2006.

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
LIN et al.: MACHINE LEARNING IN FINANCIAL CRISIS PREDICTION: A SURVEY 433

[8] R. Barniv, A. Agarwal, and R. Leach, “Predicting the outcome following [33] J. N. Crook, D. B. Edelman, and L. C. Thomas, “Recent developments in
bankruptcy filing: A three-stage classification using neural networks,” consumer credit risk assessment,” Eur. J. Oper. Res., vol. 183, pp. 1447–
Intell. Syst. Account., Finance Manag., vol. 6, pp. 177–194, 1997. 1465, 2007.
[9] T. Bellotti and J. Crook, “Support vector machines for credit scoring and [34] V. S. Desai, J. N. Crook, and G. A. Overstreet, Jr., “A comparison of neu-
discovery of significant features,” Expert Syst. Appl., vol. 36, pp. 3302– ral networks and linear scoring models in the credit union environment,”
3308, 2009. Eur. J. Oper. Res., vol. 95, no. 1, pp. 24–37, 1996.
[10] A. Ben-David and E. Frank, “Accuracy of machine learning models [35] A. I. Dimitras, S. H. Zanakis, and C. Zopounidis, “A survey of business
versus “hand crafted” expert systems—A credit scoring case study,” failures with an emphasis on prediction methods and industrial applica-
Expert Syst. Appl., vol. 36, pp. 5264–5271, 2008. tions,” Eur. J. Oper. Res., vol. 90, pp. 487–513, 1996.
[11] M. Bensic, N. Sarlija, and M. Zekic-Susac, “Modelling small-business [36] Y. Ding, X. Song, and Y. Zen, “Forecasting financial condition of Chinese
credit scoring by using logistic regression, neural networks and decision listed companies based on support vector machine,” Expert Syst. Appl.,
trees,” Intell. Syst. Account., Finance Manag., vol. 13, pp. 133–150, vol. 34, pp. 3081–3089, 2008.
2005. [37] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed.
[12] D. Berg, “Bankruptcy prediction by generalized additive models,” Appl. New York: Wiley, 2001.
Stochastic Models Bus. Ind., vol. 23, pp. 129–143, 2007. [38] P. G. Espejo, S. Ventura, and F. Herrera, “A survey on the application of
[13] J. E. Boritz and D. B. Kennedy, “Effectiveness of neural network types for genetic programming to classification,” IEEE Trans. Syst., Man, Cybern.
prediction of business failure,” Expert Syst. Appl., vol. 9, pp. 503–512, C, Appl. Rev., vol. 40, no. 2, pp. 121–144, Mar. 2010.
1995. [39] H. Etemadi, A. A. Anvary, A. Rostamy, and H. F. Dehkordi, “A genetic
[14] M. A. Boyacioglu, Y. Kara, and O. K. Baykan, “Predicting bank financial programming model for bankruptcy prediction: Empirical evidence from
failures using neural networks, support vector machines and multivariate Iran,” Expert Syst. Appl., vol. 36, pp. 3199–3207, 2009.
statistical methods: A comparative analysis in the sample of savings [40] S. Finlay, “Are we modelling the right thing? The impact of incorrect
deposit insurance fund (SDIF) transferred banks in Turkey,” Expert problem specification in credit scoring,” Expert Syst. Appl., vol. 36,
Syst. Appl., vol. 36, pp. 3355–3366, 2009. pp. 9065–9071, 2009.
[15] L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, [41] S. Finlay, “Credit scoring for profitability objectives,” Eur. J. Oper. Res.,
pp. 123–140, 1996. vol. 202, pp. 528–537, 2010.
[16] L. Breiman, J. H. Friedman, R. A. Olshen, and P. J. Stone, Classification [42] Y. Freund and R. E. Schapire, “Experiments with a new boosting algo-
and Regressing Trees, CA: Wadsworth International Group, 1984. rithm,” in Proc. Int. Conf. Mach. Learning, Bari, Italy, 1996, pp. 148–
[17] H. Byun and S.-W. Lee, “A survey on pattern recognition applications 156.
of support vector machines,” Int. J. Pattern Recog. Artif. Intell., vol. 17, [43] D. Frosyniotis, A. Stafylopatis, and A. Likas, “A divide-and-conquer
no. 3, pp. 459–486, 2003. method for multi-net classifiers,” J. Pattern Analysis Appl., vol. 6, no. 1,
[18] T. G. Calderon and J. J. Cheh, “A roadmap for future neural networks pp. 32–40, 2003.
research in auditing and risk assessment,” Int. J. Account. Inf. Syst., [44] D. J. Hand and W. E. Henley, “Statistical classification methods in con-
vol. 3, pp. 203–236, 2002. sumer credit scoring: A review,” J. Royal Statistical Soc., Series A,
[19] S. Canbas, A. Cabuk, and S. B. Kilic, “Prediction of commercial vol. 160, pp. 523–541, 1997.
bank failure via multivariate statistical analysis of financial structures: [45] S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd ed., NJ:
The Turkish case,” Eur. J. Oper. Res., vol. 166, no. 2, pp. 528–546, Prentice-Hall, 1999.
2005. [46] F. Hoffmann, B. Baesens, C. Mues, T. VanGestel, and J. Vanthienen,
[20] A. E. Celik and Y. Karatepe, “Evaluating and forecasting banking crises “Inferring descriptive and approximate fuzzy rules for credit scoring
through neural network models: An application for Turkish banking using evolutionary algorithms,” Eur. J. Oper. Res., vol. 177, pp. 540–
sector,” Expert Syst. Appl., vol. 33, pp. 809–815, 2006. 555, 2007.
[21] D. K. Chandra, V. Ravi, and I. Bose, “Failure prediction of dotcom com- [47] N.-C. Hsieh, “Hybrid mining approach in the design of credit scoring
panies using hybrid intelligent techniques,” Expert Syst. Appl., vol. 36, models,” Expert Syst. Appl., vol. 28, pp. 655–665, 2005.
pp. 4830–4837, 2009. [48] Y.-C. Hu, “Incorporating a non-additive decision making method into
[22] N. Chauhan, V. Ravi, and D. K. Chandra, “Differential evolution trained multi-layer neural network and its application to financial distress anal-
wavelet neural networks: Application to bankruptcy prediction in banks,” ysis,” Know.-Based Syst., vol. 21, pp. 383–390, 2008.
Expert Syst. Appl., vol. 36, pp. 7659–7665, 2009. [49] Y.-C Hu, “Bankruptcy prediction using ELECTRE-based single-layer
[23] F.-L. Chen and F.-C. Li, “Combination of feature selection approaches perceptron,” Neurocmputing, vol. 72, pp. 3150–3157, 2009.
with SVM in credit scoring,” Expert Syst. Appl., vol. 37, pp. 4902–4909, [50] Y.-C Hu and F.-M. Tseng, “Functional-link net with fuzzy integral for
2010. bankruptcy prediction,” Neurocomputing, vol. 70, pp. 2959–2968, 2007.
[24] L.-H. Chen and H.-D. Hsiao, “Feature selection to diagnose a business [51] Z. Hua, Y. Wang, X. X, B. Zhang, and L. Liang, “Predicting corporate
crisis by using a real GA-based support vector machine: An empirical financial distress based on integration of support vector machine and
study,” Expert Syst. Appl., vol. 35, pp. 1145–1155, 2008. logistic regression,” Expert Syst. Appl., vol. 33, pp. 434–440, 2007.
[25] W. Chen, C. Ma, and L. Ma, “Mining the customer credit using hybrid [52] C.-L. Huang, M.-C. Chen, and C.-J. Wang, “Credit scoring with a data
support vector machine technique,” Expert Syst. Appl., vol. 36, pp. 7611– mining approach based on support vector machines,” Expert Syst. Appl.,
7616, 2009. vol. 33, pp. 847–856, 2007.
[26] W.-S. Chen and Y.-K. Du, “Using neural networks and data mining [53] J. J. Huang, G.-H. Tzeng, and C.-S. Ong, “Two-stage genetic program-
techniques for the financial distress prediction model,” Expert Syst. ming (2SGP) for the credit scoring model,” Appl. Math. Comput.,
Appl., vol. 36, pp. 4075–4086, 2009. vol. 174, pp. 1039–1053, 2006.
[27] C.-B. Cheng, C.-L. Chen, and C.-J. Fu, “Financial distress prediction by [54] S.-M. Huang, C.-F. Tsai, D. C. Yen, and Y.-L. Cheng, “A hybrid finan-
a radial basis function network with logit analysis learning,” Comput. cial analysis model for business failure prediction,” Expert Syst. Appl.,
Math. Appl., vol. 51, pp. 579–588, 2006. vol. 35, pp. 1034–1040, 2008.
[28] M.-C. Cheng and S.-H. Huang, “Credit scoring and rejected instances [55] Y.-M. Huang, C.-M. Hung, and H. C. Jiau, “Evaluation of neural networks
reassigning through evolutionary computation techniques,” Expert Syst. and data mining methods on a credit assessment task for class imbalance
Appl., vol. 24, pp. 433–441, 2003. problem,” Nonlinear Anal.: Real World Appl., vol. 7, pp. 720–747, 2006.
[29] L.-C. Chi and T.-C. Tang, “Bankruptcy prediction: Application of logit [56] Z. Huang, H. Chen, C.-J. Hsu, W.-H. Chen, and S. Wu, “Credit rating
analysis in export credit risks,” Australian J. Manag., vol. 31, no. 1, analysis with support vector machines and neural networks: A market
pp. 17–28, 2006. comparative study,” Decision Support Syst., vol. 37, pp. 543–558, 2004.
[30] S. Cho, H. Hong, and B.-C. Ha, “A hybrid approach based on the combi- [57] C. Hung and J.-H. Chen, “A selective ensemble based on expected prob-
nation of variable selection using decision trees and case-based reasoning abilities for bankruptcy prediction,” Expert Syst. Appl., vol. 36, pp.
using the Mahalanobis distance: For bankruptcy prediction,” Expert Syst. 5297–5303, 2009.
Appl., vol. 37, pp. 3482–3488, 2010. [58] J. Huysmans, B. Baesens, J. Vanthienen, and T. Van Gestel, “Failure
[31] S. Cho, J. Kim, and J. K. Bae, “An integrative model with subject weight prediction with self organizing maps,” Expert Syst. Appl., vol. 30, pp.
based on neural network learning for bankruptcy prediction,” Expert 479–487, 2006.
Syst. Appl., vol. 36, pp. 403–410, 2009. [59] A. Jain and D. Zongker, “Feature selection: Evaluation, application, and
[32] C.-L. Chuang and R.-H. Lin, “Constructing a reassigning credit scoring small sample performance,” IEEE Trans. Pattern Anal. Mach. Intell.,
model,” Expert Syst. Appl., vol. 36, pp. 1685–1694, 2009. vol. 19, no. 2, pp. 153–158, Feb. 1997.

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
434 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 42, NO. 4, JULY 2012

[60] K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: A review,” ACM [87] H. Li, J. Sun, and B.-L. Sun, “Financial distress prediction based on
Comput. Survey, vol. 31, no. 3, pp. 264–323, 1999. OR-CBR in the principle of k-nearest neighbors,” Expert Syst. Appl.,
[61] J.-S. Jang, C.-T. Sun, and E. Mizutani, Neuro-Fuzzy and Soft Computing: vol. 36, pp. 643–659, 2009.
A Computational Approach to Learning and Machine Intelligence, NJ: [88] M. K. Lim and S. Y. Sohn, “Cluster-based dynamic scoring model,”
Prentice-Hall, 1996. Expert Syst. Appl., vol. 32, pp. 427–431, 2006.
[62] H. Jo and I. Han, “Integration of case-based forecasting, neural network, [89] F. Y. Lin and S. McClean, “A data mining approach to the prediction of
and discriminant analysis for bankruptcy prediction,” Expert Syst. Appl., corporate failure,” Know.-Based Syst., vol. 14, pp. 189–195, 2001.
vol. 11, pp. 415–422, 1996. [90] R.-H. Lin, Y.-T. Wang, C.-H. Wu, and C.-L. Chung, “Developing a
[63] H. Jo, J. Han, and H. Lee, “Bankruptcy prediction using case-based business failure prediction model via RST, GRA and CBR,” Expert
reasoning, neural networks, and discriminant analysis,” Expert Syst. Syst. Appl., vol. 36, pp. 1593–1600, 2009.
Appl., vol. 13, pp. 97–108, 1997. [91] S. L. Lin, “A new two-stage hybrid approach of credit risk in banking
[64] S. Kaski, J. Sinkkonen, and J. Peltonen, “Bankruptcy analysis with self- industry,” Expert Syst. Appl., vol. 36, pp. 8333–8341, 2009.
organizing maps in learning metrics,” IEEE Trans. Neural Netw., vol. 12, [92] T.-H. Lin, “A cross model study of corporate financial distress predic-
no. 4, pp. 936–947, Jul. 2001. tion in Taiwan: Multiple discriminant analysis, logit, probit and neural
[65] M.-J. Kim and I. Han, “The discovery of experts’ decision rules from networks models,” Neurocomputing, vol. 72, pp. 3507–3516, 2009.
qualitative bankruptcy data using genetic algorithms,” Expert Syst. Appl., [93] C.-H. Liu, L.-S. Chen, and C.-C. Hsu, “An association-based case reduc-
vol. 25, pp. 637–646, 2003. tion technique for case-based reasoning,” Inf. Sci., vol. 178, pp. 3347–
[66] M.-J. Kim and D.-K. Kang, “Ensemble with neural networks for 3355, 2008.
bankruptcy prediction,” Expert Syst. Appl., vol. 37, pp. 3373–3379, [94] Y. Liu and M. Schumann, “Data mining feature selection for credit
2010. scoring models,” J. Oper. Res. Soc., vol. 56, no. 9, pp. 1099–1108, 2005.
[67] J. Kittler, M. Hatef, R. P. W. Duin, and J. Matas, “On combining classi- [95] S.-T. Luo, B.-W. Cheng, and C.-H. Hsieh, “Prediction model building
fiers,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 3, pp. 226– with clustering-launched classification and support vector machines in
239, Mar. 1998. credit scoring,” Expert Syst. Appl., vol. 36, pp. 7562–7566, 2009.
[68] P.-C. Ko and P.-C. Lin, “An evolution-based approach with modularized [96] R. Malhotra and D. K. Malhotra, “Differentiating between good credits
evaluations to forecast financial distress,” Know.-Based Syst., vol. 19, and bad credits using neuro-fuzzy systems,” Eur. J. Oper. Res., vol. 136,
pp. 84–91, 2006. pp. 190–211, 2002.
[69] T. Kohonen, “Self-organized formation of topologically correct feature [97] D. Martens, B. Baesens, T. VanGestel, and J. Vanthienen, “Compre-
maps,” Biol. Cybern., vol. 43, pp. 59–69, 1982. hensible credit scoring models using rule extraction from support vec-
[70] J. R. Koza, Genetic Programming: On the Programming of Computers tor machines,” Eur. J. Oper. Res., vol. 183, no. 3, pp. 1466–1476,
by Means of Natural Selection. Cambridge, MA: MIT Press, 1992. 2007.
[71] P. R. Kumar and V. Ravi, “Bankruptcy prediction in banks and firms [98] T. E. McKee and T. Lensberg, “Genetic programming and rough sets:
via statistical and intelligent techniques—A review,” Eur. J. Oper. Res., A hybrid approach to bankruptcy classification,” Eur. J. Oper. Res.,
vol. 180, pp. 1–28, 2007. vol. 138, pp. 436–451, 2002.
[72] R. C. Lacher, P. K. Coats, S. C. Sharma, and L. F. Fant, “A neural network [99] J. H. Min and C. Jeong, “A binary classification method for bankruptcy
for classifying the financial health of a firm,” Eur. J. Oper. Res., vol. 85, prediction,” Expert Syst. Appl., vol. 36, pp. 5256–5263, 2009.
pp. 53–65, 1995. [100] J. H. Min and Y.-C. Lee, “Bankruptcy prediction using support vector
[73] G. Lanine and R. V. Vennet, “Failure prediction in the Russian bank sector machine with optimal choice of kernel function parameters,” Expert
with logit and trait recognition models,” Expert Syst. Appl., vol. 30, Syst. Appl., vol. 28, pp. 603–614, 2005.
pp. 463–478, 2006. [101] J. H. Min and Y.-C. Lee, “A practical approach to credit scoring,” Expert
[74] Y.-C. Lee, “Application of support vector machines to corporate credit Syst. Appl., vol. 35, pp. 1762–1770, 2008.
rating prediction,” Expert Syst. Appl., vol. 33, pp. 67–74, 2007. [102] S.-H. Min, J. Lee, and I. Han, “Hybrid genetic algorithms and support
[75] Y.-C. Lee and H.-L. Teng, “Predicting the financial crisis by vector machines for bankruptcy prediction,” Expert Syst. Appl., vol. 31,
Mahalanobis–Taguchi system—Examples of Taiwan’s electronic sec- pp. 652–660, 2006.
tor,” Expert Syst. Appl., vol. 36, pp. 7469–7478, 2009. [103] T. Mitchell, Machine Learning. New York: McGraw Hill, 1997.
[76] K. Lee, D. Booth, and P. Alam, “A comparison of supervised and un- [104] S. Nanda and P. Pendharkar, “Linear models for minimizing misclassi-
supervised neural networks in predicting bankruptcy of Korean firms,” fication costs in bankruptcy prediction,” Int. J. Intell. Syst. Account.,
Expert Syst. Appl., vol. 29, pp. 1–16, 2005. Finance Manag., vol. 10, pp. 155–168, 2001.
[77] K. C. Lee, I. Han, and Y. Kwon, “Hybrid neural network models for [105] L. Nanni and A. Lumini, “An experimental comparison of ensemble of
bankruptcy predictions,” Decision Support Syst., vol. 18, pp. 63–72, classifiers for bankruptcy prediction and credit scoring,” Expert Syst.
1996. Appl., vol. 36, pp. 3028–3033, 2009.
[78] T.-S. Lee, C.-C. Chiu, Y.-C. Chou, and C.-J. Lu, “Mining the customer [106] H. Ogut, R. Akts, A. Alp, and M. M. Doganay, “Prediction of financial
credit using classification and regression tree and multivariate adaptive information manipulation by using support vector machine and prob-
regression splines,” Comput. Statist. Data Anal., vol. 50, pp. 1113–1130, abilistic neural network,” Expert Syst. Appl., vol. 36, pp. 5419–5423,
2006. 2009.
[79] T.-S. Lee, C.-C. Chiu, C.-J. Lu, and I.-F. Chen, “Credit scoring using [107] C.-S. Ong, J.-J. Huang, and G.-H. Tzeng, “Building credit scoring models
the hybrid neural discriminant technique,” Expert Syst. Appl., vol. 23, using genetic programming,” Expert Syst. Appl., vol. 29, pp. 41–47,
pp. 245–254, 2002. 2005.
[80] T. Lensberg, A. Eilifsen, and T. E. McKee, “Bankruptcy theory devel- [108] G. Paleologo, A. Elisseeff, and G. Antonini, “Subagging for credit scor-
opment and classification via genetic programming,” Eur. J. Oper. Res., ing models,” Eur. J. Oper. Res., vol. 201, pp. 490–499, 2010.
vol. 169, pp. 677–697, 2006. [109] Z. Pawlak, “Rough set,” Int. J. Comput. Inf. Sci, vol. 11, no. 5, pp.
[81] M. Leshno and Y. Spector, “Neural network prediction analysis: The 341–356, 1982.
bankruptcy case,” Neurocomputing, vol. 10, pp. 125–147, 1996. [110] P. C. Pendharkar, “A threshold-varying artificial neural network approach
[82] H. Li and J. Sun, “Ranking-order case-based reasoning for financial for classification and its application to bankruptcy prediction problem,”
distress prediction,” Know.-Based Syst., vol. 21, pp. 868–878, 2008. Comput. Oper. Res., vol. 32, no. 10, pp. 2561–2582, 2005.
[83] H. Li and J. Sun, “Majority voting combination of multiple case-based [111] S. Piramuthu, “On preprocessing data for financial credit risk evaluation,”
reasoning for financial distress prediction,” Expert Syst. Appl., vol. 36, Expert Syst. Appl., vol. 30, pp. 489–497, 2006.
pp. 4363–4373, 2009. [112] P. P. M. Pompe and J. Bilderbeek, “Bankruptcy prediction: The influence
[84] H. Li and J. Sun, “Gaussian case-based reasoning for business failure of the year prior to failure selected for model building and the effects in
prediction with empirical data in China,” Inf. Sci., vol. 179, pp. 89–108, a period of economic decline,” Intell. Syst. Account., Finance Manag.,
2009. vol. 13, pp. 95–112, 2005.
[85] H. Li and J. Sun, “Predicting business failure using multiple case-based [113] P. P. M. Pompe and J. Bilderbeek, “The prediction of bankruptcy of
reasoning combined with support vector machine,” Expert Syst. Appl., small- and medium-sized industrial firms,” J. Bus. Venturing, vol. 20,
vol. 36, pp. 10085–10096, 2009. pp. 847–868, 2005.
[86] H. Li, H.-B. Huang, J. Sun, and C. Lin, “On sensitivity of case-based [114] V. Ravi, H. Kurniawan, P. N. K. Thai, and P. R. Kumar, “Soft computing
reasoning to optimal feature selection subsets in business failure predic- system for bank performance prediction,” Appl. Soft Comput., vol. 8,
tion,” Expert Syst. Appl., vol. 37, pp. 4811–4821, 2010. pp. 305–315, 2008.

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
LIN et al.: MACHINE LEARNING IN FINANCIAL CRISIS PREDICTION: A SURVEY 435

[115] V. Ravi and C. Pramodh, “Threshold accepting trained principal com- [140] C.-F. Tsai, “Feature selection in bankruptcy prediction,” Know.-Based
ponent neural network and feature subset selection: Application to Syst., vol. 22, no. 2, pp. 120–127, 2009.
bankruptcy prediction in banks,” Appl. Soft Comput., vol. 8, pp. 1539– [141] C.-F. Tsai and M.-L. Chen, “Credit rating by hybrid machine learning
1548, 2012. techniques,” Appl. Soft Comput., vol. 10, pp. 374–380, 2010.
[116] P. Ravisankar and V. Ravi, “Financial distress prediction in banks us- [142] C.-F. Tsai and J.-W. Wu, “Using neural network ensembles for
ing Group Method of Data Handling neural network, counter propaga- bankruptcy prediction and credit scoring,” Expert Syst. Appl., vol. 34,
tion neural network and fuzzy ARTMAP,” Know.-Based Syst., vol. 23, no. 4, pp. 2639–2649, 2008.
pp. 823–831, 2010. [143] A. Tsakonas, G. Dounias, M. Doumpos, and C. Zopounidis, “Bankruptcy
[117] P. Ravisankar, V. Ravi, and I. Bose, “Failure prediction of dotcom com- prediction with neural logic networks by means of grammar-guided ge-
panies using neural network-genetic programming hybrids,” Inf. Sci., netic programming,” Expert Syst. Appl., vol. 30, pp. 449–461, 2006.
vol. 180, pp. 1257–1267, 2010. [144] F.-M. Tseng and Y.-C. Hu, “Comparing four bankruptcy prediction mod-
[118] Y. U. Ryu and W. T. Yue, “Firm bankruptcy prediction: Experimental els: Logit, quadratic interval logit, neural and fuzzy neural networks,”
comparison of isotonic separation and other classification approaches,” Expert Syst. Appl., vol. 37, pp. 1846–1853, 2010.
IEEE Trans. Syst., Man, Cybern. A, Syst. Humans, vol. 35, no. 5, pp. 727– [145] B. Twala, “Multiple classifier application to credit risk assessment,”
737, 2005. Expert Syst. Appl., vol. 37, pp. 3326–3336, 2010.
[119] S. Salcedo-Sanz, J.-L. Fernández-Villacañas, M. J. Segovia-Vargas, and [146] T. Van Gestel, B. Baesens, and D. Martens, “From linear to non-linear
C. Bousoño-Calzón, “Genetic programming for the prediction of insol- kernel based classifiers for bankruptcy prediction,” Neurocomputing,
vency in non-life insurance companies,” Comput. Oper. Res., vol. 32, vol. 73, pp. 2955–2970, 2010.
pp. 749–765, 2005. [147] T. Van Gestel, B. Baesens, J. A. K. Suykens, D. Van den Poel, D.-
[120] A. Sanchis, M. J. K Segovia, J. A. Gil, A. Heras, and J. L. Vilar, “Rough E. Baestaens, and M. Willekens, “Bayesian kernel based classification
sets and the role of the monetary policy in financial stability (macroe- for financial distress detection,” Eur. J. Oper. Res., vol. 172, pp. 979–
conomic problem) and the prediction of insolvency in insurance sector 1003, 2006.
(microeconomic problem),” Eur. J. Oper. Res., vol. 181, pp. 1554–1573, [148] T. Vang Gestel, B. Baesens, P. Van Dijcke, J. Garcia, J. A. K. Suykens,
2007. and J. Vanthienen, “A process model to develop an internal rating system:
[121] H.-V. Seow and L. C. Thomas, “Using adaptive learning in credit scoring Sovereign credit rating,” Decision Support Syst., vol. 42, pp. 1131–1151,
to estimate take-up probability distribution,” Eur. J. Oper. Res., vol. 173, 2006.
pp. 880–892, 2006. [149] V. Vapnik, Statistical Learning Theory. New York: Wiley, 1998.
[122] R. Setiono, B. Baesens, and C. Mues, “A note on knowledge discovery [150] A. Vellido, P. J. G. Lisboa, and J. Vaughan, “Neural networks in business:
using neural networks and its application to credit card screening,” Eur. A survey of applications (1992–1998),” Expert Syst. Appl., vol. 17,
J. Oper. Res., vol. 192, pp. 326–332, 2009. pp. 51–70, 1999.
[123] C. Serrano-Cinca, “Feedforward neural networks in the classification of [151] A. Verikas, Z. Kalsyte, M. Bacauskiene, and A. Gelzinis, “Hybrid and
financial information,” Eur. J. Finance, vol. 3, pp. 183–202, 1997. ensemble-based soft computing techniques in bankruptcy prediction: A
[124] K.-S. Shin and Y.-J. Lee, “A genetic algorithm application in bankruptcy survey,” Soft Comput., vol. 14, pp. 995–1010, 2010.
prediction modeling,” Expert Syst. Appl., vol. 23, pp. 321–328, [152] Y. Wang, S. Wang, and K. K. Lai, “A new fuzzy support vector machine
2002. to evaluate credit risk,” IEEE Trans. Fuzzy Syst., vol. 13, no. 6, pp. 820–
[125] K.-S Shin, T. S. Lee, and H.-J. Kim, “An application of support vector 831, Dec. 2005.
machines in bankruptcy prediction model,” Expert Syst. Appl., vol. 28, [153] D. West, “Neural network credit scoring models,” Comput. Oper. Res.,
pp. 127–135, 2005. vol. 27, no. 11/12, pp. 1131–1152, 2000.
[126] K. A. Smith and J. N. D. Gupta, “Neural networks in business: Techniques [154] D. West, S. Dellana, and J. Qian, “Neural network ensemble strategies for
and applications for the operations researcher,” Comput. Oper. Res., financial decision applications,” Comput. Oper. Res., vol. 32, pp. 2543–
vol. 27, pp. 1023–1044, 2000. 2559, 2005.
[127] T. Sueyoshi and M. Goto, “DEA-DA for bankruptcy-based performance [155] B. K. Wong and Y. Selvi, “Neural network applications in business: A
assessment: Misclassification analysis of Japanese construction indus- review and analysis of literature (1990–1996),” Inf. Manag., vol. 34,
try,” Eur. J. Oper. Res., vol. 199, pp. 576–594, 2009. pp. 129–139, 1998.
[128] J. Sun and H. Li, “Data mining method for listed companies’ financial [156] B. K. Wong, T. A. Bodnovich, and Y. Selvi, “Neural network applications
distress prediction,” Know.-Based Syst., vol. 21, pp. 1–5, 2008. in business: A review and analysis of the literature (1988–95),” Decision
[129] J. Sun and H. Li, “Listed companies’ financial distress prediction based Support Syst., vol. 9, pp. 301–320, 1997.
on weighted majority voting combination of multiple classifiers,” Expert [157] C.-H. Wu, G.-H. Tzeng, Y.-J. Goo, and W.-C. Fang, “A real-valued
Syst. Appl., vol. 35, pp. 818–827, 2008. genetic algorithm to optimize the parameters of support vector machine
[130] J. Sun and H. Li, “Financial distress prediction based on serial combina- for predicting bankruptcy,” Expert Syst. Appl., vol. 32, no. 2, pp. 397–
tion of multiple classifiers,” Expert Syst. Appl., vol. 36, pp. 8659–8666, 408, 2007.
2009. [158] D. Wu, L. Liang, and Z. Yang, “Analyzing the financial distress of Chi-
[131] L. Sun and P. P. Shenoy, “Using Bayesian networks for bankruptcy nese public companies using probabilistic neural networks and multi-
prediction: Some methodological issues,” Eur. J. Oper. Res., vol. 180, variate discriminate analysis,” Socio-Economic Planning Sci., vol. 42,
pp. 738–753, 2007. pp. 206–220, 2008.
[132] T. K. Sung, N. Chang, and G. Lee, “Dynamics of modeling in data [159] W.-W. Wu, “Beyond business failure prediction,” Expert Syst. Appl.,
mining: Interpretive approach to bankruptcy prediction,” J. Manag. Inf. vol. 37, pp. 2371–2376, 2010.
Syst., vol. 16, no. 1, pp. 63–85, 1999. [160] X. Xu and Y. Wang, “Financial failure prediction using efficiency as a
[133] M. Sustersic, D. Mramor, and J. Zupan, “Consumer credit scoring models predictor,” Expert Syst. Appl., vol. 36, pp. 366–373, 2009.
with limited data,” Expert Syst. Appl., vol. 36, pp. 4736–4744, 2009. [161] X. Xu, C. Zhou, and Z. Wang, “Credit scoring algorithm based on
[134] T.-C. Tang and L.-C. Chi, “Predicting multilateral trade credit risks: link analysis ranking with support vector machine,” Expert Syst. Appl.,
Comparisons of logit and fuzzy logic models using ROC curve analysis,” vol. 36, pp. 2625–2632, 2009.
Expert Syst. Appl., vol. 28, pp. 547–556, 2005. [162] Y. Yang, “Adaptive credit scoring with kernel learning methods,” Eur.
[135] T.-C. Tang and L.-C. Chi, “Neural networks analysis in business failure J. Oper. Res., vol. 183, pp. 1521–1535, 2007.
prediction of Chinese importers: A between-countries approach,” Expert [163] Z. R. Yang, M. B. Platt, and H. D. Platt, “Probabilistic neural networks
Syst. Appl., vol. 29, pp. 244–255, 2005. in bankruptcy prediction,” J. Bus. Res., vol. 44, pp. 67–74, 1999.
[136] F. E. H. Tay and L. Shen, “Economic and financial prediction using rough [164] I.-C. Yeh and C.-H. Lien, “The comparisons of data mining techniques
sets model,” Eur. J. Oper. Res., vol. 141, pp. 641–659, 2002. for the predictive accuracy of probability of default of credit card clients,”
[137] L. C. Thomas, “A survey of credit and behavioural scoring: Forecasting Expert Syst. Appl., vol. 36, pp. 2473–2480, 2009.
financial risk of lending to consumers,” Int. J. Forecast., vol. 16, pp. 149– [165] J. S. Yoon and Y. S. Kwon, “A practical approach to bankruptcy pre-
172, 2000. diction for small businesses: Substituting the unavailable financial data
[138] L. C. Thomas, K. M. Jung, S. D. Thomas, and Y. Wu, “Modeling con- for credit card sales information,” Expert Syst. Appl., vol. 37, pp. 3624–
sumer acceptance probabilities,” Expert Syst. Appl., vol. 30, pp. 499–506, 3629, 2010.
2006. [166] L. Yu, S. Wang, and K. K. Lai, “An intelligent-agent-based fuzzy group
[139] C.-F. Tsai, “Financial decision support using Neural Networks and sup- decision making model for financial multicriteria decision support: The
port vector machines,” Expert Syst., vol. 25, no. 4, pp. 380–393, 2008. case of credit scoring,” Eur. J. Oper. Res., vol. 195, pp. 942–959, 2009.

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.
436 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 42, NO. 4, JULY 2012

[167] D. Zhang, X. Zhou, S. C. H. Leung, and J. Zheng, “Vertical bagging Ya-Han Hu received the Ph.D. degree in informa-
decision trees model for credit scoring,” Expert Syst. Appl., vol. 37, tion management from National Central University,
pp. 7838–7843, 2010. Taoyuan County, Taiwan in 2007.
[168] G. Zhang, M. Y. Hu, B. E. Patuwo, and D. C. Indro, “Artificial neural net- He is currently an Assistant Professor with the
works in bankruptcy prediction: General framework and cross-validation Department of Information Management, National
analysis,” Eur. J. Oper. Res., vol. 116, pp. 16–32, 1999. Chung Cheng University, Chiayi County, Taiwan.
[169] G. Zhang, B. E. Patuwo, and M. Y. Hu, “Forecasting with artificial neural His current research interests include data mining
networks: The state of the art,” Int. J. Forecast., vol. 14, pp. 35–62, 1998. and knowledge discovery, decision support systems,
[170] H. Zhao, A. Sinha, and W. Ge, “Effects of feature construction on clas- and EC technologies. His research has appeared in
sification performance: An empirical study in bank failure prediction,” Data & Knowledge Engineering, Decision Support
Expert Syst. Appl., vol. 36, pp. 2633–2644, 2009. Systems, and Journal of Information Science.
[171] Z. Zhu, H. He, J. A. Starzyk, and C. Tseng, “Self-organizing learning
array and its application to economic and financial problems,” Inf. Sci.,
vol. 177, pp. 1180–1192, 2007.

Chih-Fong Tsai received the Ph.D. degree from the


School of Computing and Technology, University of
Wei-Yang Lin received the B.S.E.E. degree from Na- Sunderland, U.K., in 2005.
tional Sun Yat-Sen University, Kaohsiung, Taiwan, in He is currently an Associate Professor with the
1994, and the M.S.E.E. and Ph.D. degrees from the Department of Information Management, National
University of Wisconsin-Madison, Madison, in 2004, Central University, Taoyuan County, Taiwan. He has
and 2006, respectively. published more than 50 technical publications in jour-
Since 2006, he has been with the Department nals, book chapters, and international conference pro-
of Computer Science and Information Engineering, ceedings. His current research interests include mul-
National Chung Cheng University, Chiayi County, timedia information retrieval and data mining.
Taiwan, where he is currently an Assistant Profes- Dr. Tsai received the Highly Commended Award
sor. His research interests include computer vision, (Emerald Literati Network 2008 Awards for Excellence) from Online Informa-
biometric authentication, and multimedia signal tion Review and the award for top ten cited articles in 2008 from Expert Systems
processing. with Applications.

Authorized licensed use limited to: TKR Educational Society. Downloaded on September 05,2020 at 17:55:02 UTC from IEEE Xplore. Restrictions apply.

You might also like