Professional Documents
Culture Documents
Charalambous 2000
Charalambous 2000
Abstract. This study compares the predictive performance of three neural network methods, namely the
learning vector quantization, the radial basis function, and the feedforward network that uses the conjugate
gradient optimization algorithm, with the performance of the logistic regression and the backpropagation
algorithm. All these methods are applied to a dataset of 139 matched-pairs of bankrupt and non-bankrupt
US firms for the period 1983–1994. The results of this study indicate that the contemporary neural network
methods applied in the present study provide superior results to those obtained from the logistic regression
method and the backpropagation algorithm.
1. Introduction
Artificial Neural Networks (ANN) are steadily becoming a very popular research subject
with applications in many areas, such as medicine, business, politics and technology.
Their application in business, and specifically in bankruptcy prediction has become even
more important as recent evidence suggests that ANN models can effectively capture and
represent complex relationships in areas where other statistical methods do not perform
that well. In addition, ANN models can overcome some limitations imposed by such
statistical methods [15].
Thus far, most researchers have experimented mainly with simple feedforward net-
works trained by the Back-Propagation (BP) algorithm. Evidence regarding the predic-
tive ability of ANN models based on other ANN structures is limited. The purpose of
this study is to compare the predictive performance of three contemporary ANN meth-
ods with the performance of the logistic regression and the simple feedforward network
using the BP algorithm. The present study differs from prior research in the following
respects: (a) it uses the Kohonen Learning Vector Quantization (LVQs) training algo-
rithms [17], the Radial Basis Function (RBF) network [8], and the feedforward network
that minimizes the Least Squares Error Function (LSEF) with and without a penalty term
using conjugate gradient optimization algorithm [9], in addition to the common feedfor-
ward network trained by the BP algorithm [23], (b) it compares the results of these ANN
methods with the results of the logistic regression method, by applying these meth-
ods to bankruptcy prediction. We chose to apply these ANN methodologies to bank-
ruptcy prediction because bankruptcy is considered one of the most significant threats
for many businesses today. Evidence shows that in the past two decades business fail-
ures worldwide have occurred at rates higher than at any time since the early 1930s.
404 CHARALAMBOUS ET AL.
Bankruptcy does not only affect the organization itself, but it also affects the overall
economy. Specifically, investors, creditors, management and employees are severely
affected from business failures. Since the economic cost of corporate failure is large,
there is a great need for models giving timely and robust prediction for this particular
event [5,16,25,28,31,33].
Our data set consists of 139 matched-pairs of bankrupt and non-bankrupt US firms
for the period 1983–1994. Our empirical findings indicate that the contemporary ANN
methods employed achieve better prediction results than the feedforward-BP model and
the logistic regression in all three years prior to bankruptcy.
The remainder of this study is organized as follows: section 2 motivates the study;
section 3 discusses the methodology employed in this study; empirical evidence from
the application of the ANN models in bankruptcy prediction is given in section 4; the
final section provides summary and conclusions.
Extant evidence shows that several researchers assessed the predictive ability of ANN
models on bankruptcy prediction. However, most of them employed the feedforward
network structure in conjunction with the standard BP algorithm. More specifically,
Odom and Sharda [19], Rahimian et al. [22], Coats and Fant [11], and Wilson and
Sharda [32] applied the Altman’s model [1] to compare the predictive ability of the feed-
forward networks with the performance of the multiple discriminant analysis (MDA).
These researchers conclude that the classification results of the ANN models are su-
perior to the results provided by the MDA method. Moreover, Altman [3] concludes
that the feedforward ANN models consisting of more than two layers achieve better
classification results than the MDA models. Leshno and Spector [18] also conclude
that (a) the prediction results of the ANN models are superior to the results of the
MDA model, and (b) the prediction accuracy is improved when learning is performed
through enhanced techniques and when financial information from distinct periods is
presented to the network. Furthermore, other researchers exhibited extended exper-
imentation with a number of statistical methods, including logistic regression, mul-
tiple discriminant analysis and ANN models on the bankruptcy problem. Boritz et
al. [6], Tam and Kiang [29,30], Charitou and Charalambous [10], Huang et al. [15],
Brocket et al. [7], Salchenberger et al. [24] and Fanning and Cogger [12] compared
the predictive ability of feedforward networks with the performance of other statisti-
cal methods, such as the k-nearest neighbor method, the non-parametric discriminant
analysis, and the logit and probit models [20]. These researchers conclude that the
ANN models are superior to the other statistical methods, partly because the ANN
models are not constrained to a particular distribution. Contrary to the above evi-
dence, Barniv et al. [4] find no significant differences in the predictive ability of the
feedforword ANN, MDA, and logit methods. Finally, Ragupathi et al. [21] and Suh
and Kim [27] applied different ANN models to bankruptcy prediction. More specif-
ically, Ragupathi et al. [21] conclude that the two-hidden-layer networks outperform
COMPARATIVE ANALYSIS OF ARTIFICIAL NEURAL NETWORK MODELS 405
the single-hidden layer networks because a single-hidden-layer network could not effec-
tively capture the complex relationships between the predictor financial variables. Suh
and Kim [27] also tried to discriminate between healthy, bankrupt and financially dis-
tressed firms using single-hidden-layer feedforward networks but the results regarding
the separation of the sample consisting the bankrupt and distressed firms were inconclu-
sive.
In summary, the aforementioned bankruptcy studies indicate that the prediction re-
sults of the feedforward networks trained by the BP algorithm are superior to the results
obtained from the standard statistical techniques (i.e., logistic regression and multiple
discriminant analysis). Even though there are other contemporary ANN methods, re-
searchers did not apply these methods to determine whether they provide superior clas-
sification results. The objective of this study is to compare the predictive performance
of three neural network methods, namely the learning vector quantization, radial basis
function, the feedforward network that uses the conjugate gradient optimization algo-
rithm with the performance of the logistic regression and the backpropagation algorithm.
3. Data set
Our data set consists of 139 matched-pairs of bankrupt and non-bankrupt US firms for
the period 1983–1994. The bankrupt firms were matched with non-bankrupt firms on the
basis of their industry classification, size, and year of bankruptcy filing. All data used
in this study were collected from the Compustat database. The data were separated into
two sub-samples, the training and the testing samples. The training sample consists of all
firms, bankrupt and healthy, included in the data set for the period 1983–1991 (total of
192 firms). The testing or forecasting sample consists of the remaining 86 firms included
in the data set for the period 1992–1994. In order to identify the predictor variables we
use twenty-seven financial variables that were found to be the most significant in the
literature. By employing univariate and stepwise regression analysis we find that the
following seven variables are the most significant in predicting bankruptcy and these are
the ones used subsequently:
The following ANN models and training algorithms are used in this study:
1. Kohonen’s self-organising map plus the three learning vector quantization training
algorithms; versions 1–3.
2. The radial basis functions network, with optimization.
3. The simple feedforward network when:
(i) trained by the standard back-propagation algorithm, and
(ii) trained by the conjugate gradient optimization algorithm, minimising the least-
squares error function with and without a penalty term.
A detailed discussion of these ANN models follows:
Kohonen’s SOFM gives solution to the following problem: given a set of input
patterns X(i) ∈ RN , i = 1, 2, . . . , s, with an unknown probability density function
p(x), we seek to find the weight vectors associated with the n-neurons of Kohonen’s
2-dimensional array such that their density function is an approximation of p(x) in an
orderly fashion. By orderly fashion we imply that patterns that are close in the input
space will give winning neurons that are close in the 2-dimensional array map. It is
crucial to the formation of ordered maps that during the self-organizing process, the
winner neuron and its neighboring neurons update their weights such that they move
closer to the input pattern. Hence one of the main activities of the Kohonen SOFM is
the clustering of the input patterns into n-clusters.
The following steps describe Kohonen’s SOFM algorithm.
Step 0: Set the grid size (i.e., values of nx and ny );
Set the initial value of the learning rate parameter α to α0 ;
Set the initial value of the radius R;
Initialize the weight vector for each neuron; in this study we adopt a random
initialization of the weight vectors within the range of the input space.
Step 1: While stopping condition is false do steps 1.1–1.3.
Step 1.1: For each input pattern X, do steps (i), (ii):
(i) Determine the winner neuron I ∗ ;
(ii) Update the network:
wi(new) = wi(old) + α X − wi(old) , i ∈ N(I ∗ , R),
wi(new) = wi(old) otherwise.
Step 1.2: Reduce the value of the learning rate α;
Reduce the value of R.
Step 1.3: Test stopping condition: the condition may specify fixed number of
iterations.
The value of R is relatively large to start with to include all neurons. As the learning
process continues it decreases until it ultimately goes to zero, where only the weight of
the winner neuron is updated.
The learning rate α is a slowly decreasing function of training epochs, ep. In this
work α was updated using the formula:
ep
α = α0 1 − ,
epmax
where epmax is the maximum number of epochs.
The Kohonen SOFM’s main activity is clustering the input patterns and cannot be
used as a pattern classifier.
408 CHARALAMBOUS ET AL.
LVQ1
Step 0: Set the number of neurons representing each class; For each class use Kohonen
SOFM to find the initial position of the reference vectors of these neurons.
Initialize the learning rate α.
Step 1: While stopping condition is false, do steps 1.1–1.3:
Step 1.1: For each training input pattern X do steps (i), (ii):
(i) Determine the winner neuron I ∗ .
(ii) Update the network:
wI(new)
∗ = wI(old)
∗ + α X − wI(old)
∗ if neuron I ∗ and pattern X
belong in the same class,
wI(new)
∗ = wI(old)
∗ − α X − wI(old)
∗ if neuron I ∗ and pattern X
belong in different class,
wi(new) = wi(old) for i = I ∗ .
Step 1.2: Reduce the value of the learning rate α.
Step 1.3: Test stopping condition.
In the LVQ1 method only the reference vector that corresponds to the winning
neuron is updated. In the improved algorithms (LVQ2 and LVQ3), two vectors, the
winner and the runner up are updated, if several conditions are satisfied. The idea is that
if the input is approximately the same distance from both the winner and the runner up,
then both corresponding reference vectors should be updated.
LVQ2
In this case step 1.1 is modified as follows:
(i) Determine the winner neuron I ∗ and the runner-up J ∗ .
(ii) Update both wI ∗ and wJ ∗ , if the following two conditions hold:
COMPARATIVE ANALYSIS OF ARTIFICIAL NEURAL NETWORK MODELS 409
LVQ3
This algorithm is the same as the LVQ2 algorithm except part (iii) of step 1.1, which is
modified as follows:
(iii) If X, wI ∗ and wJ ∗ belong to the same class then,
wk(new) = wk(old) + εα X − wk(old) for k ∈ I ∗ , J ∗ ,
wk(new) = wk(old) for k = I ∗ ,J ∗ ,
else (this case occurs if I ∗ and J ∗ belong to two different classes and the conditions
stated in (ii) do not hold)
wk(new) = wk(old) for all k.
410 CHARALAMBOUS ET AL.
An illustrative example
This example illustrates the learning behavior of the LVQ algorithms as pattern clas-
sifiers. We would like to distinguish between two classes of “overlapping”, two-
dimensional, normally-distributed patterns, labeled C1 and C2 .
Class C1 consists of 200 sample points and has the following parameters:
µ1 = mean = [0 0]T , σ12 = variance = 1.
Class C2 consists of 200 sample points and has the following parameters:
µ2 = [2 0]T , σ22 = 4.
Figure 2 shows the joint scatter plots of the two classes. Two reference vectors are used
for each class, denoted by circles for C1 and by stars for C2 . Figures 2–4 show the final
position of the reference vectors and the decision boundaries obtained by applying the
LVQ1, the LVQ2 and the LVQ3 algorithms. The classification accuracies obtained by
the three methods are:
LVQ1: 98% for class 1 and 66.5% for class 2, giving 82.3% overall correct classifica-
tion.
LVQ2: 95% for class 1 and 70% for class 2, giving 82.8% overall correct classification.
LVQ3: 87% for class 1 and 74% for class 2, giving 80.5% overall correct classification.
It is important to note that the decision boundary between two class regions is
made up from linear pieces. As it is shown in figure 4, even though we started with two
reference vectors for C1 , these vectors overlap, resulting into a single reference vector
for C1 . The probability of correct classification produced by the Bayesian (optimum)
classifier is calculated to be (see [14]) 81.5%. The fact that this optimum result was
exceeded for the case of the LVQ1 and LVQ2 algorithms is attributed to the fact that we
used a finite number of sample points for the two classes.
Furthermore, in order to examine the connection between the LVQ’s architecture
and the single perceptron architecture, we consider the case where one reference vector
for each class is used, for a two class problem. Let w1 and w2 be the final position of
the two reference vectors. In this case the resulting decision boundary between the two
class regions obtained by the LVQ method is the hyperplane that is orthogonal to the line
segment joining the two reference vectors and cuts the line segment in the middle (see
figure 5(a)).
(a)
(b) (c)
Figure 5. Both the Kohonen’s architecture and that of the single preceptron lead to the same hyperplane
as the decision boundaries. (a) A two class decision boundary; (b) Kohonen’s architecture; (c) Single
perceptron architecture.
COMPARATIVE ANALYSIS OF ARTIFICIAL NEURAL NETWORK MODELS 413
Let X be any point on the decision boundary. Using the orthogonality property
between vectors X − (w1 + w2 )/2 and (w1 − w2 ) we obtain the equation of the decision
boundary,
w1 + w2
(w1 − w2 ) X −
T
=0
2
which can be expressed in the form
w T x + w0 = 0,
where
w = w1 − w2 , w0 = − 12 w1T w1 − w2T w2 .
A perceptron neuron is shown in figure 5(c). Its analog output ψ, is given by
ψ = w T x + w0 ,
where w is the perceptrons N-dimensional weight vector, w = [w1 , w2 , . . . , wN ]T and
w0 is its threshold weight. The analog output is passed through a hard-limit function to
produce the binary output y; y = 1 if ψ 0, and y = −1 if ψ < 0. The hyperplane
w T x + w0 = 0 (ψ = 0) divides the input space, RN , into two subregions. For a two
class classification problem we want to choose weights w and w0 such that we get the
“best” separation between the two classes. Hence in this case the LVQ network shown
in figure 5(b) and the perceptron network shown in figure 5(c) will produce the same
decision boundary if weights w1 , w2 of the LVQ network and weights w, w0 of the
perceptron network are related by the above equations.
By noting that the difference between the single perceptron model and the logistic
model is the replacement of the hard-limiting function by the log-sigmoid function, the
connection between the LVQ’s architecture and that of the logistic model can be seen.
The second ANN structure examined is the Radial Basis Function (RBF) network shown
in figure 6. Although this network has a feedforward structure and it consists of a single
hidden layer and an output layer, it differs from the simple feedforward network because
it does not have weights in its hidden layer, but instead it has centers. The basic attribute
of this network is that all neurons in the hidden layer have locally tuned response charac-
teristics. These neurons are fully interconnected to a number of linear units in the output
layer (in this case there is only one output neuron). The purpose of RBF networks is to
transform a non-linearly separable classification problem into a linearly separable one.
Once an input vector is presented to the network, the hidden unit outputs are obtained
by calculating the closeness of the input vector X to the weight vector (center) of each
one of the hidden units. The function used to calculate the closeness is as follows:
ri2
(ri ) = exp − ,
2σi2
414 CHARALAMBOUS ET AL.
where:
ri = X − wi : the distance between X and wi ;
wi : weight vector associated with neuron i in the hidden layer;
σi2 : variance associated with neuron i;
X: input vector.
The above function gives an appreciable value (close to 1) only when the distance
between the input vector X and the weight vector of neuron i, wi is small; otherwise
it gives a value very close to zero. In order to reach the final output of the network,
we have to multiply the output vector of the hidden layer by the corresponding weight
vector associated with the neuron in the output layer. This weight vector is computed
using the pseudoinverse method, and the given targeted values. Since the output neuron
is linear, the actual output of the network is:
H
y = v0 + v∗i (ri ),
i=1
minimise the least-squares error function between the actual output and the correspond-
ing targeted output, using the weight vectors wj and the weights of the output layer, v0
and vi , as variables. The initial values of the wj are those obtained above by using the
SOFM and the values of the v0 and vi are those obtained by using the pseudoinverse
method. The values of σi are kept fixed, as above.
Figure 7 shows the final ANN architecture applied in this study which is based on a
simple feedforward network. The network used consists of three layers, the input layer,
the hidden layer with a number of hidden neurons, and the output layer with a single
neuron. The hidden layer uses the hyperbolic tangent sigmoid activation function fH (·),
while the output layer uses the log-sigmoid activation function f0 (·). (The hyperbolic
tangent sigmoid activation function can also be used in the output layer.) In this study,
the simple feedforward network is applied with two different training algorithms: (a) the
standard back-propagation algorithm, and (b) the conjugate gradient optimization algo-
rithm which minimises the least-squares error function plus a penalty term.
Pf (W ) = F (W ) + P (W ),
where
s
2
F (W ) = 0.5 yi (W ) − ti ,
i=1
N H
β(wlm )2
N H
2
P (W ) = e1 + e2 wlm ,
l=1 m=1
1 + β(wlm )2 l=1 m=1
5. Empirical results
In this section we discuss (a) overview of data and (b) comparison of the prediction
results obtained by the logistic regression and by the ANN models.
Figure 8 shows the trend of the seven significant variables used in all our models. Con-
sistent with our expectations, these trends reveal that there are major differences between
the two groups of companies (i.e., the bankrupt and healthy companies). As the year of
bankruptcy approaches, the medians of the WCFOM, OPNI2N and DER variables drop
substantially, whereas the medians of these variables for the healthy group remain stable
over the three-year period tested. Moreover, the median of the CLTA and DAR variables
for the bankrupt companies increases as the firm approaches bankruptcy, whereas it re-
mains relatively stable and at a lower level for the healthy companies. The median of the
UCFFOM variable also shows a significant difference between the two groups of firms.
These results show that as the firms get closer to the year of bankruptcy they are unable
to generate cash from their operations. Finally, as it was expected, the median of the
CHETA variable is lower for the bankrupt firms in all years examined.
Table 2 in the appendix shows the coefficients resulting from the application of
the logistic regression method on data taken over the period 1983–1991 for 96 matched-
pairs of bankrupt and non-bankrupt firms. These coefficients are then used to estimate
the probability of bankruptcy for a testing sample consisting of 43 matched-pairs of
bankrupt and non-bankrupt firms over the period 1992–1994.
Table 3 in the appendix shows the final weights of the training phase of the five
types of neural networks applied over the three years prior to bankruptcy. Annotation
H represents the number of neurons in the hidden layer for the feedforward and RBF
networks, and the number of reference neurons for the LVQ network (the best prediction
results of the LVQ2 and LVQ3 are quoted). The ith row of matrix W represents the
weight vector wi associated with neuron i. For the feedforward network, the last column
of W and V correspond to threshold weights.
Table 1 presents training and prediction results obtained by applying logistic regression
and neural network methods. The results of the logistic regression method on the training
sample show that the model classifies correctly 82.3%, 74.5% and 69.8% of the total
sample, for the first, second and third year prior to bankruptcy, respectively. To validate
these prediction results we use an out of sample period ex-ante test. The testing results
show that the logistic regression model classifies correctly 77.9%, 68.6% and 64%, one,
two and three years prior to bankruptcy respectively. Comparing the type I and type II
error rates of these models we observe that the type I error rates are much higher in all
418 CHARALAMBOUS ET AL.
Figure 8. Trends of the significant variables used in the models. Medians are presented for each variable
for all three years prior to bankruptcy.
COMPARATIVE ANALYSIS OF ARTIFICIAL NEURAL NETWORK MODELS 419
Table 1
Empirical results.
years tested.1 These results are not so encouraging since evidence shows that the type I
error rates could be 35 times more costly than the type II error rates [2].
Since the major objective of our study is to compare the predictive performance
of various ANN methods with the classification accuracy of the logistic approach, we
also apply the ANN methods described in the previous section. The first three models
applied are based on the feedforward structure. The first model uses common BP learn-
ing algorithm while the second and the third models make use of the conjugate gradient
optimization algorithm, which minimises the LSEF with and without the penalty term.
All three networks consist of two layers of weights and have the same number of hidden
neurons in their hidden layer. The number of neurons in the hidden layer is based on the
1 Type I error is the misclassification of a bankrupt firm as healthy and type II error is the misclassification
of a healthy firm as bankrupt.
420 CHARALAMBOUS ET AL.
surface between the two classes is a single hyperplane. Comparing these results with
the results given by the logistic regression method we observe that for this problem
the results given by the decision hyperplane of the LVQ model are superior to those
results given by the decision hyperplane of the logistic regression. Hence, if we are
interested for a single hyperplane to classify our data, it will be worthwhile to use the
LVQ method with one reference vector for each class in addition to using the logistic
regression method.
6. Conclusions
This study provides new evidence on the predictive ability of three contemporary Neural
Network (NN) methods in predicting bankruptcy. Specifically, our results indicate that
the three contemporary NN methods applied in the present study provide superior re-
sults to those obtained from the logistic regression method and from the BP algorithm.
The results of this study also encourage further research that may improve our under-
standing of the usefulness of these contemporary NN methods to other business issues.
Since the predictive performance of these NN methods may depend on the characteris-
tics of the dataset and on the complexity of the issue under examination, researchers
are encouraged to apply these NN methods to other business and non-business is-
sues in order to determine whether these methods indeed provide superior results to
those results obtained from the most common statistical and NN methods applied thus
far.
Appendix
Table 2
Logistic regression models.
Years prior to Constant CHETA CLTA DAR DER OPNI2N UCFFOM WCFOM
bankruptcy
1 −2.1571 −7.5201 3.3017 0.2619 1.8878 1.7422 −0.017 −0.8003
(0.0001) (0.0076) (0.0014) (0.0388) (0.0165) (0.0002) (0.0544) (0.0056)
2 −1.9448 −0.9192 3.4995 0.0499 1.4871 1.3414 −0.0045 −0.6698
(0.0003) (0.4934) (0.0003) (0.5536) (0.0858) (0.0023) (0.6574) (0.0506)
3 −2.142 0.3761 3.5297 0.2625 2.1247 2.0053 −0.0133 −0.2581
(0.7243) (0.2914) (0.0147) (0.0003) (0.2655) (0.6334)
The dependent variable takes the value of 1 if the firm belongs in the bankrupt group and it takes the value of
0 if the firm belongs in the non-bankrupt group; CHETA: Cash and equivalents/Total assets; CLTA: Current
liabilities/Total assets; DAR: Change in accounts receivalbes; DER: (Debt due in one year + Long term
debt)/Total assets; OPN12N: Dummy for operating income, 1 if negative for the last two years, 0 otherwise;
UCFFOM: Change in cash flow from operations/Market value of equity; WCFOM: Working capital from
operations/Market value of equity at fiscal year end.
422 CHARALAMBOUS ET AL.
Table 3
Neural network models.
One year prior to bankruptcy
Panel A1: Feedforward network – BP H = 2
1.1378 6.035 0.4225 5.6004 10.0373 −69.0788 −25.447 10.2213
W =
0.4319 −1.1075 0.0573 −1.0423 −2.414 26.2243 7.6243 −2.1903
Table 3
(continued).
Panel D2: LVQ H = 2 (1 neuron in each class)
−0.3616 1.5204 0.3626 0.9658 1.6687 −0.9212 −0.9856
W=
0.9059 −1.6916 −0.8440 −1.3930 −1.6048 0.7174 1.0205
Acknowledgements
References
[1] E. Altman, Financial ratios, discriminant analysis and the prediction of corporate bankruptcy, Journal
of Finance XXIII (September 1968).
[2] E. Altman, R. Halderman and P. Narayaman, Zeta analysis, Journal of Banking and Finance (June
1977).
[3] E. Altman, G. Marco and F. Varetto, Corporate distress diagnosis: Comparisons using linear dis-
criminant analysis and neural networks (the Italian experience), Journal of Banking and Finance 18
(1994).
[4] R. Barniv, A. Agarwal and R. Leach, Predicting the outcome following bankruptcy filing: A three-
state classification using neural networks, International Journal of Intelligent Systems in Accounting,
Finance and Management 6 (1997).
[5] J.E. Boritz, The “Going Concern” assumption: Accounting and auditing implications, Research Re-
port, CICA (1991).
[6] J.E. Boritz, D.B. Kennedy and A. de Miranda e Albuquerque, Predicting corporate failure using a
neural netwotk approach, International Journal of Intelligent Systems in Accounting, Finance and
Management 14 (1995).
[7] P.L. Brockett, W.W. Cooper, L.L. Golden and U. Pitaktong, A neural network model for obtaining an
early warning of insurer insolvency, Journal of Risk and Insurance 61(3) (September 1994).
[8] D.S. Broomhead and D. Lowe, Multivariate functional interpolation and adaptive networks, Complex
Systems 2 (1988) 321–355.
[9] C. Charalambous, Conjugate gradient algorithm for efficient training of artificial neural networks,
IEEE Proceedings 139(3) (June 1992).
[10] A. Charitou and C. Charalambous, The prediction of earnings using financial statement information:
Empirical evidence with logit models and artificial neural networks, International Journal of Intelli-
gent Systems in Accounting, Finance and Management 5 (1996).
[11] P. Coats and F.L. Fant, Recognizing financial distress patterns using a neural network tool, Financial
Management (Autumn 1993).
[12] K. Fanning and K. Cogger, A comparative analysis of artificial neural networks using financial distress
prediction, International Journal of Intelligent Systems in Accounting, Finance and Management 3
(1994).
[13] R. Fletcher, Practical Optimization (Wiley, 1980).
[14] S. Hayking, Neural Networks: A Compehensive Foundation, 2nd ed. (Prentice-Hall, 1999).
[15] C.S. Huang, R.E. Dorsey and M.A. Boose, Life insurer financial distress prediction: A neural network
model, Journal of Insurance Regulation 13(2) (1995).
[16] F. Jones, Current techniques in bankruptcy prediction, Journal of Accounting Literature 6 (1987).
[17] T. Kohonen, The self-organizing map, Proc. IEEE 78(9) (September 1990).
[18] M. Leshno and Y. Spector, Neural network prediction analysis: The bankruptcy case, Neurocomputing
10 (1996).
[19] M. Odom and R. Sharda, A neural network model for bankruptcy prediction, in: Proc. IEEE Interna-
tional Conference on Neural Networks (San Diego, CA, 1990).
COMPARATIVE ANALYSIS OF ARTIFICIAL NEURAL NETWORK MODELS 425
[20] J. Ohlson, Financial ratios and the probabilistic prediction of bankruptcy, Journal of Accounting Re-
search 18(1) (Spring 1980).
[21] W. Ragupathi, L.L. Schkade and B.S. Raju, A neural network approach to bankruptcy prediction, in:
Proc. IEEE 24th Annual Hawaii International Conference on Systems Science (1991).
[22] E. Rahimian, S. Singh, T. Thammachote and R. Virmani, Bankruptcy prediction by neural network,
in: Neural Networks in Finance and Investing, eds. R.R. Trippi and E. Turban (Probus, Chicago,
1992).
[23] D. Rumelhart, G. Hinton and G. Williams, Learning internal representations by error propagation, in:
Parallel Distributed Processing, Vol. 1, eds. D. Rumelhart and J. McCleland (MIT Press, 1986).
[24] L. Salchenberger, E. Cinar and N. Lash, Neural networks: A new tool for predicting thrift failures,
Decision Sciences 23 (1992).
[25] J. Scott, The probability of bankruptcy: A comparison of empirical predictions and theoretical models,
Journal of Banking and Finance 5 (1981).
[26] R. Setiono and H. Liu, Neural-network feature selector, IEEE Transactions on Neural Networks 8(3)
(1997) 654–661.
[27] Y. Suh and J. Kim, Current artificial neural network models for bankruptcy prediction, Journal of
Accounting & Business Research 4 (1996).
[28] T.S. Suan and K.H. Chye, Neural network applications in accounting and business, Accounting and
Business Review 4(2) (July 1997).
[29] K.Y. Tam and M.Y. Kiang, Managerial applications of neural networks: The case of bank failure
predictions, Management Science 28 (1992).
[30] K.Y. Tam and M.Y. Kiang, Predicting bank failures: A neural network approach, Applied Artificial
Intelligence 4 (1990).
[31] D. Trigueiros and R. Taffler, Neural networks and empirical research in accounting, Accounting and
Business Research 26(4) (1996).
[32] R. Wilson and R. Sharda, Bankruptcy prediction using neural networks, Decision Support Systems 11
(1994).
[33] C. Zavgren, The prediction of corporate failure: The state of the art, Journal of Accounting Literature
2 (1983).