An Integrated Data Mining and Behavioral Hseil

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Expert Systems with Applications 27 (2004) 623–633

www.elsevier.com/locate/eswa

An integrated data mining and behavioral scoring model


for analyzing bank customers
Nan-Chen Hsieh*
Department of Information Management, National Taipei College of Nursing, No. 365, Min Te Road 11257, Taipei, Taiwan, ROC

Abstract
Analyzing bank databases for customer behavior management is difficult since bank databases are multi-dimensional, comprised of
monthly account records and daily transaction records. This study proposes an integrated data mining and behavioral scoring model to
manage existing credit card customers in a bank. A self-organizing map neural network was used to identify groups of customers based on
repayment behavior and recency, frequency, monetary behavioral scoring predicators. It also classified bank customers into three major
profitable groups of customers. The resulting groups of customers were then profiled by customer’s feature attributes determined using an
Apriori association rule inducer. This study demonstrates that identifying customers by a behavioral scoring model is helpful characteristics
of customer and facilitates marketing strategy development.
q 2004 Elsevier Ltd. All rights reserved.

Keywords: Data mining; Behavioral scoring model; Customer segmentation; Neural network; Association rule

1. Introduction existing customers (Setiono, Thong, & Yap, 1998). These


two scoring models are highly related to the field of
Contemporary marketing strategies perceive customers classification analysis by statistical analysis (Hand, 1981;
as important resources to an enterprise. Therefore, it is Johnson & Wichern, 1998), especially classification anal-
essential to enterprises to successfully acquire new ysis by neural networks in the field of data mining (Lancher,
customers and retain high value customers. To achieve Coats, Shanker, & Fant, 1995).
these aims, many enterprises have gathered significant Until now, most existing data mining approaches have
numbers of large databases, which then can be analyzed and been discovering general rules (Agrawal, Imielinski, &
applied to develop new business strategies and Swami, 1993; Bult & Wansbeek, 1995; Setiono et al., 1998),
opportunities. predicting personal bankruptcy (Dasgupta, Dispensa, &
However, instead of targeting all customers equally or Ghose, 1994; Desai, Crook, & Overstreet, 1996; Zhang, Hu,
providing the same incentive offers to all customers, Patuwo, & Indro, 1999) and credit scoring (Kim & Sohn,
enterprises can select only those customers who meet 2004; Lancher et al., 1995; Sharda & Wilson, 1996) in bank
certain profitability criteria based on their individual needs databases. Few works have studied the mining of bank
or purchasing behaviors (Dyche & Dych, 2001). Credit databases from the viewpoint of customer behavioral
scoring and behavioral scoring are techniques that help scoring (Sharda & Wilson, 1996). More specifically, we
decision makers to realize their customers. Credit scoring wanted to look at both the account data of the customers and
models help to decide whether to grant credit to new their credit card transactions. With these data, the aim was
applicants by customer’s characteristics such as age, income to discover interesting patterns in the data that could provide
and martial status (Chen & Huang, 2003). Behavioral clues about what incentives a company could offer as better
scoring models help to analyze purchasing behavior of marketing strategies to its customers. As shown in Fig. 1,
this study presents a two-stage approach for behavioral
scoring analysis of implicit knowledge using bank customer
* Tel./fax: C2-822-7101-2220.
E-mail address: nchsieh@ntcn.edu.tw. account and transaction data. Topics discussed include data

0957-4174/$ - see front matter q 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2004.06.007
624 N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633

predicators. This SOM was employed to classify customers


into three major profitable groups of customer: revolver
user, transactor user, and convenience user.
Once the SOM identified the profitable groups of
customers, an Apriori profiled each group of customers
focusing on demographic and geographic characteristics for
building and maintaining the most profitable customer base.
The customer profile then was used to describe a
representative case in each group of customers, and served
as a tool for establishing better bank marketing strategies.
After analyzing the bank database, this study demonstrates
that customer behavior scoring models are an effective
method for banks to realize their most profitable customers.
We conclude by analyzing target groups of customers using
the proposed two-stage behavioral scoring model.
For a better understanding of our solutions, this study is
organized as follows. Section 2 makes a description of the
analyses methodology. An integrated data mining and
behavioral scoring model was presented. Section 3 assesses
neural networks as a tool for customer segmentation while
using past repayment behavior and RFM scoring variables
to build behavioral scoring models. Section 3 also presents
the processes of creating customer profiles according to
their feature attributes as determined by an Apriori
association rule inducer. Finally, conclusions are made in
Section 4.

2. Description the analyses methodology

2.1. Credit and behavioral scoring models

Credit and behavioral scoring models (Thomas, 2000)


are one of the most successful applications of statistical and
operational research modelling in finance and banking, and
the number of scoring analysts in the industry is constantly
increasing. The main objective of both credit and behavioral
Fig. 1. Two-stage behavioral scoring modeling. scoring models is to classify customers into groups (Lancher
et al., 1995). Hence scoring problems are related to the field
preprocessing, customer behavior scoring modelling, sensi- of classification analysis (Hand, 1981; Johnson & Wichern,
tivity analysis of relative importance attributes contributing 1998; Morrison, 1990). Applying to bank databases,
to the customer profiling, and the two stages of the classification analysis for credit scoring is used to categorize
behavioral scoring model itself. a new applicant as either accepted or rejected with respect to
The key feature of the two-stage behavioral scoring his features such as age, income and martial status (Chen &
model is a cascade involving self-organizing map (SOM) Huang, 2003). On the other hand, classification analysis for
and an Apriori association rule inducer. An SOM (Kim & behavioral scoring is used to describe the behavior of
Sohn, 2004; Kohonen, 1995) is an unsupervised learning existing customers by using behavioral scoring variables
algorithm that relates multi-dimensional data as similar and also to predict future purchasing behavior or credit
input vectors to the same region of a neuron map, and status of existing customers (Setiono et al., 1998).
Apriori (Agrawal et al., 1993) is mainly used to find out the Until now, the building of both scoring models has been
potential relationships between items or features that occur always based on a pragmatic approach; because of this, the
synchronously in the database. In the first stage of the best and most standard scoring models for every unique
approach presented here, a conceptual customer behavioral circumstance most certainly does not exist. Most previous
scoring model was established to predict profitable groups studies have focused on building more accurate credit or
of customers based on previous repayment behavior and behavioral scoring models and increasing the accuracy of
RFM (Bult & Wansbeek, 1995) behavioral scoring the classification model with various kinds of statistical
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 625

techniques. However, analyzing bank databases for custo- trained by back propagation and gradient descent, or similar
mer behavior management is difficult since bank databases alternatives.
are multi-dimensional, comprising of monthly account
records and daily transaction records (Donato et al., 2.3. Properties of the built behavioral scoring model
1999). Therefore, even with highly accurate scoring models,
some misclassification patterns appear frequently. In the business world, the most successful application of
This study intended to draw much from data mining behavioral scoring model is embodied into databases, which
perspectives. Providing a general integrated data mining and is an approach of analyzing customer histories, looking for
behavioral scoring model for customer behavior analysis, similar behavioral patterns among existing customer pre-
which includes necessary preprocessing of the real-world ferences and using those patterns for a targeted selection of
data sets, scoring predicators derivation and customer existing or future customers The decisions to be made
profiling in order to support a standard model building include which target groups of customers will be encour-
process will be of great utility. The framework of two-stage aged to spend more, what credit line to assign, whether to
behavioral scoring model serves as a tool to validate the promote new products to particular groups of customers,
effect of data mining techniques in practical scoring analysis and, if the repayment ability turns bad, how to manage debt
applications. recovery. Therefore, a behavioral scoring model is an
information-driven marketing process that enables market-
2.2. Neural networks to the segmentation analysis ers to develop, test, implement, measure and appropriately
modify customized marketing programs and strategies.
For credit scoring or behavioral scoring analysis, many In addition to customer values that credit scoring models
studies have presented that neural networks perform use as major scoring information, repayment behavior
significantly better than statistical techniques such as linear patterns and customer purchasing histories are also required
discriminate analysis (LDA), multiple discriminate analysis in a behavioral scoring model. Behavioral scoring models
(MDA), logistic regression analysis (LRA) and so on (Desai are intended to establish associations between the input
et al., 1996; Lancher et al., 1995; Malhotra and Malhotra, predictors and the output scores in order to model the
2003; Sharda & Wilson, 1996; Zhang et al., 1999). The behavior of different customers. More precisely, behavioral
application of neural networks to segmentation analysis is a scoring models tried to group customers that represent
promising research area and is a challenge for a variety of shared behavior patterns. This is carried out by assigning
marketers (Vellido, Lisboa, & Vaughan, 1999). behavior scores to each customer and grouping customers
Baesens, Viaene, Poel, Vanthienen, & Dedene (2002) into classes of similar score value using an SOM neural
employed Bayesian neural networks to repeat purchase network. The behavior score is given by a mathematical
behavior modelling in direct marketing. Davies, Moutinho, function of the form:
& Curry (1996) and Moutinho, Davies, & Curry (1996)
analyzed how different bank customer groups represent behavior score Z fSOM ðpredicator1 ; predicator2 ; .Þ:
different expectations of the automatic teller machines In this study, four predicators, namely, repayment behavior
(ATMs) service. Rather than profiling segments based on and RFM values are used to classify three profitable groups
demographic or geographic characteristics, Dasgupta et al. of customers. Individual customer scores are updated on a
(1994) characterized potential customer segments in terms yearly basis in this study.
of lifestyle variables. Balakrishnan, Cooper, Jacob, & Lewis
(1996) accomplished a six-segment classification study
using coffee brand switching probabilities derived from the
scanner data at a sub-household level. Mazanec (Mazanec, 3. Assessing the neural network as customer
1992) grouped tourists using a benefit approach. Setiono segmentation
et al. (1998) utilized a rule-extraction neural network to aim
at companies for the promotion of new information 3.1. Preparing the data sets
technology. Fish, Barnes, & Aiken (1995) proposed a new
methodology for industrial market segmentation by neural For this study, bank databases were provided by a major
networks. Lee, Chiu, Lu, & Chen (2002) explore the Taiwanese credit card issuer. Data preprocessing was
performance of credit scoring by integrating the back required to ensure data field consistency in behavioral
propagation neural networks with traditional discriminate scoring model building. Obviously, not all the data are
analysis. Kim & Sohn (2004) used neural networks to related to the chosen purposes, so knowledge extraction
manage customer loans. from the bank databases included the following three sub-
Among these studies, only Balakrishnan et al. used the actions. The first sub-action was intended to organize the
frequency sensitive competitive learning (FSCL) algorithm raw data. Two data sets were obtained: a set containing
in segmentation analysis. The rest of the studies used effective credit card account information of 158,126
supervised feed-forward multilayer perceptron (MLP) customers until June 2003, and another set storing over
626 N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633

20 millions individual transaction records for these accounts to make good customer behavior management may be
from January 2000 to June 2003. Then, two data sets were limited by poor data relevance and quality, the volume of
joined using a customer identifier to create a single data needing to be processed, or difficulty in viewing the
behavioral-oriented data set. The second sub-action was data. Therefore, the original data set could not be used
the extraction of only that data considered useful for the directly to predict customer behavior, so extra behavioral
analysis. Unnecessary data fields and records containing scoring predicators were needed for predication.
incomplete or missing data were removed from the data sets As mentioned, banks have three types of profitable
(Fish et al., 1995). The third sub-action was the application customers: revolver users, transactor users and convenience
of simple statistics to calculate an aggregate of new users. Revolver users always carry a credit card balance,
behavioral scoring predicators. rolling over part of the bill to the next month, instead of
The aim of calculating the aggregate was to emphasize paying off the balance in full each month. Revolver users are
the customer repayment behavior and RFM (Bult & highly profitable customers because every month they pay
Wansbeek, 1995) information hidden in the 12 months considerable interest on their outstanding balance. Mean-
observation period. In this case, the values derived from the while, transactor users pay in full on or before the due date
database such as maximum, minimum and average of a set of the interest-free credit period and do not incur any
of variables (e.g. repayment states, payment cycle days, interest payments or finance charges. Transactor users do
number of credit card purchases, consumption amount, not contribute significant revenue through interest on their
interest on credit balance, and so on) for the monthly credit balances, but the discount on each transaction they
activity over the past 12 months were considered for the make still provides an important source of bank revenue.
purpose of building a behavioral scoring model. As Finally, convenience users are customers who periodically
mentioned, the desired outcome is to be able to predict charge large bills, such as for vacation or large purchases to
which customer belongs to which profitable group. The their credit card, and then pay these bills off over several
ranges of values of numerical predictor are split into months. Convenience users thus contribute significant
intervals so that each interval contains as many customers as amounts of interest on their credit balance.
possible that have a significant homogeneous behavior. Fig. 2 presents the conceptual framework used to answer
Multiple predictors can be grouped together to obtain the the questions posed in this study. This figure shows the two
same effect. To derive the most profitable customers, it was components, customer segmentation and customer profiling,
chosen to identify similar repayment behavior with respect which serve as major issues to be discussed here. Generally,
to RFM values found in the real world. credit card issuers make money from annual fees, interest on
credit balance, and the discount collected from merchants
3.2. Analyzing the behavior of customers on each transaction. In this framework, account and
transaction data sets are assumed to be input sets to
To establish a better relationship with customers, banks customer segmentation. The values of RFM and repayment
constantly seek ways of differentiating their offerings and behavior are assumed to be behavioral scoring predicators
developing more appropriate services for distinct market affecting customer segmentation.
segments. An important observation on the current state-of- The recency (R) value measures the average time
the-art segmentation analysis is the use of past transaction distance between the day of makes a charge and the day
data. The results produced are based on the assumption that pays the bill, frequency (F) value measures the average
the customer behavior follows patterns similar to past number of credit card purchases made, and monetary (M)
patterns and will repeat in the future. Therefore, there could value measures the amount of consumption spent during a
not be a better time than now to recognize the importance of yearly time period. Next, variables such as customer
an effective new marketing strategy using data mining attributes and credit card usage are assumed to influence
techniques. To increase the amount of purchases while customer profiling. Finally, clusters and the associated
improving customer satisfaction is a major goal. customer profiles are assumed to be outputs, as well as
Segmentation analysis is a method of achieving more influencing of credit card marketing strategies. In Fig. 2,
targeted communication with customers and is a pioneering repayment behavior is highly related to customer segmenta-
step towards classifying individual customers according to tion, but is an implicit variable which cannot be retrieved
previously defined groups of customers. The process of directly from the data set. We needed to develop a method
segmentation analysis describes the characteristics of for modeling the customer repayment behavior.
groups of customers within the data, and putting customers As shown in the following equation, this study employs
into segments according to their affinities or similar ‘Repayment Ability’ (RA) to model repayment behavior,
characteristics.
This study tries to construct a behavioral scoring model Repayment Ability
for direct marketing and encouraging consumption (Lancher
et al., 1995). These two goals are similar for analyzing no: of months without delayed pay off
Z :
potential credit card customer behavior. However, attempts no: of months of holding the card
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 627

Fig. 2. A conceptual framework of customer behavioral modeling.

The default observation range is assumed to be During the learning process, when a pattern is presented as
12 months, and RA is computed as the ‘no. of months an input to the neural network, each Euclidean distance
without delayed pay off’ divided by the ‘no. of months of between the pattern and each neuron is calculated using
holding the card’. For example, a customer without carries a RA first and then RFM as input variables. For inputs to the
credit card balance for 8 months, and then the degree of RA SOM, each feature is scaled by subtracting the mean and
is computed as 8/12. For each customer, if RA is dividing by the standard deviation, resulting in each scaled
approaching one, then the repayment behavior of that feature having a mean of zero and a standard deviation of
customer is considered a transactor user. Meanwhile, if RA one. Once the most similar neuron is determined, the
is between zero and one then the repayment behavior of that neighborhood of that neuron is identified. The neighbor-
customer is considered a convenience user. Finally, if the hood of a neuron is defined as all the neurons within a
value of RA is approaching to zero then the repayment given link distance of the matched neuron. All neurons in
behavior of that customer is considered a revolver user. the neighborhood are adjusted to have feature values closer
to the current case. The adjustment amount of the neuron
3.3. Assessing the SOM for customer behavioral scoring weights is controlled by the learning rate.
The SOM map is shown in Fig. 3, the repayment
During the last years, the SOM (Kohonen, 1995) has behavior, number of customers, ratio of number of
gained in popularity as a classification analysis tool in customers relative to the overall customers, RA and RFM
business related areas (Vellido et al., 1999). In this study, are shown for each neuron. Fig. 4 illustrated the overall
the SOM is built with data from existing customers, which distribution of customers with respect to three major
include variables from account and transaction data sets. All profitable types of customers. The mass cases are distributed
of the existing customer’s data are used to build the over neurons 9–16, the number of customers is 104,979 and
behavioral scoring model in order to predicate potential repayment behavior is revolver user. Neurons 3, 4, 7 and 8
customer behavior. indicate a total of 21,202 customers are convenience users.
The behavioral scoring model utilized in this study is Neurons 1, 2, 5 and 6 indicate a total of 31,945 customers
arranged to form a two dimensional SOM with a 4!4 are transactor users.
rectangular shaped array of neurons. Each of these neurons On the basis that no meaningful conclusions can be
is connected to the input vectors through synaptic weights drawn from small numbers of customers, no future analysis
which are adjusted during learning. The first phase of needs to be performed on the clusters with fewer than 1000
SOM is a rough estimation phase, used to capture the gross cases (i.e. neuron 6, 12 and 15). The next major step is to
data patterns. The second phase is a tuning phase, used to choose the target groups of customers, so as to choose the
adjust the map to model the fine features of the data. target customers for direct marketing and encourage
628 N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633

Fig. 3. Neurons in a 4!4 map, each neuron defines a cluster.

consumption. The repayment behavior can be used to The data set obtained after data preprocessing contained
indicate the risk of customers, the risk degrees among three 32 attributes, 10 character attributes and 22 continuous
profitable groups of customers are ‘transactor user’%‘con- attributes. The neural network sensitivity analysis was used
‘convenience user’%‘revolver user’. Moreover, the clusters to reserve the relative importance attributes, repayment
of RFM values tend to RYF[M[ of each profitable group behavior and RFM values chosen as predicated variables for
are selected as target ones, all customers who belong to whole customers. As recommended by Hornik, Stinch-
these clusters become candidates for conducting suitable combe, & White (1989), one hidden layer network is
marketing strategies for a bank, which attract the most sufficient to model a complex system with any desired
attention. accuracy, and the employed neural network model has just
one hidden layer.
3.4. Determining the relative importance variables Table 1 lists the distribution of the relative importance
for each input variable using the neural network. The
After the segmentation of the existing customers, it is sensitivity analysis of the neural network and the order of
possible to infer the characteristics of each group of most significant input variables indicate those variables that
customers and from that propose appropriate management
strategies. Customer profiling (Setiono et al., 1998) provides
a basis for enterprises to offer customers better services and
retain good customers. Customer profiling is done by
assembling collected information on customers and their
potential behavior. We first used neural network sensitivity
analysis (Zurada, Malinowski, & Cloete, 1994) test for
whole customers to determine if there are significant
differences between each customer and minimize the input
variables, then infer customer profiles by an Apriori
association rule inducer. Fig. 4. Customer distribution to repayment behavior.
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 629

Table 1 a transaction in DB containing both X and Y is s%. (i.e.


Sensitivity analysis of the relative importance input variables Support(X0Y)ZSupport(XgY, DB)). An association
Neural network model rule X0Y holds in DB with confidence c% if the
Input layer (no. of 32 probability of a transaction in DB which contains X also
neurons) contains Y is also c%. (i.e. Confidence(X0Y)ZSup-
First hidden layer (no. 20 port(XgY,DB)/Support(X,DB)). A well-known Apriori
of neurons) algorithm (Agrawal et al., 1993) has been proposed for
Output layer (no. of 4
mining association rules in a transaction database. To
neurons)
Predicted accuracy 96.16% find an association rule is to discover all the association
rules whose support is larger than a minimum support
Relative importance to RA and RFM (minsup) threshold and whose confidence is larger than a
Variable name Relative Comments minimum confidence (minconf) threshold. The associ-
importance
ation rules must satisfy two conditions:Support(X0Y) R
Amount_of_Con- 0.40658 Monthly amount of consumption minsup,Confidence(X0Y) R minconf.
sumption
When all of the association rules are generated, the
Cardage_Month 0.30048 Number of months for which the
card has been held simplest way to determine positive tendency of each
Creditline 0.19980 Credit line association rule is to use the lift judgment. Lift is the
Total_Consumption 0.15002 Yearly amount of consumption ratio of confidence to expected confidence. Expected
Blockcod 0.12431 Card usage limit or not confidence is the number of transactions that include the
Occupation 0.03722 Encoded field
consequent divided by the total number of transactions.
Cardtype 0.02546 Encoded field
Marital_Status 0.01779 0, single; 1, married; 2, divorced; Suppose that we used X0Y to determine a customer’s
3, separation tendency of purchasing Y, the product purchase Sup-
Age_Segments 0.01499 1, !25; 2, 25–30; 3, 30–35; 4, port(Y, DB)/Support(XgY, DB) is the expected confi-
35–40; 5, 40–45; 6, 45–50; 7, dence, and the lift is computed as:
50–60; 8, O60)
Sex 0.00997 1, male; 0, female LiftðX0 YÞ Z ConfidenceðX0 YÞ=
N N
Expected_ConfidenceðX0 YÞ:
are worth looking at in more detail. Factors with a relative According to the SOM results, the customers are fall
importance of 0.00997 and above were used in successive into three major profitable groups of customers dispersed
customer profiling. In Table 1, Amount_of_Consumption, over 16 clusters. The 10 variables deriving from the
CardAge_Month and CreditLine, were the three most sensitivity analysis were chosen as predicate variables for
differentiating variables. On the other hand, Marital_Status, association rule analysis. For simplified explanation, we
Age_Segments and Sex are the least differentiating variables. chose only cluster-1 and cluster-2 for mining association
rules. Parameters were set up to identify association rules
3.5. Create customer profiles that had at least 85% confidence and 5% support
imposed on the Apriori association rule inducer.
The study’s aim is to try to discover hidden patterns Table 2 lists the cluster profile of cluster-1 in the form
in bank databases so that it could better understand of association rules, where each rule represents a customer
different characteristics about different customers and profile that was dominant or most strongly associated with
develop new strategies to provide better service. In the the customers matching that cluster. For discriminating
previous sections, we used behavioral scoring model to purposes, we have grouped customers with shared
classify customers into clusters with shared character- behavioral characteristics. From this, marketers can create
istics. The employment of mining association rules was more accurate campaigns towards each target group of
used to create customer profile in each cluster. The customers for cross-selling and encouraging consumption.
purpose of association rule extraction is to discover After briefly reviewing the 16 clusters using cluster
significant relationships between items or features that profiles, the customers with values tend to RYF[M[ can
occur frequently in a transaction database. be targeted with greater accuracy. However, the risk
Let IZ{i1,i2,.,im} be a set of items. Let DB be a arising from the different profitable groups of customers in
transaction database, where each transaction T consists of practical applications should be considered.
a set of items such that T4I. Given a set of items X4I,
a transaction T contains X if and only if X4T. Support 3.6. Merging redundant association rules
(X,DB) denotes the rate of X in DB. An association rule
is an implication of the form ‘X0Y(s%, c%, l)’, where After customers were classified by the behavioral
X4I, Y4I and XhYZ:. An association rule X0Y scoring model, the resulting clusters are then profiled by
holds in DB with support s% if the probability of feature attributes determined using an Apriori association
630 N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633

Table 2
Cluster-1 profile

Rule ID Association rules Support Confidence Lift R. to Rules


1 Marital_StatusZ0*OccupationZ4010&SexZ1&Age_SegmentsZ2&CardTy- 5.6% 87.9% 1.91
CardTypeZ100
2 Marital_StatusZ0*SexZ1&Age_SegmentsZ2 6.7% 87.8% 1.91 1
3 OccupationZ4010*Marital_StatusZ0&Age_SegmentsZ2 17.3% 87.8% 0.99
4 OccupationZ4010*Marital_StatusZ0&Age_SegmentsZ3 14.8% 88.2% 1.00 7
5 OccupationZ4010*Marital_StatusZ0&Age_SegmentsZ4 6.5% 89.1% 1.01
6 OccupationZ4010*Marital_StatusZ0&CardTypeZ113 6.2% 94.2% 1.06
7 OccupationZ4010*Marital_StatusZ0&SexZ1&Age_SegmentsZ3 6.7% 87.8% 0.99
8 OccupationZ4010*Marital_StatusZ0&SexZ1 17.9% 86.1% 0.97 7
9 OccupationZ4010*Marital_StatusZ0 46.0% 87.9% 0.99 3w8
10 OccupationZ4010*Marital_StatusZ1&Age_SegmentsZ3 10.8% 88.0% 0.99 18
11 OccupationZ4010*Marital_StatusZ1&Age_SegmentsZ4 13.2% 87.9% 0.99 16,19
12 OccupationZ4010*Marital_StatusZ1&Age_SegmentsZ5 9.5% 88.8% 1.00
13 OccupationZ4010*Marital_StatusZ1 and Age_SegmentsZ6 6.2% 91.0% 1.03
14 OccupationZ4010*Marital_StatusZ1&Age_SegmentsZ7 5.0% 89.4% 1.01
15 OccupationZ4010*Marital_StatusZ1&CardTypeZ113 5.7% 92.7% 1.05
16 OccupationZ4010*Marital_StatusZ1&SexZ1&Age_SegmentsZ4 6.1% 86.6% 0.98
17 OccupationZ4010*Marital_StatusZ1&SexZ1 22.7% 88.1% 1.00 16,20
18 OccupationZ4010*Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ3 5.2% 88.9% 1.01
19 OccupationZ4010*Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ4 5.9% 88.0% 0.99
20 OccupationZ4010*Marital_StatusZ1&BlockcodZ’n’&SexZ1 10.5% 87.9% 0.99
21 OccupationZ4010*Marital_StatusZ1&BlockcodZ’n’ 22.7% 88.4% 1.00 18w20
22 OccupationZ4010*Marital_StatusZ1 49.4% 88.5% 1.00 10w21
: : : : :

rule inducer. An association rule is considered relevant measure in replacement of confidence measure. Srikant &
for decision making if it has support and confidence at Agrawal (1995) defined generalized association rules using a
least equal to some minimal support and confidence taxonomy of the items set. Heckerman (Heckerman, 1996)
thresholds defined by the user. As shown in Table 2, the and Silberschatz et al. (Silberschatz & Tuzhilin, 1996)
extracted association rules are usually very large, to the measured the distance between association rules by evaluat-
present of a huge proportion of redundant rules ing the deviation according rule’s support and confidence.
conveying the same information. Many of the rules Bayardo, Agrawal, & Gunopulos (1999) used item con-
may contain redundant, irrelevant information or describe straints, which are Boolean expressions defined by user, to
trivial knowledge. We present interactive strategies for specify the form of association rules. Pasquier, Bastide,
pruning redundant association rules on the basis of Taouil, & Lakhal (1999) adapted the Duquenne-Guigues
equivalence relation to enhance its readability. basis for global implications, and the proper basis for
Several methods have been proposed in the literature to partial implications to the framework of association rules.
reduce the number of extracted association rules. Silverstein, Klemettinen, Mannila, Ronkainen, Toivonen, & Verkamo
Brin, & Motwani (1998) used Pearson’s correlation statistic (1994) simplified a relatively significant number of

Table 3
The redundant-free cluster profile of cluster-1 (merged)

Rule ID Association Rules Support Confidence Lift


1 Marital_StatusZ0 * OccupationZ4010&SexZ1&Age_SegmentsZ2&CardTypeZ100 5.6% 87.9% 1.91
3 OccupationZ4010 * Marital_StatusZ0&Age_SegmentsZ2 17.3% 87.8% 0.99
5 OccupationZ4010 * Marital_StatusZ0&Age_SegmentsZ4 6.5% 89.1% 1.01
6 OccupationZ4010 * Marital_StatusZ0&CardTypeZ113 6.2% 94.2% 1.06
7 OccupationZ4010 * Marital_StatusZ0&SexZ1&Age_SegmentsZ3 6.7% 87.8% 0.99
12 OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ5 9.5% 88.8% 1.00
13 OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ6 6.2% 91.0% 1.03
14 OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ7 5.0% 89.4% 1.01
15 OccupationZ4010 * Marital_StatusZ1&CardTypeZ113 5.7% 92.7% 1.05
16 OccupationZ4010 * Marital_StatusZ1&SexZ1&Age_SegmentsZ4 6.1% 86.6% 0.98
18 OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ3 5.2% 88.9% 1.01
19 OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ4 5.9% 88.0% 0.99
20 OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&SexZ1 10.5% 87.9% 0.99
: : : N N
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 631

Table 4
The redundant-free cluster profile of cluster-2 (merged)

Rule ID Association rules Support Confidence Lift


2 Marital_StatusZ0 * CardTypeZ100&Age_SegmentsZ2 6.3% 86.1% 1.87
3 Marital_StatusZ0 * OccupationZ4010&SexZ1&Age_SegmentsZ2 5.3% 85.7% 1.86
6 Marital_StatusZ0 * BlockcodZ’n’&SexZ1&Age_SegmentsZ2 5.3% 85.6% 1.86
7 OccupationZ4010 * CardTypeZ100&Marital_StatusZ2 8.3% 92.7% 1.35
8 OccupationZ4010 * CardTypeZ113 6.7% 93.8% 1.36
9 OccupationZ4010 * CardTypeZ821 5.5% 97.0% 1.41
: N N N N

association rules via the visualization technique. Bastide, according to the context. This method is possible to
Pasquier, Taouil, Stumme, & Lakhal (2000) used the Galois deduce efficiently, without access to the original dataset;
connection as a basis to discover minimal non-redundant all valid association rules with their supports and
association rules. Bayardo & Agrawal (1999) proposed the confidences are from these bases.
A-maximal rules which state that when the population of Suppose that X10Y1 is a redundant-free association rule,
objects concerned is reduced when an item is added to the if and only if, there does not exist another association rule
antecedent, the form of association rules will have maximal X20Y2, such that X24X1 and Y14Y2. For example, in
antecedents. Table 2, rule 9 is redundant to rules 3–8, because rule 9 does
We intended to provide strategies to reserve useful, not convey additional information to the user. Therefore,
relevant and non-redundant association rules. Thus, rule 9 can be removed from the cluster profile. Here is an
redundant rules which represent in certain databases illustration of two types of rule merging principle.
the majority of extracted rules, particularly in the case of (1) Let X10Y1 (s1%, c1%, l1) and X20Y2 (s2%, c2%, l2) be
dense or correlated data for which the total number of two association rules in the same cluster profile, where
valid rules is very large, will be pruned. Using the X24X1 or Y14Y2. Then, X10Y1 (s1%, c1%, l1) is a
concept of equivalence class, the redundant rules will be redundant association rule and can be directly removed
collected in the same equivalence class. The presentation from the cluster profile. For example, Table 3 represents
to the user will be only the most informative non- the redundant-free cluster profile to cluster-1 (Table 2).
redundant association rules, where the union of the The last field in Table 2, ‘R. to Rules’, indicates the
antecedents (or consequents) is equal to the unions of the corresponding redundant association rules.
antecedents (or consequents) of all the association rules (2) Let X10Y1 (s1%, c1%, l1) and X20Y2 (s2%, c2%, l2) be
valid in the context. The resulting rules will have two association rules in the different cluster profiles,
minimal antecedents and maximal consequents in the where X24X1 or Y14Y2, and t1, t2 are number of cases
same equivalence class. The extraction of a set of rules representing X10Y1 (s1%, c1%, l1) and X20Y2 (s2%,
without any loss of information will convey all the c2%, l2), respectively. Then, X10Y1 (s1%, c1%, l1) is a
information in a set of association rules that are all valid redundant association rule and should be removed from
Table 5
The redundant-free cluster profile of cluster-1 and cluster-2 (merged)

Rule ID Association rules Support Confi- Lift U. to


dence Rule
1 Marital_StatusZ0 * OccupationZ4010&SexZ1&Age_SegmentsZ2&CardTypeZ100 5.58% 87.9% 1.91 c2,id2
c2,id3
3 OccupationZ4010 * Marital_StatusZ0&Age_SegmentsZ2 17.3% 87.8% 0.99
5 OccupationZ4010 * Marital_StatusZ0&Age_SegmentsZ4 6.5% 89.1% 1.01
6 OccupationZ4010 * Marital_StatusZ0&CardTypeZ113 6.28% 94.1% 1.10 c2,id8
7 OccupationZ4010 * Marital_StatusZ0&SexZ1&Age_SegmentsZ3 6.7% 87.8% 0.99
12 OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ5 9.5% 88.8% 1.00
13 OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ6 6.2% 91.0% 1.03
14 OccupationZ4010 * Marital_StatusZ1&Age_SegmentsZ7 5.0% 89.4% 1.01
15 OccupationZ4010 * Marital_StatusZ1&CardTypeZ113 5.88% 92.9% 1.09 c2,id8
16 OccupationZ4010 * Marital_StatusZ1&SexZ1&Age_SegmentsZ4 6.1% 86.6% 0.98
18 OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ3 5.2% 88.9% 1.01
19 OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&Age_SegmentsZ4 5.9% 88.0% 0.99
20 OccupationZ4010 * Marital_StatusZ1&BlockcodZ’n’&SexZ1 10.5% 87.9% 0.99
6 Marital_StatusZ0 * BlockcodZ’n’&SexZ1&Age_SegmentsZ2 5.3% 85.6% 1.86
7 OccupationZ4010 * CardTypeZ100&Marital_StatusZ2 8.3% 92.7% 1.35
9 OccupationZ4010 * CardTypeZ821 5.5% 97.0% 1.41
632 N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633

the cluster profile. Three judgment standards, support, and marketing strategies can be implemented according to
confidence and lift, of X20Y2 (s 0 %,c 0 %,l 0 ) were more detailed customer sub-groups.
updated as:

0 1
t1 * SupportðX1 g Y1 ; DB1 Þ C t2 * SupportðX2 g Y2 ; DB2 Þ
s Z0
;
B t1 C t2 C
B C
B 0 t1 * SupportðX1 g Y1 ; DB1 Þ C t2 * SupportðX2 g Y2 ; DB2 Þ C
Bc Z ;C
X 2 0 Y2 B
B t1  SupportðX1 ; DB1 Þ C t2  SupportðX2 ; DB2 Þ C:
C
B 0 c0 C
Bl Z  C
@ t1  SupportðY1 ; DB1 Þ C t2  SupportðY2 ; DB2 Þ A
:
t1 C t2

For example, Tables 3 and 4 represent the redundant-free References


cluster profiles to cluster-1 and cluster-2, respectively.
Suppose that these two tables are the customer profiles of Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules
between sets of items in large databases. Proceedings of the
the ‘transactor user’, the (c2, id2) rule in Table 4 is then a
SIGMOD’93, Washington, DC, , 207–216.
redundant rule to the (c1, id1) rule in Table 3 so it can be Baesens, B., Viaene, S., Poel, D., Vanthienen, J., & Dedene, G. (2002).
removed from Table 4, and the judgment standards of the Bayesian neural network for repeat purchase modelling in
first rule in Table 3 were updated as:Marital_StatusZ direct marketing. European Journal of Operational Research, 138,
0*OccupationZ4010 and SexZ1 and Age_SegmentsZ2 191–211.
and CardTypeZ100 (5.6%, 87.8%, 1.90).The judgment Balakrishnan, P. V. S., Cooper, M. C., Jacob, V. S., & Lewis, P. A. (1996).
Comparative performance of the FSCL neural net and K-means
standards of the rest redundant association rules were
algorithm for market segmentation. European Journal of Operational
updated accordingly as in Table 5. In here, (cx, idy) denotes Research, 93, 346–357.
the association rule of Rule-ID y in cluster-x, and the last Bastide, Y., Pasquier, N., Taouil, R., Stumme, G., & Lakhal, L. (2000).
field in Table 5, ‘U. to Rule’, indicates the judgment Mining minimal non-redundant association rules using frequent closed
standards updated according to which association rule. item sets. Lecture Notes in Computer Science, 1861, 972–986.
Bayardo, R. J., & Agrawal, R. (1999). Mining the most interesting rules.
Proceedings of KDD Conference , 145–154.
Bayardo, R. J., Agrawal, R., & Gunopulos, D. (1999). Constraint-based rule
mining in large, dense databases. Proceedings of ICDE Conference,
4. Conclusion 188–197.
Bult, J. R., & Wansbeek, T. (1995). Optimal selection for direct mail.
Credit and behavioral scoring have become useful tools to Marketing Science, 14(4), 378–381.
model financial problems. However, most studies have Chen, M. C., & Huang, S. H. (2003). Credit scoring and rejected instances
concentrated on building an accurate credit scoring model to reassigning through evolutionary computation techniques. Expert
Systems with Applications, 24, 433–441.
decide whether or not to grant credit to new applicants. In Dasgupta, C. G., Dispensa, G. S., & Ghose, S. (1994). Comparing the
order to strengthen customer behavior management for predictive performance of a neural network model with some traditional
existing credit card customers, we created a behavioral market response models. International Journal of Forecasting, 10,
scoring model using neural networks and an association rule 235–244.
inducer. The existing customers were divided into three Davies, F., Moutinho, L., & Curry, B. (1996). Curry, ATM attitudes: a
neural network analysis. Marketing Intelligence and Planning, 14(2),
profitable groups of customers according to their shared
26–32.
behavior and characteristics. Marketers then can infer the Desai, V. S., Crook, J. N., & Overstreet, G. A., Jr.. (1996). A comparison
profiles of customers in each group and propose management of neural networks and linear scoring models in the credit
strategies appropriate to the characteristics of each group. union environment. European Journal of Operational Research, 95,
This study provides a good method of analyzing bank 24–37.
databases. Beyond simply understanding customer value, Donato, J. M., Schryver, J. C., Hinkel, G. C., Schmoyer, R. L., Leuze, M. R.,
& Grandy, N. W. (1999). Mining multi-dimensional data for decision
the bank gains the opportunities to establish better customer
support. Future Generation Computer Systems, 15, 433–441.
relationships while increasing customer loyalty and rev- Dyche, J., & Dych, J. (2001). The CRM handbook: a business guide to
enue. Additionally, this two-stage behavioral scoring model customer relationship management. Reading, MA: Addison-Wesley.
also can be applied to predicate personal bankruptcy among Fish, K. E., Barnes, J. H., & Aiken, M. W. (1995). Artificial neural
bank customers to the account database. Further research networks—a new methodology for industrial market segmentation.
may aim at time-series behavioral scoring models that could Industrial Marketing Management, 24, 431–438.
Hand, D. J. (1981). Discrimination and classification. New York: Wiley.
include the change of credit status in every period. Credit
Heckerman, D. (1996). Bayesian networks for knowledge discovery.
card customers could be segmented into more subgroups Advances in knowledge discovery and data mining , 273–305.
according to newly developed predicators and so on. Thanks Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward
to this paper and many others, more detailed management networks are universal approximations. Neural Networks, 2, 336–359.
N.-C. Hsieh / Expert Systems with Applications 27 (2004) 623–633 633

Johnson, R. A., & Wichern, D. W. (1998). Applied multivariate statistical Setiono, R., Thong, J. Y. L., & Yap, C. S. (1998). Symbolic rule extraction
analysis (4th Ed.). Upper Saddle River, NJ: Prentice-Hall. from neural networks—an application to identifying organizations
Kim, Y. S., & Sohn, S. Y. (2004). Managing loan customers using adopting IT. Information and Management, 34(2), 91–101.
misclassification patterns of credit scoring model. Expert Systems with Sharda, R., & Wilson, R. (1996). Neural network experiments in business
Applications, 26, 567–573. failures predication: a review of predictive performance issues.
Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., & Verkamo, International Journal of Computational Intelligence and Organiz-
A. I. (1994). Finding interesting rules from large sets of discovered ations, 1(2), 107–117.
association rules. Proceedings of CIKM Conference , 401–407. Silberschatz, A., & Tuzhilin, A. (1996). What makes patterns interesting in
Kohonen, T. (1995). Self-organizing maps. Berlin: Springer. knowledge discovery systems. IEEE Transactions on Knowledge and
Lancher, R. C., Coats, P. K., Shanker, C. S., & Fant, L. F. (1995). A neural Data Engineering, 8(6), 970–974.
network for classifying the financial health of a firm. European Journal Silverstein, C., Brin, S., & Motwani, R. (1998). Beyond market baskets:
of Operational Research, 85(1), 53–65. generalizing association rules to dependence rules. Data Mining and
Knowledge Discovery, 2(1), 39–68.
Lee, T. S., Chiu, C. C., Lu, C. J., & Chen, I. F. (2002). Credit scoring using
Srikant, R., & Agrawal, R. (1995). Mining generalized association rules.
the hybrid neural discriminate technique. Expert Systems with
Proceedings of VLDB Conference , 407–419.
Applications, 23, 245–254.
Thomas, L. C. (2000). A survey of credit and behavioural scoring:
Malhotra, R., & Malhotra, D. K. (2003). Evaluating consumer loans using
forecasting financial risk of lending to consumers. International
neural networks. Omega, 31(2), 83–96.
Journal of Forecasting, 16, 149–172.
Mazanec, J. A. (1992). Classifying tourists into market segments: a neural
Vellido, A., Lisboa, P. J. G., & Vaughan, J. (1999). Neural networks in
network approach. Journal of Travel and Tourism Marketing, 1(1), business: a survey of applications (1992–1998). Expert Systems with
39–59. Applications, 17, 51–70.
Morrison, D. F. (1990). Multivariate statistical methods. New York, NY: Zhang, G., Hu, M. Y., Patuwo, B. E., & Indro, D. C. (1999). Artificial neural
McGraw-Hill. networks in bankruptcy prediction: general framework and cross-
Moutinho, L., Davies, F., & Curry, B. (1996). The impact of gender on car validation analysis. European Journal of Operational Research, 116,
buyer satisfaction and loyalty. Journal of Retailing and Consumer 16–32.
Sciences, 3(3), 135–144. Zurada, J. M., Malinowski, A., & Cloete, I. (1994). Sensitivity analysis for
Pasquier, N., Bastide, Y., Taouil, R., & Lakhal, L. (1999). Closed set based minimization of input data dimension for feedforward neural network.
discovery of small covers for association rules. Proceedings of BDA IEEE International Symposium on Circuits and Systems, London, May
Conference , 361–381. 20–June 3.

You might also like