Hybrid Genetic Algorithms and Case-Based Reasoning Systems For Customer Classification

Article _____________________________
Hybrid genetic algorithms and case-based reasoning systems for customer classication
Hyunchul Ahn,1 Kyoung-jae Kim2 and Ingoo Han1
(1) Graduate School of Management, Korea Advanced Institute of Science and Technology, 207-43 Cheongrangri-Dong, Dongdaemun-Gu, Seoul 130-722, South Korea E-mails: hcahn@kaist.ac.kr; ighan@kgsm.kaist.ac.kr (2) Department of Management Information Systems, Dongguk University, 3-26 Pil-Dong, Chung-Gu, Seoul 100-715, South Korea E-mail: kjkim@dongguk.edu
Abstract: Because of its convenience and strength in complex problem solving, case-based reasoning (CBR) has
been widely used in various areas. One of these areas is customer classication, which classies customers into either purchasing or non-purchasing groups. Nonetheless, compared to other machine learning techniques, CBR has been criticized because of its low prediction accuracy. Generally, in order to obtain successful results from CBR, effective retrieval of useful prior cases for the given problem is essential. However, designing a good matching and retrieval mechanism for CBR systems is still a controversial research issue. Most previous studies have tried to optimize the weights of the features or the selection process of appropriate instances. But these approaches have been performed independently until now. Simultaneous optimization of these components may lead to better performance than naive models. In particular, there have been few attempts to simultaneously optimize the weights of the features and the selection of instances for CBR. Here we suggest a simultaneous optimization model of these components using a genetic algorithm. To validate the usefulness of our approach, we apply it to two real-world cases for customer classication. Experimental results show that simultaneously optimized CBR may improve the classication accuracy and outperform various optimized models of CBR as well as other classication models including logistic regression, multiple discriminant analysis, articial neural networks and support vector machines.
Keywords: case-based reasoning, genetic algorithms, feature weighting, instance selection, customer classication, customer relationship management
1. Introduction
One of the important issues in customer relationship management is customer classication, by which a company classies its customers into predened groups with similar behavior patterns. Usually, companies build a customer classication model to nd the prospects for a specic product. In this case, they classify prospects into either purchasing or non-purchasing groups. This kind of knowledge may create a variety of marketing opportunities for the com-
pany such as one-to-one marketing, direct mailing, and sales promotion via telephone or e-mail. Consequently, many leading companies including Ford (automobile company), Allstate (insurance company) and 1-800-owers.com (dot-com company) analyze their customers proles and buying behavior, and build their customer classication models based on the probability of product purchase.1
1
SAS Institute, Success Stories, http://www.sas.com/success/. 127
2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c
Expert Systems, July 2006, Vol. 23, No. 3
In this study, we suggest a customer classication model for the direct marketing of an Internet shopping mall which uses collected information of customers as inputs to make a prediction for product purchase. As an implementation algorithm for customer classication, case-based reasoning (CBR) has been popularly applied because it is easy to apply and maintain (Chiu, 2002). However, CBR has a critical limitation as a classication technique its prediction accuracy is generally much lower than the accuracy of other articial intelligence techniques such as articial neural networks and support vector machines. Thus, an important research issue is to improve the classication accuracy when applying CBR to customer classication. In this study, we suggest a hybrid CBR model as a tool to improve the classication accuracy. Typical CBR is a problem-solving technique that reuses past experiences to nd a solution. It often improves the effectiveness of complex and unstructured decision making, and so it has been applied to various problem-solving areas including manufacturing, nance and marketing (see Shin & Han, 1999; Kim & Han, 2001; Chiu, 2002; Yin et al., 2002; Chiu et al., 2003). However, it is not easy to obtain successful results with high classication accuracy by applying CBR because there is no mechanism to design effective systems in typical CBR. In particular, it is very important to design an appropriate mechanism for case retrieval. In this aspect, the selection of appropriate feature and instance subsets and the determination of features weights in the case retrieval step have been the most popular research issues (see Wang & Ishii, 1997; Shin & Han, 1999; Kim & Han, 2001; Chiu, 2002; Kim, 2004). Recently, simultaneous optimization of several components in CBR has attracted the interest of researchers. An approach to combining feature and instance selection simultaneously was proposed as a pioneering study (Kuncheva & Jain, 1999; Rozsypal & Kubat, 2003). Feature weighting determines the weights of features from 0 to 1 but feature selection is just binary selection, 0 or 1, so feature weighting includes
128
feature selection. Consequently, feature weighting may improve the effectiveness of CBR systems better than feature selection. Nonetheless, there have been few attempts to optimize feature weighting and instance selection simultaneously. In this paper, we propose a new hybrid approach using genetic algorithms (GAs) to optimize feature weights and instance selection simultaneously. In addition, we apply the proposed model to two real-world cases of customer classication and present experimental results from the application. The paper is organized as follows. Section 2 provides a brief review of previous research and the next section describes our proposed model, the GA approach to optimizing feature weighting and instance selection simultaneously. In Section 4, the research design and experiments are explained. In the fth section, the empirical results are summarized and discussed. The nal section presents the contributions and limitations of this study.
2. Previous research
In this section, we rst review the basic concepts of CBR and GAs. After that, we introduce previous studies that attempt to optimize the components of CBR such as feature selection (weighting) and instance selection. Finally, we review some studies that tried to optimize the parameters of a CBR system simultaneously. 2.1. GAs as an optimization tool for CBR The basic idea of CBR is to nd a solution to new problems by adopting solutions that have been used in the past. Although most articial intelligence techniques pursue generalized relationships between problem descriptors and conclusions, CBR just refers to specic knowledge of previously experienced, concrete problem situations, so it is effective for complex and unstructured problems and easy to update (Shin & Han, 1999). Because of its strengths, CBR has been popularly applied to various areas of problem solving, e.g. intelligent product cata-
logs for Internet shopping malls, conict resolution in air trafc control, medical diagnosis and even the design of semiconductors (Turban & Aronson, 2001). The general process of CBR is shown graphically in Figure 1 (Turban & Aronson, 2001). Among the steps of the owchart, the second process, case retrieval, is the most important step because the performance of CBR systems usually depends on it (Kolodner, 1993). In this step, the CBR system retrieves the most similar cases from the case memory, which become the bases for solution of the input problem. Thus, it is crucial to determine appropriate similar cases.
In particular, feature weighting (selection) and instance selection for measuring similarity have been controversial issues in designing CBR systems. There have been many studies to determine these factors. Among many methods of instance selection and feature weighting, GAs are increasingly being used in CBR systems. The GA is a popular optimization method that attempts to incorporate ideas of natural evolution. Its procedure improves the search results by constantly trying various possible solutions with some kinds of genetic operations. In general, the process of the GA proceeds as follows.
Input
1. Assign indexes
Input + Indexes
Case Memory (Case Base) 5. Add new cases
2. Retrieve
Prior solution
5b. Store
3. Modify
Proposed solution
5a. Assign indexes
When accepted
4. Test
When rejected
Working solution Failure description
New Solution
6a. Explain
Causal analysis
6b. Repair
6. Revise failed case
Figure 1: CBR owchart.

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c Expert Systems, July 2006, Vol. 23, No. 3
129
First, the GA generates a set of solutions randomly, which is called an initial population. Each solution is called a chromosome and it is usually in the form of a binary string. After the generation of the initial population, a new population is formed to consist of the ttest chromosomes as well as offspring of these chromosomes based on the notion of survival of the ttest. The value of the tness for each chromosome is calculated from a user-dened function. Typically, classication accuracy (performance) is used as a tness function for classication problems. In general, offspring are generated by applying genetic operators. Among various genetic operators, selection, crossover and mutation are the most fundamental and popular. The selection operator determines which chromosome will survive. In crossover, substrings from pairs of chromosomes are exchanged to form new pairs of chromosomes. In mutation, with a very small mutation rate, arbitrarily selected bits in a chromosome are inverted. These steps of evolution continue until the stopping conditions are satised (Han & Kamber, 2001; Chiu, 2002). 2.2. Feature selection and weighting approaches Features are the elements which characterize each case in CBR. Thus, how to dene features and how to determine their relative importance affect the prediction results of CBR considerably because the similarities between cases vary according to the features and their weights. Consequently, appropriate settings for the features in CBR have been one of the most popular research topics. Feature selection is the process of picking a subset of features that are relevant to the target concept and removing irrelevant or redundant features. Feature weighting is assigning a weight to each feature according to its relative importance. Feature weighting can reect the relative importance with sophistication, but feature selection just determines whether the model would include a specic feature or not. That is, feature selection is a special case of feature weighting. These are important factors that determine the
130
performance of most classication models that include CBR. In the case of feature selection, Siedlecki and Sklanski (1989) proposed a feature selection algorithm based on genetic search, and Cardie (1993) presented a decision tree approach to feature subset selection. Skalak (1994) and Domingos (1997) proposed a hill climbing algorithm and a clustering method respectively for feature subset selection. In addition, Cardie and Howe (1997) used a mixed feature weighting and feature subset selection method. They rst selected relevant features using a decision tree, and then they assigned weights to the remaining features using the value of information gain for each feature. Jarmulak et al. (2000) selected relevant features using a decision tree algorithm including C4.5 and assigned feature weights using a GA. Regarding feature weighting, Wettschereck et al. (1997) presented various feature weighting methods based on distance metrics in the machine learning literature. Kelly and Davis (1991) proposed a GA-based feature weighting method for k-nearest neighbors. Similar methods are applied to the prediction of corporate bond rating (Shin & Han, 1999) and to failuremechanism identication (Liao et al., 2000). In addition, Kim and Shin (2000) presented feature weighting methods based on GAs and articial neural networks. 2.3. Instance selection approaches Instance selection is the technique that selects an appropriate reduced subset of a case base and applies the nearest-neighbor rule to the selected subset. Reducing the whole case base into a small subset that consists only of representative cases positively affects conventional CBR systems. First, it reduces the search space, so we can save computing time searching for nearest neighbors. It also produces quality results because it may eliminate noise in a case base. So, it has been another popular research issue in CBR systems for a long time. There exist many different approaches for selecting appropriate instances. First, Hart
(1968) suggested a condensed nearest-neighbor algorithm and Wilson (1972) proposed Wilsons method. Their algorithms are based on simple information gain, so they are easy to apply. Recent studies for instance selection use mathematical tools or articial intelligence techniques for improving accuracy. For example, Sanchez et al. (1997) proposed a proximity graph approach, and Lipowezky (1998) suggested linear programming methods for instance selection. Yan (1993) and Huang et al. (2002) presented an articial-neural-network-based instance selection method. In addition, the GA approach was proposed by Babu and Murty (2001). 2.4. Simultaneous optimization approaches The rst simultaneous optimization approach was proposed by Kuncheva and Jain (1999), so there are few studies because of the short time that has elapsed. They proposed simultaneous optimization of feature selection and instance selection using a GA, and they compared their model to sequential combining of traditional feature selection and instance selection algorithms. Rozsypal and Kubat (2003) also tried simultaneous optimization of feature and instance selection using a GA, but they differentiated their model by applying the value encoding method and more effective design of the tness function. They showed that their model outperforms the model by Kuncheva and Jain. As mentioned, feature weighting includes feature selection, since selection is a special case of weighting with binary weights. Consequently, a simultaneous optimization model of feature weighting and instance selection may improve the performance of the model of feature and instance selection. In this manner, Yu et al. (2003) proposed a global optimization model of feature weighting and instance selection for collaborative ltering. Collaborative ltering is an algorithm for recommendation that is very similar to CBR but it is different from CBR in essence. Furthermore, they did not apply articial intelligence techniques but an informationtheoretic approach to the optimization model.
Thus, in the strict sense of the words, their model is not a simultaneous optimization model but a sequential combining model of the two approaches.
3. GAs for simultaneous feature weighting and instance selection

In order to enhance the performance of typical CBR systems, this study proposes a GA as a simultaneous optimization tool of feature weighting and instance selection. Our proposed model employs a GA to select a relevant instance subset and to optimize the weights of each feature simultaneously using the reference and the test case bases. We name it FWISCBR (feature weighting and instance selection simultaneously for CBR) in this study. The framework of FWISCBR is shown in Figure 2. The detailed explanation for each phase of FWISCBR is as follows. In the rst phase, the system searches the space to nd optimal or near-optimal parameters (feature weights and selection codes for each instance). To apply the GA to search for these optimal parameters, they have to be coded on a chromosome, a form of binary strings. The structure of the chromosomes for FWISCBR is presented in Figure 3. As shown in Figure 3, the length of each chromosome for FWISCBR is 14 k n bits where k is the number of features and n is the number of instances. Here, we set the feature weights as precisely as 1=10,000. They range from 0 to 1, so 14 binary bits are required to express them with 1=10,000th precision because 8192 213 < 10000 r 214 16384 (Michalewicz, 1996). These 14 bit binary numbers are transformed into decimal oating numbers, ranging from 0 to 1, by applying the equation x0 x x 214 1 16383 1
where x is the decimal number of the binary code for each feature weight. For example, the binary code for feature 1 of the sample chromosome in Figure 3 is (10110010010110)2. Its
131
Test cases
Hold-out cases
Case indexing Reference case-base
Case retrieval
Candidate parameters
Generate solution
Solutions for the test cases
Solutions for the hold-out cases
Fitness Function
Genetic learning
Selection/Crossover/Mutation
Figure 2: Framework of FWISCBR.
Feature weighting V 1 Sample Chromosome 1 2 0 3 1 4 1 5 0 6 0 7 1 8 0 9 0 10 11 12 13 14 1 0 1 1 0 1 0 2 1 V 3 1 14 1 1 1 2 0 V 3 1 14 1
Instance selection I I I I
Figure 3: Gene structure for FWISCBR.
decimal value is (11414)10 and it is interpreted as 11414 0:696697796 % 0:6967 16383 The value of the code for instance selection is set to 0 or 1: 0 means the corresponding instance is not selected and 1 means it is selected. A sign for each instance selection requires just 1 bit, so n bits are required to implement instance selection by the GA where n is the total number of instances.
132
The population (a set of seed chromosomes for nding optimal parameters) is initiated into random values before the search process. The population is searched to nd the encoded chromosome for maximizing the specic tness function. The objective of the study is to determine appropriate feature weights and to select relevant instances for the CBR system that produce the highest classication accuracy for the test data. Thus, we set the classication accuracy of the test data as the tness function for the GA (Shin & Han, 1999; Kim, 2004). The
tness function fT for the test set T can be expressed as fT

n 1X CAi n i1
CAi 1 if POi AOi for item Ii CAi 0 if POi 6 AOi for item Ii where CAi represents whether the resulting class is correct or not for the ith test case Ii, which is denoted by 1 (correct) or 0 (incorrect), POi is the classied output from the model for the ith test case, AOi is the actual output from the model for the ith test case, and the test set T is {I1, I2, I3, . . ., In}. In the second phase, the parameters that are set in the rst stage are applied to the CBR system and the general reasoning process of CBR begins. As shown in equation (3), the similarity between an input case and stored cases is calculated as the weighted sum of feature distances (Watson, 1997; Jarmulak et al., 2000). As a similarity function, we use the Euclidean distance for each feature. Pn I R i 1 Wi sim fi ; fi Pn 3 i 1 Wi where Wi is the weight of the ith feature, fiI is the value of the ith feature for the input case, fiR is the value of the ith feature for the retrieved case and sim( ) is the similarity function for fiI and fiR . In order to mitigate the size effect of each feature, we apply minmax normalization to all of the non-Boolean variables. Minmax normalization performs a linear transformation on the original data. Suppose that minA and maxA are the minimum and maximum values of attribute A. Minmax normalization maps value n of A to n0 in the range [new_minA, new_maxA] by computing v0 v minA new maxA maxA minA new minA new minA 4
because it ensures that larger value input features do not overwhelm smaller value input features (Han & Kamber, 2001). In this study, all the features of our data sets range from 0 to 1. Also, we use k-nearest neighbor (k-NN) matching as a method of case retrieval. The system searches for the k-nearest neighbors for an input case and suggests the nal classication result by voting for them. In this paper, we determine the k parameter of k-NN which shows the best prediction accuracy among 1-NN, 3-NN, 5-NN, 7-NN and 9-NN. After the adoption of the reasoning process for all the test cases, the values of the tness function fT for the items of test set T are updated. In the third phase, the process of the GAs evolution goes on towards the direction of maximizing the value of the tness function. It includes selection of the ttest, crossover and mutation. The second and third phases are iterated repeatedly until the stopping conditions are satised. In the last stage, the system determines the parameters the optimal weights of features and selection of instances whose performance for the test data is the best. It applies them to the hold-out data to check the generalizability of the selected parameters. Sometimes, parameters optimized by the GA do not t the unknown data. Thus, this phase is required to check the possibility of overtting.
4. The research design and experiments

4.1. Application data: two cases To validate the usefulness of our approach, we applied our proposed CBR model to two realworld data sets. The rst data set was collected from an online diet portal site in Korea that contains all kinds of services for online diets such as providing information, community services and a shopping mall. For convenience, we call this case Case 1 hereafter. In the case of online diet portals, the customers usually have a clear objective for using the service and they are very sensitive to the personalized information
Minmax normalization is usually employed to enhance the performance of CBR systems
133
services. Therefore, they are eager to give their personal information to get more precise service from the site. As a result, the target company of our research possesses detailed and accurate customer information, and it has a desire to use the information as a source of direct marketing for selling its products. Based on this motivation, we built a customer classication model to predict the potential buyers for the companys products among the users of its website. The data set for Case 1 includes 980 cases that consist of purchasing and non-purchasing customers. It contains demographic variables and the status of purchase or non-purchase for the corresponding user. The status of purchase for each user is categorized as 0 or 1 and it is used as a dependent variable: 0 means that the user has not purchased the companys products and 1 means that he or she made a purchase. We collected 46 independent variables in total including demographic and other personal information. To eliminate irrelevant variables and make the reasoning process more efcient, we adopted two statistical methods: two-sample t tests for ratio variables and w2 tests for nominal variables. Finally, we selected only 14 factors that proved to be the most inuential in the purchase of the companys products. The second data set for our validation is the public data from the Ninth GVUs WWW User Survey, part of which concerns Internet users opinions on on-line shopping (Kehoe et al., 1998). This case is called Case 2 hereafter. The data set consists of 53 independent features on general user opinions of using the Worldwide Web for shopping and on user opinions of Webbased vendors, and demographic information on 800 Internet users. The dependent variable is set as the Boolean variable, which indicates the experience of online shopping. As in Case 1, 0 means the user has not purchased from any Web-based vendors and 1 means he or she has purchased online. By applying two-sample t tests, we select only 43 features among the total independent variables that affect the dependent variable with statistical signicance at the 95% level. Appendix A presents detailed information on the selected factors of these two cases.
134
Table 1: The portion of each case base

Case base Reference Test Hold-out Total Portion 60% 20% 20% 100% Sample size of Case 1 588 196 196 980 Sample size of Case 2 480 160 160 800
To apply GA optimization, we split the data sets into three groups: reference (training), test, and hold-out case bases. Table 1 shows the assigned portion and the sample size of each case base. 4.2. Research design and system development We set the population size, crossover rate, mutation rate and stopping condition as the controlling parameters of the GA search for our experiments. However, there are few theories that can concisely guide the assignment of these values (Chiu, 2002). Thus, we determine the value of these parameters in the light of previous studies that combined CBR and GAs. Most prior studies use 50100 organisms as the population size. In addition, their crossover rate ranges from 0.5 to 0.7, and the mutation rate ranges from 0.06 to 0.12 (Shin & Han, 1999; Chiu, 2002; Shin & Lee, 2002). However, because the search space for our GA is much larger than these studies, we set these parameters to somewhat larger values. Thus, we use 100200 organisms in the population and set the crossover rate at 0.7 and the mutation rate at 0.1. As a stopping condition, we use 20004000 trials (20 generations). Regarding genetic operators, this study performs the crossover using a uniform crossover routine. The uniform crossover method is considered better at preserving the schema and can generate any schema from the two parents, while single-point and two-point crossover methods may bias the search with an irrelevant position of the features. This study generates a random number between 0 and 1 for each of the features in the organism for the mutation method. If a feature gets a number
that is less than or equal to the mutation rate, then that feature is mutated. To test the effectiveness of the proposed model, we also apply ve different CBR models to our experimental data sets. The rst model, labeled ConvCBR (conventional CBR), uses a conventional approach for the reasoning process of CBR. This model considers all initially available features as a feature subset. That is, there is no special process of feature subset selection. The relative importance of each feature is not considered because many conventional CBR models do not have general feature selection or a feature weighting algorithm. In addition, instance selection is also not considered here, so all instances are selected in this model. The second model, named FSCBR (feature selection for CBR), selects relevant features using the GA. In this model, we try to optimize feature selection by the GA, but we are still unconcerned with instance selection. The third model assigns relevant feature weights via genetic search. It is named FWCBR (feature weighting for CBR). Similar models have been suggested previously by Kelly and Davis (1991), Shin and Han (1999), Kim and Shin (2000) and Liao et al. (2000). This model again does not deal with instance selection. The fourth model, ISCBR (instance selection for CBR) uses the GA to select a relevant instance subset. In this model, we try to optimize instance selection by the GA, but we are unconcerned with feature selection or weighting. Therefore, we set all weights of features to 1. Babu and Murty (2001) proposed a similar model. In the fth model, the GA simultaneously searches to optimize relevant features and instances. We call this FISCBR (feature and instance selection simultaneously for CBR). The suggested models of Kuncheva and Jain (1999) and Rozsypal and Kubat (2003) are similar. These experiments are done by our private prototype software which is developed using Microsoft Excel 2003 and Evolver Industrial Version 4.08 (Palisade Software, www.palisade.
com), a commercial GA tool. The k-NN algorithm is implemented in VBA (Visual Basic for Applications) of Microsoft Excel 2003 and the VBA codes are incorporated with Evolver to optimize the parameters by the GA. Figure 4 represents the working screen of the developed experimental system. To examine the prediction performances of FWISCBR compared to other statistical and articial intelligence methods, we also applied four other comparative models to the given data sets. They include LOGIT (logistic regression), MDA (multiple discriminant analysis), ANNs (articial neural networks) and SVM (support vector machine). For LOGIT, we use a forward selection procedure and set the probability for stepwise entry to 0.05. In the case of MDA, we again use a stepwise method using Wilkss l for the selection of input variables, and we use the F value as the criterion for entry or removal of the input variables. In the case of ANNs, we adopt a standard three-layer back-propagation network and set the learning rate to 0.1 and the momentum term to 0.1. The hidden and output nodes use the sigmoid transfer function. We perform the experiments repeatedly by varying the number of nodes in the hidden layer to n=2, n, 3n=2 and 2n where n is the number of input features. For the stopping criterion of the ANNs, this study allows 150 learning epochs. In the case of SVM, the linear kernel, the polynomial kernel and the Gaussian radial basis function (RBF) are used as the kernel function. Equations (5)(7) show each function in detail. Linear kernel function: Kxi ; xj xT xj i Polynomial kernel function of power d: Kxi ; xj 1 xT xj d i Gaussian RBF kernel function: Kxi ; xj exp1=d2 xi xj 2 7 6 5
Tay and Cao (2001) showed that the upper bound C and the kernel parameters like d or d2
135
Figure 4: Working screen of the experimental system.
play an important role in the performance of SVMs. Improper selection of these two parameters may cause overtting or undertting problems. Thus, we varied the parameters to select optimal values for the best prediction performance. This study uses LIBSVM Version 2.8 as the tool for implementing SVM.2
5. Experimental results
For ConvCBR, we apply the k-NN algorithm by varying the parameter k as odd numbers
2 LIBSVM is available at http://www.csie.ntu.edu.tw/ $cjlin/libsvm/.
ranging from 1 to 9. As a result, we nd that 3-NN shows the best performance for both Case 1 and Case 2, as illustrated in Table 2. So, we apply 3-NN to all the other CBR models including FSCBR, FWCBR, ISCBR, FISCBR and FWISCBR. Appendix B and Table 3 present the results for FSCBR, FWCBR, ISCBR, FISCBR and FWISCBR. In Appendix B, the tables show the nally selected instances, features and feature weights for the two cases. Table 3 presents the overall prediction performances of all comparative models. From Appendix B, we nd that ISCBR, FISCBR and FWISCBR for both cases use only a portion of the training case base, and
136
Table 2: Classication accuracy of k-NN for ConvCBR

k Performance for hold-out case base, Case 1 Performance for hold-out case base, Case 2 1 52.04% 66.25% 3 56.12% 73.13% 5 55.61% 70.63% 7 54.08% 70.63% 9 53.57% 68.75%
Table 3: Average prediction accuracy of the models

Model Case 1 LOGIT MDA ANN SVM ConvCBR FSCBR FWCBR ISCBR FISCBR FWISCBR Case 2 LOGIT MDA ANN SVM ConvCBR FSCBR FWCBR ISCBR FISCBR FWISCBR Training 63.30% 63.10% 68.37% 65.82% Test Hold-out 62.76% 62.76% 61.73% 63.27% 56.12% 60.20% 61.73% 63.27% 64.29% 67.86% 78.75% 76.25% 80.00% 81.25% 73.13% 80.00% 78.13% 80.63% 81.88% 83.75% Remarks for nally selected model Forward selection Stepwise (Wilkss l) 14 nodes in hidden layer Gaussian RBF kernel function with C 1 and d2 75 k of k-NN 3 k of k-NN 3 k of k-NN 3 k of k-NN 3 k of k-NN 3 k of k-NN 3 Enter Stepwise (Wilkss l) 43 nodes in hidden layer Polynomial kernel function with C 55 and d 1 k of k-NN 3 k of k-NN 3 k of k-NN 3 k of k-NN 3 k of k-NN 3 k of k-NN 3
66.84% 62.24% 62.76% 65.82% 66.33% 70.92% 72.96%
85.63% 85.21% 86.46% 87.29%
87.50% 75.00% 82.50% 82.50% 83.13% 85.63% 85.63%
these three techniques select about 55%90% of the total training samples. Thus our data sets seem to contain quite a few irrelevant or harmful cases. Comparing the weights of FWCBR, FISCBR and FWISCBR in Case 1, we nd that the pattern of the weights of FWISCBR is similar to that of FISCBR rather than FWCBR because the weights of ADD0, OCCU0 and SEX are small in both FWISCBR and FISCBR. Also, in Case 2 we nd that the pattern of the weights of FWISCBR is similar to that of FISCBR rather than FWCBR when considering the weights of many features including IC4, IC9, IC12, IC30, IC33, CMA4, CMA7 and CMA14. Thus, instance selection seems to affect the results of feature weighting or feature selection.
As shown in Table 3, FWISCBR has the highest level of accuracy (67.86% for Case 1 and 83.75% for Case 2) among the models for the given hold-out data set, followed by FISCBR (64.29% for Case 1 and 81.88% for Case 2) and SVM (63.27% for Case 1 and 81.25% for Case 2). The result shows that FWISCBR may improve the prediction accuracy of typical CBR systems dramatically, by about 11.73% for Case 1 and 10.62% for Case 2. Thus, the experimental results show that FWISCBR may be an excellent option for building a customer classication model that not only has the inherent strength of CBR but also produces prediction results as accurate as other advanced articial intelligence algorithms
137
Table 4: Z values for the hold-out data

LOGIT Case 1 MDA LOGIT ANN SVM ConvCBR FSCBR FWCBR ISCBR FISCBR Case 2 MDA LOGIT ANN SVM ConvCBR FSCBR FWCBR ISCBR FISCBR
* **
ANN
SVM
ConvCBR FSCBR FWCBR 1.3372** 1.3372** 1.1293* 1.4416** 0.5189 0.5189 0.3106 0.6235 0.8191 0.2084 0.2084 0.0000 0.3130 1.1293* 0.3106
ISCBR 0.1046 0.1046 0.3130 0.0000 1.4416** 0.6235 0.3130
FISCBR FWISCBR 0.3148 0.3148 0.5231 0.2102 1.6510** 0.8335 0.5231 0.2102 1.0611* 1.0611* 1.2690* 0.9566* 2.3932*** 1.5787** 1.2690* 0.9566* 0.7467 1.6771** 1.1458* 0.8707* 0.5885 2.3108*** 0.8707* 1.2809* 0.7305 0.4445
0.0000
0.2084 0.1046 0.2084 0.1046 0.3130
0.5355
0.8113 1.0932* 0.2763 0.5590 0.2829
0.6428 1.1770* 1.4516** 1.7318**
0.8113 0.2763 0.0000 0.2829 1.4516**
0.3997 0.1359 0.4122 0.6947 1.0416* 0.4122
0.9515* 0.4168 0.1406 0.1423 1.5910** 0.1406 0.5526
1.2366* 0.7029 0.4270 0.1442 1.8742** 0.4270 0.8385 0.2864
Signicant at the 10% level. Signicant at the 5% level. *** Signicant at the 1% level.
such as ANNs and SVM. Table 3 also shows that ISCBR outperforms FSCBR and FWCBR. This means that selection of appropriate instances is more important than proper selection or weighting of the features for these data sets. We use the two-sample test for proportions to examine whether the differences in predictive accuracy between FWISCBR and other comparative algorithms is statistically signicant. By applying this test, it is possible to check whether there exists a difference between two probabilities when the prediction accuracy of the leftvertical methods is compared with that of the right-horizontal methods (Harnett & Soni, 1991). In the test, the null hypothesis is H0: pi pj 0 where i 1, . . ., 8 and j 2, . . ., 9, while the alternative hypothesis is Ha: pi pj>0 where i 1, . . ., 8 and j 2, . . ., 9. pk means the classication performance of the kth method. Table 4 shows Z values for the pairwise comparison of performance between models. As shown in Table 4, FWISCBR outperforms ConvCBR at the 1% statistical signicance level and FSCBR at the 5% (10% for Case 2) statistical
138
signicance level. It also outperforms all of the other comparative algorithms except for FISCBR (and ISCBR for Case 2) at the 10% statistical signicance level. Thus, we may conclude that our proposed approach shows better prediction performance as a whole. However, FISCBR which optimizes two dimensions of CBR (i.e. features and instances) simultaneously also shows promising prediction accuracy that is as good as our model in practice.
6. Conclusions
We have suggested a new kind of hybrid system of GA and CBR to improve the performance of the typical CBR system for customer classication. This paper uses a GA as a tool to optimize the feature weights and instance selection simultaneously. From the results of the experiment, we show that FWISCBR, our proposed model, may outperform other comparative algorithms such as ConvCBR, FSCBR, FWCBR and ISCBR as well as FISCBR in two cases of
customer classication. Moreover, we have found that it may also outperform other statistical and articial intelligence methods such as LOGIT, MDA, ANNs and SVM. The major contribution of our study is that our proposed model, FWISCBR, may provide prediction results as accurate as other advanced articial intelligence techniques for customer classication. In fact, CBR is a very useful algorithm for customer classication (Watson, 1997). First, it is not complex to implement, and it is easy to combine with most companies customer databases with a relational format. Moreover, CBR can classify a target customer into the purchasing group or non-purchasing group, and its prediction model (i.e. reference case base) is updated in real time (Kolodner, 1993). However, in practice, the low prediction accuracy of CBR hinders using it as a customer classication model. Thus, FWISCBR may be useful in practical applications because it has all the advantages of CBR as well as the ability to produce accurate prediction results. In addition, FWISCBR can condense the case base, which enables efcient problem solving. As shown in Appendix B, FWISCBR only uses 55%60% of the reference cases, so it can save much time for making a prediction. CBR, by nature, requires more time as the size of the case base grows. Thus, condensing the case base for CBR is also an important issue when applying it to a problem that has many prior cases. FWISCBR can be used as a solution for the scalability problem.
However, this study also has some limitations. First, it takes too much computational time to obtain optimal parameters for FWISCBR. As mentioned, FWISCBR iterates the case retrieval process whenever genetic evolution occurs. In general, the case retrieval process in CBR takes much computational time because it should search the whole case base to nd just one solution. So, efforts to make FWISCBR more efcient should be followed in the future to apply our model to general cases in reality. Second, there are other factors which enhance the performance of the CBR system that may be incorporated with the simultaneous optimization model. For example, the k parameter of the k-NN the number of cases to combine may be another parameter to optimize (Lee & Park, 1999; Ahn et al., 2003). In addition, the instance weighting may also be another research issue, though the search space for it would be extremely large. Building a universal simultaneous optimization model for CBR including feature weights and instance selection as well as other factors like the k parameter may improve the overall performance of the CBR system. Finally, the results of this study may depend on the experimental data set. In particular, all of our research data sets are collected for customer classication, and this is a binary classication problem. Consequently, in order to validate the general applicability of FWISCBR, it should be applied to other data sets in various problem domains in the future.
139
Appendix A: Selected features and their descriptions

Feature name Case 1 AGE ADD0 ADD1 OCCU0 OCCU2 OCCU4 SEX LOSS4 PUR0 HEIGHT BMI Description Age Residences are located in Seoul (the capital of South Korea) Residences are located in big cities Range
Continuous (years) 0: False 1: True 0: False 1: True Customers work for companies 0: False 1: True Customers are students 0: False 1: True Customers run their own businesses 0: False 1: True Gender 0: Male 1: Female Customers need to lose weight around their legs 0: Not exist and thighs 1: Exist Customers diet for beauty 0: False 1: True Height Continuous (m) Body mass index (BMI) is the measure of body fat based on height Continuous (kg=m2) and weight that applies to both adult men and women. It is calculated as follows: BMI kg=m2 weight kg height m2 0: Not exist 1: Exist 0: Not exist 1: Exist 0: Not exist 1: Exist Continuous (years) Continuous (US$1000) Continuous 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale
E01 E02 E05 Case 2 AGE INCOME HOURS IC1 IC3 IC4 IC7 IC9 IC11 IC12 IC13 IC15 IC16
Prior experience with functional diet food Prior experience with diet drugs Prior experience with one food diet Age Current household income Average of hours spent using WWW (per week) Shopping over the WWW would enhance my effectiveness at shopping Shopping over the WWW would require me to purchase equipment which would be beyond my nancial means Shopping over the WWW ts into my shopping style Shopping over the WWW would increase my shopping productivity I am able to experiment with shopping over the WWW as necessary I would trust online vendors enough to feel safe shopping over the WWW Learning to shop over the WWW would be easy for me Shopping over the WWW would be expensive since it would require me to pay for access to the Internet Shopping over the WWW would be very risky Shopping over the WWW would allow me to have better item selection in my shopping
140
Appendix A:
Feature name IC17 IC18 IC19 IC21 IC22 IC23 IC24 IC25 IC26 IC27 IC28 IC30 IC31 IC32 IC33 IC34 CMA1 CMA2 CMA3 CMA4 CMA5 CMA6 CMA7 CMA9 CMA10 CMA11 CMA12 CMA14 CMA15 CMA16
Continued
Description I could afford to buy the equipment needed to shop over the WWW Shopping over the WWW would improve my image with those around me I would trust an Internet service provider with transmitting personal information necessary for me to shop over the WWW Shopping over the WWW would be completely compatible with my current situation Shopping over the WWW would give me greater control over my shopping Shopping over the WWW would be a safe way to shop Shopping over the WWW would require a lot of mental effort I could afford to pay a monthly fee to an Internet service provider in order to shop over the WWW Shopping over the WWW would allow me to do my shopping more quickly People who shop over the WWW have greater prestige than those who do not Shopping over the WWW would be compatible with all aspects of the way I shop Shopping over the WWW would improve my shopping abilities Overall, I believe that shopping over the WWW would be easy to do Shopping over the WWW would allow me to get better prices when shopping Shopping over the WWW would be clear and understandable Ive had a great deal of opportunity to try shopping over the WWW It is easier to nd a Web-based vendor that sells the items I wish to purchase I can quickly gather information about products and services I wish to purchase from Web-based vendors Web-based vendors deliver orders=services in a more timely manner It is easier to place orders with Web-based vendors Web-based vendors provide better customer service and after-sales support Placing an order for an item takes less time with Web-based vendors I can gather more information from Web-based vendors about an item I want to purchase Web-based vendors are better at providing me easy access to the opinions of experts about products I wish to purchase Paying for an item purchased is easier with Web-based vendors Web-based vendors are better at providing information about updates on products Ive purchased It is easier to compare similar items between different Web-based vendors It is more risky to make payment to Web-based vendors when purchasing an item I would prefer to gather purchase-related information through Web-based vendors It takes longer to receive the item purchased from Web-based vendors Range 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale 5-point Likert scale
141
Appendix B:
Feature name
The feature weights and instance selection of optimized CBR models

FSCBR FWCBR ISCBR FISCBR FWISCBR
Case 1 Feature weights AGE ADD0 ADD1 OCCU0 OCCU2 OCCU4 SEX LOSS4 PUR0 HEIGHT BMI E01 E02 E05 Instance selections Number of selections Ratio (%) Case 2 Feature weights AGE INCOME HOURS IC1 IC3 IC4 IC7 IC9 IC11 IC12 IC13 IC15 IC16 IC17 IC18 IC19 IC21 IC22 IC23 IC24 IC25 IC26 IC27 IC28 IC30 IC31 IC32 IC33 IC34 CMA1 CMA2 CMA3
0 1 1 0 0 0 1 1 0 0 1 1 1 1 588 100%
0.5678 0.9035 0.8532 0.9715 0.9313 0.7782 0.8097 1.0000 0.8836 0.6249 0.8093 0.8836 0.5022 1.0000 588 100%
1 1 1 1 1 1 1 1 1 1 1 1 1 1 524 89.12%
1 0 1 0 1 1 0 1 1 1 0 1 1 1 478 81.29%
0.3264 0.0102 0.3997 0.0135 0.8071 0.8974 0.0124 0.4335 0.8093 0.8475 0.3038 0.7450 0.2632 0.6381 347 59.01%
0 0 0 1 1 1 1 1 0 0 1 0 0 1 1 1 1 0 1 1 1 1 1 1 1 0 0 1 0 1 1 0
0.8928 0.0411 0.9140 0.0129 0.0746 0.5451 0.9406 0.1926 0.3861 0.3147 0.3671 0.2070 0.4209 0.6595 0.6363 0.2196 0.7791 0.0772 0.3187 0.6392 0.2500 0.1046 0.7938 0.4904 0.7833 0.3689 0.4298 0.5126 0.5898 0.7996 0.5865 0.6532
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 0 0 0 0 0 1 1 1 0 1 1 0 1 1 1 0 1 0 1 0 1 1 1 0 1 1 0 1 1 1 1
0.8994 0.0706 0.0031 0.0227 0.0922 0.1504 0.8358 0.9837 0.5390 0.1599 0.2306 0.6272 0.4093 0.9724 0.9351 0.9050 0.6544 0.2969 0.1135 0.6199 0.0521 0.7675 0.9437 0.7516 0.1817 0.9083 0.0151 0.0982 0.9390 0.5566 0.7806 0.8728
142
Appendix B:
Feature name
Continued
FSCBR 1 0 1 1 1 0 0 1 0 1 1 480 100% FWCBR 0.5867 0.9458 0.0405 0.5696 0.7379 0.1717 0.1085 0.4741 0.0474 0.8064 0.7233 480 100% ISCBR 1 1 1 1 1 1 1 1 1 1 1 312 65.00% FISCBR 0 0 0 0 1 0 0 1 1 1 1 293 61.04% FWISCBR 0.0837 0.7387 0.0903 0.1364 0.9881 0.3495 0.1015 0.9572 0.7645 0.8380 0.8177 267 55.63%
CMA4 CMA5 CMA6 CMA7 CMA9 CMA10 CMA11 CMA12 CMA14 CMA15 CMA16 Instance selections Number of selections Ratio (%)
References
AHN, H., K.-J. KIM and I. HAN (2003) Determining the optimal number of cases to combine in an effective case-based reasoning system using genetic algorithms, Proceedings of the International Conference of Korea Intelligent Information Systems Society 2003 (ICKIISS 2003), Seoul: Korea Intelligent Information Systems Society, 178184. BABU, T.R. and M.N. MURTY (2001) Comparison of genetic algorithm based prototype selection schemes, Pattern Recognition, 34, 523525. CARDIE, C. (1993) Using decision trees to improve case-based learning, Proceedings of the 10th International Conference on Machine Learning, San Francisco, CA: Morgan Kaufmann, 2532. CARDIE, C. and N. HOWE (1997) Improving minority class prediction using case-specic feature weights, Proceedings of the 14th International Conference on Machine Learning, San Francisco, CA: Morgan Kaufmann, 5765. CHIU, C. (2002) A case-based customer classication approach for direct marketing, Expert Systems with Applications, 22, 163168. CHIU, C., P.C. CHANG and N.H. CHIU (2003) A casebased expert support system for due-date assignment in a water fabrication factory, Journal of Intelligent Manufacturing, 14, 287296. DOMINGOS, P. (1997) Context-sensitive feature selection for lazy learners, Articial Intelligence Review, 11, 227253. HAN, J. and M. KAMBER (2001) Datamining: Concepts and Techniques, San Francisco, CA: Morgan Kaufmann. HARNETT, D.L. and A.K. SONI (1991) Statistical Methods for Business and Economics, Reading, MA: Addison-Wesley.
HART, P.E. (1968) The condensed nearest neighbor rule, IEEE Transactions on Information Theory, 14, 515516. HUANG, Y.S., C.C. CHIANG, J.W. SHIEH and E. GRIMSON (2002) Prototype optimization for nearest-neighbor classication, Pattern Recognition, 35, 12371245. JARMULAK, J., S. CRAW and R. ROWE (2000) Selfoptimizing CBR retrieval, Proceedings of the 12th IEEE International Conference on Tools with Articial Intelligence, Washington, DC: IEEE Computer Society, 376383. KEHOE, C., J. PITKOW and J.D. ROGERS (1998) Ninth GVUs WWW User Survey, URL: http://www.gvu. gatech.edu/user_surveys/survey-1998-04/. KELLY, J.D.J. and L. DAVIS (1991) Hybridizing the genetic algorithm and the k nearest neighbors classication algorithm, Proceedings of the 4th International Conference on Genetic Algorithms, San Diego, CA: Morgan Kaufmann, 377383. KIM, K. (2004) Toward global optimization of casebased reasoning systems for nancial forecasting, Applied Intelligence, 21 (3), 239249. KIM, K. and I. HAN (2001) Maintaining case-based reasoning systems using a genetic algorithms approach, Expert Systems with Applications, 21, 139145. KIM, S.H. and S.W. SHIN (2000) Identifying the impact of decision variables for nonlinear classication tasks, Expert Systems with Applications, 18, 201214. KOLODNER, J. (1993) Case-based Reasoning, San Mateo, CA: Morgan Kaufmann. KUNCHEVA, L.I. and L.C. JAIN (1999) Nearest neighbor classier: simultaneous editing and feature selection, Pattern Recognition Letters, 20, 11491156. LEE, H.Y. and K.N. PARK (1999) Methods for determining the optimal number of cases to combine in an effective case based forecasting system, Korean Journal of Management Research, 27, 12391252.
143
LIAO, T.W., Z.M. ZHANG and C.R. MOUNT (2000) A case-based reasoning system for identifying failure mechanisms, Engineering Applications of Articial Intelligence, 13, 199213. LIPOWEZKY, U. (1998) Selection of the optimal prototype subset for 1-NN classication, Pattern Recognition Letters, 19, 907918. MICHALEWICZ, Z. (1996) Genetic Algorithms Data Structures Evolution Programs, 3rd edn, Berlin: Springer. ROZSYPAL, A. and M. KUBAT (2003) Selecting representative examples and attributes by a genetic algorithm, Intelligent Data Analysis, 7, 291304. SANCHEZ, J.S., F. PLA and F.J. FERRI (1997) Prototype selection for the nearest neighbour rule through proximity graphs, Pattern Recognition Letters, 18, 507513. SHIN, K.-S. and I. HAN (1999) Case-based reasoning supported by genetic algorithms for corporate bond rating, Expert Systems with Applications, 16, 8595. SHIN, K.-S. and Y.-J. LEE (2002) A genetic algorithm application in bankruptcy prediction modeling, Expert Systems with Applications, 23 (3), 321328. SIEDLECKI, W. and J. SKLANSKI (1989) A note on genetic algorithms for large-scale feature selection, Pattern Recognition Letters, 10, 335347. SKALAK, D.B. (1994) Prototype and feature selection by sampling and random mutation hill climbing algorithms, Proceedings of the 11th International Conference on Machine Learning, San Francisco, CA: Morgan Kaufmann, 293301. TAY, F.E.H. and L.J. CAO (2001) Application of support vector machines in nancial time series forecasting, Omega, 29, 309317. TURBAN, E. and J.E. ARONSON (2001) Decision Support Systems and Intelligent Systems, 6th edn, Upper Saddle River, NJ: Prentice-Hall. WANG, Y. and N. ISHII (1997) A method of similarity metrics for structured representations, Expert Systems with Applications, 12, 89100. WATSON, I. (1997) Applying Case-based Reasoning: Techniques for Enterprise Systems, San Francisco, CA: Morgan Kaufmann. WETTSCHERECK, D., D.W. AHA and T. MOHRI (1997) A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms, Articial Intelligence Review, 11, 273314. WILSON, D.L. (1972) Asymptotic properties of nearest neighbor rules using edited data, IEEE Transactions on Systems, Man, and Cybernetics, 2 (3), 408421. YAN, H. (1993) Prototype optimization for nearest neighbor classier using a two-layer perceptron, Pattern Recognition, 26, 317324. YIN, W.J., M. LIU and C. WU (2002) A genetic learning approach with case-based memory for job-shop scheduling problems, Proceedings of the 1st International
Conference on Machine Learning and Cybernetics, Piscataway, NJ: IEEE Press, Vol. 2, 16831687. YU, K., X. XU, M. ESTER and H.-P. KRIEGEL (2003) Feature weighting and instance selection for collaborative ltering: an information-theoretic approach, Knowledge and Information Systems, 5, 201224.
The authors
Hyunchul Ahn
Hyunchul Ahn is a PhD candidate in the Graduate School of Management of the Korea Advanced Institute of Science and Technology (KAIST). He received the BS and ME degrees from KAIST. His research interests are in the areas of data mining in marketing and nance, and articial intelligence techniques such as case-based reasoning, genetic algorithms and support vector machines.
Kyoung-jae Kim
Kyoung-jae Kim is an assistant professor of the Department of Management Information Systems in the Dongguk University. He received his PhD from KAIST. He has published in Applied Intelligence, Expert Systems, Expert Systems with Applications, Intelligent Data Analysis, Intelligent Systems in Accounting, Finance and Management, Neural Computing and Applications, Neurocomputing and other journals. His research interests include data mining, knowledge management and intelligent agents.
Ingoo Han
Ingoo Han is a professor at the Graduate School of Management of KAIST. He received his PhD from the University of Illinois at Urbana-Champaign. His papers have been published in Decision Support Systems, Information and Management, International Journal of Electronic Commerce, Expert Systems, Expert Systems with Applications, International Journal of Intelligent Systems in Accounting, Finance and Management and other journals. His research interests include applications of articial intelligence for nance and marketing.
144

Hybrid Genetic Algorithms and Case-Based Reasoning Systems For Customer Classification

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hybrid Genetic Algorithms and Case-Based Reasoning Systems For Customer Classification

Uploaded by

Copyright:

Available Formats

Article _____________________________

SAS Institute, Success Stories, http://www.sas.com/success/. 127

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

Expert Systems, July 2006, Vol. 23, No. 3

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

Case Memory (Case Base) 5. Add new cases

Working solution Failure description

6. Revise failed case

Figure 1: CBR owchart.

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

3. GAs for simultaneous feature weighting and instance selection

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

Case indexing Reference case-base

Solutions for the test cases

Solutions for the hold-out cases

Figure 2: Framework of FWISCBR.

Feature weighting V 1 Sample Chromosome 1 2 0 3 1 4 1 5 0 6 0 7 1 8 0 9 0 10 11 12 13 14 1 0 1 1 0 1 0 2 1 V 3 1 14 1 1 1 2 0 V 3 1 14 1

Figure 3: Gene structure for FWISCBR.

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

tness function fT for the test set T can be expressed as fT

4. The research design and experiments

Minmax normalization is usually employed to enhance the performance of CBR systems

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

Table 1: The portion of each case base

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

Figure 4: Working screen of the experimental system.

Expert Systems, July 2006, Vol. 23, No. 3

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

Table 2: Classication accuracy of k-NN for ConvCBR

Table 3: Average prediction accuracy of the models

66.84% 62.24% 62.76% 65.82% 66.33% 70.92% 72.96%

85.63% 85.21% 86.46% 87.29%

87.50% 75.00% 82.50% 82.50% 83.13% 85.63% 85.63%

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

Table 4: Z values for the hold-out data

ISCBR 0.1046 0.1046 0.3130 0.0000 1.4416** 0.6235 0.3130

0.2084 0.1046 0.2084 0.1046 0.3130

0.8113 1.0932* 0.2763 0.5590 0.2829

0.6428 1.1770* 1.4516** 1.7318**

0.8113 0.2763 0.0000 0.2829 1.4516**

0.3997 0.1359 0.4122 0.6947 1.0416* 0.4122

0.9515* 0.4168 0.1406 0.1423 1.5910** 0.1406 0.5526

1.2366* 0.7029 0.4270 0.1442 1.8742** 0.4270 0.8385 0.2864

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

Expert Systems, July 2006, Vol. 23, No. 3

Appendix A: Selected features and their descriptions

Expert Systems, July 2006, Vol. 23, No. 3

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

Expert Systems, July 2006, Vol. 23, No. 3

The feature weights and instance selection of optimized CBR models

Expert Systems, July 2006, Vol. 23, No. 3

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

Expert Systems, July 2006, Vol. 23, No. 3

 2006 The Authors. Journal Compilation  2006 Blackwell Publishing Ltd. c c

You might also like

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

0.6428 1.1770* 1.4516 1.7318

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c

2006 The Authors. Journal Compilation 2006 Blackwell Publishing Ltd. c c