Business Data Mining

1.
Three techniques for classification and prediction are listed below:
Classification Trees
Classification trees are used
1. When the target variable is categorical
2. When the goal is to generate understandable and explainable rules
3. To pick a good set of variables to be used as inputs to another modeling technique
like ANN etc
4. The early phase of majority of data mining projects as they reveal so much about
the data
Classification trees are not used

1. When there are more than one output variable.
2. For continuous variables since they generate lumpy estimates i.e. all the records
reaching the same leaf are assigned the same estimated value
3. When the number of training examples per class gets small.
4. For trees with many levels and/or many branches per node.
Possible issues that should be considered are:

1. Over-fitting:
As the tree grows (nodes get smaller) the tree over-fits the training set by identifying
patterns specific to training set thereby allowing the idiosyncrasies of the training set
to dominate. As a result we end up with an unstable tree making bad predictions. The
solution is to go for pruning.
2. Misclassification costs: Sometimes more accurate classification is desired for some
classes than others for reasons unrelated to relative class sizes.
3. Determining when to stop:
One characteristic of classification trees is that if no limit is placed on the number of
splits that are performed, eventually "pure" classification will be achieved, with each
terminal node containing only one class of cases or objects.
4. Selecting the right sized tree:
It should be sufficiently complex to account for the known facts, but at the same time
it should be as simple as possible. It should exploit information that increases
predictive accuracy and ignore information that does not.
Application of Classification Trees:

Classifying an internet user as Light, Medium and heavy user based on demographic
characteristics like age, education, profession, years on internet, income etc. This will
enable an Internet Service Provider for better targeting. Based on the customer details
collected the ISP will decide to offer the user with Light, Medium or Heavy packages to
choose from.
Business Data Mining Page 1

Artificial Neural networks
Artificial Neural Network (ANN) is used
1. When there exists a complex/unknown relationship between input and output
variables. The input variables can be both categorical and continuous.
2. ANN learns best when input fields are mapped between -1 and +1.
3. ANN work best when the number of input variables is limited.
ANN is not used

1. When there is very little understanding of the input variables. ANN model makes
no attempt to choose which variables to use.
2. Here variable selection becomes an issue when there are a large number of input
variables with very little significance
3. When the goal is to understand about the model and rules and how one got them
Possible issues that should be considered:

1. Large training Set:
The training set has to be sufficiently large to cover the range of inputs available for
each feature and in addition you require at least 30 training examples for each weight
in the network.
2. Static Model:
The model is static in nature and the training set has to be updated by adding most
recent examples and the network has to be retrained explicitly in order that model
useful.
3. Over Training:
Training set should diverse enough to prevent overtraining of the model
4. Black Box:
Neural networks can generate valid predictions but are not capable of identifying the
specific nature of the interrelations between the variables on which the predictions are
based i.e. it is virtually impossible to interpret the solution of neural network in
analytical terms.
Application of Neural Networks: Credit Card Fraud Detection

Generally banks use a combination of the following methods:
i. External - Here banks make use of third party database containing transaction
information from many companies to identify fraud patterns. The bank uses theses
patterns to cross-reference with patterns identified in its internal database to
identify fraud.
ii. Internal – Here fraud patterns are identified using only banks own internal
database.
Artificial Neural networks helps to find fraudulent transactions by detecting potentially

fraudulent PIN-based and signature-based debit transactions. The model considers
transaction data and cardholder characteristics to come up with transaction score which is
nothing but a number between 1 & 999. The lower the score, lower is the probability of
transaction being fraudulent.

Discriminant Analysis
Discriminant Analysis is a very useful tool
1. For detecting the variables that allow the researcher to discriminate between
different (naturally occurring) groups
2. For classifying cases into different groups with a better than chance accuracy
3. Categorical dependent variables with more than or equal to 2 fields
Discriminant Analysis cannot be used

1. When the variance among the group formed by the independent variables is zero
2. Dependant variables are ordinal in nature
3. Independent variables have to be metric
Possible issues to be considered:

1. Discriminant analysis is not an interdependence technique. So make sure that
distinction exists between dependant and independent variables.
2. Models with Low variance among the group formed by the independent variables
are generally poor
3. Make sure that there exists no multi co-linearity among the input variables
4. Here analysis is based on a fundamental assumption that independent variables
are normally distributed
Application of Discriminant Analysis:

As long as we can transform the problem into a classification problem, we may apply this
technique to a variety of contexts like pattern recognition, bankruptcy prediction, product
marketing etc. In product marketing Discriminant analysis is used to determine the
factors which distinguish different types of customers and/or products on the basis of
surveys or other forms of collected data. This will help us will help us to select which
features can describe the customer group membership of buy or not buy the product.
2. An online e-book retailer has purchase details of 50,000 customers. Any

customer who buys from this site needs to register with the site. As of now, only a user
login, password and address are captured while registering. Now, the retailer wants to
use data mining to improve customer satisfaction and overall profitability. You need to
put together a data mining plan that will help him achieve his objectives. The plan
must also recommend what data elements he needs and possible means to capture the
same. Please state any assumptions that you make in making drawing up this plan.
Business Context: Online book retailer
Business Objective: Customer satisfaction and Overall profitability
Requirement
1) Data Mining plan
2) Recommend what data elements he needs and possible means to capture them
By addressing the customer satisfaction issues, the e-book retailer can greatly improve the

customer conversion ratio and hence profitability.
Other ways of profitability include suggesting other related books to purchase in similar
fields like
1. Cross Selling: Customers who bought this item also bought…….

This requires Market basket Analysis – on the 50,000 records available about the
customer purchases.
Frequently bought together:
Top 2 in the results obtained from market basket analysis will appear along with
their respective percentages here.
2. Up selling: Suggesting customers to buy the entire collection of Home Alone
DVD series, Harry Potter book series when they buy one of the books/DVDs.
This recommendation engine can be improved by collecting demographic details
of the customer when he makes a purchase.
a. Age
b. Education
c. Profession
d. Languages known
e. Interests – travel, cooking, sports and other options. This will help predict
the broad areas in which the customer is interested in.
f. Top 3 widely read genres – fiction, non-fiction, crime, thriller, romance,
poems, comedy etc.
This field can be updated as and when customer buys his books.
How to capture: Offer Book club membership – benefits include special
discounts, exclusive preview before annual sales, members-only dinner meets etc
Mining Objectives:
1. Build a recommendation Engine which finds clusters of books that usually sell
together
2. Text Mining: Gain insights into customers' views and opinions about books and
our website
Techniques that are worth using:

Market Basket Analysis, Association rules, Neural networks, Text Mining
Data to be used & possible means to capture the same:

The following elements improve customer satisfaction by providing all the details he
wants to know about the product.
1. Product details – Pages/Paperback/hardbound/publisher/edition/Book
Category/About the Authors - Easily available from ISBN data
2. Editorial Review – Again Available from publisher
3. Customer Reviews – Incentivize the customer who has brought/used this product
to comeback and write comprehensive and well thought out reviews listing all the
good and bad about the book. The new customers who plan to buy this book
should find this review very useful to make a decision whether this book is for
them or not.
Incentives include: X % commission:

For every new customer, who finds a review useful and likes it, and goes on to
purchase the book, the person who has written the review will be paid
commission. This will promote people to write good reviews.
How to capture:
1. Have a template to capture customer background/biz context, Pros, Cons,
who do they think should buy this?
2. Did the book satisfy/meet their requirements on scale of 1 to 10
3. Overall rating on a scale of 1 to 5
4. Average customer rating – this is derived from the ratings provided by customers
in their reviews.
5. Inside this book – Cover pages, TOC, First few pages, Surprise me (Random
sample pages) - This is available with the book
6. Shipping Details – Available with the retailer
7. What do customers buy ultimately after viewing this ITEM?
8. If the majority bought the same book then it is a signal to customer that the book is
good/ relevant to their requirements.
How to capture:
1) The retailer has to track the customers visit to different books (their respective
web pages) and maintain a record of all the books he has browsed.
2) When a customer buys a particular book, his browsing records are referred to
see if he has already viewed this book before and then update the field - What do
customers buy ultimately after viewing this ITEM?
This is done for all the books visited by him as per his browsing records.
Special considerations:
Host a Customer discussion forum on the E-boom retailer website – user driven content.
Here allow customers to discuss about the books they bought/like/hate etc and also what
they love and hate about the website, things for improvement etc.
a. Perform text mining to capture good and bad about the book/retailer
website
b. Review the feedback, consolidate and make changes

3. A supermarket chain wants to develop a targeted marketing plan for the next
financial year. The marketing team also wants to explore if any niche marketing
opportunities are available. They already run a loyalty program for their customers
and as part of this program capture family demographics and all customer
transactions in their database. They have data for the last 5 years. Recommend a data
mining plan stating mining objectives, data to be used, techniques that are worth using
and any special considerations to enable the supermarket to draw up its marketing
plan.
Business Context: Supermarket Chain
Business Objective:
Develop targeted marketing plan
Explore if any niche marketing opportunities are available
Assumptions:
The supermarket has several departments like Apparel, Cosmetics, Home Furnishings,
Furniture, Electronic home appliances other than Groceries
Example: Big Bazaar, SPAR etc
Mining Objectives:
1. Classify customers into Brand Loyalists, Brand Switchers and Alternators for every
Brand sold in the store
2. Identify cluster of products that tend to be purchased by the same person over time
3. Link the demographic details with the share of wallet for each category sold in
supermarket (apparel, home appliances etc)
Techniques that are worth using:

Clustering, Classification Trees, Association rules, Neural networks
Available Data: Loyalty program details – last 5 years

1. Family Demographics
2. All Customer transactions
Methodology
Product/ Brand wise analysis:
Use data mining to analyze previous transactions of customers and classify them into
1. Loyalists – who does not switch brands
2. Switchers – who switch brands based on the promotions offered by respective
brands
3. Alternators – who change to new brand alternatively (Anti-incumbency effect)
Example Targeted marketing campaigns:

1. Aimed at Switchers – make them buy more at one go, or offer them coupons
on same brand for their next purchase.

Tie-up with respective brands to offer consumer promotions like buy 2 get
Rs5 Off, Buy 3 get 30% Off etc.
2. Loyalists & Alternators need no promotions since Loyalists will anyway stick
to the brand while Alternators will anyway change the brand due to anti-
incumbency.
Customer based analysis:
 Classify customers into different clusters such that people within the same cluster are
as similar as possible while those between clusters are as dissimilar based on
demographic details.
 Identify clusters of products that are usually purchased by the same person over time.
Now customers who have purchased some but not all of the products within a cluster
are good targets for missing elements.
 When such a customer comes for billing give him next purchase discounts/consumer
promotion coupons for the missing elements.
Example clusters:
1) Bachelors & Work experience less than 1 years: Offer discounts on Electric rice
cookers, Sandwich makers, Utensils etc
2) Newly married: Offer discounts on home furnishings, home appliances like LCD
TV, Refrigerator, Washing machines, Microwave Ovens; Furniture like Sofa, Dining
table etc
3) Married with Kids: Offer discounts on kids clothes during festival time, Bundle
kids uniforms and school stationary & shoes etc
4) Housewives: Offer special discounts during weekdays to avoid long billing queues
over weekends. This offer can be best utilized only by people staying at home during
weekdays.
Identify loyalty at store level –Increase in the share of wallet:
Classify the customers into categories based on their income, profession, location,
purchase details across several months etc and predict typically what kind of
customers spend what percentage of their income on what categories of products.
Examples:
Suppose data mining shows that a typical IT professional who is married and has 2
kids and stays within 3Km radius of the store, spends approximately 30% of his
income on apparel;
If there exists a customer who belongs to this cluster but spends only 5% of his
income on apparel, then he has to be offered incentives for spending his full quota of
approximately 30% of his income in the store and not elsewhere.
Special Considerations:
Since the supermarket is a chain, there may be customers who buy from various outlets
spread across the city. So to get comprehensive view about purchase details it is
necessary to integrate the databases of all chains. Also it will be interesting to find out
what kind of people buy from distributed outlets.

4. Fashion Plus Co. has two product lines. One is an exclusive boutique line
consisting of customized very expensive products. The other product line is priced
lower to appeal to the larger base of upper middle class customers. The product
managers of the two lines were presented with two models that have the following
gains for identifying potential customers. Which model would you recommend to each
of the product managers? Why?
Gains Chart
2500
2000
1500
No model
Hi
ts
#
Model 1
Model 2
1000
500
0
1 2 3 4 5
Quintile
Observations about models

Model#1: Maximum benefit of lift curve can be seen near 2.5 Quintile, with hits of 1500,
as against ~1200 with no model. The relative gain is better for larger number of
customers.
Model#2: Maximum benefits of lift curve can be seen near 1.5 Quintile, with hits of
1250, as against 750 with no model. The relative gain is better for smaller number of
customers.
Observations about the product lines

Product #1: Exclusive boutique line
Since the product is very expensive, we can safely assume that the target segment is
small. When the number of target customers is smaller in size, the gains achieved by
model #2 is relatively greater than model #1.
Considering the fact that product will be customized to the needs of every individual
customer, if the number of customers gets larger it will be very difficult to service their
needs with utmost satisfaction and personal relationship. Hence the product should be
aimed at lesser number of customers.
Hence Model #2 is recommended for product #1.
Product #2: Less expensive mass product

Since the product is priced lower, the company can recover money when it is targeted to
larger base of customers. When the number of target customers is larger in size, the gains
achieved by model #1 is relatively greater than model #2.
Even though the product is aimed at larger base, since the target customer belongs to
upper middle class, their number cannot be more than that covered by 3.5 Quintile.
Hence Model #1 is recommended for product #2.
After 3.5 Quintile, both models converge to gains achieved by using no model. Hence
when the number of target customers are very large (>70%), it is better not to use any
model.

Business Data Mining

Uploaded by

Copyright:

Available Formats

You might also like

Business Data Mining

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Business Data Mining

Uploaded by

Copyright:

Available Formats

1.

Three techniques for classification and prediction are listed below:

Classification trees are not used

Possible issues that should be considered are:

Application of Classification Trees:

Business Data Mining Page 1

ANN is not used

Possible issues that should be considered:

Application of Neural Networks: Credit Card Fraud Detection

Artificial Neural networks helps to find fraudulent transactions by detecting potentially

Business Data Mining Page 2

Discriminant Analysis cannot be used

Possible issues to be considered:

Application of Discriminant Analysis:

2. An online e-book retailer has purchase details of 50,000 customers. Any

Business Context: Online book retailer

Business Objective: Customer satisfaction and Overall profitability

Business Data Mining Page 3

1. Cross Selling: Customers who bought this item also bought…….

Techniques that are worth using:

Data to be used & possible means to capture the same:

Business Data Mining Page 4

Business Data Mining Page 5

Business Context: Supermarket Chain

Techniques that are worth using:

Available Data: Loyalty program details – last 5 years

Example Targeted marketing campaigns:

Business Data Mining Page 6

Business Data Mining Page 7

Observations about models

Observations about the product lines

Product #2: Less expensive mass product

Business Data Mining Page 8

Business Data Mining Page 9

You might also like