Professional Documents
Culture Documents
Orientation: Business Analytics
Orientation: Business Analytics
Orientation: Business Analytics
Orientation
www.proschoolonline.com
Agenda
• What is Business Analytics?
• Some BA Examples
• BA Process
• Job Profiles
• Our Course
Why?
What?
How?
• Evaluation
www.proschoolonline.com
Why BA ?
• Credit ratings/targeted marketing:
Given a database of 100,000 names, which persons are the least likely to default on their credit
cards?
Identify likely responders to sales promotions
• Fraud detection
Which types of transactions are likely to be fraudulent, given the demographics and transactional
history of a particular customer?
• Customer relationship management:
Which of my customers are likely to be the most loyal, and which are most likely to leave for a
competitor? :
www.proschoolonline.com
What is Business Analytics?
• Analytics is the use of:
Data,
Information technology,
Statistical Analysis,
Quantitative Methods,
Mathematical or Computer-based models
To help managers gain improved insight about their business operations and make better, fact-based
decisions.
www.proschoolonline.com
What is Business Analytics?
• Process of semi-automatically analyzing large databases to find patterns that are:
valid: hold on new data with some certainity
novel: non-obvious to the system
useful: should be possible to act on the item
understandable: humans should be able to interpret the pattern
• Also known as Knowledge Discovery in Databases (KDD)
www.proschoolonline.com
Applications
• Banking: loan/credit card approval
Predict good customers based on old customers
• Customer relationship management:
Identify those who are likely to leave for a competitor.
• Targeted marketing:
Identify likely responders to promotions
• Fraud detection: telecommunications, financial transactions
From an online stream of event identify fraudulent events
• Manufacturing and production:
Automatically adjust knobs when process parameter changes
• Medicine: disease outcome, effectiveness of treatments
Analyze patient disease history: find relationship between diseases
• Molecular/Pharmaceutical: identify new drugs
• Scientific data analysis: Identify new galaxies by searching for sub clusters
• Web site/store design and promotion: Find affinity of visitor to pages and modify layout
www.proschoolonline.com
Some Basic Operations
• Predictive:
Regression
Classification
• Descriptive:
Clustering / similarity matching
Association rules and variants
www.proschoolonline.com
Classification (Supervised learning)
www.proschoolonline.com
Classification
• Given old data about customers and payments, predict new applicant’s loan eligibility.
Salary > 5 L
Age Good/
Salary Prof. = Exec bad
Profession
Location
Customer type
www.proschoolonline.com
Nearest Neighbor
• Define proximity between instances, find neighbors of new instance and assign majority class
• Case based reasoning: when attributes are more complicated than real-valued.
www.proschoolonline.com
Decision Trees
• Tree where internal nodes are simple decision rules on one or more attributes and leaf nodes are
predicted class labels.
Salary < 1 M
www.proschoolonline.com
Neural networks
• Useful for learning complex data like handwriting, speech and image recognition
Decision Boundaries:
www.proschoolonline.com
Clustering(Supervised Learning)
www.proschoolonline.com
Clustering
• Unsupervised learning when old data with class labels not available e.g. when introducing a new product.
• Group/cluster existing customers based on time series of payment history such that similar customers in
same cluster.
• Key requirement: Need a good measure of similarity between instances.
• Identify micro-markets and develop policies for each
www.proschoolonline.com
Applications
• Customer segmentation e.g. for targeted marketing
Group/cluster existing customers based on time series of payment history such that similar
customers in same cluster.
Identify micro-markets and develop policies for each
• Collaborative filtering:
Group based on common items purchased
• Text clustering
• Compression
www.proschoolonline.com
Association
www.proschoolonline.com
Association rules
• Given set T of groups of items
Milk, cereal
• Example: set of item sets purchased
Tea, milk
• Goal: find all rules on itemsets of the form a-->b such that
Support of a and b > user threshold s Tea, rice, bread
Conditional probability (confidence) of b given a > user threshold c
• Example: Milk --> bread
• Purchase of product A --> service B cereal
www.proschoolonline.com
Prevalent Interesting
• Analysts already know about prevalent rules 1995
• Interesting rules are those that deviate from prior expectation Milk and
• Mining’s payoff is in finding surprising phenomena cereal sell
together!
www.proschoolonline.com
What makes a Rule Surprising?
• Does not match prior expectation
Correlation between milk and cereal remains roughly constant over time
• Cannot be trivially derived from simpler rules
Milk 10%, cereal 10%
Milk and cereal 10% … surprising
Eggs 10%
Milk, cereal and eggs 0.1% … surprising!
Expected 1%
www.proschoolonline.com
Applications of fast itemset counting
• Find correlated events:
Applications in medicine: find redundant tests
Cross selling in retail, banking
www.proschoolonline.com
Business Analytics in Practice
www.proschoolonline.com
Application Areas
Industry Application
Finance Credit Card Analysis
Insurance Claims, Fraud Analysis
Telecommunication Call record analysis
Transport Logistics management
Consumer goods promotion analysis
Data Service providers Value added data
Utilities Power usage analysis
www.proschoolonline.com
Why Now?
• Data is being produced and warehoused
• The computing power is available and affordable
• The competitive pressures are strong
• Commercial products are available
• Growth in Data: The size of the data from the beginning of time to 2003 is now generated in five minutes
• BA Industry is $15 billion currently and is growing at 9 to 10% per annum
• Data Scientist – Sexiest Job – HBR
• Approx 4.4 million data scientists will be required by end of 2015
www.proschoolonline.com
Analytics in Use
• The US Government uses Data Mining to track fraud
• A Supermarket becomes an information broker
• Basketball teams use it to track game strategy
• Cross Selling
• Target Marketing
• Holding on to Good Customers
• Weeding out Bad Customers
www.proschoolonline.com
Some Success stories
• Network intrusion detection using a combination of sequential rule discovery and classification tree on 4
GB DARPA data
Won over (manual) knowledge engineering approach
http://www.cs.columbia.edu/~sal/JAM/PROJECT/ provides good detailed description of the entire
process
• Major US bank: customer attrition prediction
First segment customers based on financial behavior: found 3 segments
Build attrition models for each of the 3 segments
40-50% of attritions were predicted == factor of 18 increase
• Targeted credit marketing: major US banks
Find customer segments based on 13 months credit balances
Build another response model based on surveys
Increased response 4 times -- 2%
www.proschoolonline.com
Another Success Story
• Example 1.5 A Sales-Promotion Model
• In the grocery industry, managers typically need to know how best to use pricing, coupons and
advertising strategies to influence sales.
• Using Business Analytics, a grocer can develop a model that predicts sales using price, coupons and
advertising.
www.proschoolonline.com
Decision Models
www.proschoolonline.com
Business Analytics Process
www.proschoolonline.com
Problem Solving and Decision Making
• BA represents only a portion of the overall problem solving and decision making process.
• Six steps in the problem solving process
1. Recognizing the problem
2. Defining the problem
3. Structuring the problem
4. Analyzing the problem
5. Interpreting results and making a decision
6. Implementing the solution
www.proschoolonline.com
Problem Solving and Decision Making
1. Recognizing the Problem
• Problems exist when there is a gap between what is happening and what we think should be happening.
• For example, costs are too high compared with competitors.
2. Defining the Problem
• Clearly defining the problem is not a trivial task.
• Complexity increases when the following occur:
Large number of courses of action
Several competing objectives
External groups are affected
Problem owner and problem solver are not the same person
Time constraints exist
3. Structuring the Problem
• Stating goals and objectives
• Characterizing the possible decisions
• Identifying any constraints or restrictions
www.proschoolonline.com
Problem Solving and Decision Making
4. Analyzing the Problem
• Identifying and applying appropriate Business Analytics techniques
• Typically involves experimentation, statistical analysis, or a solution process
Much of this course is devoted to learning BA techniques for use in Step 4.
5. Interpreting Results and Making a Decision
• Managers interpret the results from the analysis phase.
• Incorporate subjective judgment as needed.
• Understand limitations and model assumptions.
• Make a decision utilizing the above information.
6. Implementing the Solution
• Translate the results of the model back to the real world.
• Make the solution work in the organization by providing adequate training and resources.
www.proschoolonline.com
Job Profiles
www.proschoolonline.com
Project Mechanism
• Recognizing the problem
Domain Expertise
• Defining the problem
Domain Expertise
• Structuring the problem
A technology expert with the domain expert with someone who knows Stats
• Analyzing the problem
Statistical Expertise
• Interpreting results and making a decision
Statistical Expertise along with Domain Expertise
• Implementing the solution
Domain Expertise
www.proschoolonline.com
Job Profiles
• The Data / Technology Profile
Comfortable with Analytics
RDBMS, Operating Systems
Data Gatherer
• The Domain Profile
Comfortable with Analytics and IT, Can do basic Analytics
Knows the business, industry
Liaises with the Data Gatherer and the Stats Expert
• The Statistics Profile
Deep knowledge of Statistics and Modeling
Comfortable with IT
www.proschoolonline.com
Our Course
www.proschoolonline.com
Our Course – Why?
• Makes you very comfortable with Analytics and IT
• Focus is on
Popular Statistical Techniques
Commonly – used Tools and Technologies
Brief Overview of Domains
• Based on your Background
IT
Domain
Stats
www.proschoolonline.com
Our Course – What?
• Basic Statistics with Sampling and Hypothesis Testing
Lay down the foundation
• Linear and Multiple Regression
Most common numeric prediction technique
• Logistic Regression and Classification
Most common application of BA
• Clustering
The backbone of Customer Segmentation
• Market Basket Analysis
Highly effective in the Retail Industry
www.proschoolonline.com
Our Course – What?
• MS – Excel
90% of the organizations do 90% of their work on Excel, need we say more!
• R
The open source Data Mining Language is being used and accepted by more and more companies as
we speak.
• SAS
Old and Still strong Mining tool. The best tool for Visualization
• Overview of Big Data, Mobile BA
What to expect from the future
• 3 Case Studies from 3 Domains
Introduction to the domains and common applications within it
www.proschoolonline.com
Our Course – How?
• Each Technique is taught with at least one tool and at least one dataset
• Example – Basic Statistics
Course taught in Excel
Class Data Set
Home Assignment Data Set
www.proschoolonline.com
Evaluation
• IMS Proschool Certification
End Term Test (80%)
• NSE Certification
2 hour
75 MCQs
www.proschoolonline.com
Thank You.
www.proschoolonline.com