Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 39

PREDICTIVE

MODELLING BATC601
WEEK1 –Chapter 1
This chapter is about overview of predictive analytics
Learning Objective----
What is analytics
What is predictive Analytics
Supervised Vs unsupervised Learning
Business Intelligence
Predictive Vs Statistics
Predictive analytics Vs Data Mining
Four
types of
Analytics
1. Descriptive: What is happening?

- Preliminary stage of data


- Creates a summary of historical data
- Provide information about happened.
2. Diagnostic: Why is it happening?

- Advance analytics to answer the question “Why did it happen?”


- Drill-down, data discovery, data mining and correlations
4.Prescriptive: What do I need to do?

- “What-might-happen” analysis to help the user determine


the best course of action to take.
- typically not just with one individual action
- is related to both descriptive and predictive analytics.
3.Predictive: What is likely to happen?

- Forecasting.
- Predictive models, machine learning, and data mining typically
utilise a variety of variable data to make the prediction.
What is Predictive Analytics?
Predictive analytics is the practice of extracting insights from the existing data set
with the help data mining, statistical modelling and machine learning techniques
and using it to predict unobserved/unknown events.

Identifying cause-effect relationships across the variables from


the historical data.

Discovering hidden insights and patterns with the help of data mining
techniques.

Apply observed patterns to unknowns in the Past, Present or Future.


Predictive Analytics Process Cycle

• PC------
Common Predictive Analytics Methods
• Regression:
• Predicting output variable using its cause-effect relationship with input
variables. OLS Regression, GLM, Random forests, ANN etc.

• Classification:
• Predicting the item class. Decision Tree, Logistic Regression, ANN, SVM, Naïve
Bays classifier etc.

• Time Series Forecasting:


• Predicting future time events given past history. AR, MA, ARIMA, Triple
Exponential Smoothing, Holt-Winters etc.
Common Predictive Analytics Methods
(Contd.)
• Association rule mining:
• Mining items occurring together. Apriori Algorithm.

• Clustering:
• Finding natural groups or clusters in the data. K-means, Hierarchical,
• Spectral, Density based EM algorithm Clustering etc.

• Text mining:
• Model and structure the information content of textual sources.
• Sentiment Analysis, NLP
Predictive Analytics Tools in Market
Stages in Data Analytics
Business Data Data Data Model
Understanding Understanding Deployment
Preparation Modeling Evaluation

Determine
Business Collect Select
Initial Data Select Data Modeling Evaluate Plan
Objectives Results Deployment
Technique

Assess Describe
Situation Clean Data Generate Plan
Data Review
Test Design Monitoring &
Process
Maintenance

Determine Data
Mining Goals Explore Data Construct
Data Build Model Determine Produce Final
Next Steps Report

Produce Verify Data Integrate


Project Plan Quality Assess
Data Review
Model
Project
Supervised vs. unsupervised????
Continue---
• In Supervised learning, you train the machine using data which
is well "labeled." It means some data is already tagged with the
correct answer. It can be compared to learning which takes place
in the presence of a supervisor or a teacher.
• A supervised learning algorithm learns from labeled training
data, helps you to predict outcomes for unforeseen data.
Successfully building, scaling, and deploying accurate supervised
machine learning Data science model takes time and technical
expertise from a team of highly skilled data scientists
Ex--
Unsupervised------
• Unsupervised learning is a machine learning technique, where
you do not need to supervise the model. Instead, you need to
allow the model to work on its own to discover information. It
mainly deals with the unlabelled data.
• Unsupervised learning algorithms allow you to perform more
complex processing tasks compared to supervised learning.
Although, unsupervised learning can be more unpredictable
compared with other natural learning deep learning and
reinforcement learning methods
• Ex-clustering.
Parametric Vs Non-parametric
models-----
• Algorithms for predictive analytics include both
parametric and non-parametric algorithms.
• Parametric algorithms assume known distributions in
the data.
• ML algorithms typically do not assume distributions and
therefore are considered non-parametric or distribution
–free models.
Business Intelligence Vs
Predictive analytics
• Business Intelligence is about descriptive analytics (or
looking at what happened), slicing-and-dicing across
dimensional models with massive dissemination to all
business users.
• Predictive analytics, on the other hand, builds analytic
models at the lowest levels of the business—at the
individual customer, product, campaign, store, and device
levels—and looks for predictable behaviours, propensities,
and business rules (as can be expressed by an analytic or
mathematical formula) that can be used to predict the
likelihood of certain behaviours and actions.
• Predictive analytics is about finding and quantifying hidden
patterns in the data using complex mathematical models
that can be used to predict future outcomes.
Predictive analytics & Business
Intelligence
• Predictive analytics takes the questions that business
intelligence is answering to the next level, moving from a
retrospective set of answers to a set of answers focused
on predicting performance and prescribing specific actions
or recommendations.
• For example, if we change the three key business
questions that we asked earlier (most valuable customers,
most important products, most successful campaigns) to a
future tense, then you can see that we need a predictive
analytics approach that is completely different from the
conventional BI approach (see table below).
Point need to be discussed---
• Similarities between business intelligence and predictive
analytics---
• Predictive analytics Vs Statistics-----

statistics Predictive analytics

Models based on theory Models often based on non-


Models typically linear parametric algorithms; no
Data typically smaller, algorithms guaranteed optimum
often geared toward accuracy Model typically non-linear
with small data Scales to big data, algorithms
Model is king not as efficient or stable for
small data .
Data is king.
Predictive Analytics vs. Data
Mining--
• Difference
Between Predictive Analytics vs. Data Mining Predictive
analytics is the process of refining that data resource,
using business knowledge to extract hidden value from
those newly discovered patterns. Data mining is the
discovery of hidden patterns of data through machine
learning — and sophisticated algorithms are
the mining tools.
Who uses Predictive
Analytics?
• Predictive analytics are used to determine customer
responses or purchases, as well as promote cross-sell
opportunities. Predictive models help businesses
attract, retain and grow their most profitable
customers. Improving operations.

• Many companies use predictive models to forecast


inventory and manage resources.
Challenges in using predictive
analytics---
• Obstacles in management
• Obstacles with data
• Obstacles with modelling
• Obstacles in deployment
Chapter ---2 Setting Up The
Problem---

• Predictive Analytics Processing Steps---CRIS-DM


• Business understanding
• Data understanding
• Data preparation
• Modelling
• Evaluation
• Deployment
Unit analysis---
• Suppose you are building models for the same
hospitality organization and that the buisness objectives
include identifying customer behaviour so they can
customize marketing creative content to better match
the type of visitor they are contacting.
• The modelling algorithms would not know this particular
customer had visited multiple times, nor would the
algorithms know there is a connection between the
visits.
Target variable---
• For models that estimate or predict a specific value, a
necessary step in the Buisness Understanding stage is
to identify one or more target variables to predict.
• A target variable is a column in the modelling data that
contains values to be estimated or predicted as defined
in the buisness objectives. The target variable can be
numeric or categorical depending on the type of model
that will be built.
Temporal considerations for
target variable------
• For most modelling projects focused on predictive
future behaviour, careful consideration for the timeline
is essential. Predictive modelling data, as is the case all
data, is historical, meaning that it was collected in past.
To build a model that predicts so called future actions
from historic data requires shifting the time line in the
data itself. (You can refer page no 31 from the text
book)
Continued----
• “Time Needed to affect decision” time gap can be
critical in models, allowing time for a treatment to
mature. For example, if you are building models to
identify churn, the lead time to predict when churn
might take place is critical to putting churn mitigation
programs in place. You might, therefore, want to
predict if churn will occur 30-60 days in the future, in
which case there must be a 30 days gap between the
most recent input variable timestamp and the churn
timestamp.
Continued----

• “Timeframe
for models input” range
has two endpoints. The input variable
Times tampMax”
Measures Of Success for
predictive models-----

• Success criteria for predictive models


• Success criteria for Estimation---
• Other Customized Success criteria
Steps need to do ---
• Building Models first
• Early Model Deployment
• Discussion On Case study--- Recovering
Lapsed Donors

You might also like