Notes For Business Analytics Part II

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 66

Regression Analysis

12/21/2020 Slides used for Educational Purpose only


Need for Regression

▪ The correlation coefficient gives you just the degree of


relationship or association.

▪ It cannot help you estimate or predict the response variable for


a given independent variable.

▪ The response variable is called the dependent variable.

▪ In the present problem involving Sigma Property, ‘%' Occupancy


is the independent variable and ‘Revenue’ is the dependent
variable.
12/21/2020 Slides used for Educational Purpose only
Objectives of Regression Analysis
▪ Explain The variations in the dependent variable as a result of
using a number of independent variables.
▪ Describe the nature of relationship in a precise manner by
way of an equation.
▪ Validate the regression equation statistically.
▪ Predict the value of the dependent variable based on the
target values of the independent variables.
▪ Remove unwanted variable/variables that do not contribute
much toward explaining variations in the dependent variable
12/21/2020 Slides used for Educational Purpose only
Part I-Simple Linear Regression

12/21/2020 Slides used for Educational Purpose only


Regression Model
Simple Linear Regression Model

▪ In this model, dependent variable is a linear function of one independent variable.


For the present case, Revenue may be structured as a linear function of %
occupancy.
▪ Based on sample data collected for the dependent and independent variable, a
model is postulated connecting the dependent variable with the independent
variable in a linear equation form. Symbolically, we write the sample regression
line as follows:
Yˆ = b0 + b1 x1
where
ŷ is the estimate for the dependent variable(revenue)
x1 is the independent variable(% individual occupancy)
b0 and b1 are determined by statistical least square method. b1 is called the regression coefficient(slope)
b0 is the constant term (intercept).Slides used for Educational Purpose only
and 12/21/2020
Historical Perspective

Just for knowledge sake, it is worth pointing out here that the estimates for b0 and b1
obtained by least square method are called ‘Best Linear Unbiased Estimates’ (BLUE)
first pioneered by Gauss and Markoff in the context of General Linear Models that take
care of Multiple Linear Regression as well.

12/21/2020 Slides used for Educational Purpose only


Values of b0 and b1 in the case of simple linear
regression model

 y = nb 0 + b1  x1

 yx = b0  x1 + b1  x1
2
1
Here n denotes the sample size.

Solving these two normal equations,

 (x − x )(y − y) 1
b1 =  ( x − x ) 1

1 1
2

12/21/2020
b0 = y − b1 x 1 Slides used for Educational Purpose only
Simple Linear Regression
To understand the nitty-gritty of simple regression, let us take the present problem for
which we give below the relevant data(Refer File Hotel1.csv)
Revenue PercentOccupancy
514.44 65.70
463.12 61.10
598.18 78.20
454.92 65.40
453.80 63.50
502.23 70.60
626.26 81.20
498.70 72.00
514.46 72.90
623.29 81.70
454.77 62.10
385.57 53.40
You postulate the model for the population in the standard form as follows:
Y= β0+β1X1

Y is the Revenue measured in $1000, β0 is the intercept and β1 is the slope


corresponding the independent variable X1(PercentlOccupancy)
12/21/2020 Slides used for Educational Purpose only
Scatter Diagram-Revenue versus Percent Occupancy

12/21/2020 Slides used for Educational Purpose only


Simple Linear Regression
The estimated regression model to test the population model is
Yˆ = b0 + b1 x1
where ŷ is the estimated dependent variable(Revenue)
x1 is the independent variable(%Occupancy in the sample data)
b0 and b1 are the intercept and slope to be determined by statistical least square method.

Yˆ = -60.3747+8231.7777x1
If the %Occupancy is projected at 85%, then the predicted
Revenuein$1000 upon substitution=639.3234

12/21/2020 Slides used for Educational Purpose only


Backtracking Ability of the Model

Red Color =Actual


Blue Color=Predicted

12/21/2020 Slides used for Educational Purpose only


Part II-Multiple Linear Regression

12/21/2020 Slides used for Educational Purpose only


Multiple Linear Regression
Multiple Linear Regression is an extension of the simple linear regression model
in which the number of independent variables will be more than one. In the
present context of Sigma Property, we add one more independent variable
namely % Group Occupancy.

You postulate the model for the population in the standard form as follows:

Y= β0+β1X1+β2X2

Y is the Revenue measured in $, β0 is the intercept and β1 is the slope


corresponding the independent variable X1(% Individual Occupancy) and β2 is the
slope corresponding the independent variable X2(% Group Occupancy)
12/21/2020 Slides used for Educational Purpose only
Estimated Regression Model
Yˆ = b0 + b1 x1 + b2 x2

Where

ŷ is the estimate for the dependent variable(revenue)


x1 is the independent variable (% individual occupancy)
x 2 is the independent variable (% group occupancy)

b0 , b1 , and b2 represent the intercept, and slopes of the independent variables


respectively.

12/21/2020 Slides used for Educational Purpose only


Linear Programming

12/21/2020 Slides used for Educational Purpose only


Linear Programming

12/21/2020 Slides used for Educational Purpose only


Definition of LPP

12/21/2020 Slides used for Educational Purpose only


Why use Linear
Programming?

12/21/2020 Slides used for Educational Purpose only


Characteristics of a Linear Programming Problem

12/21/2020 Slides used for Educational Purpose only


Graphical Representation of LP

12/21/2020 Slides used for Educational Purpose only


Example 2

12/21/2020 Slides used for Educational Purpose only


"List five areas of application of Linear Programming (LP)
and discuss the usefulness of LP in three of these areas."

• Linear Programming is basically a planning


tool to find out the best solutions under
constraints or limiting factors. The main
objective of a linear programming problem is
to maximize or to minimize some numerical
value.

12/21/2020 Slides used for Educational Purpose only


Various Applications of this technique
• 1. Industrial Applications:
– Product Mix problems
– Blending problems
– Production scheduling
– Assembly line balancing
– Inventory management

• 2. Management applications
– Media selection problems
– Portfolio selection
– Profit/Sales Maximization
– Transportation problems

• 3. It can also be used to solve Diet problems, Flight scheduling,


agriculture problems and many more.

12/21/2020 Slides used for Educational Purpose only


• Linear Programming is the statistical tool for finding the most optimum
solutions to real world problems which have a set of variables linearly
related to each other and are guided by a set of constraints. In simple
terms, a business scenario can involve many parameters for production or
minimizing loss/error for maximizing profit. These can be solved using a
Linear Programming model using the simplex method by identifying the
decision variables and constraints involved.

• The various industry applications of this technique are

• Advertising: Maximizing the reach of the advertisement based on the


slots availability and the cost to advertise on each slot and budget
available for advertising the product.

• Manufacturing: Maximizing the profit earned on products manufactured


based on variables of available raw materials and cost to manufacture
each product including the cost of labour and advertising each product.

• Petroleum Industry: Maximizing the number of barrels sold for gaining


optimum profit on the basis of combinations of crude oil to be used and
constraint to the parameters llike Hydrocarbon and sulphur content per
unit of Petroleum.
12/21/2020 Slides used for Educational Purpose only
• Resource Management (Shift Scheduling): Finding the
optimum number of resources to be employed at a given time
to minimize the cost and maximize the profit based on the
shift allowance per employee and project budget for the
week/month and the number of the leaves to be given per
week to the employee.

• Transportation: Finding the optimum route or number of


drivers to be utilized based on peak timing traffic conditions,
cost of petrol/time taken to transport for a given route and
profit gained per shipment.

• Portfolio Optimizations: Finding the optimum portfolios to


invest in given a budget allocated for investment based on
the % growth of the shares, Dividends received per dollar
investment.

12/21/2020 Slides used for Educational Purpose only


• Civil Engineering Applications: Use of LP has been adopted in many
construction plans, such as steel cutting, template building and earthwork
blending, etc. This can be used in order to optimize the use of
construction equipments to help yield higher profits.
• Cost management in public transportation can be maximize using LP.
For example in Pune local bus services there are few routes for which
multiple buses are ran on daily basis but the all of them are overcrowded.
Similarly there are few routes on which the buses are comparatively less
crowded. On certain peak hours few buses from less crowded areas can
be diverted to the heavy crowded routes. The optimization can be done
using LP method.
• Managing shelf life any perishable product is always a challenge in
demand and supply industry. With the help of LP, considering the demand
of any product on certain areas the supply of the product to that area’s
store can be determine.
• LP can be used to determine the optimized utilization of manpower in
any operations. Companies can identify the proper training and
development of the current workforce basis to the operations requirement,
this will help the company to increase the productivity of the person and
will make the workforce skillful in handling the work with more
accuracy.

12/21/2020 Slides used for Educational Purpose only


Introduction to Machine
Learning

12/21/2020 Slides used for Educational Purpose only


Learning from Data
• Can we learn about the world around us using
data?
• Model building from data
– Take data as input
– Find patterns in the data
– Summarize the pattern in a mathematically precise
way
• Machine learning automates this model building.
The Challenge
• Data unfortunately contains noise. If not,
machine learning would be trivial!
• Think of Data = Information + Noise
• The challenge is to identify the information
content and distill away the noise.
• To help do this, machine learning uses a train and
test approach.
Over fitting Vs under fitting
• If the model we finish with ends up
– modeling the noise as well, we call it “over fitting” -
bad for prediction!
– not modeling all the information, we call it “under
fitting” - bad for prediction!
• The hope is that the model that does the best on
testing data manages to capture/model all the
information but leave out all the noise.
Machine Learning tasks
1. Supervised learning: Building a mathematical model using
data that contains both the inputs and the desired outputs
(ground truth).
– Examples:
• Determining if an image has a horse. The data would include images
with and without the horse (the input), and for each image we would
have a label (the output) indicating if there is a horse in that image.
• Determining is a client might default on a loan
• Determining if a call center employee is likely to quit
– Since we have desired outputs, model performance can be
evaluated by comparisons.
Machine Learning Tasks
2. Unsupervised learning: Building a mathematical model
using data that contains only inputs and no desired outputs.
– Used to find structure in the data, like grouping or clustering of
data points. To discover patterns and group the inputs into
categories.
– Example: an advertising platform segments the population into
smaller groups with similar demographics and purchasing
habits. Helping advertisers reach their target market with
relevant ads.
– Since no labels are provided, there is no specific way to compare
model performance in most unsupervised learning methods.
Tools and techniques
• Supervised learning
– Regression: desired output is a continuous number
– Classification: desired output is a category
• Unsupervised learning
– Clustering: Grouping data
– Dimensionality reduction: Compressing data
– Association rule learning: If X then Y
Intro to Clustering

12/21/2020 Slides used for Educational Purpose only


Clustering
• Clustering is an Unsupervised Learning Technique
• A Cluster: collection of objects that are similar
• Objective is to group similar data points into a group
– Segmenting customers into similar groups
– Automatically organizing similar files/emails into folders
• Simplifies data by reducing many data points into a few
clusters
Distance
• Do define “similarity” you need a measure of
distance
• Examples of common distance measures
– Eucledian Distance
Types of Clustering
1. Connectivity based clustering (Hierarchical clustering): based on the idea that related
objects are closer to each other. Can we then create a hierarchy of clusters/groups.
– Useful when you want flexibility in how many clusters you ultimately want. For
example, imagine grouping items on an online marketplace like Etsy or Amazon.
– In terms of outputs from the algorithm, in addition to cluster assignments you also
build a nice tree (dendrogram) that tells you about the hierarchies between the
clusters. You can then pick the number of clusters you want from this tree.
– In a dendrogram, the y-axis marks the distance at which the clusters merge, while
the objects are placed along the x-axis.
– Algorithms can be agglomerative (start with 1 object and aggregate them into
clusters) or divisive (start with complete data and divide into partitions).
Types of Clustering
2. Centroid based clustering (Eg. K- Means clustering):
The objective is to find K clusters/groups. The way
these groups are defined is by creating a centroid for
each group. The centroids are like the heart of the
cluster, they “capture” the points closest to them and
add them to the cluster.
– Large K produces smaller groups and a small K
produces larger groups
– K-Means uses Euclidian distances and is the most
popular
– Other variants like K-medians and K-mediods use
other distance measures
Clustering

12/21/2020 Slides used for Educational Purpose only


Data we will work with
– Customer Spend Data
• AVG_Mthly_Spend: The average monthly amount spent by customer
• No_of_Visits: The number of times a customer visited in a month
• Item Counts: Count of Apparel, Fruits and Vegetable, Staple Items
purchased

• Can we cluster similar customers together?


Connectivity Based: Hierarchical Clustering

• Hierarchical Clustering techniques create clusters


in a hierarchical tree like structure
• Any type of distance measure can be used as a
measure of similarity
• Cluster tree like output is called Dendrogram
• Techniques either start with individual objects
and sequentially combine them (Agglomerative
), or start from one cluster of all objects and
sequentially divide them (Divisive)
Distance between objects
Centroid based: K-Means
Clustering
• K-Means is probably the most used clustering technique

• Aims to partition the n observations into k clusters so as to


minimize the within-cluster sum of squares (i.e. variance).

• Computationally less expensive compared to hierarchical


techniques.

• Have to pre-define K, the no of clusters


Choosing the optimal K
• Usually subjective, based on striking a good
balance between compression and accuracy
•The “elbow” method is commonly used
Lloyd’s algorithm
1. Assume K Centroids

2. Compute Squared Euclidian distance of each objects with these K


centroids. Assign each to the closest centroid forming clusters.

3. Compute the new centroid (mean) of each cluster based on the


objects assigned to each clusters.

4. Repeat 2 and 3 till convergence: usually defined as the point at which


there is no movement of objects between clusters
TIME SERIES FORECASTING

12/21/2020 Slides used for Educational Purpose only


VISUALIZING TIME SERIES
COMPONENTS

12/21/2020 Slides used for Educational Purpose only


Steps in Forecasting
1. Problem definition:
2. Gathering information
3. Preliminary (exploratory) analysis.
4. Choosing and fitting models
5. Using and evaluating aforecasting model.

Objective of this lesson is to explore several time series data


sets and apply visual methods using R to extract information

12/21/2020 Slides used for Educational Purpose only


Problem Definition
Time series forecasting involves
• Understanding historical pattern of data
• Using past knowledge forecasting for future

Before a forecasting problem is taken up, decision needs to be


made regarding the forecast horizon

12/21/2020 Slides used for Educational Purpose only


Forecast Range
Different industry needs different forecast range for different purpose

Example: Airlines industry: Interested in passenger volumeforecast


Passenger volume is the driving force behind all itsoperation

• Long-term forecast: 5-10 years


̶ Required for strategic decision making
̶ Acknowledging limited reliability of these forecasts
• Mid-term forecast: 2-5 years
̶ Manpower hiring
̶ Decision on addition/alteration in new and existing routes
• Short-term forecast: 2 weeks – 6 months
̶ Manpower rostering
̶ Dynamic pricing

12/21/2020 Slides used for Educational Purpose only


Forecast Range
• Supply Chain: Responds to customer demand
̶ Very long range forecast will not serve the purpose well
̶ In addition to taking into account the past demand, lead time and
planned advertising and other marketing activity must be
incorporated into forecast horizon

• Contract Research Organization doing clinical trials


̶ 2000 trials running simultaneously across the world
̶ Need to forecast monthly for each of some 5000 items required
for trials for next 6 months

12/21/2020 Slides used for Educational Purpose only


Gathering Information
Historical data required for future prediction

If volume of data is limited, forecasts will not


be reliable enough

If data is available for very long past, datamay


not be useful at all

12/21/2020 Slides used for Educational Purpose only


Example: Clay Brick Production

12/21/2020 Slides used for Educational Purpose only


Example: Clay Brick Production

Series not stable

Use stable part for forecast

12/21/2020 Slides used for Educational Purpose only


12/21/2020
0
0.1
0.2
0.3
0.4
0.5
0.6

Week
0001W07
0001W14
0001W21
0001W28
0001W35
0001W42
0001W49
0002W04
0002W11
0002W18
0002W25
0002W32
0002W39
0002W46
0003W01
0003W08
0003W15
0003W22
0003W29
0003W36
0003W43
0003W50
0004W05
Weekly Market Share

0004W12
0004W19
0004W26
Slides used for Educational Purpose only

0004W33
0004W40
0004W47
Example: Crest Toothpaste

0005W02
0005W09
0005W16
0005W23
0005W30
0005W37
0005W44
0005W51
Use later part only for forecast

0006W06
0006W13
Components of Time
Series
Graphs highlight variety of patterns inherent toTS

A TS can be split into several components, each representing one of the


underlying categories of patterns,

Time Series Components


▪ Trend
Systematic
▪ Seasonal component Component
▪ Cyclic component

▪ Irregular component (Error or Random Component)


12/21/2020 Slides used for Educational Purpose only
Trend
• Long term movement of a series: either increasing or
decreasing

12/21/2020 Slides used for Educational Purpose only


12/21/2020
75
95
115
135
155
175
195
215
1988-01
1988-02
1988-03
1988-04
1988-05
1988-06
1988-07
1988-08
1988-09
1988-10
1988-11
1988-12
1989-01
1989-02
1989-03
1989-04
1989-05
1989-06
1989-07
Low Demand (Jan)

1989-08
1989-09
1989-10
1989-11
1989-12
1990-01
Bricks

1990-02
1990-03
1990-04
1990-05
1990-06
1990-07
1990-08
1990-09
1990-10
Slides used for Educational Purpose only

1990-11
1990-12
1991-01
1991-02
1991-03
1991-04
1991-05
1991-06
1991-07
1991-08
High Demand (May)

1991-09
1991-10
1991-11
1991-12
Example: Demand of Bricks
Example: Demand of Bricks
Across each year demand for bricks follow a
repetitive pattern

In a particular month (Jan) demand is the


lowest

In some other months, demand fluctutaes

12/21/2020 Slides used for Educational Purpose only


Seasonality
• Representing intra-year stable fluctuations repeatable year
after year with respect to timing, direction andmagnitude

• Normal variations that recur every year to the


same extent

• A Yearly series does not have seasonality

12/21/2020 Slides used for Educational Purpose only


Seasonality
• Demand for winter clothes
• Airlines and train ticket demands
• Incidence of influenza or other vector-borne
diseases

Stock prices typically will not showany


seasonal pattern

12/21/2020 Slides used for Educational Purpose only


Example: Sale of Shoes

2011-13 demand increasing


2013-15 stable demand
2015 onwards demand declining

12/21/2020 Slides used for Educational Purpose only


Cyclical Component
• In addition to within year stable fluctuation,
demand for this particular style of shoes show
increase over years for a period and then decrease

12/21/2020 Slides used for Educational Purpose only


Systematic Components
• Trend, Seasonality, Cyclicality are part of
systematic component
• These patterns are interpretable
• These can be estimated
• Forecast of time series involves estimation
and extrapolation of these components

We focus on Trend and Seasonality only


12/21/2020 Slides used for Educational Purpose only
Irregular Component
The error or variability associated with the series is the Irregular
component

This component is a randomcomponent

The part of the series that cannot be explained through Systematic


component forms the Irregular Component

Other names of this component is Error or White Noise

This component is assumed to have a normal distribution with 0 mean and


constant variance σ2

12/21/2020 Slides used for Educational Purpose only

You might also like