Download as pdf or txt
Download as pdf or txt
You are on page 1of 116

Predictive Analytics : QM901.

1x
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Those who have knowledge don’t predict.


Those who predict don’t have knowledge.

- Lao Tzu

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

I think there is a world market for may be 5 computers


- Thomas Watson, Chairman of IBM 1943

Computers in future weigh no more than 1.5 tons


- Popular Mechanics, 1949

640K ought to be enough for everybody


- Bill Gates, 1981???

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Forecasting
Prof U Dinesh Kumar, IIMB

• Forecasting is a process of estimation of an unknown event/parameter


such as demand for a product.

• Forecasting is commonly used to refer time series data.

• Time series is a sequence of data points measured at successive time


intervals.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Corporate Prof U Dinesh Kumar, IIMB
Strategy

Business Product and Market Product and Market Strategy


Forecasting Planning

Aggregate Resource Planning -


Aggregate Production
Medium to Long Range
Forecasting Planning
Planning

Item Production Capacity - Short


Master Production
Forecasting Planning Range Planning

Demand Materials Capacity Requirement


forecasting at SKU Requirement Planning
Level Planning

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Forecasting methods

• Qualitative Techniques.
– Expert opinion (or Astrologers)

• Quantitative Techniques.
– Time series techniques such as exponential smoothing

• Casual Models.
– Uses information about relationship between system elements
(e.g regression).

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Why Time Series Analysis ?


Prof U Dinesh Kumar, IIMB

• Time series analysis helps to identify and explain:

– Any systemic variation in the series of data which is due to seasonality.


– Cyclical pattern that repeat.
– Trends in the data.
– Growth rates in the trends.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Time Series Components


Prof U Dinesh Kumar, IIMB

Trend Cyclical

Seasonal Irregular

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Trend Component

• Persistent upward or downward pattern

• Due to consumer behaviour, population, economy, technology etc.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Cyclical Component
Prof U Dinesh Kumar, IIMB

• Repeating up and down movements.

• Due to interaction of factors influencing economy such as recession.

• Usually 2-10 years duration.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Seasonal Component
Prof U Dinesh Kumar, IIMB

• Regular pattern of up and down movements.

• Due to weather, customs, festivals etc.

• Occurs within one year.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Seasonal Vs Cyclical
Prof U Dinesh Kumar, IIMB

• When a cyclical pattern in the data has a period of less than one
year, it is referred as seasonal variation.

• When the cyclical pattern has a period of more than one year we
refer to it as cyclical variation.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Irregular Component
Prof U Dinesh Kumar, IIMB

• Erratic fluctuations

• Due to random variation or unforeseen events

• White Noise

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB
More
than
one

Demand
Demand
year gap

Random
movement

Time Time
(a) Trend (b) Cycle

Less
than

Demand
Demand

one
year gap

Time Time
(c) Seasonal pattern (d) Trend with seasonal pattern

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Time Series Data – Additive and Multiplicative Models


1. Additive Forecasting Model

 Seasonalit
Trend  y Cyclical
 Random

Yt  Tt  St  Ct  Rt

2. Multiplicative Forecasting Model

 Seasonalit
Trend  y Cyclical
 Random

Yt  Tt  St  Ct  Rt
© All Rights Reserved, Indian Institute of Management Bangalore
Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Time Series Data Decomposition


Multiplicative Forecasting Model

 Seasonalit
Trend  y
Yt  Tt  St

Yt
St 
Tt Yt / Tt is called
deseasonalized data

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Time Series Techniques


Prof U Dinesh Kumar, IIMB

• Moving Average.

• Exponential Smoothing.

• Auto-regression Models (AR Models).

• ARIMA (Auto-regressive Integrated Moving Average) Models.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Moving Average Method

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Moving Average (Rolling Average)


Prof U Dinesh Kumar, IIMB

• Simple moving average.


– Used mainly to capture trend and smooth short term fluctuations.
– Most recent data are given equal weights.

• Weighted moving average


– Uses unequal weights for data

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Data

• Demand for continental breakfast at the Die Another Day Hospital.

• Daily data between 1 October 2014 – 23 January 2015 (115 days)

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Simple moving average


Prof U Dinesh Kumar, IIMB

• The forecast for period t+1 (Ft+1) is given by the average of the ‘n’ most
recent data.
1 t
Ft 1   Yi
n i  t  n 1

Ft 1  Forecast for period t  1

Yi  Data corresponding to time period i

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Simple moving average


Prof U Dinesh Kumar, IIMB

• The forecast for period t+1 (Ft+1) is given by the average of the ‘n’ most
recent data.
1 t
Ft 1   Yi
n i  t  n 1

Ft 1  Forecast for period t  1

Yi  Data corresponding to time period i

© All Rights Reserved, Indian Institute of Management Bangalore


Demand for Continental Breakfast at DAD Hospital – Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Moving Average
60
50
40
30
20
10
0

Actual Demand Forecast

1 t
Ft 1   Yi
7 i  t  7 1

© All Rights Reserved, Indian Institute of Management Bangalore


Measures of aggregate error
Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Mean absolute error MAE 1 n


MAE   Et
n t 1

Mean absolute percentage 1 n Et


MAPE  
error MAPE n t 1 Yt

Mean squared error MSE 1 n 2


MSE   Et
n t 1

Root mean squared error RMSE 


1 n 2
 Et
RMSE n t 1

Et = Ft - Yt
© All Rights Reserved, Indian Institute of Management Bangalore
Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

DAD Forecasting – MAPE and RMSE

Mean absolute percentage 0.1068 or 10.68%


error MAPE
Root mean squared error 5.8199
RMSE

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Exponential Smoothing
Prof U Dinesh Kumar, IIMB

• Form of Weighted Moving Average


– Weights decline exponentially.
– Largest weight is given to the present observation, less weight to
immediately preceding observation and so on.

• Requires smoothing constant ()


– Ranges from 0 to 1

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Exponential Smoothing

Next forecast =   (present actual value) + (1-)  present forecast

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Simple Exponential Smoothing Equations


Prof U Dinesh Kumar, IIMB

• Smoothing Equations

Ft 1   * Yt  (1   ) * Ft

F1  Y1

Ft+1 is the forecasted value at time t+1

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Simple Exponential Smoothing Equations


Prof U Dinesh Kumar, IIMB

• Smoothing Equations

Ft  Yt 1   (1   )Yt  2   (1   ) 2 Yt 3  ....

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Exponential Smoothing Forecast


Prof U Dinesh Kumar, IIMB

60
Demand Exponential Smoothing Forecast

50

40

30

20

10

0
18-09-2014 08-10-2014 28-10-2014 17-11-2014 07-12-2014 27-12-2014 16-01-2015 05-02-2015

Ft 1  0.8098 * Yt  (1  0.8098) * Ft

© All Rights Reserved, Indian Institute of Management Bangalore


DAD Forecasting – MAPE and RMSE
Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Exponential Smoothing

Mean absolute percentage 0.0906 or 9.06%


error MAPE
Root mean squared error 5.3806
RMSE

 = 0.8098

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Choice of 

For “smooth” data, try a high value of a , forecast responsive to most


current data.

For “noisy” data try low a  forecast more stable—less responsive

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Optimal 
Prof U Dinesh Kumar, IIMB

 n 
 Y  F 
2
Min  t t / n

 t 1 

Ft  Yt 1  (1   ) Ft 1

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Double exponential Smoothing – Holt’s model

Simple exponential smoothing may produce consistently biased forecasts in


the presence of a trend.

Holt's method (double exponential smoothing) is appropriate when demand


has a trend but no seasonality.

Systematic component of demand = Level + Trend

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Holt’s method
Prof U Dinesh Kumar, IIMB

• Holt’s method can be used to forecast when there is a linear trend present
in the data.

• The method requires separate smoothing constants for slope and intercept.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Holt’s Method
Prof U Dinesh Kumar, IIMB

Equation for
intercept or
• Holt’s Equations
level

(i ) Lt    Yt  (1   )  ( Lt 1  Tt 1 )
(ii ) Tt    ( Lt  Lt 1 )  (1   )  Tt 1
Equation for
Slope (Trend)
• Forecast Equation

Ft 1  Lt  Tt
© All Rights Reserved, Indian Institute of Management Bangalore
Predictive Analytics : QM901.1x

Initial values of Lt and Tt


Prof U Dinesh Kumar, IIMB

• L1 is in general set to Y1.

• T1 can be set to any one of the following values (or use regression to get initial values):

T1  (Y 2Y1 )
T1  (Y2  Y1 )  (Y3  Y2 )  (Y4  Y3 ) / 3
T1  (Yn  Y1 ) /(n  1)

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB
60

50

40

30

20

10

0
0 20 40 60 80 100 120 140

Demand Forecast

Double exponential smoothing


 = 0.8098;  = 0.05
© All Rights Reserved, Indian Institute of Management Bangalore
DAD Forecasting – MAPE and RMSE
Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Double Exponential Smoothing

Mean absolute percentage 0.0930 or 9.3%


error MAPE
Root mean squared error 5.5052
RMSE

 = 0.8098;  = 0.05

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Forecasting Accuracy Prof U Dinesh Kumar, IIMB

• The forecast error is the difference between the forecast value and the actual value for
the corresponding period.

E t  Yt  Ft
E t  Forecast error at period t
Yt  Actual value at time period t
Ft  Forecast for time period t

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Measures of aggregate error


Prof U Dinesh Kumar, IIMB

Mean absolute error MAE 1 n


MAE   Et
n t 1

Mean absolute percentage 1 n Et


MAPE  
error MAPE n t 1 Yt

Mean squared error MSE 1 n 2


MSE   Et
n t 1

Root mean squared error 1 n 2


RMSE   Et
RMSE n t 1

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Forecasting Power of a Model

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Theil’s coefficient (U Statistic)


Prof U Dinesh Kumar, IIMB

n
2
 (Yt 1  Ft 1 )
U  t 1
n
2
 (Yt 1  Yt )
t 1

The value U is the relative forecasting power of the method against naïve technique. If U < 1,
the technique is better than naïve forecast If U > 1, the technique is no better than the naïve
forecast.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Theil’s coefficient for DAD Hospital

Method U

Moving Average with 7 periods 1.221

Exponential Smoothing 1.704

Double exponential Smoothing 1.0310

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Theil’s Coefficient

2
n  Ft 1  Yt 1 
n -1

 t t
Y  F 2

t 1 
 
U1  t 1
, U2 
Yt 
2
n n
 Yt 1  Yt 
t   t
n -1

  
2 2
Y F
t 1 t 1 t 1  Yt 

U1 is bounded between 0 and 1, with values closure to zero indicating greater accuracy.

If U2 = 1, there is no difference between naïve forecast and the forecasting technique


If U2 < 1, the technique is better than naïve forecast
If U2 > 1, the technique is no better than the naïve forecast.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Seasonal Effect

• Seasonal effect is defined as the repetitive and predictable pattern of data


behaviour in a time-series around the trend line.

• In seasonal effect the time period must be less than one year, such as, days,
weeks, months or quarters.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Seasonal effect
Prof U Dinesh Kumar, IIMB

• Identification of seasonal effect provides better understanding of the time


series data.

• Seasonal effect can be eliminated from the time-series. This process is called
deseasonalization or seasonal adjusting.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Seasonal Adjusting

Seasonal adjustment in multiplicative model


Tt St Yt
Seasonal effect   100
Tt Tt
Seasonal adjustment in additive model
Yt  St  Tt  St  St  Tt

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Seasonal Index
Prof U Dinesh Kumar, IIMB

• Method of simple averages

• Ratio-to-moving average method

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Method of simple averages


Prof U Dinesh Kumar, IIMB

• Average the unadjusted data by period ( for example daily or monthly).

• Calculate the average of daily (or monthly) averages.

• Seasonal index for day i (or month i) is the ratio of daily average of day i (or
month i) to the average of daily (or monthly) averages times 100.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB
Example: DAD Example - Demand for Continental Breakfast
5 October – 1 November 2014 Data

Week Week 2 Week 3 Week 4


DAY (5-11 October) (12-18 October) (19-25 OCT) (26 OCT - 1 NOV)
Sunday 41.00 40.00 40.00 41.00
Monday 30.00 44.00 43.00 41.00
Tuesday 40.00 49.00 41.00 40.00
Wednesday 40.00 50.00 46.00 43.00
Thursday 40.00 45.00 41.00 46.00
Friday 40.00 40.00 40.00 45.00
Saturday 40.00 42.00 32.00 45.00

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Seasonality Index
Prof U Dinesh Kumar, IIMB

Week1 Week 2 Week 3 Week 4 Daily Seasonality


DAY (5-11 October) (12-18 October) (19-25 OCT) (26 Oct-1 Nov) Average Index
Sunday 41.00 40.00 40.00 41.00 40.50 97.34%
Monday 30.00 44.00 43.00 41.00 39.50 94.94%
Tuesday 40.00 49.00 41.00 40.00 42.50 102.15%
Wednesday 40.00 50.00 46.00 43.00 44.75 107.55%
Thursday 40.00 45.00 41.00 46.00 43.00 103.34%
Friday 40.00 40.00 40.00 45.00 41.25 99.14%
Saturday 40.00 42.00 32.00 45.00 39.75 95.54%
41.61 700

Average of daily Total = Number of


averages seasons x 100

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Deseasonalized Data
Prof U Dinesh Kumar, IIMB

Date Demand Seasonal Index De-seasonalized Data


10/5/2014 41 97.34% 42.12
10/6/2014 30 94.94% 31.60
10/7/2014 40 102.15% 39.16
10/8/2014 40 107.55% 37.19
10/9/2014 40 103.34% 38.70
10/10/2014 40 99.14% 40.35
10/11/2014 40 95.54% 41.87

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Trend
Prof U Dinesh Kumar, IIMB

• Trend is calculated using regression on deseasonalized data.

• Deseasonalized data is obtained by dividing the actual data with its


seasonality index.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.268288
R Square 0.071978
Adjusted R Square 0.036285
Standard Error 3.715395
Observations 28

ANOVA
df SS MS F Significance F
Regression 1 27.83729 27.83729 2.016587 0.167469
Residual 26 358.9082 13.80416
Total 27 386.7455

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 39.81731 1.442768 27.59786 8.7E-21 36.85166 42.78296
Day 0.123437 0.086923 1.420066 0.167469 -0.05524 0.30211

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Forecast
Prof U Dinesh Kumar, IIMB

Ft = Tt x St

F29 = T29 x S29

F29 = (39.81 + 0.1234 x 29) x 0.9733 = 42.2372

Y29 = 46

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Forecasting using method of averages in the


presence of seasonality

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Auto-correlation
• Auto correlation is a correlation of a variable observed at two time
points (e.g. Yt and Yt-1 or Yt and Yt-3).

• Auto-correlation of lag k, k, is given by:


n    
  Yt  k  Y  Yt  Y 
t  k 1   
k  
n
2
 (Yt  Y )
t 1
n  total number of observations

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Auto-correlation Function (ACF)

• A k-period plot of autocorrelations is called autocorrelation


function (ACF) or a correlogram.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Auto-Correlation
Prof U Dinesh Kumar, IIMB

• Auto-correlation of lag k is auto-correlation between Yt and Yt+k.

• To test whether the autocorrelation at lag k is significantly different


from 0, the following hypothesis test is used:
H0: k = 0
HA:  k ≠ 0

• For any k, reject H0 if |  k| > 1.96/√n. Where n is the number of


observations.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

ACF for Demand for Continental Prof U Dinesh Kumar, IIMB

Breakfast

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Partial Auto-correlation
Prof U Dinesh Kumar, IIMB

Partial auto correlation of lag k is an auto correlation between Yt and Yt-k


with linear dependence between the intermedia values (Yt-k+1, Yt-k+2, …, Yt-1)
removed.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Partial Auto-correlation Function

A k-period plot of partial autocorrelations is called partial


autocorrelation function (PACF).

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Partial Auto-Correlation
Prof U Dinesh Kumar, IIMB

• Partial auto-correlation of lag k is auto-correlation between Yt and Yt+k after the


removal of linear dependence of Yt+1 to Yt+k-1.

• To test whether the partial autocorrelation at lag k is significantly different from 0,


the following hypothesis test is used:
H0: pk = 0
HA: pk ≠ 0

• For any k, reject H0 if | pk| > 1.96/√n. Where n is the number of observations.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

PACF – Demand for Continental Breakfast at


DAD hospital

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Stationarity
Prof U Dinesh Kumar, IIMB

• A time series is stationary, if:

– Mean () is constant over time.


– Variance () is constant over time.
– The covariance between two time periods (Yt) and (Yt+k) depends
only on the lag k not on the time t.

We assume that the time series is stationary before


applying forecasting models

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

ACF Plot of non-stationary and stationary


Prof U Dinesh Kumar, IIMB

process

Non-stationary Stationary
© All Rights Reserved, Indian Institute of Management Bangalore
Predictive Analytics : QM901.1x

White Noise
Prof U Dinesh Kumar, IIMB

• White noise is a data uncorrelated across time that follow normal


distribution with mean 0 and constant standard deviation .

• In forecasting we assume that the residuals are white Noise.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Residual White Noise

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Auto-Regression
Prof U Dinesh Kumar, IIMB

• Auto-regression is a regression of Y on itself observed at different


time points.

• That is, we use Yt as the response variable and Yt-1, Yt-2 etc. as
explanatory variables.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

AR(1) Parameter Estimation


Prof U Dinesh Kumar, IIMB

Yt  Yt 1   t
n n
 t
2
  Yt  Yt 1  2
t 2 t 2
n
  Yt Yt 1
OLS
 t 2 Estimate
n
2
 t 1
Y
t 2

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

AR(1) Process
Prof U Dinesh Kumar, IIMB

Yt = Yt-1 + t

Yt =  (Yt-2 + t-1) + t
…………………………………………………………

Yt = t X0 + t-1 1 + t-2 2 +…+ 1 t-1 + t

t 1
Yt   X 0    i  t  i
t
i 0

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Auto-regressive process (AR(p))


Prof U Dinesh Kumar, IIMB

• Assume {Yt} is purely random with mean zero and constant standard
deviation  (White Noise).

• Then the autoregressive process of order p or AR(p) process is

Yt  0  1Yt 1  2Yt  2  ...   pYt  p   t

AR(p) process models each future observation


as a function “p” previous observations.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Model Identification in AR(p)


Process

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Pure AR Model Identification


Prof U Dinesh Kumar, IIMB

Model ACF PACF


AR(1) Exponential Decay: Positive side if ø1 Spike at lag 1, then cuts off to zero.
> 0 and alternating in sign starting on Spike positive if ø1 > 0 and negative
negative side if ø1 < 0. side if ø1 < 0.

AR(p) Exponential decay: pattern depends on Spikes at lags 1 to p, then cuts of to


signs of ø1, ø2, etc zero.

Yt  0  1Yt 1  2Yt  2  ...   pYt  p   t

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Partial Auto-Correlation
Prof U Dinesh Kumar, IIMB

• Partial auto-correlation of lag k is auto-correlation between Yt and Yt+k after the


removal of linear dependence of Yt+1 to Yt+k-1.

• To test whether the autocorrelation at lag k is significantly different from 0, the


following hypothesis test is used:
H0: k = 0
HA:  k ≠ 0

• For any k, reject H0 if |  k| > 1.96/√n. Where n is the number of observations.

© All Rights Reserved, Indian Institute of Management Bangalore


PACF Function – DAD Continental Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Breakfast

First two
PAC are
different
from
zero.

© All Rights Reserved, Indian Institute of Management Bangalore


ACF Function - DAD Continental Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Breakfast

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Actual Vs Fit- Continental Breakfast


Prof U Dinesh Kumar, IIMB

Data

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB
AR(2) Model

MAPE = 8.8%

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Residual White Noise


Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

(Yt – 41.252) = 0.663 (Yt-1 – 41.252) – 0.204 (Yt-2 – 41.252)

Yt = (Xt - )

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Moving Average Process


Prof U Dinesh Kumar, IIMB

• A moving average process is a time series regression model in which


the value at time t, Yt, is a linear function of past errors.

• First order moving average, MA(1), is given by:

Yt = 0 + 1 t-1 + t

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Moving Average Process MA(q)

{Yt} is a moving average process of order q (written MA(q)) if for some


constants 0, 1, . . . q

We have.,
Yt   0  1 t   2 t  2  ...   q t  q   t

MA(q) models each future observation as a function of


“q” previous errors

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Pure MA Model Identification


Prof U Dinesh Kumar, IIMB

Model ACF PACF

MA(1) Spike at lag 1 then cuts of to zero. Spike Exponential decay. On negative side if
positive if 1 > 0 and negative side if 1> 0 on positive side if 1< 0.
1 < 0.

MA(q) Spikes at lags 1 to q, then cuts off to zero. Exponential decay or sine wave. Exact
pattern depends on signs of 1, 2 etc.

© All Rights Reserved, Indian Institute of Management Bangalore


ACF Function - DAD Continental Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Breakfast

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

SPSS Output
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Actual Vs Fitted Value

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Residual Plot
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

ARMA(p,q) Model
Prof U Dinesh Kumar, IIMB

AR(p)
Model

Yt   0  1Yt 1   2Yt 2  ...   pYt  p 


  0  1 t 1   2 t 2  ...   q t q  t

MA(q)
Model

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

ARMA(p,q) Model Identification


Prof U Dinesh Kumar, IIMB

• ARMA(p,q) models are not easy to identify. We usually start with pure
AR and MA process. The following thump rule may be used.

Process ACF PACF

ARMA(p,q) Tails off after q lags Tails off to zero after p


lags

• The final ARMA model may be selected based on parameters such as


RMSE, MAPE, AIC and BIC.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

ACF PACF

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

ARMA(2,1) Model Output


Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


AR(2) Model
Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

MAPE = 8.8%

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Residual White Noise


Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

Forecasting Model Evaluation


Akaike’s information criteria

AIC = -2LL + 2m
Where m is the number of variables estimated in the model
AIC and BIC can
Bayesian Information Criteria be interpreted as
distance from
BIC = -2LL + m ln(n) true model

Where m is the number of variables estimated in the model and n is the number of
observations

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Final Model Selection


Prof U Dinesh Kumar, IIMB

Model AR(2) MA(1) ARMA(2,1)


BIC 3.295 3.301 3.343

AR(2) has the


least BIC

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

ARIMA
Prof U Dinesh Kumar, IIMB

ARIMA has the following three components:

• Auto-regressive component: Function of past values of the time series.

• Integration Component: Differencing the time series to make it a stationary process.

• Moving Average Component: Function of past error values.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Integration (d)
Prof U Dinesh Kumar, IIMB

• Used when the process is non-stationary.

• Instead of observed values, differences between observed values are considered.

• When d=0, the observations are modelled directly. If d = 1, the differences


between consecutive observations are modelled. If d = 2, the differences of the
differences are modelled.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

ARIMA (p, d, q)
Prof U Dinesh Kumar, IIMB

• The q and p values are identified using auto-correlation function (ACF)


and Partial auto-correlation function (PACF) respectively. The value d
identifies the level of differencing.

• Usually p+q <= 4 and d <= 2.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Differencing
Prof U Dinesh Kumar, IIMB

• Differencing is a process of making a non-stationary process into


stationary process.

• In differencing, we create a new process Xt, where Xt = Yt – Yt-1.

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

ARIMA(p,1,q) Process
Prof U Dinesh Kumar, IIMB

X t   0  1 X t 1   2 X t  2  ...   p X t  p 
  0  1 t 1   2 t  2  ...   q t  q  t

Where Xt = Yt – Yt-1

© All Rights Reserved, Indian Institute of Management Bangalore


Ljung-Box Test
Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB

• Ljung-Box is a test on the autocorrelations of residuals. The test statistic is:

rk2
m
Qm  n(n  2)
k 1 n  k
n = number of observations in the time series.
k = particular time lag checked
m = the number of time lags to be tested
rk = sample autocorrelation function of the kth residual term.

H0: The model does not exhibit lack of fit


HA: The model exhibits lack of fit

Q statistic is approximate chi-square distribution with m – p – q degrees of freedom if ARMA orders


are correctly specified.
© All Rights Reserved, Indian Institute of Management Bangalore
Predictive Analytics : QM901.1x

AR(2) Ljung-Box Test


Prof U Dinesh Kumar, IIMB

P-value is 0.740, thus the model doesn’t show


lack of fit

© All Rights Reserved, Indian Institute of Management Bangalore


Predictive Analytics : QM901.1x

Box-Jenkins Methodology
Prof U Dinesh Kumar, IIMB

• Identification: Identify the ARIMA model using ACF & PACF plots. This
would give the values of p, q and d.

• Estimation: Estimate the model parameters (using maximum likelihood)

• Diagnostics: Check the residual for any issue such as not providing White
Noise.

© All Rights Reserved, Indian Institute of Management Bangalore

You might also like