Professional Documents
Culture Documents
Predictive Analytics: QM901.1x Prof U Dinesh Kumar, IIMB
Predictive Analytics: QM901.1x Prof U Dinesh Kumar, IIMB
1x
Prof U Dinesh Kumar, IIMB
- Lao Tzu
Forecasting
Prof U Dinesh Kumar, IIMB
Forecasting methods
• Qualitative Techniques.
– Expert opinion (or Astrologers)
• Quantitative Techniques.
– Time series techniques such as exponential smoothing
• Casual Models.
– Uses information about relationship between system elements
(e.g regression).
Trend Cyclical
Seasonal Irregular
Trend Component
Cyclical Component
Prof U Dinesh Kumar, IIMB
Seasonal Component
Prof U Dinesh Kumar, IIMB
Seasonal Vs Cyclical
Prof U Dinesh Kumar, IIMB
• When a cyclical pattern in the data has a period of less than one
year, it is referred as seasonal variation.
• When the cyclical pattern has a period of more than one year we
refer to it as cyclical variation.
Irregular Component
Prof U Dinesh Kumar, IIMB
• Erratic fluctuations
• White Noise
Demand
Demand
year gap
Random
movement
Time Time
(a) Trend (b) Cycle
Less
than
Demand
Demand
one
year gap
Time Time
(c) Seasonal pattern (d) Trend with seasonal pattern
Seasonalit
Trend y Cyclical
Random
Yt Tt St Ct Rt
Seasonalit
Trend y Cyclical
Random
Yt Tt St Ct Rt
© All Rights Reserved, Indian Institute of Management Bangalore
Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB
Seasonalit
Trend y
Yt Tt St
Yt
St
Tt Yt / Tt is called
deseasonalized data
• Moving Average.
• Exponential Smoothing.
Data
• The forecast for period t+1 (Ft+1) is given by the average of the ‘n’ most
recent data.
1 t
Ft 1 Yi
n i t n 1
• The forecast for period t+1 (Ft+1) is given by the average of the ‘n’ most
recent data.
1 t
Ft 1 Yi
n i t n 1
Moving Average
60
50
40
30
20
10
0
1 t
Ft 1 Yi
7 i t 7 1
Et = Ft - Yt
© All Rights Reserved, Indian Institute of Management Bangalore
Predictive Analytics : QM901.1x
Prof U Dinesh Kumar, IIMB
Exponential Smoothing
Prof U Dinesh Kumar, IIMB
Exponential Smoothing
• Smoothing Equations
Ft 1 * Yt (1 ) * Ft
F1 Y1
• Smoothing Equations
60
Demand Exponential Smoothing Forecast
50
40
30
20
10
0
18-09-2014 08-10-2014 28-10-2014 17-11-2014 07-12-2014 27-12-2014 16-01-2015 05-02-2015
Ft 1 0.8098 * Yt (1 0.8098) * Ft
Exponential Smoothing
= 0.8098
Choice of
Optimal
Prof U Dinesh Kumar, IIMB
n
Y F
2
Min t t / n
t 1
Ft Yt 1 (1 ) Ft 1
Holt’s method
Prof U Dinesh Kumar, IIMB
• Holt’s method can be used to forecast when there is a linear trend present
in the data.
• The method requires separate smoothing constants for slope and intercept.
Holt’s Method
Prof U Dinesh Kumar, IIMB
Equation for
intercept or
• Holt’s Equations
level
(i ) Lt Yt (1 ) ( Lt 1 Tt 1 )
(ii ) Tt ( Lt Lt 1 ) (1 ) Tt 1
Equation for
Slope (Trend)
• Forecast Equation
Ft 1 Lt Tt
© All Rights Reserved, Indian Institute of Management Bangalore
Predictive Analytics : QM901.1x
• T1 can be set to any one of the following values (or use regression to get initial values):
T1 (Y 2Y1 )
T1 (Y2 Y1 ) (Y3 Y2 ) (Y4 Y3 ) / 3
T1 (Yn Y1 ) /(n 1)
50
40
30
20
10
0
0 20 40 60 80 100 120 140
Demand Forecast
= 0.8098; = 0.05
• The forecast error is the difference between the forecast value and the actual value for
the corresponding period.
E t Yt Ft
E t Forecast error at period t
Yt Actual value at time period t
Ft Forecast for time period t
n
2
(Yt 1 Ft 1 )
U t 1
n
2
(Yt 1 Yt )
t 1
The value U is the relative forecasting power of the method against naïve technique. If U < 1,
the technique is better than naïve forecast If U > 1, the technique is no better than the naïve
forecast.
Method U
Theil’s Coefficient
2
n Ft 1 Yt 1
n -1
t t
Y F 2
t 1
U1 t 1
, U2
Yt
2
n n
Yt 1 Yt
t t
n -1
2 2
Y F
t 1 t 1 t 1 Yt
U1 is bounded between 0 and 1, with values closure to zero indicating greater accuracy.
Seasonal Effect
• In seasonal effect the time period must be less than one year, such as, days,
weeks, months or quarters.
Seasonal effect
Prof U Dinesh Kumar, IIMB
• Seasonal effect can be eliminated from the time-series. This process is called
deseasonalization or seasonal adjusting.
Seasonal Adjusting
Seasonal Index
Prof U Dinesh Kumar, IIMB
• Seasonal index for day i (or month i) is the ratio of daily average of day i (or
month i) to the average of daily (or monthly) averages times 100.
Seasonality Index
Prof U Dinesh Kumar, IIMB
Deseasonalized Data
Prof U Dinesh Kumar, IIMB
Trend
Prof U Dinesh Kumar, IIMB
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.268288
R Square 0.071978
Adjusted R Square 0.036285
Standard Error 3.715395
Observations 28
ANOVA
df SS MS F Significance F
Regression 1 27.83729 27.83729 2.016587 0.167469
Residual 26 358.9082 13.80416
Total 27 386.7455
Forecast
Prof U Dinesh Kumar, IIMB
Ft = Tt x St
Y29 = 46
Auto-correlation
• Auto correlation is a correlation of a variable observed at two time
points (e.g. Yt and Yt-1 or Yt and Yt-3).
Auto-Correlation
Prof U Dinesh Kumar, IIMB
Breakfast
Partial Auto-correlation
Prof U Dinesh Kumar, IIMB
Partial Auto-Correlation
Prof U Dinesh Kumar, IIMB
• For any k, reject H0 if | pk| > 1.96/√n. Where n is the number of observations.
Stationarity
Prof U Dinesh Kumar, IIMB
process
Non-stationary Stationary
© All Rights Reserved, Indian Institute of Management Bangalore
Predictive Analytics : QM901.1x
White Noise
Prof U Dinesh Kumar, IIMB
Auto-Regression
Prof U Dinesh Kumar, IIMB
• That is, we use Yt as the response variable and Yt-1, Yt-2 etc. as
explanatory variables.
Yt Yt 1 t
n n
t
2
Yt Yt 1 2
t 2 t 2
n
Yt Yt 1
OLS
t 2 Estimate
n
2
t 1
Y
t 2
AR(1) Process
Prof U Dinesh Kumar, IIMB
Yt = Yt-1 + t
Yt = (Yt-2 + t-1) + t
…………………………………………………………
t 1
Yt X 0 i t i
t
i 0
• Assume {Yt} is purely random with mean zero and constant standard
deviation (White Noise).
Partial Auto-Correlation
Prof U Dinesh Kumar, IIMB
Breakfast
First two
PAC are
different
from
zero.
Breakfast
Data
MAPE = 8.8%
Yt = (Xt - )
Yt = 0 + 1 t-1 + t
We have.,
Yt 0 1 t 2 t 2 ... q t q t
MA(1) Spike at lag 1 then cuts of to zero. Spike Exponential decay. On negative side if
positive if 1 > 0 and negative side if 1> 0 on positive side if 1< 0.
1 < 0.
MA(q) Spikes at lags 1 to q, then cuts off to zero. Exponential decay or sine wave. Exact
pattern depends on signs of 1, 2 etc.
Breakfast
SPSS Output
Prof U Dinesh Kumar, IIMB
Residual Plot
Prof U Dinesh Kumar, IIMB
ARMA(p,q) Model
Prof U Dinesh Kumar, IIMB
AR(p)
Model
MA(q)
Model
• ARMA(p,q) models are not easy to identify. We usually start with pure
AR and MA process. The following thump rule may be used.
ACF PACF
MAPE = 8.8%
AIC = -2LL + 2m
Where m is the number of variables estimated in the model
AIC and BIC can
Bayesian Information Criteria be interpreted as
distance from
BIC = -2LL + m ln(n) true model
Where m is the number of variables estimated in the model and n is the number of
observations
ARIMA
Prof U Dinesh Kumar, IIMB
Integration (d)
Prof U Dinesh Kumar, IIMB
ARIMA (p, d, q)
Prof U Dinesh Kumar, IIMB
Differencing
Prof U Dinesh Kumar, IIMB
ARIMA(p,1,q) Process
Prof U Dinesh Kumar, IIMB
X t 0 1 X t 1 2 X t 2 ... p X t p
0 1 t 1 2 t 2 ... q t q t
Where Xt = Yt – Yt-1
rk2
m
Qm n(n 2)
k 1 n k
n = number of observations in the time series.
k = particular time lag checked
m = the number of time lags to be tested
rk = sample autocorrelation function of the kth residual term.
Box-Jenkins Methodology
Prof U Dinesh Kumar, IIMB
• Identification: Identify the ARIMA model using ACF & PACF plots. This
would give the values of p, q and d.
• Diagnostics: Check the residual for any issue such as not providing White
Noise.