Professional Documents
Culture Documents
3 Mavzu
3 Mavzu
FOR
FORECASTING
Rupika Abeynayake
Professor in Applied Statistics
Introduction
Frequently there is a time lag between awareness of an impending
event or need and occurrence of the event
1 n
Mean percentage error (MPE) : MPE PE t
n t 1
1 n
Mean absolute percentage error (MAPE) : MAPE | PE t |
n t 1
Main components of time series data
Moving averages rank among the most popular techniques for the
preprocessing of time series. They are used to filter random "white
noise" from the data, to make the time series smoother
Takes a certain number of past periods and add them together; then
divide by the number of periods gives the simple moving average
Moving Average
350 Actual
Smoothed
Forecast
Actual
250 Smoothed
Forecast
Obs
MAPE: 32.03
50 MAD: 57.39
MSD: 5165.47
0 10 20
Time
Example…
Moving Average
Actual
Smoothed
300 Forecast
Actual
Smoothed
Forecast
Obs
200
Moving Average
Length: 5
0 10 20
Time
1Centered Moving Average
The center of the fist moving average is at 2.5 while the center
of the second moving average is at 3.5
w1 + w2 + w3 = 1
That is, different solutions for = 0.1 to are tried starting with
0.9, with increments of 0.1.
350
300
250
Observed values
Shipments
50
0
1 2 3 4 5 6 7 8 9 10 11 12
Month
Holt’s linear method
Holt (1957) extended single exponential smoothing to linear
exponential smoothing to allow forecasting of data with trends
Yt
Lt a (1 a )( Lt 1 bt 1 ) Level
S t s
bt ( Lt Lt 1 ) (1 )bt 1 Trend
Yt
St (1 ) S t s Seasonal
Lt
Ft m ( Lt bt m)S t s m Forecast
Example…
Actual
700
Predicted
600 Forecast
Actual
Predicted
500
Forecast
Yt
400
Smoothing Constants
Alpha (level): 0.200
300
Gamma (trend): 0.200
Delta (season): 0.200
200
MAPE: 5.163
MAD: 12.557
100
MSD: 356.695
0 50 100 150
Time
Seasonal Factor
Ratio-to-moving-average
Year Q1 Q2 Q3 Q4
2008 20 30 39 60
2009 40 51 62 81
2010 50 64 74 85
Time Quarter Time Sales Centered Sales/MA*100
Period index MA (4)
2008 Q1 1 20
2008 Q2 2 30
2008 Q3 3 39
2008 Q4 4 60 39.750 150.943
2009 Q1 5 40 44.875 89.136
2009 Q2 6 51 50.375 101.241
2009 Q3 7 62 55.875 110.962
2009 Q4 8 81 59.750 135.565
2010 Q1 9 50 62.625 79.840
2010 Q2 10 64 65.750 97.338
2010 Q3 11 74 67.750 109.225
2010 Q4 12 85
Year Q1 Q2 Q3 Q4
2008 150.94
2009 89.14 101.24 110.96 135.56
2010 79.84 97.33 109.22
Mean 84.49 99.29 110.095 143.25 437.125
AF 0.915 0.915 0.915 0.915
Seasonal 77.3085 90.85 100.736 131.0737 399.9693
Index
400
0
300
-50
200
-100
100
0 50 100 150 0 50 100 150
General overview of forecasting techniques
models (1970)
No
Is variance Apply
stable transformation
Yes
Obtain ACFs and
PACFs
Are residual
Box-Jenkins Modeling
Modify Model uncorrelated Approach to Forecasting
No
Yes
Are parameters
significant
Yes
Forecasting
Autocorrelation function
The key statistics in time series analysis is the autocorrelation
coefficient (or the correlation of the time series with itself, lagged by
1, 2, or more periods), which is given by the following formula
(Y t Y )(Yt K Y )
rk t k 1
n
(Y
t 1
t Y )2
n
(Yt Y)(Yt K Y) 29
rk t k 1
n
0.557692
52
(Yt Y) 2
t 1
Sampling distribution of Auto Correlation
Portmanteau Tests (317)
An alternative to this would be to examine a whole set of rk values,
say the first 10 of them (r1 to r10) all at once and then test to see
whether the set is significantly different from a zero set. Such a test
is known as a portmanteau test, and the two most common are the
Box-Pierce test and the Ljung-Box Q* statistic..
ei
1.0
0.5
ACF
0.0
-0.5
-1.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Lag Number
Yt ' Yt Yt 1
Yt ' The differenced series will have only n-1 values since it is not possible
Yt '
to calculate a difference for the first observation
Yt Yt Yt 12
'
Backshift Notation
• A very useful notational device is the backward
shift operator, B which is used as follows:
BYt Yt 1
BYt Yt 1 B( BYt ) B Yt Yt 2
2
B Yt Yt 2
2
B Yt Yt 3
3
B Yt Yt d
d
Yt ' Yt Yt 1 First Difference
Yt Yt BYt
'
Yt (1 B)Yt
'
Yt (1 B)Yt
'
Second Difference Third Difference
Yt Yt 2 (1 B )Yt
3
Yt B Yt2
(1 B )Yt
2
First Order Difference
Yt Yt Yt 1
'
Yt Yt BYt
'
Yt ' (1 B)Yt
Yt (Yt Y )
'' ' '
t 1
Second Order Difference
Yt '' (Yt Yt 1 ) (Yt 1 Yt 2 )
Yt Yt Yt 1 Yt 1 Yt 2
''
Yt Yt 2Yt 1 Yt 2
''
Yt (1 2B B )Yt
'' 2
Yt '' (1 B) 2 Yt
Linear time series models
ARIMA (p,d,q) : Yt (1 B) t
d d
The major tools used in the identification phase are plots of the
series, correlograms of auto correlation (ACF), and partial
autocorrelation (PACF)
Partial autocorrelation
Autocorrelation
Partial autocorrelation
Autocorrelation
ARIMA(p,d,q)(P,D,Q)S:
’49 ‘50 ‘51 ‘52 ‘53 ‘54 ‘55 ‘56 ‘57 ‘58 ‘59 ‘60
Jan. 112 115 145 171 196 204 242 284 315 340 360 417
Feb 118 126 150 180 196 188 233 277 301 318 342 391
Mar 132 141 178 193 236 135 267 317 356 362 406 419
Aprl 129 135 163 181 235 227 269 313 348 348 396 461
May 121 125 172 183 229 234 270 318 355 363 420 472
Jun 135 149 178 218 243 264 315 374 422 435 472 535
July 148 170 199 230 264 302 364 413 465 491 548 622
Aug 148 170 199 242 272 293 347 405 467 505 559 606
Sep 136 158 184 209 237 259 312 355 404 404 463 508
Oct. 119 133 162 191 211 229 274 306 347 359 407 461
Nov 104 114 146 172 180 203 237 271 205 310 362 390
Dec 118 140 166 194 201 229 278 306 336 337 405 432
Time series plot for original data
700
600
500
Value yt
400
300
200
100
13 5 7 9 1 1 1 1 12 2 2 2 2 3 3 3 3 34 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 78 8 8 8 8 9 9 9 9 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
13 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 0 0 0 0 0 1 1 1 1 12 2 2 2 2 3 3 3 3 34 4
135 79 13579 135 79 135 79 13
Case Number
Time series plot for trance formed data
‘49 ‘50 ‘51 ‘52 ‘53 ‘54 ‘55 ‘56 ‘57 ‘58 ‘59 ‘60
4.72 4.74 4.98 5.14 5.28 5.32 5.49 5.65 5.75 5.83 5.89 6.03
4.77 4.84 5.01 5.19 5.28 5.24 5.45 5.62 5.71 5.76 5.83 5.97
4.88 4.95 5.18 5.26 5.46 4.91 5.59 5.76 5.87 5.89 6.01 6.04
4.86 4.91 5.09 5.20 5.46 5.42 5.59 5.75 5.85 5.85 5.98 6.13
4.80 4.83 5.15 5.21 5.43 5.46 5.60 5.76 5.87 5.89 6.04 6.16
4.91 5.00 5.18 5.38 5.49 5.58 5.75 5.92 6.05 6.08 6.16 6.28
5.00 5.14 5.29 5.44 5.58 5.71 5.90 6.02 6.14 6.20 6.31 6.43
5.00 5.14 5.29 5.49 5.61 5.68 5.85 6.00 6.15 6.22 6.33 6.41
4.91 5.06 5.21 5.34 5.47 5.56 5.74 5.87 6.00 6.00 6.14 6.23
4.78 4.89 5.09 5.25 5.35 5.43 5.61 5.72 5.85 5.88 6.01 6.13
4.64 4.74 4.98 5.15 5.19 5.31 5.47 5.60 5.32 5.74 5.89 5.97
4.77 4.94 5.11 5.27 5.30 5.43 5.63 5.72 5.82 5.82 6.00 6.07
Time series plot for trance formed data
6.50
6.00
Value Log_yt
5.50
5.00
4.50
13 5 7 9 1 1 1 1 12 2 2 2 2 3 3 3 3 34 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
13 5 79 13 5 7 9 13 5 7 9 13 5 79 13 5 79 13 5 7 9 13 5 7 9 13 5 7 9 13 5 79 0 0 0 0 0 1 1 1 1 12 2 2 2 2 3 3 3 3 34 4
135 79 13 5 79 13 5 7 9 13 5 7 9 13
Case Number
Time series plots after differencing
150 100
100
50
Value SDIFF(Yt,1,12)
Value DIFF(Yt_D1,1)
50
-50
-50
-100
-100
-150 -150
1 1 1 12 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 12 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
3 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 0 0 0 0 0 1 1 1 1 12 2 2 2 2 3 3 3 3 3 4 4 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 0 0 0 0 1 1 1 1 12 2 2 2 2 3 3 3 3 3 4 4 4
13 5 7 9 13 5 7 9 13 5 7 9 13 5 7 9 13 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4
1.0 Coefficient
Upper Confidence Limit
Lower Confidence
Limit
0.5
ACF
ACF
0.0
-0.5
-1.0
1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
Lag Number
DIFF(Yt_D1,1)
1.0 Coefficient
Upper Confidence Limit
Lower Confidence
Limit
0.5
PACF
Partial ACF
0.0
-0.5
-1.0
1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
Lag Number
Comparison of ARIMA models with AIC
Where, L = Likelihood
m = p+q+P+Q
σ2 = variance of residual
Comparison of ARIMA models with AIC
S. No. Model AIC
1 ARIMA (1,1,1) (0,1,1)12 1172.362
2 ARIMA (0,1,1) (0,1,1)12 1169.650
3 ARIMA (0,1,2) (0,1,1)12 1171.052
4 ARIMA (0,1,1) (0,1,2)12 1172.295
5 ARIMA (0,1,1) (1,1,1)12 1172.631
6 ARIMA (0,1,3) (0,1,1)12 1172.205
7 ARIMA (1,1,1) (1,1,1)12 1171.065
8 ARIMA (0,1,1) (1,1,0)12 1171.286
9 ARIMA (1,1,1) (1,1,0)12 1170.986
10 ARIMA (1,1,0) (0,1,1)12 1185.438
The class of ARIMA models are useful for both stationary and
non-stationary time series
Holt, C.C. (1957). Forecasting seasonal and trends by exponentially weighed moving average,
Office of Navel Research, Research memorandum No. 52.
Makridakis, S., Steven, C. W. and Rob J. Hyndman (1998). Forecasting Methods and
Applications, 3rd edition, John Wiley & sons, New Yark.
Winters, P. R.(1960). Forecasting sales by exponentially weighted moving averages, seasonal and
trends by exponentially weighed moving average, Management Science, 6, 324-342.
Yar, M and C. Chatfield (1990), Prediction intervals for the Holt-Winters forecasting procedure,
International Journal of Forecasting 6, 127-137.
ARCH Model
100
200
300
400
500
600
700
0
49
19
50
19
51
19
52
19
53
19
54
19
55
19
56
19
57
19
58
19
59
19
60