Professional Documents
Culture Documents
Forecasting Slides
Forecasting Slides
Francis X. Diebold
University of Pennsylvania
1. Forecasting in Action
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
3 / 323
4 / 323
5 / 323
Modeling trend
Forecasting trend
6 / 323
Modeling seasonality
7 / 323
Characterizing Cycles
White noise
8 / 323
9 / 323
Forecasting Cycles
Optimal forecasts
10 / 323
11 / 323
Vector autoregressions
Predictive causality
12 / 323
13 / 323
Smoothing
14 / 323
15 / 323
16 / 323
17 / 323
Special insights:
I
18 / 323
Journals
I
Journal of Forecasting
19 / 323
General:
I
I
I
I
I
I
I
Cross-section:
I
Eviews
S+
Minitab
SAS
R
Python
Many more...
Stata
Open-ended:
I
Matlab
20 / 323
Online Information
I
21 / 323
Skewness
Asymmetry
Kurtosis
Leptokurtosis
Normal, or Gaussian, Distribution
Marginal Distribution
Joint Distribution
Covariance
Correlation
Conditional Distribution
Conditional Moment
Conditional Mean
Conditional Variance
23 / 323
Population Distribution
Sample
Estimator
Statistic, or Sample Statistic
Sample Mean
Sample Variance
Sample Standard Deviation
Sample Skewness
Sample Kurtosis
2 Distribution
t Distribution
F Distribution
Jarque-Bera Test
24 / 323
T
X
[ yt 0 1 xt ] 2
t = 1
Fitted values:
y
t = 0 + 1 xt
Residuals:
e t = yt y
t
25 / 323
iid
(0, 2 )
Multiple regression:
yt = 0 + 1 xt + 2 zt + t
iid
(0, 2 )
26 / 323
y
=
1X
yt
T
t=1
PT
SD =
y
)2
T1
t=1 (yt
T
X
e2t
t=1
27 / 323
(SSRres SSR) / (k 1)
SSR / (T k)
PT
=
SER =
2
t=1 et
Tk
s
s2
PT
2
t=1 et
Tk
28 / 323
PT
= 1 PT
2
t=1 et
t=1 (yt
y
t )2
or
R
= 1
1 PT
2
t=1 et
T
1 PT
t=1 (yt
T
y
t )2
PT 2
1
t=1 et
Tk
PT
1
t=1 (yt
T1
y
t )2
29 / 323
PT
2
t=1 et
PT
2
t=1 et
30 / 323
t = t1 + vt
PT
DW =
2
t=2 (et et1 )
PT 2
t=1 et
31 / 323
Regression of y on x and z
32 / 323
Scatterplot of y versus x
33 / 323
34 / 323
35 / 323
36 / 323
37 / 323
Shrinkage Principle
Signal vs. noise
Smaller is often better
Even incorrect restrictions can help
38 / 323
39 / 323
40 / 323
Quadratic Loss
41 / 323
Absolute Loss
42 / 323
Asymmetric Loss
43 / 323
Forecast Statement
44 / 323
45 / 323
Extrapolation Forecast
46 / 323
47 / 323
Univariate, multivariate
Time series vs. distributional shape
Relational graphics
Anscombes Quartet
(1)
x1
y1
10.0
8.04
8.0
6.95
13.0 7.58
9.0
8.81
11.0
8.33
14.0
9.96
6.0
7.24
4.0
4.26
12.0
10.84
7.0
4.82
5.0
5.68
(2)
x2
10.0
8.0
13.0
9.0
11.0
14.0
6.0
4.0
12.0
7.0
5.0
y2
9.14
8.14
8.74
8.77
9.26
8.10
6.13
3.10
9.13
7.26
4.74
(3)
x3
10.0
8.0
13.0
9.0
11.0
14.0
6.0
4.0
12.0
7.0
5.0
y3
7.46
6.77
12.74
7.11
7.81
8.84
6.08
5.39
8.15
6.42
5.73
(4)
x4
8.0
8.0
8.0
8.0
8.0
8.0
8.0
19
8.0
8.0
8.0
49 / 323
Anscombes Quartet
50 / 323
51 / 323
52 / 323
53 / 323
Liquor Sales
54 / 323
55 / 323
56 / 323
57 / 323
58 / 323
59 / 323
60 / 323
Graph
61 / 323
Graph
62 / 323
Graph
63 / 323
Graph
64 / 323
Graph
65 / 323
66 / 323
Tt = 0 e1 TIMEt
ln(Tt ) = ln(0 ) + 1 TIMEt
67 / 323
0 ,1
T
X
t = 1
T
X
(0 , 1 , 2 ) = arg min
0 ,1 ,2
0 ,1
0 ,1
yt 0 1 TIMEt 2 TIME2t
t = 1
(0 , 1 ) = arg min
(0 , 1 ) = arg min
[ yt 0 1 TIMEt ] 2
T
X
T h
X
yt 0 e1 TIMEt
i2
t = 1
[ ln yt ln 0 1 TIMEt ] 2
t = 1
68 / 323
yT+h,T = 0 + 1 TIMET+h
y
T+h,T = 0 + 1 TIMET+h
69 / 323
y
T+h,T 1.96
N(yT+h,T , 2 )
N(
yT+h,T ,
2)
70 / 323
2
t=1 et
MSE =
R2 = 1 PT
T
PT
2
t=1 et
t=1 (yt
y
)2
71 / 323
2
R
PT
=
1 PT
2
t=1 et
t=1 (yt
PT
=
2
t=1 et
Tk
PT
T
Tk
2
t=1 et
/ Tk
y
)2 / T 1
1 s
2
PT
)2 / T
t=1 (yt y
72 / 323
SIC = T( T )
I
Consistency
Efficiency
PT
2
t=1 et
T
PT
2
t=1 et
73 / 323
74 / 323
75 / 323
76 / 323
77 / 323
78 / 323
79 / 323
80 / 323
81 / 323
82 / 323
83 / 323
84 / 323
85 / 323
86 / 323
Retail Sales
87 / 323
88 / 323
89 / 323
90 / 323
91 / 323
92 / 323
93 / 323
94 / 323
95 / 323
AIC
SIC
Linear Trend
19.35
19.37
Quadratic Trend
15.83
15.86
Exponential Trend
17.15
17.17
96 / 323
97 / 323
98 / 323
99 / 323
100 / 323
=
=
=
=
(1,
(0,
(0,
(0,
s
X
0,
1,
0,
0,
0,
0,
1,
0,
0,
0,
0,
1,
1,
0,
0,
0,
0,
1,
0,
0,
0,
0,
1,
0,
0,
0,
0,
1,
1,
0,
0,
0,
0,
1,
0,
0,
0,
0,
1,
0,
0,
0,
0,
1,
...)
...)
...)
...)
i Dit + t
i=1
yt = 1 TIMEt +
s
X
i Dit + t
i=1
yt = 1 TIMEt +
s
X
i=1
i Dit +
v1
X
i=1
iHD HDVit +
v2
X
iTD TDVi
i=1
101 / 323
yt = 1 TIMEt +
s
X
i Dit +
i=1
yT+h = 1 TIMET+h +
v1
X
iHD HDVit +
i=1
s
X
i Di,T+h +
yT+h,T = 1 TIMET+h +
v1
X
y
T+h,T
s
i
X
i=1
iHD HDVi,T+h +
i=1
i Di,T+h +
i=1
= 1 TIMET+h +
iTD TDVi
i=1
i=1
s
X
v2
X
v1
X
iHD HDVi,T+h +
i=1
Di,T+h +
v1
X
iHD HDVi,T+h +
i=1
102 / 323
N(yT+h,T , 2 )
N(
yT+h,T ,
2)
103 / 323
Gasoline Sales
104 / 323
Liquor Sales
105 / 323
106 / 323
107 / 323
108 / 323
109 / 323
Residual Plot
110 / 323
111 / 323
Housing Starts
112 / 323
Housing Starts
113 / 323
Characterizing Cycles
1. Covariance Stationary Time Series
I
I
I
Realization
Sample Path
Covariance Stationary
Eyt = t
Eyt =
(t, ) = cov(yt , yt ) = E(yt )(yt )
(t, ) = ( )
( ) = p
( )
( )
cov(yt , yt )
p
p
= p
=
(0)
var(yt ) var(yt )
(0) (0)
114 / 323
corr(x, y) =
( ) =
( ) = p
I
cov(x, y)
x y
( )
, = 0, 1, 2, ....
(0)
( )
( )
cov(yt , yt )
p
p
= p
=
(0)
var(yt ) var(yt )
(0) (0)
115 / 323
2.White Noise
yt WN(0, 2 )
iid
yt (0, 2 )
yt
iid
N(0, 2 )
116 / 323
E(yt ) = 0
var(yt ) = 2
E(yt |t1 ) = 0
var(yt |t1 ) = E[(yt E(yt |t1 ))2 |t1 ] = 2
117 / 323
bi Li
i=0
B(L) t = b0 t + b1 t1 + b2 t2 + ... =
bi ti
i=0
118 / 323
yt = B(L)t =
bi ti
i=0
t WN(0, 2 )
where
b0 = 1
and
b2i <
i=0
119 / 323
yt = B(L)t =
bi ti
i=0
iid
t WN(0, 2 ),
where b0 = 1 and
b2i <
i=0
120 / 323
E(yt ) = E(
X
i=0
bi ti ) =
X
i=0
bi Eti =
bi 0 = 0
i=0
X
X
X
X
2
2 2
2
var(yt ) = var(
bi ti ) =
bi var(ti ) =
bi =
b
i=0
i=0
i=0
i=0
var(yt |t1 ) = E[(yt E(yt |t1 ))2 |t1 ] = E(2t |t1 ) = E(2t )
121 / 323
B(L) =
(L) =
(L)
(L)
q
X
i Li
i=0
(L) =
p
X
i Li
i=0
B(L)
(L)
(L)
122 / 323
y
=
1X
yt
T
t=1
( ) =
( ) =
1
T
PT
E [(yt ) (yt )]
E[(yt )2 ]
t = +1 [(yt
1 PT
t=1 (yt
T
y
) (yt y
)]
y
)2
PT
=
t = +1
[(yt y
) (
PT
t=1 (yt
y
t = c + 1 yt1 + ... + yt
p
( )
1
( ), p
( ) N 0,
T
123 / 323
1
)
T
rootT
( ) N(0, 1)
T 2 ( ) 21
QBP = T
m
X
2 ( )
=1
QLB = T (T + 2)
m
X
=1
1
T
2 ( )
124 / 323
125 / 323
126 / 323
127 / 323
128 / 323
129 / 323
130 / 323
131 / 323
132 / 323
133 / 323
P. Acorr.
Std. Error
Ljung-Box
0.949
.088
.088
.088
.088
.088
.088
.088
.088
.088
.088
.088
.088
118.07
219.66
303.72
370.82
422.27
460.00
486.32
503.41
512.70
516.43
517.20
517.21
p-v
v
1 0.949
2 0.877
3 0.795
4 0.707
5 0.617
6 0.526
7 0.438
8 0.351
9 0.258
10 0.163
11 0.073
12 0.005
0.244
0.101
0.070
0.063
0.048
0.033
0.049
0.149
0.070
0.011
0.016
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
134 / 323
135 / 323
136 / 323
var(yt |t1 ) = E[(yt E(yt |t1 ))2 |t1 ] = E(2t |t1 ) = E(2t )
137 / 323
yt = t + 1 t1 + ... + q tq = (L)t
t WN(0, 2 )
where
(L) = 1 + 1 L + ... + q Lq
138 / 323
yt = yt1 + t
t WN(0, 2 )
If covariance stationary:
yt = t + t1 + 2 t2 + ...
139 / 323
Moment Structure
E (yt ) = E (t + t1 + 2 t2 + ...)
= E (t ) + E (t1 ) + 2 E (t2 ) + ...
= 0
var (yt ) = var (t + t1 + 2 t2 + ...)
= 2 + 2 2 + 4 2 + ...
= 2
i=0
2i
= 12
140 / 323
yt = yt1 + t
yt yt = yt1 yt + t yt
For
1
,
( ) = ( 1).
(Yule-Walker equation) But
2
(0) = 12
142 / 323
. Thus
2
( ) = 12 , = 0, 1, 2, ....
and
( ) = , = 0, 1, 2, ....
Partial autocorrelations:
143 / 323
144 / 323
yt =
(1 + L)
t
(1 L)
145 / 323
t WN(0, 2 )
(L)yt = (L)t
146 / 323
147 / 323
= .4
148 / 323
= .95
149 / 323
= .4
150 / 323
= .95
151 / 323
152 / 323
= .4
153 / 323
= .95
154 / 323
= .4
155 / 323
= .95
156 / 323
157 / 323
158 / 323
159 / 323
0.345
0.614
0.426
0.042
-0.398
0.145
0.118
0.048
0.019
0.066
0.098
0.113
Std. Error
Ljung-Box
.088
.088
.088
.088
.088
.088
.088
.088
.088
.088
.088
.088
73.089
111.01
135.49
151.79
183.70
185.71
202.46
205.50
206.96
207.89
208.01
p-v
15.614
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
160 / 323
161 / 323
162 / 323
163 / 323
AR Order
0
1
2
3
4
1.01
.762
.77
.79
1
2.86
.83
.77
.761
.79
MA Order
2
2.32
.79
.78
.77
.77
3
2.47
.80
.80
.78
.79
4
2.20
.81
.80
.79
.80
164 / 323
AR Order
0
1
2
3
4
1.05
.83
.86
.90
1
2.91
.90
.86
.87
.92
MA Order
2
2.38
.88
.89
.90
.93
3
2.56
.91
.92
.94
.97
4
2.31
.94
.96
.96
1.00
165 / 323
166 / 323
167 / 323
0.032
0.040
0.017
0.047
0.007
0.009
0.019
0.060
0.097
0.040
0.022
0.153
Std. Error
Ljung-Box
.09
.09
.09
.09
.09
.09
.09
.09
.09
.09
.09
.09
0.1376
0.3643
0.3904
0.6970
0.7013
0.7246
0.7650
1.3384
2.5182
2.7276
2.7659
5.4415
p-v
0.402
0.696
0.858
0.855
0.774
0.842
0.906
0.710
168 / 323
169 / 323
Forecasting Cycles
T = {yT , yT1 , yT2 , ...},
T = {T , T1 , T2 , ...}.
Optimal Point Forecasts for Infinite-Order Moving Averages
yt =
bi ti ,
i=0
where
t WN(0, 2 )
,
170 / 323
b2i <
i=0
h1
X
bi T+hi ,
i=0
h2 = 2
h1
X
b2i .
i=0
171 / 323
yT+h,T 1.96h
h-step-ahead density forecast:
N(yT+h,T , h2 )
Making the Forecasts Operational
The Chain Rule of Forecasting
172 / 323
173 / 323
174 / 323
175 / 323
176 / 323
177 / 323
178 / 323
179 / 323
yt = Tt () +
s
X
i=1
i Dit +
v1
X
iHD HDVit +
i=1
v2
X
iTD TDVit + t
i=1
(L)t = (L)vt
(L) = 1 1 L ... p Lp
(L) = 1 + 1 L + ... + q Lq
vt WN(0, 2 ).
180 / 323
Point Forecasting
yT+h = TT+h () +
s
X
i Di,T+h +
v1
X
i=1
yT+h,T = TT+h () +
s
X
s
i
X
i=1
v2
X
i=1
i Di,T+h +
i=1
+
y
T+h,T = TT+h ()
iHD HDVi,T+h +
i=1
v1
X
iHD HDVi,T+h
i=1
Di,T+h +
v1
X
i=1
iTD T
iHD HDVi,T+h +
v2
X
iTD
i=1
v2
X
iTD T
i=1
181 / 323
y
T+h,T z/2
h
e.g.: (95% interval)
y
T+h,T 1.96
h
Density Forecasting:
N(
yT+h,T ,
h2 )
182 / 323
Recursive Estimation
yt =
K
X
k xkt + t
k=1
t iidN(0, 2 ),
t = 1, ..., T .
OLS estimation uses the full sample, t = 1, ..., T .
Recursive least squares uses an expanding sample.
Begin with the first K observations and estimate the model.
Then estimate using the first K + 1 observations, and so on.
At the end we have a set of recursive parameter estimates:
k,t , for k = 1, ..., K and t = K , ..., T .
183 / 323
Recursive Residuals
At each t, t = K , ..., T 1, compute a 1-step forecast,
yt+1,t =
K
X
kt xk,t+1 .
k=1
et+1,t N(0, 2 rt )
where rt > 1 for all t
184 / 323
wt+1,t
et+1,t
,
rt
t = K , ..., T 1.
Under the maintained assumptions,
wt+1,t iidN(0, 1).
Then
CUSUMt
t
X
wt+1,t , t = K , ..., T 1
t=K
186 / 323
187 / 323
188 / 323
189 / 323
Acorr.
0.117
0.149
0.106
0.014
0.142
0.041
0.134
0.029
0.136
0.205
0.056
0.888
0.055
0.187
0.159
P. Acorr.
0.117
0.165
0.069
0.017
0.125
0.004
0.175
0.046
0.080
0.206
0.080
0.879
0.507
0.159
0.144
Std. Error
.056
.056
.056
.056
.056
.056
.056
.056
.056
.056
.056
.056
.056
.056
.056
Ljung-Box
4.3158 0.0
11.365 0.0
14.943 0.0
15.007 0.0
21.449 0.0
21.979 0.0
27.708 0.0
27.975 0.0
33.944 0.0
47.611 0.0
48.632 0.0
306.26 0.0
307.25 0.0
318.79 0.0
327.17
0.0
190 / 323
191 / 323
192 / 323
193 / 323
194 / 323
Acorr.
0.700
0.686
0.725
0.569
0.569
0.577
0.460
0.480
0.466
0.327
0.364
0.355
0.225
0.291
0.211
P. Acorr.Std.Error
0.700
.056
0.383
.056
0.369
.056
0.141
.056
0.017
.056
0.093
.056
0.078
.056
0.043
.056
0.030
.056
0.188
.056
0.019
.056
0.089
.056
0.119
.056
0.065
.056
0.119
.056
Ljung-Box
p-value
154.34 0.000
302.86 0.000
469.36 0.000
572.36 0.000
675.58 0.000
782.19 0.000
850.06 0.000
924.38 0.000
994.46 0.000
1029.1 0.000
1072.1 0.000
1113.3 0.000
1129.9 0.000
1157.8 0.000
1172.4 0.000
195 / 323
196 / 323
197 / 323
198 / 323
Std. Error
.056
.056
.056
.056
.056
.056
.056
.056
.056
.056
.056
.056
.056
.056
.056
Ljung-Box
p-v
0.9779 0.323
1.4194 0.492
1.6032 0.659
3.8256 0.430
3.8415 0.572
5.1985 0.519
5.7288 0.572
7.2828 0.506
9.3527 0.405
18.019 0.055
18.045 0.081
24.938 0.015
26.750 0.013
34.034 0.002
34.532 0.003
199 / 323
200 / 323
201 / 323
202 / 323
203 / 323
204 / 323
205 / 323
206 / 323
207 / 323
208 / 323
209 / 323
210 / 323
211 / 323
212 / 323
y t = 0 + 1 x t + t
t N(0, 2 )
yT+h,T |xT+h = 0 + 1 xT+h
Density forecast:
N(yT+h,T |xT+h , 2 )
Scenario analysis, contingency analysis
No forecasting the RHS variables problem
213 / 323
yT+h,T = 0 + 1 xT+h,T
Forecasting the RHS variables problem
Could fit a model to x (e.g., an autoregressive model)
Preferably, regress y on
xth , xth1,
...
214 / 323
Distributed Lags
Start with unconditional forecast model:
yt = 0 + xt1 + t
Generalize to
yt = 0 +
Nx
X
i xti + t
i=1
215 / 323
min
0 , i
T
X
"
yt 0
t = Nx +1
Nx
X
#2
i xti
i=1
subject to
i = P(i) = a + bi + ci2 , i = 1, ..., Nx
Lag weights constrained to lie on low-order polynomial
Additional constraints can be imposed, such as
P(Nx ) = 0
Smooth lag distribution
Parsimonious
216 / 323
yt =
A(L)
x t + t
B(L)
Equivalently,
B(L)yt = A(L)xt + B(L) t
Lags of x and y included
Important to allow for lags of y, one way or another
217 / 323
Another way:
distributed lag regression with lagged dependent variables
yt = 0 +
Ny
X
i yti +
i=1
Nx
X
j xtj + t
j=1
Another way:
distributed lag regression with ARMA disturbances
yt = 0 +
Nx
X
i xti + t
i=1
t =
(L)
vt
(L)
vt WN(0, 2 )
218 / 323
C(L)
t
D(L)
A(
Distributed Lag with
B(L) yt = A(L) xt + t
, or
Lagged Dep. Variables
yt =
1
A(L)
xt +
t
B(L)
B(L)
C(
Distributed Lag with
219 / 323
Vector Autoregressions
e.g., bivariate VAR(1)
y1,t = 11 y1,t1 + 12 y2,t1 + 1,t
y2,t = 21 y1,t1 + 22 y2,t1 + 2,t
1,t WN(0, 12 )
Estimation by OLS
2,t WN(0, 22 )
cov(1,t , 2,t ) = 12
Order selection by information criteria
Impulse-response functions, variance decompositions, predictive
causality
Forecasts via Wolds chain rule
220 / 323
221 / 323
222 / 323
Starts Correlogram
Sample: 1968:01 1991:12
Included observations: 288
Acorr.
1 0.937
2 0.907
3 0.877
4 0.838
5 0.795
6 0.751
7 0.704
8 0.650
9 0.604
10 0.544
11 0.496
12 0.446
13 0.405
14 0.346
P. Acorr.
0.937
0.244
0.054
0.077
0.096
0.058
0.067
0.098
0.004
0.129
0.029
0.008
0.076
0.144
Std. Error
Ljung-Box
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
255.24
495.53
720.95
927.39
1113.7
1280.9
1428.2
1554.4
1663.8
1752.6
1826.7
1886.8
1936.8
1973.3
p-v
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
223 / 323
224 / 323
225 / 323
Completions Correlogram
Completions Correlogram
Sample: 1968:01 1991:12
Included observations: 288
Acorr.
1 0.939
2 0.920
3 0.896
4 0.874
5 0.834
6 0.802
7 0.761
8 0.721
9 0.677
10 0.633
11 0.583
12 0.533
P. Acorr.
0.939
0.328
0.066
0.023
0.165
0.067
0.100
0.070
0.055
0.047
0.080
0.073
Std. Error
Ljung-Box
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
256.61
504.05
739.19
963.73
1168.9
1359.2
1531.2
1686.1
1823.2
1943.7
2046.3
2132.2
p-v
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
226 / 323
227 / 323
228 / 323
229 / 323
230 / 323
231 / 323
232 / 323
P. Acorr.
0.001
0.003
0.006
0.023
0.013
0.021
0.038
0.048
0.056
0.116
0.038
0.028
0.193
0.021
Std. Error
Ljung-Box
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.059
0.0004
0.0029
0.0119
0.1650
0.2108
0.3463
0.7646
1.4362
2.3528
6.1868
6.6096
6.8763
17.947
18.010
p-v
0.985
0.999
1.000
0.997
0.999
0.999
0.998
0.994
0.985
0.799
0.830
0.866
0.160
0.206
233 / 323
234 / 323
235 / 323
t WN(0, 2 ),
h-step-ahead linear least-squares forecast:
yt+h,t = + bh t + bh+1 t1 + ...
Corresponding h-step-ahead forecast error:
et+h,t = yt+h yt+h,t = t+h + b1 t+h1 + ... + bh1 t+1
with variance
h2
= (1 +
h1
X
b2i )
i=1
236 / 323
237 / 323
1X
et+h,t
T
ME =
t=1
EV =
1
T
T
X
(et+h,t ME)2
1
T
1X 2
et+h,t
MSE =
T
t=1
T
1X 2
pt+h,t
T
t=1
v
u
T
u1 X
RMSE = t
e2t+h,t
T
MSPE =
t=1
239 / 323
Forecast encompassing
a
b
yt+h = a yt+h,t
+ b yt+h,t
+ t+h,t
240 / 323
2 2
bb
ab
2 + 2 2sigma2
bb
aa
ab
2
2
bb
ab
2 +
2
2 2
bb
aa
ab
241 / 323
a
b
yt+h = 0 + 1 yt+h,t
+ 2 yt+h,t
+ t+h,t
242 / 323
243 / 323
Random Walk
yt = yt1 + t
t WN(0, 2 ),
I
t WN(0, 2 ),
Stochastic trend vs deterministic trend
244 / 323
t
X
i=1
E(yt ) = y0
var(yt ) = t 2
lim var(yt ) =
245 / 323
t
X
i=1
E(yt ) = y0 + t
var(yt ) = t 2
lim var(yt ) =
246 / 323
ARIMA(p,1,q) model
247 / 323
ARIMA(p,d,q) model
248 / 323
I
I
249 / 323
250 / 323
h1
X
2i
i=0
h1
X
T+hi
i=0
h2 = h 2
252 / 323
yt = yt1 + t
iid
t N(0, 2 )
1
=
s
PT 1
t=2
2
yt1
Dickey-Fuller distribution
Trick regression:
yt yt = ( 1)yt1 + t
253 / 323
254 / 323
255 / 323
p
X
j ytj = t
j=1
Rewrite:
yt = 1 yt1 +
p
X
j (ytj+1 ytj ) + t
j=2
where p 2, 1 =
Pp
j=1 j ,
and i =
Pp
j=1 j ,
i = 2, ..., p.
256 / 323
(yt ) +
p
X
j (ytj ) = t
j=1
or
yt = + yt1 +
p
X
j (ytj+1 ytj ) + t ,
j=2
P
where = (1 + pj=1 j ), and the other parameters are as
above.
In the unit root case, the intercept vanishes, because
Pp
257 / 323
p
X
j (ytj a bTIMEtj ) = t
j=1
or
yt = k1 + k2 TIMEt + 1 yt1 +
p
X
j (ytj+1 ytj ) + t
j=2
where
k1 = a(1 +
p
X
i ) b
i=1
and
k2 = bTIMEt (1 +
p
X
ii
i=1
p
X
i )
i=1
Pp
i=1 ii
and k2 = 0.
k1
X
j (ytj+1 ytj ) + t
j=2
yt = + 1 yt1 +
k1
X
j (ytj+1 ytj ) + t
j=2
yt = k1 + k2 TIMEt + 1 yt1 +
k1
X
j (ytj+1 ytj ) + t
j=2
I
260 / 323
Exponential Smoothing
261 / 323
Smoothed series, {
y t }T
t=1 (estimate of the local level)
Forecasts, yT +h,T
1. Initialize at t = 1: y1 = y1
2. Update: yt = yt + (1 )
yt1 , t = 2, ...T
3. Forecast: yT + h, T = yT
262 / 323
t1
X
wj ytj
j=0
where
wj = (1 )j
I
263 / 323
Holt-Winters Smoothing
264 / 323
266 / 323
267 / 323
268 / 323
269 / 323
270 / 323
271 / 323
272 / 323
273 / 323
274 / 323
275 / 323
276 / 323
277 / 323
278 / 323
279 / 323
280 / 323
281 / 323
282 / 323
283 / 323
284 / 323
285 / 323
286 / 323
287 / 323
288 / 323
289 / 323
290 / 323
291 / 323
293 / 323
bi L
i = 0
b2i <
b0 = 1
i = 0
t | t1 N(0, t2 )
t2 = + (L)2t
>0
(L) =
p
X
i Li
i 0 for all i
i < 1.
i = 1
294 / 323
t2 = + r2t1
I
Conditional variance:
2
E ([rt E (rt | t1 )]2 | t1 ) = + rt1
295 / 323
yt = t
t | t1 N(0, t2 )
t2 = + (L)2t + (L)t2
(L) =
p
X
i L ,
(L) =
i = 1
> 0,
i 0,
q
X
i Li
i = 1
i 0,
i +
i < 1.
296 / 323
297 / 323
Important result:
The above equation is simply
r2t = ( + ( + )r2t1 t1 ) + t
= t2 + t
Thus rt2 is a noisy indicator of t2
298 / 323
wj r2tj
where
wj = (1 )j
GARCH(1,1)
2
t2 = + r2t1 + t1
+
j1 r2tj
1
299 / 323
300 / 323
T
T
1 X r2t
Tp
1 X
ln t2 ()
.
ln(2)
2
2
2
2 ()
t=p+1
t=p+1 t
Exogenous variables
302 / 323
TARCH:
2
t2 = + r2t1 + r2t1 Dt1 + t1
where
positive return (good news): effect on volatility
negative return (bad news): + effect on volatility
6= 0 : Asymmetric news response
> 0 : Leverage effect
303 / 323
304 / 323
rt | t1 N(0, t2 )
2
t2 = + r2t1 + t1
+ Xt
where:
is a parameter vector
X is a set of positive exogenous variables.
305 / 323
Component GARCH
Standard GARCH:
2
(t2
) = (r2t1
) + (t1
),
306 / 323
2
(t2 qt ) = (r2t1 qt1 ) + (r2t1 qt1 )Dt1 + (t1
qt1 ) +
307 / 323
yt = x0t + t
t | t1 N(0, t2 )
308 / 323
t | t1 N(0, t2 )
GARCH-M model is a special case:
yt = x0t + t2 + t
t | t1 N(0, t2 )
310 / 323
311 / 323
312 / 323
313 / 323
314 / 323
315 / 323
316 / 323
317 / 323
318 / 323
319 / 323
320 / 323
321 / 323
322 / 323
323 / 323