VAR Models: Gloria González-Rivera

VAR Models
Gloria Gonzlez-Rivera
University of California, Riverside
and
Jess Gonzalo U. Carlos III de Madrid
Some Some References References
Hamilton, chapter 11
Enders, chapter 5
Palgrave Handbook of Econometrics, chapter 12 by Lutkepohl
Any of the books of Lutkepohl on Multiple Time Series
Multivariate Multivariate Models Models
VARMAX Models as a multivariate generalization of the
univariate ARMA models:
Structural VAR Models:
VAR Models (reduced form)
n n x 1 k x k n x 1 n x n n x
t
j
L
q
0 j
j t
X
i
L
r
0 i
i
G
t
Y
s
L
p
0 s
s

=
+
=
=
=

1 1
...
t t p t p t
BY Y Y

= + + +
1 1
... a
t t p t p t
Y Y Y

= + + +
Multivariate Multivariate Models Models ( (cont cont) )
where the error term is a vector white noise:
To avoid parameter redundancy among the parameters, we need
to assume certain structure on
and
This is similar to univariate models.
'
( ) i f s t
0 o t h e r w i s e
t s
E a a = =
=
0

A Structural VAR(1) A Structural VAR(1)
t 10 12 t 11 t 1 12 t 1 yt
t 20 21 t 21 t 1 22 t 1 xt
y b b x y x
x b b y y x

= + + +
= + + +
The error terms (structural shocks)
yt
and
xt
are white noise
innovations with standard deviations
y
and
x
and a zero covariance.
The two variables y and x are endogenous (Why?)
Note that shock
yt
affects y directly and x indirectly.
There are 10 parameters to estimate.
Consider a bivariate Y
t
=(y
t
, x
t
), first-order VAR model:
From From a a Structural Structural VAR VAR to to a Standard VAR a Standard VAR
The structural VAR is not a reduced form.
In a reduced form representation y and x are just functions of lagged y
and x.
To solve for a reduced form write the structural VAR in matrix form
as:
| |
10 1 12 11 12
20 1 21 21 22
0 1 1
1
1

t t yt
t t xt
t t t
y b y b
x b x b
BY Y

( ( ( ( ( (
= + +
( ( ( ( ( (

= + +
From From a a Structural Structural VAR VAR to to a a Standard VAR Standard VAR ( (cont cont) )
Premultipication by B
-1
allow us to obtain a standard VAR(1):
This is the reduced form we are going to estimate (by OLS equation
by equation)
Before estimating it, we will present the stability conditions (the
roots of some characteristic polynomial have to be outside the unit
circle) for a VAR(p)
After estimating the reduced form, we will discuss which information
do we get from the obtained estimates (Granger-causality, Impulse
Response Function) and also how can we recover the structural
parameters (notice that we have only 9 parameters now).
| |
0 1 1
1 1 1
0 1 1
0 1 1

t t t
t t t
t t t
BY Y
Y B B Y B
Y Y a
= + +
= + +
= + +
A A bit bit of of history history ....Once ....Once Upon Upon a Time a Time
Sims(1980) Macroeconomics and Reality Econometrica, 48
Generalization of univariate analysis to an array of random variables

.....
income V rate, interest supply, money i.e.
2 2 1 1
t
t p t p t t t
t
t
t
t
t t
a Y Y Y c Y
V
X
Z
Y
X Z
+ + + + =
(
(
(
=
= = =

VAR(p)
=
= =
t
t
a a E a E
t t
0
) ' ( 0 ) (
i
are matrices
) 1 (
33 32 31
23 22 21
13 12 11
1
(
(
(
=

A typical equation of the system is
t p t
p
p t
p
p t
p
t t t t
a V X Z
V X Z c Z
1
) (
13
) (
12
) (
11
1
13
) 1 (
1
12
) 1 (
1
11
) 1 (
1
.....
+ + +
+ + + + + =

Each equation has the same regressors
Stability Stability Conditions Conditions
=
=
+ =
+ =
+ =

j i
j i
L L L
ij
L
a c Y L
a c Y L L L I
a c Y Y Y Y
ij
p
p
ij ij ij
t t
t t
p
p
t p t p t t t
0
1
] .... [
is (L) of element the
operator L lag the in polynomial matrix nxn a is ) (
) (
) ...... (
......
) (
2
) 2 ( ) 1 (
ij
2
2 1
2 2 1 1

A VAR(p) for is STABLE if
t
Y
2
1 2
..... 0
x roots of the characteristic polynomial are outside of the unit circle.
p
n p
I x x x
p n
=
c I
p n
1
2 1
) ..... (

=
If the VAR is stable then a representation exists.
This representation will be the key to study the impulse response
function of a given shock.
) ( MA
......] [ ) (
) ( ......
2
2 1
2 2 1 1
+ + + =
+ = + + + + =

L L I L
a L a a a Y
n
t t t t t

Re-writing the system in deviations from its mean
t p t p t t t
a Y Y Y Y + + + =

) ( ... ) ( ) (
2 2 1 1

Stack the vector as
(
(
(
(
=
(
(
(
(
(
(
(
(
(
(
0
0
0 .......... 0 0
0 ..... .......... 0
0 ..... .......... 0
......
1 2 1
1
1
M
M
M
t
t
n
n
n
p p
p t
t
t
t
a
v
I
I
I
F
Y
Y
Y
(nxp)x1 (nxp)x(nxp)
(nxp)x1
1
( ')
0
0.....0
0 0......0
where
0 0......0
t t t t
H t
F v E v v
t
H
= + =

(
(
(
=
(
(

M
(nxp)x(nxp)
STABLE:
eigenvalues of F lie inside
of the unit circle (WHY?).
VAR(p) VAR(p) VAR(1) VAR(1)
Estimation Estimation of of VAR VAR models models
Estimation: Conditional MLE
1 1 0 1 1 1 2 1
1
1 2 1 1
1 2
1 2
( , ..... | , .... ; ) ( | , .... ; )
| , .... ( .... , )
' [ ..... ]
[1 ...... ]'
'
( ) log
T
T T p t t t t p
t
t t t t p t p
p
t t t t p
t t t
t
f Y Y Y Y Y Y f Y Y Y Y
Y Y Y N c Y Y
c
X Y Y Y
Y X a

+ +
=

=
=
+ +

= +
=
l
( ) ( )
1
1 1
1
( | ; )
1
log(2 ) log ' ' '
2 2 2
T
t
T
t t t t
t
f Y past
Tn T
Y X Y X

=
=
( = +

n x (np+1)
(np+1) x 1
Claim: OLS estimates equation by equation are good!!!
1
1 1

' ' '
T T
mle ols ols t t t t
t t
Y X X X

= =
( (
= =
( (

Proof:
( ) ( )
( ) ( )
( ) ( )

+ + =
= + + =
= + + =
=

=
t t t
t ols t t ols ols t t t
T
t
t ols t t ols t
t t ols t ols t
T
t
t t ols t ols t
T
t
t t t t
X a X X a a
X a X a
X X X Y X X X Y
X Y X Y
)'
( ' 2 )'
( )
( ' '
)'
( ' )'
(
' '
'
' ' '
'
' ' '

1 1 1
1
1
1
1
1
1
0 ' )
( ' )'
(
)'
( ' )'
( ' (*)
1 1
1 1
=
(
=
(
=
=
(
=

t
t t ols t
t
t ols
t
t ols t
t
t ols t
a X tr a X tr
X a tr X a
( ) ( )
=

=
=
=
ols
when achieved is alue smallest v the

definite positive is
1 -
matrix definite positive is because
t t
t
X )'
ols
(
1
)
ols
(
t
' X
t
a
1
t
' a
min
T
1 t
t
X '
t
Y
1
'
t
X '
t
Y min
Maximum Likelihood of
Evaluate the log-likelihood at , then
=
=
= =
=

=
=
= = =

+ =
T
t
jt it ij
T
t
it ii
T
t
T
t
t t t t
T
t
t t
a a
T
a
T
a a
T
a a
T
a a
T Tn
1
2
1
2 2
1 1
1
1
1 1

1
elements diagonal - off
1
elements diagonal
'
1
0 '
2
1
'
2
)
, (
'
2
1
log
2
) 2 log(
2
)
, (
l
l
Testing Testing Hypotheses Hypotheses in a VAR in a VAR model model
Likelihood ratio test in VAR
| | | |
: lags of number the Testing
2
log
2
) 2 log(
2
)
(
2 2
1

2
1
'
2
1
'
2
1
'
2
1
'
2
1
log
2
) 2 log(
2
)
(
0 1
1
1
1
1
1
1
1
1
1
1 1
p p
Tn T Tn
Tn
TI trace T trace
a a trace a a trace a a
a a
T Tn
n
T
t
t t
T
t
t t
T
t
t t
T
t
t t
>
+ =
= = =
=
(
=
(
=
+ =
l
l
) ( :
) ( :
1 1
0 0
p VAR H
p VAR H
0
under H
2
log
2
) 2 log(
2

2
log
2
) 2 log(
2
lags p and constant a

on variable each of s regression OLS n perform
1
1
*
1
1
0
*
0
0 0 0
Tn T Tn
Tn T Tn
+ =
+ =

l
l
1
under H
{ } { }
) ( ns restrictio of number
log
log
log
log ) ( 2
0 1
2
2
1 0
1
0
1
1
*
0
1
*
p p n m LR
T T LR
m
=
= = =

l l
equation each in ) (
variable each on n restrictio has equation each
0 1
0 1
p p n
p p

Let ( ) denote the (nk 1) (with k=1+np number of parameters
T
estimated per equation) vector of coef. resulting from OLS regressions of each
of the elements of y on x for a sample of size T:
t t
vec
T
=
)
)
-1
1.T
T T
'
. , where = x x x y
t t t it
T iT
t=1 t=1
n.T
Asymptotic distribution of is
1
( ) (0, ( )), and the coef of regression i
T
2 1
( ) (0, ) with lim(1 / )

T N M
T N M M p T
iT i i
=
( (
( (
( (

(
(
(

)
) )
)
)
' X X
t t
t
In general, linear hypotheses can be tested directly as usual and

their A.D follows from the next asymptotic result:
Information Criterion in a Standard VAR(p) Information Criterion in a Standard VAR(p)
2
2
2(n p n)
AIC ln
T
(n p n) ln(T)
SBC ln
T
+
= +
+
= +

In the same way as in the univariate AR(p) models,
Information Criteria (IC) can be used to choose the right
number of lags in a VAR(p): that minimizes IC(p) for
p=1, ..., P.
p
Similar consistency results to the ones obtained in the univariate

world are obtained in the multivariate world.The only difference is
that as the number of variables gets bigger, it is more unlikely that
the AIC ends up overparametrizing (see Gonzalo and Pitarakis
(2002), Journal of Time Series Analysis)
Granger Granger Causality Causality
Granger (1969) :
Investigating Causal Relations by Econometric Models and Cross-
Spectral Methods, Econometrica, 37
Consider two random variables
t t
Y X ,
Two Forecast of , periods ahead:
t
(1) (2)

( ) ( | , ,....) ( ) ( | , ,.... , ,....)
1 1 1
2

( ( )) ( ( ))
(1) (2)

If ( ( ) ) ( ( ) ) then does not Granger-c
X s
X s E X X X X s E X X X Y Y
t t s t t t s t t t t t
MSE X s E X X s
t t s t
MSE X s MSE X s Y
t t t
= =
+ +
=
+
= ause 0
is not linearly informative to forecast
X s
t
Y X
t t
>
Test for Granger-causality

Assume a lag length of p
1 1 1 2 2 1 1 2 2
..... ....
t t t p t p t t p t p t
X c X X X Y Y Y a

= + + + + + + +
Estimate by OLS and test for the following hypothesis
0 any :
) cause - Granger not does ( 0 ...... :
1
2 1 0
= = = =
i
t t p
H
X Y H

Unrestricted sum of squared residuals
Restricted sum of squared residuals
=
t
t
a RSS
2
1
=
t
t
a RSS
2
2
2 1
1
( )
/( 2 1)
RSS RSS
F
RSS T p
=

Under general conditions
( ) F p
Impulse Response Impulse Response Function Function (IRF) (IRF)
Objective: the reaction of the system to a shock
1 1 2 2
1 1 2 2
1
1 1 2 2 1 1
....
If the system is stable,
( ) ....
( ) [ ( )]
Redating at time :
.... ....
t t t p t p t
t t t t t
t s t s t s t s s t s t
Y c Y Y Y a
Y L a a a a
L L
t s
Y a a a a a

+ + + + +
= + + + + +
= + = + + + +
=
+
= + + + + + + +
| |
) (
,
) (
'
s
ij
jt
s t i
s
ij s
t
s t
a
y
a
Y
= =
+
+
n x n
Reaction of the i-variable to a unit change
in innovation j
(multipliers)
Impluse Impluse Response Response Function Function ( (cont cont) )
Impulse-response function: response of to one-time impulse in
with all other variables dated t or earlier held constant.
s t i
y
+ ,
jt
y
ij
jt
s t i
a
y
=
+ ,
s
ij
1
2 3
Example Example: IRF : IRF for for a VAR(1) a VAR(1)
(
(
=
(
+
(
=
(
2
2 12
12
2
1
2
1
2
1 1
22 21
12 11
2
1
;
1

a
t
t
t
t
t
a
a
y
y
y
y
t
1 2
10 20 2
0 0
0 0, 1 ( increases by 1 unit)
(no more shocks occur)
t t
t
t y y
t a a y
< = =
= = =
Reaction of the system 10
20
11 11 12 12
21 21 22 22
2
12 11 12 11 11 12
22 21 22 21 21 22
1 11 12
1
2 21 22
0
1
0
1
0
1
0 0
1 1
s
s
s
s
y
y
y
y
y y
y y
y
y

( (
=
( (

( ( ( (
= =
( ( ( (

( ( ( ( (
= =
( ( ( ( (

( ( ( (
= =
( ( ( (

M
(impulse)
If you work with the MA representation:
| |
1
2
1 2
1 1
1
) ( ) (
s
s
L L
=
=
=
=

M
In this example, the variance-covariance matrix of the innovations
is not diagonal, i.e.
0
12

There is contemporaneous correlation between shocks, then
(
=
(
1
0
20
10
y
y
To avoid this problem, the variance-covariance matrix has to be
diagonalized (the shocks have to be orthogonal) and here is where
a serious problems appear.
This is not very realistic
Reminder:
is positive definite (symmetric) matrix.
(non-singular) such that Q ' Q Q I
=
Then, the MA representation:
0
0
1
0
1
0
Let us call ;
[ ' ] [ ' '] [ ' ] ' '
has components that are all uncorrelated and unit variance
t i t i n
i
t i t i
i
i i t t t i t i
i
t t t t t t n
t
Y a I
Y Q Qa
M Q w Qa Y M w
E ww E Qa a Q QE a a Q Q Q I
w
=
= + =
= +
= = = +
= = = =
1
t s
s s
t
Y
M Q
w

+
= =
Orthogonalized impulse-response
Function.
Problem: Q is not unique
Variance Variance decomposition decomposition
Contribution of the j-th orthogonalized innovation to the MSE of
the s-period ahead forecast
1 1 1 1
1 1 1 1
1 1' 1 1'
1 1
1 1'
1 1
1 1'

( ( )) ( ( ))( ( )) '
( ) ( ) .....
[ ( ) ( ) '] ' .... '
( ) ' ' ' ....
' '
t t s t t s t
t t s t t s t s s t
t t a a s a s
a a
s a s
MSE Y s E Y Y s Y Y s
e s Y Y s a a a
E e s e s
MSE s Q Q Q Q Q Q Q Q
Q Q Q Q
Q Q
+ +
+ + + +

=
= = + +
= + + +
= + +
+ =
=
1 1' 1 1'
1 1 1 1
0 0 1 1 1 1
' ....... '
' ' ......... '
s s
s s
Q Q Q Q
M M M M M M

+ + =
= + +
1
1
0 0
recall that
and ,
i i
M Q
M Q I
=
= =
contribution of the first orthogonalized
innovation to the MSE (do it for a two variables VAR model)
Example: Variance decomposition in a two
variables (y, x) VAR
The s-step ahead forecast error for variable y is:
y E y M (1,1) M (1,1) ... M (1,1)
t s t t s yt s 0 1 yt s 1 s 1 yt 1
M (1, 2) M (1, 2) ... M (1, 2)
xt s 0 1 xt s 1 s 1 xt 1
= + + + +
+ + + + +
+ + +
+ + +
Denote the variance of the s-step ahead forecast
error variance of y
t+s
as for
y
(s)
2
:
2 2 2 2 2
(s) [M (1,1) M (1,1) ... M (1,1) ]
y y 0 1 s 1
2 2 2 2
[M (1, 2) M (1, 2) ... M (1, 2) ]
x 0 1 s 1
= + + + +
+ + +

The forecast error variance decompositions are
proportions of
y
(s)
2
.
2
y
2
y
2 2 2 2
[M (1,1) M (1,1) ... M (1,1) ]
y 0 1 s 1
2 2 2 2
[M (1, 2) M (1, 2) ... M (1, 2) ]
x 0 1 s 1
due to shocks to y / (s)
due to shocks to x / (s)
+ + +

+ + +

=
=
Identification in a Standard VAR(1) Identification in a Standard VAR(1)
yt
t 10 t 1 11 12 12
t 20 21 22 t 1
xt
y b y 1 b
x b x 0 1

(
( ( ( ( (
= + +
(
( ( ( ( (

Remember that we started with a structural VAR model, and
jumped into the reduced form or standard VAR for estimation
purposes.
Is it possible to recover the parameters in the structural VAR
from the estimated parameters in the standard VAR? No!!
There are 10 parameters in the bivariate structural VAR(1)
and only 9 estimated parameters in the standard VAR(1).
The VAR is underidentified.
If one parameter in the structural VAR is restricted the
standard VAR is exactly identified.
Sims (1980) suggests a recursive system to identify the model
letting b
21
=0.
Identification in a Standard VAR(1) (cont.) Identification in a Standard VAR(1) (cont.)
yt
t 10 t 1 11 12 12 12 12
t 20 21 22 t 1
xt
t 10 t 1 1t 11 12
t 20 21 22 t 1 2t
y b y 1 b 1 b 1 b
x b x 0 1 0 1 0 1
y y e
x x e
(
( ( ( ( ( ( (
= + +
(
( ( ( ( ( ( (

( ( ( ( (
= + +
( ( ( ( (

The parameters of the structural VAR can now be identified from the
following 9 equations
2 2 2
10 10 12 20 20 20 1 y 12 x
2
11 11 12 21 21 21 2 x
2
12 12 12 22 22 22 1 2 12 x
b b b b var(e ) b
b var(e )
b cov(e , e ) b
= = = +
= = =
= = =
b
21
=0 implies
Identification in a Standard VAR(1) (cont.) Identification in a Standard VAR(1) (cont.)
Note both structural shocks can now be identified from the
residuals of the standard VAR.
b
21
=0 implies y does not have a contemporaneous effect on x.
This restriction manifests itself such that both
yt
and
xt
affect y
contemporaneously but only
xt
affects x contemporaneously.
The residuals of e
2t
are due to pure shocks to x.
Decomposing the residuals of the standard VAR in this triangular
fashion is called the Choleski decomposition.
There are other methods used to identify models, like Blanchard
and Quah (1989) decomposition (it will be covered on the
blackboard).
Critics Critics on on VAR VAR
A VAR model can be a good forecasting model, but in a sense it
is an atheoretical model (as all the reduced form models are).
To calculate the IRF, the order matters: remember that Q is not
unique.
Sensitive to the lag selection
Dimensionality problem.
THINK on TWO MORE weak points of VAR modelling

VAR Models: Gloria González-Rivera

Uploaded by

Copyright:

Available Formats

You might also like

VAR Models: Gloria González-Rivera

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

VAR Models: Gloria González-Rivera

Uploaded by

Copyright:

Available Formats

VAR Models

' ' '

' ' '

when achieved is alue smallest v the

Evaluate the log-likelihood at , then

lags p and constant a

( ) (0, ) with lim(1 / )

In general, linear hypotheses can be tested directly as usual and

Similar consistency results to the ones obtained in the univariate

Test for Granger-causality

You might also like