Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

2 Reduced Forms VARs

2.1 Representations
The Vector Autoregressive (VAR) Representation

If the MA matrix of lag polynomials is invertible, then a unique VAR exists.

We define C(L)−1 as an (n × n) lag polynomial such that C(L)−1 C(L) = I; i.e. when these
lag polynomial matrices are matrix-multiplied, all the lag terms cancel out. This operation
in effect converts lags of the errors into lags of the vector of dependent variables.

Thus we move from MA coefficient to VAR coefficients.

Define A(L) = C(L)−1. Then given the (invertible) MA coefficients, it is easy to map these
into the VAR coefficients:

Yt = C(L)t
A(L)Yt = t (3)

where A(L) = A0 L0 + A1 L1 + A2 L2 + ... and Aj for all j are (n × n) matrices of coefficients.

To show that this matrix lag polynomial exists and how it maps into the coefficients in C(L),
note that by assumption we have the identity

(A0 L0 + A1 L1 + A2 L2 + ...)(C0 L0 + C1 L1 + C2 L2 + ...) = I


After distributing, the identity implies that coefficients on the lag operators must be zero,
which implies the following recursive solution for the VAR coefficients:

A0 = I
A1 = −A0 C1
Ak = −A0 Ck − A1 Ck−1 − ... − Ak−1 C1

As noted, the VAR is possibly of infinite order (i.e. infinite number of lags required to fully
represent joint density). In practice, the VAR is usually restricted for estimation by truncating
the lag-length.

The pth-order vector autoregression, denoted VAR(p) is given by

Yt = A1 Yt−1 + A2 Yt−2 + ... + Ap Yt−p + t (4)

Note: Here we are considering zero mean processes. In case the mean of Yt is not zero we
should add a constant in the VAR equations.
Alternative representations: VAR(1) Any VAR(p) can be rewritten as a VAR(1). To
0 , ..., Y 0
form a VAR(1) from the general model we define: e0t = [0 , 0, ..., 0], Yt0 = [Yt0 , Yt−1 t−p+1 ]
 
A1 A2 ... Ap−1 Ap
 In 0 ... 0 0 
A= 0
 
In ... 0 0 
 ... ... .. 
.
0 ... ... In 0

Therefore we can rewrite the VAR(p) as a VAR(1)

Yt = AYt−1 + et

This is also known as the companion form of the VAR(p)

An Example: Suppose n = 2 and p = 2. The VAR will be


a111 a112 a211 a212
         
y1t y1t−1 y1t−2 ε1t
= + +
y2t a121 a122 y2t−1 a221 a222 x2t−2 ε1t
We can rewrite the above model as
a111 a112 a211 a212
      
y1t y1t−1 ε1t
 y2t  =  a121 a122 a221 a222   y2t−1   ε1t 
y   1 0 0 0
y + 0 
1t−1 1t−2
y2t−1 0 1 0 0 x2t−2 0
Setting
a111 a112 a211 a212
     
y1t ε1t
 y2t  , a121 a122 a221 a222   ε1t 
Yt =  A= , et = 

y1t−1 1 0 0 0 0
 
y2t−1 0 1 0 0 0
we obtain the previous VAR(1) representation.
SUR representation The VAR(p) can be stacked as

Y = XΓ + u

where X = [X1 , ..., XT ]0 , Xt = [Yt−1


0 , Y 0 ..., Y 0 ]0 Y = [Y , ..., Y ]0 , u = [ , ...,  ]0 and Γ =
t−2 t−p 1 T 1 T
[A1 , ..., Ap ] 0

!
X11 X12
Vec representation Let vec denote the stacking columns operator, i.e X = X21 X22
  X31 X32
X11
 X21 
 
X31 
then vec(X) = 

 X12 

 X22 
X32

Let γ = vec(Γ), then the VAR can be rewritten as

Yt = (In ⊗ Xt0 )γ + t
Stationarity of a VAR

Stability and stationarity Consider the VAR(1)

Yt = µ + AYt−1 + εt

Substituting backward we obtain

Yt = µ + AYt−1 + εt
= µ + A(µ + AYt−2 + εt−1 ) + εt
= (I + A)µ + A2 Yt−2 + Aεt−1 + εt
..
.
j−1
X
Yt = (I + A + ... + Aj )µ + Aj Yt−j + Ai εt−i
i=0

If all the eigenvalues of A are smaller than one in modulus then

1. Aj = P Λj P −1 → 0.

2. the sequence Ai , i = 0, 1, ... is absolutely summable.

Pj−1
3. the infinite sum i=0
Ai εt−i exists in mean square (see e.g. proposition C.10L);

4. (I + A + ... + Aj )µ → (I − A)−1 and Aj → 0 as j goes to infinity.


Therefore if the eigenvalues are smaller than one in modulus then Yt has the following
representation

X
Yt = (I − A)−1 + Ai εt−i
i=0

Note that the eigenvalues (λ) of A satisfy det(Iλ − A) = 0. Therefore the eigenvalues
correspond to the reciprocal of the roots of the determinant of A(z) = I − Az.

A VAR(1) is called stable if det(I − Az) 6= 0 for |z| ≤ 1. Equivalently stability requires
that all the eigenvalues of A are smaller than one in absoulte value.

A condition for stability: For a VAR(p) the stability condition also requires that all the
eigenvalues of A (the AR matrix of the companion form of Yt ) are smaller than one in mod-
ulus or all the roots larger than one. Therefore we have that a VAR(p) is called stable if
det(I − A1 z − A2 z 2 , ..., Ap z p ) 6= 0 for |z| ≤ 1.

A condition for stationarity: A stable VAR process is stationary.

Notice that the converse is not true. An unstable process can be stationary.

Notice that the vector M A(∞) representation of a stationary VAR satisfies the absolute
summability condition so that assumptions of 10.2H hold.
Example A stationary VAR(1)
     
0.5 0.3 1 0.3 0.81
Yt = AYt−1 + t , A = , Ω= E(t 0t ) = , λ=
0.02 0.8 0.3 .1 0.48

Figure 1: Blu: Y1 , green Y2 .


Finding the Wold Representation

Rewriting the VAR(p) as a VAR(1) it is particularly useful in order to find the Wold repre-
sentation of Yt .

We know how to find the MA(∞) representation of a stationary AR(1).

We can proceed similarly for the VAR(1). Substituting backward in the companion form we
have
Yt = Aj Yt−j + Aj−1 et−j+1 + ... + A1 et−1 + ... + et
P∞
If conditions for stationarity are satisfied, the series i=1
Aj converges and Yt has an VMA(∞)
representation in terms of the Wold shock et given by

Yt = (I − A)−1 et

X
= Aj et−j
i=1
= C(L)et

where C0 = A0 = I, C1 = A1 , C2 = A2 , ..., Ck = Ak . Cj will be the n × n uuper left matrix of


Cj .
Second Moments

Autocovariance of a VAR(p). Let us consider the companion form of a stationary (zero


mean for simplicity) VAR(p) defined earlier

Yt = AYt−1 + et (5)

The variance of Yt is given by

Σ̃ = E[(Yt )(Yt )0 ]
= AΣ̃A0 + Ω̃ (6)

a closed form solution to (7) can be obtained in terms of the vec operator.

Let A, B, C be matrices such that the product ABC exists. A property of the vec operator is
that
vec(ABC) = (C 0 ⊗ A)vec(C)
Applying the vec operator to both sides of (7) we have

vec(Σ̃) = (A ⊗ A)vec(Σ̃) + vec(Ω̃)

If we define A = (A ⊗ A) then we have

vec(Σ̃) = (I − A)−1 vec(Ω̃)


The jth autocovariance of Yt (denoted Σ̃j ) can be found by post multiplying (6) by Yt−j and
taking expectations:
E(Yt Yt−j ) = AE(Yt Yt−j ) + E(et Yt−j )
Thus
Σ̃j = AΣ̃j−1
or
Σ̃j = Aj Σ̃
The variance Σ and the jth autocovariance Σj of the original series Yt is given by the first n
rows and columns of Σ̃ and Σ̃j respectively.
Model specification

Specification of the VAR is key for empirical analysis. We have to decide about the following:

1. Number of lags p.

2. Which variables.

3. Type of transformations.
Number of lags As in the univariate case, care must be taken to account for all systematic
dynamics in multivariate models. In VAR models, this is usually done by choosing a sufficient
number of lags to ensure that the residuals in each of the equations are white noise.

AIC: Akaike information criterion Choosing the p that minimizes the following

AIC(p) = T ln |Ω̂| + 2(n2 p)

BIC: Bayesian information criterionChoosing the p that minimizes the following

BIC(p) = T ln |Ω̂| + (n2 p) ln T

HQ: Hannan- Quinn information criterion Choosing the p that minimizes the following

HQ(p) = T ln |Ω̂| + 2(n2 p) ln ln T


p̂ obtained using BIC and HQ are consistent while with AIC it is not.

AIC overestimate the true order with positive probability and underestimate the true or-
der with zero probability.

Suppose a VAR(p) is fitted to Y1 , ..., YT (Yt not necessarily stationary). In small sample
the following relations hold:

p̂BIC ≤ p̂AIC if T ≥ 8

p̂BIC ≤ p̂HQ for all T

p̂HQ ≤ p̂AIC if T ≥ 16
Type of variables Variables selection is a key step in the specification of the model.

VAR models are small scale models so usually 2 to 8 variables are used.

Variables selection depends on the particular application and in general should be included
all the variable conveying relevant information.

We will see several examples.


Type of transformations: many economic time series display trend over time and are clearly
non stationary (mean not constant).

Trend-stationary series
Yt = µ + bt + εt , εt ∼ W N.

Difference-stationary series
Yt = µ + Yt−1 + εt , εt ∼ W N.
Figure 1: blu: log(GDP). green: log(CPI)
These series can be thought as generated by some nonstationary process. Here there are
some examples

Example: Difference stationary


            
Y1t 0.01 0.7 0.3 Y1t−1 1t 1 0.3 1
= + + , Ω= , λ=
Y2t 0.02 −0.2 1.2 Y2t−1 2t 0.3 .1 0.9
Example: Trend stationary
            
Y1t 0 0.5 0.3 Y1t−1 1t 1 0.3 0.81
= t+ + Ω= , λ=
Y2t 0.01 0.02 0.8 Y2t−1 2t 0.3 .1 0.48
So:

1) How do I know whether the series are stationary?

2) What to do if they are non stationary?


Dickey-Fuller test In 1979, Dickey and Fuller have proposed the following test for station-
arity

1. Estimate with OLS the following equation

∆xt = b + γxt + εt

2. Test the null γ = 0 against the alternative γ < 0.

3. If the null is not rejected then


xt = b + xt + ε t
which is a random walk with drift.

4. On the contrary if γt < 0, then xt is a stationary AR

xt = b + axt + εt

with a = 1 + γ < 1.

An alternative is to specify the equation augmented by a deterministic trend

∆xt = b + γxt + ct + εt

With this specification under the alternative the preocess is stationary with a deterministic
linear trend.
Augmented Dickey-Fuller test. In the augmented version of the test p lags of the lags of
∆xt can be added, i.e.
A(L)∆xt = b + γxt + εt
or
A(L)∆xt = b + γxt + ct + εt

• If the test statistic is smaller than (negative) the critical value, then the null hypothesis of
unit root is rejected.
Transformations I: first differences Let ∆ = 1 − L be the first differences filter, i.e. a filter
such that ∆Yt = Yt − Yt−1 and let us consider the simple case of a random walk with drift

Yt = µ + Yt−1 + t

where t is WN. By applying the first differences filter (1 − L) the process is transformed into
a stationary process
∆Yt = µ + t
Let us now consider a process with deterministic trend

Yt = µ + δt + t

By differencing the process we obtain

∆Yt = δ + ∆t

which is a stationary process but is not invertible because it contains a unit root in the MA
part.
log(GDP) and log(CPI) in first differences
Transformations II: removing deterministic trend Removing a deterministic trend (linear
or quadratic) from a process from a trend stationary variable is ok.

However this is not enough if the process is a unit root with drift. To see this consider again
the process
Yt = µ + Yt−1 + t
this can be writen as
t
X
Yt = µt + Y0 + j
j=0
By removing the deterministic trend the mean of the process becomes constant but the
variance grows over time so the process is not stationary.
log(GDP) and log(CPI) linearly detrended
Transformations of trending variables: Hodrick-Prescott filter The filter separates the
trend from the cyclical component of a scalar time series. Suppose yt = gt + ct , where gt
is the trend component and ct is the cycle. The trend is obtained by solving the following
minimization problem
T T −1
X X
min c2t + λ [(gt+1 − gt ) − (gt − gt−1 )]2
{gt }Tt=1
t=1 t=2

The parameter λ is a positive number (quarterly data usually =1600) which penalizes vari-
ability in the growth component series while the first part is the penalty to the cyclical
component. The larger λ the smoother the trend component.

You might also like