Professional Documents
Culture Documents
Dynamic 2
Dynamic 2
= Xit + uit
Di¤erent assumptions about the properties of xit will imply di¤erent sets
instruments.
xit may be correlated or uncorrelated with i.
ment conditions.
And assuming that xit is strictly exogenous wrt vit is more restrictive than
are valid, but would imply inconsistency if they are not valid.
tested.
Example
xit is correlated with the individual e¤ects, and predetermined wrt the
i.e. xi;t 1 (as well as xi;t 2 and longer lags) is uncorrelated with vit =
vit vi;t 1:
The complete set of (linear) moment conditions can be written as E(Zi0 vi) =
0, where now
0 1
B yi1 xi1 xi2 0 0 0 0 0 ::: 0 ::: 0 0 ::: 0 C
B C
B C
B 0 0 0 yi1 yi2 xi1 xi2 xi3 : : : 0 : : : 0 0 ::: 0 C
B C
Zi = B C
B . .. .. .. .. .. .. .. . . . .. .. .. .. C
B . C
B C
@ A
0 0 0 0 0 0 0 0 : : : yi1 : : : yi;T 2 xi1 : : : xi;T 1
bGM M = ( X 0ZWN Z 0 X) 1
X 0ZWN Z 0 y
Alternative choices of WN produce one step and two step GMM estimators,
remain valid.
If xit is correlated with vit, then xi;t 1 is not a valid instrument for the
In this case the treatment of xis and yis in the instrument matrix is sym-
metric.
If instead xit is strictly exogenous wrt the vit shocks (E(xisvit) = 0 for all
would be valid.
Strict exogeneity implies that all past, present and future values of xis are
xit is uncorrelated with the individual e¤ects, and strictly exogenous wrt
the shocks.
This introduces some new issues, since now we have valid moment condi-
correlated with i, and we use all the moment conditions that this implies
for the equations in …rst-di¤erences (as discussed above), this gives a further
There are many equivalent ways to write these. One possibility that is
quite elegant (but less useful in practice if we have to deal with unbalanced
panel data) is
where uiT = i + viT is the error term for the untransformed equation in the
E(Zi+0u+i ) = 0 where
0 1
B yi1 xi1 : : : xiT 0 0 0 ::: 0 ::: 0 ::: 0 0 ::: 0 0 ::: 0 C
B C
B C
B 0 0 : : : 0 yi1 yi2 xi1 : : : xiT : : : 0 : : : 0 0 ::: 0 0 ::: 0 C
B C
B C
+ B .. C
Zi = B .. .. .. .. .. .. .. . . . .. .. .. .. .. C
B C
B C
B C
B 0 0 ::: 0 0 0 0 : : : 0 : : : yi1 : : : yi;T 2 xi1 : : : xiT 0 ::: 0 C
B C
@ A
0 0 ::: 0 0 0 0 ::: 0 ::: 0 ::: 0 0 ::: 0 xi1 : : : xiT
or 0 1
D
+ B i 0 ::: 0 C
Z
Zi = @ A
0 xi1 : : : xiT
where ZiD is the corresponding instrument set (i.e. assuming strict exogene-
the levels equation for the …nal period, and augment the instrument matrix
all individuals, the observation for the …nal period may be missing for some
where yi+ and Xi+ are de…ned analogously to u+i , and y +; X + and Z + are
mators is that there is no compelling choice for the one step weight matrix
2
proportional to a known matrix (since the variance of uiT depends on as
2
well as v ):
For slightly obscure reasons, the GAUSS and OX implementations now use
N
! 1
1 X +0 + +
WN = Zi H Zi
N i=1
where 0 1
+BH 0C
H =@ A
0 I
and H is the (T 2) (T 2) matrix with 2 on the main diagonal, -1 on
is no reason to suppose that the one step weight matrix we use for this
where
XN
1
VbN = Zi+0u
b+i u
bi+0Zi
N i=1
and
b+i = yi+
u Xi+b
These are used to obtain E(Zi+0u+i ) = 0 where Zi+ again has the form
0 1
D
+ B i 0 ::: 0 C
Z
Zi = @ A
0 xi1 : : : xiT
but now ZiD contains the smaller set of instruments that are valid for the
This arises when xit is correlated with i, but some known function of xit
is uncorrelated with i, and thus can be used to form valid instruments for
) E[(xit xi;t 1) i] = ! i !i = 0
( xi2; xi3; :::; xiT ) as instruments for the period T equation in levels
If xit is correlated with vit, we can only use ( xi2; xi3; :::; xi;T 1) as
Details will not be given here, but a similar case will be considered shortly.
When we have variables (e.g. xit itself, or xit) that are uncorrelated with
the individual e¤ects, so that the levels equations can be used in estimation,
where the wi are observed explanatory variables that do not vary over time.
eliminate the unobserved i from the error term, the transformation also
mated under the assumption that E[wi( i + vit)] = 0, in which case the
sistently without invoking this assumption about wi itself (and hence also
introduced into the static panel data literature by Hausman and Taylor
(Econometrica, 1981).
itself is correlated with i raises the possibility that yit may also be uncor-
related with i.
To consider this further, and to explore when this initial conditions restric-
considered previously.
‘System GMM’
= yi;t 1 + uit
E( i) = E(vit) = E( ivit) = 0
E(visvit) = 0 for s 6= t
conditions
E( yi2 i) = 0
implies
:
t 3
X
t 2 s
= yi2 + vi;t s for t = 3; :::; T
s=0
This then implies an additional T 2 non-redundant linear moment con-
ditions for the equations in levels, which can be written (for example) as
ii) Given these additional linear moment restrictions, the quadratic moment
For example
then be written as
and
ditions for the equations in levels can in some cases provide a dramatic
In this case the correlation between yi;t 1 and lagged levels yi;t s for
s 2 becomes weaker, and the …rst-di¤erenced GMM has poor …nite sample
It turns out that exploiting the additional linear moment conditions im-
plied by this restriction on the initial conditions provides much more dra-
matic gains, provided that the additional initial conditions restriction is valid
The AR(1) speci…cation determines yi2 given yi1, so to guarantee that yi2
The representation
t 3
X
t 2 s
yit = yi2 + vi;t s for t = 3; :::; T
s=0
suggests (using backward recursion for earlier periods) that if the same model
has generated the yit series for long enough prior to our sample period, the
the process to have become negligibly small (which in turn depends on the
true value of ).
De…ne (wlog)
i
ei1 = yi1
1
i
yi1 = + ei1
1
Then
i
yi2 = ( 1) +( 1)ei1 + i + vi2
1
= ( 1)ei1 + vi2
E(ei1 i) = 0
Interpretation
1
i
is the level that our model speci…es the yit series will converge towards
ei1 = yi1 1
i
is the deviation from this convergent level at the start of
The initial observations yi1 can deviate randomly, but not systematically,
does not impose any restriction on the variance. Sometimes known as ‘mean
stationarity’.
As noted earlier, the restriction will hold automatically if the same process
has generated the yit series for long enough before the start of our sample
period.
Thus if we believe the AR(1) speci…cation, and there is nothing special
to hold.
But if our …rst observation corresponds to the true start-up of the process,
the case where levels (or …rst-di¤erences) of some xit variable can be used to
yi+ = yi(+ 1) + u+
i
0 1 0 1 0 1
B yi3 C B yi2 C B vi3 C
B C B C B C
B C B C B C
.
B . C B .
. C B .. C
B C B C B C
B C = B C+B C
B C B C B C
B yiT C B yi;T 1 C B viT C
B C B C B C
@ A @ A @ A
yiT yi;T 1 i + viT
and write the complete set of moment conditions as E(Zi+0u+i ) = 0, where
0 1
D
+
Z
B i 0 ::: 0 C
Zi = @ A
0 yi2 : : : yi;T 1
1
PN +0 +
De…ning bN ( ) = N i=1 i ui ( ) and choosing
Z to minimise JN ( ) =
bN ( )0WN bN ( ) gives
Consider the basic AR(1) model with serially uncorrelated shocks, prede-
di¤erences
tions in levels
Suppose that we use the 3 moment conditions for the equations in …rst-
= E[ yi2( i + vi3)]
Then knowing E[yi2 vi4] = 0 and E[yi1 vi4] = 0 and E[ yi2( i + vi4)] = 0
Thus if we use the 3 moment conditions E[yi2 vi4] = 0 and E[yi1 vi4] = 0
But conversely
= E[ yi2( i + vi4)]
Again, knowing E[yi2 vi4] = 0 and E[yi1 vi4] = 0 and E[ yi2( i +vi3)] = 0
Although we cannot express all the available moment conditions using only
Keeping all the available moment conditions for the equations in …rst-
Given that we use all 3 moment conditions for the 2 equations in …rst-
E[ yi2( i + vi3)] = 0
E[ yi3( i + vi4)] = 0
which requires the use of 2 equations in levels.
For balanced panels, these two alternative representations are com-
In balanced panels, where all i = 1; 2; :::; N individuals are observed for all
lent. The one-step and two-step GMM estimators of obtained from the
The system with only 1 equation in levels, for the …nal time period (here
for T = 4), is more compact to write, which is why I have used it on the
earlier slides.
However the system with both equations in levels extends more easily to
and the moment condition E[yi1 vi3] = 0; and the levels equation
and the moment condition E[ yi2( i + vi3)] = 0, for those individuals with
which is not observed for the individuals with period 4 data missing, we
would lose all the information in the additional moment conditions, implied
Similar considerations apply to panels with more than 4 time periods, and
MA errors
Example
E("is"it) = 0 for s 6= t
for t = 3; :::; T:
The …rst equation for which we have a valid moment condition is now
available.
NB. Models with MA errors arise naturally in dynamic models with (seri-
then the model for the observed series yit has the form
which has an MA(1) error component if the measurement errors mit are
serially uncorrelated.
AR errors
Example
E("is"it) = 0 for s 6= t
Lagged values of both yit and xit are correlated with past "it shocks, and
E(yi;t s vit) 6= 0 and E(xi;t s vit) 6= 0 for all the lagged observations on
how to estimate. The explanatory variables (xit and xi;t 1) are correlated
( 1; 2; 3):
tor.
Final remark on estimation
panels in the context of models with lagged dependent variables, for example
or
where xit is correlated with i and not strictly exogenous wrt vit, then we
can use these GMM methods to obtain estimators of that are consistent
as N ! 1 with T …xed.
For example, if we want to allow xit to be correlated with vit (but not
correlated with future values vi;t+s of the (serially uncorrelated) vit shocks),
we can use the equations in …rst-di¤erences
with lagged values of xi;t s dated t 2 and earlier as instruments (we may also
want to use lagged yi;t s dated t 2 and earlier as additional instruments).
Speci…cation tests
The basic speci…cation test for GMM estimators is the Sargan (1958)/Hansen
Otherwise we reject.
Very loosely, this is like testing for correlation between the model residuals
yi = Xi + ui i = 1; :::; N
E(Zi0ui ) = 0
S = N JN (bGM M )
optimal (two step) GMM estimator (bGM M ), and JN (bGM M ) is the value of
N N N
1 Xb 0 1 X 0 1 X
b
= N( bi Zi)(
u Zi u bi 0Zi) 1(
bi u Zi0u
bi )
N i=1 N i=1 N i=1
where
bi = yi
u Xi b
b
bi = yi
u Xi bGM M
b is an initial consistent estimator, s.t. ( 1 PN Z 0u
b b 0 1
N i=1 i i i Zi)
u = WN is the
weight matrix used to calculate the optimal two step estimator.
b
bGM M is the optimal two step estimator, so that u
bi are the two step resid-
uals.
If we have L columns in Zi and K columns in Xi (or K rows in ),
a 2
S (L K)
Values of the Sargan/Hansen test statistic that are too large, relative to
2
critical values of the appropriate distribution, reject the validity of the
This test procedure can be used to test nested hypotheses, where (only)
a subset of the moment conditions that are valid under the null hypothesis
H0 : vit M A(0)
H1 : vit M A(1)
For the basic AR(1) model, under the null we have the set of moment
conditions
While under the alternative we have the smaller set of moment conditions
If the null is valid, neither S nor S A should reject, and the di¤erence
between them should not be too large.
If the null if invalid but the alternative is valid, then S should reject while
S A should not reject. The di¤erence between them should be larger.
If both the null and the alternative are invalid, then both S and S A should
reject. Again in this case (though less obviously) the di¤erence between
them should be larger than in the case where the null is valid.
a 2
DS = S SA (L LA) under H0 : vit M A(0)
NB. Rejecting the null does not imply acceptance of the speci…c alternative
that was considered. If we reject the null of no serial correlation against the
MA(1) alternative, the next step should be to test the null of MA(1) shocks
The Hausman test has the same sequential reasoning as the Di¤erence
vectors under the null and under the alternative, rather than on the di¤erence
Let bGM M denote the optimal two step GMM estimator under H0 (i.e.
A
b
using instruments Zi) and let GM M denote the optimal two step GMM
If the null is invalid, then bGM M (at least) is inconsistent, and in this case
A
the di¤erence b bGM M should be larger.
GM M
Formally
A A 1 bA a
b
h = ( GM M GM M ) [avar( GM M ) avar( GM M )] ( GM M bGM M )
b 0 b b 2
(R)
A
b
under the null hypothesis, where R = rank[avar( GM M ) avar(bGM M )]:
of parameters in :
When this covariance matrix is not of full rank, we can either use a gener-
Or, more simply, apply the Hausman test using a subset of the parameters
Arellano and Bond (1991) proposed a direct test for serial correlation in
Since vit = vit vi;t 1 is MA(1) under the null hypothesis that E(visvit) =
…rst-di¤erenced residuals.
Then
cv 0 2 cv a
m2 = N (0; 1)
se
under H0 : E( vit vi;t 2) = 0:
The expression for the standard error of this autocovariance (se) can be
Values outside the range 1.96 thus reject the null at the 95% level.
Note that the sign of the test statistic is also informative about the sign of
cv i = yi XibGM M
and test for the absence of second-order serial correlation in these …rst-
di¤erenced residuals.
The corresponding test for the absence of …rst-order serial correlation in
the …rst-di¤erenced residuals (m1) can - and should - be used to check that
used to test assumptions about the status of xit explanatory variables, and
to test the initial conditions restriction required for lagged values of yis to
We compute the ‘system GMM’ estimator under the null and the basic
The Hausman test considers whether these parameter estimates are similar
in cases where the moment conditions available for the equations in …rst-
close to 1 in the simple AR(1) model; or if some of the xit series are close to