Professional Documents
Culture Documents
GMM 2
GMM 2
GMM 2
Lecture outline:
1. Introduction:
4. Hypothesis testing.
Today:
Why GMM?
Statistical antecedents.
Contemporay example.
IV estimator in linear regression model.
1. Introduction
Hansen (1982, Econometrica) introduced the
Generalized Method of Moments (GMM) estimator.
Two advantages:
T 1
T
X
t=1
T 1
T
X
t=1
vt
^ = 0
vt2 (^
2 +
^2) = 0
6
This implies
^ = T 1
^2 = T 1
T
X
vt
t=1
T
X
(vt
^)2
t=1
Key idea: population moment conditions provide information upon which estimation of parameters can be based.
k
X
[^
pi h(i; 0)]2 d
i=1
p^i
! 2
k1
Connection to moments:
Let fDt(i); i = 1; 2; : : : k; t = 1; 2; : : : T g satisfy:
Dt(i) = 1 if tth outcome in ith group
Dt(i) = 0 else
)
P (Dt(i) = 1) = h(i; 0)
E[Dt(i)] = h(i; 0)
Dt(1) h(1; 0)
6
6 Dt(2) h(2; 0)
6
E6
:
6
6
:
4
Dt(k) h(k; 0)
p^ h(1; )
6 1
6 p
^ h(2; )
6 2
6
:
6
6
:
4
p^k h(k; )
7
7
7
7 = 0
7
7
5
7
7
7
7 = 0
7
7
5
10
GFT () = T
2
6
6
6
6
6
6
4
30 2 1
p^
0
p^1 h(1; )
7 6 1
1
6 0
p^2 h(2; ) 7
p
^
2
7 6
7
6
:
:
7 6 :
7 6
:
:
5 4 :
p^k h(k; )
0
0
:
:
:
:
:
: 0
: 0
:
:
:
:
: p^1
k
32
p^ h(1; )
76 1
76 p
7 6 ^2 h(2; )
76
:
76
76
:
54
p^k h(k; )
11
(1)
qtD = qtS = qt
Wright's solution:
Find: ztD such that Cov(ztD ; uD
t ) = 0.
Then (1) )
Cov(ztD ; qt) 1Cov(ztD ; pt) = 0
(2)
(3)
and so if E[uD
t ] = 0 then
^1 =
T
X
t=1
ztD qt=
T
X
ztD pt
(4)
t=1
13
for all t
T
X
f (vt; )0WT T 1
t=1
T
X
f (vt; )
t=1
p
PT
^
t=1 f (vt; T ) =
14
1
X
iU (ct+i)j-t]
i=0
subject to
ct + ptqt = rtqt1 + wt
for all t, where
(5)
for all t.
16
U (ct) = ct =
and so (6) becomes
E[(rt+1=pt)(ct+1=ct)1j-t] 1 = 0
(7)
Consider:
yt = x0t0 + ut;
t = 1; 2; : : : T
yt is scalar, observed;
ut is the unobserved error term;
zt is a (q 1) vector of instruments.
Problem: to estimate 0.
19
21
22
F0
0
0
1=2
E[xtzt]W
F (F 0F )1F 0W 1=2E[ztut(0)] = 0
identifying restrictions
23
Remainder is
(Iq F (F 0F )1F 0)W 1=2E[ztut(0)] = 0
overidentifying restrictions.
Sample analogs:
Overidentifying restrictions: WT
Now,
1=2
QT (^T ) = kWT T 1Z 0u(^T )k
24
^T is consistent for 0
q
a
^T;ii=T
N (0; 1) where
(^T;i 0;i)= V
^T = (X 0ZWT Z 0X)1X 0Z SZ
^ 0X(X 0ZWT Z 0X)1
{ V
p
^T !
{ S
limT !1V ar[T 1=2Z 0u]
25
T
X
t=1
T 1
T
X
ztutgfT 1=2
0
E[u2
t ztzt]
t=1
Therefore,
^T =
S
T 1
T
X
u(^T )2ztzt
t=1
26
T
X
t=1
ztutg
27
JT ! 2
qp
28
Lecture outline:
1. Introduction
4. Hypothesis testing.
Today:
Identication
The Estimator
Identifying and overidentifying restrictions
Asymptotic properties
Covariance Matrix estimation
Strict Stationarity
The (r 1) random vectors fvt; 1 < t < 1g
form a strictly stationary process with sample
space V <r .
Population Moment Condition
Let 0 be a vector of unknown parameters
which are to be estimated, vt be a vector of
random variables and f (:) a vector of functions
then a population moment condition takes the
form
E[f (vt; 0)] = 0
for all t:
(1)
4
Global Identication
The parameter vector 0 is globally identied
by the population moment condition in Assumption 3.3 if and only if E[f (vt; )] 6
= 0
for all 2 such that 6
= 0.
\global" ) pmc only holds at one value in the
entire parameter space.
Identication failures can sometimes be diagnozed, but it is often dicult.
6
M() = 4
(2)1
0
0
+( 2)1
2
2
0
7
0 5
1
10
Terminology:
11
2. GMM estimation
GMM minimand is:
QT () = fT 1
T
X
t=1
f (vt; )g0WT fT 1
T
X
f (vt; )g
t=1
T
T
X
X
@f (vt; ^T ) 0
1
g WT fT
f (vt; ^T )g = 0
0
@
t=1
t=1
12
WT
1=2
gT (^T ) = fIq PT (^T )gWT gT (^T )
4. Asymptotic properties of ^T
p
Consistency: ^T ! 0
15
d
Asymptotic normality: T 1=2(^T 0) ! N (0; MSM 0)
where
M = (G00W G0)1G00W .
G0 = E[@f (vt; 0)=@0]
S = limT !1 V ar[T 1=2gT (0)]
MVT: gT (^T ) = gT (0)+GT (^T ; 0; T )(^T 0)
Premultiply both sides by GT (^T )0WT , use FOC
and rearrange to give:
T 1=2(^T 0) = MT T 1=2gT (0) + op(1)
16
S = 0 +
1
X
(i + 0i)
i=1
T
X
f^tf^t0
t=1
17
k
X
^i(k)g1(k)fI
^
A
q
i=1
PT
1
T
et(k)^
et(k)0.
t=1 ^
k
X
^i(k)g1
A
i=1
Choice of k:
18
b(T
X)
^i +
^ 0i)
!iT (
i=1
where
^ i = T 1
T
X
f^tf^ti
t=i+1
Evidence suggests HAC estimators do not perform well if ft has slowly decaying autocovariances i.e. a strong autoregressive component.
! rewhitening and recolouring
19
k
X
k
X
0
1
^ HAC fIq
^i(k)g
^i(k)g1
fIq
A
A
i=1
i=1
20
Numerical optimization
Three important aspects of to numerical optimization routines.
The starting value for , (1).
The iterative search method by which the
candidate value of ^ is updated on the ith
step.
The convergence criterion used to judge
when the minimum has been reached.
Example of iterative routine: Newton{Raphson
(NR) algorithm
(j)
(j1)
^T = ^T
31
(2)
(j1)
@ 2QT (^T
)
5
4
0
@@
2
(2) (j1)
@QT (^T
)
@
jj^T
(j1)
^T
jj <
21
Single asset = equally weighted NYSE index (EWR) or value weighted NYSE index
(VWR)
Lecture outline:
1. Introduction
4. Hypothesis testing.
Today:
M = (G00W G0)1G00W .
G0 = E[@f (vt; 0)=@0]
S = limT !1 V ar[T 1=2gT (0)]
If p = q then MSM 0 = (G00S 1G0)1
{ independent of W .
Two{step procedure:
1. Estimate with sub{optimal WT ! ^T (1) !
^T (1).
S
^T (1)1 ! ^T (1).
2. Estimate with WT = S
Can iterate further
^T (i 1)
1. ^T (i 1) ! S
^T (i 1)1 ! ^T (i).
2. Estimate with WT = S
Continue until k^T (i1)^T (i)k < or i = imax.
6
and
et(; ) = (rt+1=pt)(ct+1=ct)1 1
Single asset = equally weighted NYSE index (EWR) or value weighted NYSE index
(VWR)
ct = aggregate per capita consumption
zt = (1; ct=ct1; ct1=ct2; rt=pt1; rt1=pt2)0
Sample: 1960.1{1991.12
As before for rst step use WT = 105I5 and
(T 1Z 0Z)1.
7
2. Impact of transformations
Consider ve types of transformation:
10
PT
t=1 vt.
PT
t=1 xt
So ~T = c^T
But interpretation of 0 has changed!
11
(ii) Reparameterization:
0 satises:
globally identied
can be written as 0 = h(0) where h :
<p ! <p is a continuous, dierentiable bijective mapping.
The GMM estimator is invariant to reparameterization in the sense that the two parameterizations yield logically consistent estimators.
Q;T () = GMM minimand associated with the
reparameterized model, i.e. Q;T () = QT (h()).
^
T = argmin Q;T ()
Can calculate ^
T as follows.
min QT (h()) wrt h() ! ^
hT
^
hT = h(^
T ) ! ^
T .
But ^
hT = ^T and so by construction
^T = h(^
T )
13
14
16
^T = 4(T 1
T
X
^T = 4(T 1
x1;tzt)WT
(T 1
t=1
ztx1;t
t=1
(T 1
T
X
x1;tzt)WT
(T 1
T
X
x2;tzt)WT
(T 1
(T 1
T
X
t=1
T
X
x2;tzt)WT
ztx2;t
t=1
T
X
31
)5
ztR1;t)
t=1
t=1
t=1
T
X
(T 1
31
)5
T
X
ztSt)
t=1
^T and
^T are not logically consistent but ex
hibit this property in the limit.
17
PT
^
t=1 f (vt; T ) = 0.
18
ptct 0
0
= 0E[rt+1ct+1
j-t]
Since ptct 0
2 -t, both sides of this equation
1
were divided by ct 0 pt to give
E[0(rt+1=pt)(ct+1=ct)01 1j-t] = 0
However FOC also implies
1
0
E[0rt+1ct+1
ptct 0
j-t] = 0
20
21
23
T Qcont;T (0) ! 2
q
Therefore an asymptotically valid 100(1 )%
condence set for 0 is then given by
f : T Qcont;T () < cq () g
where cq () is the 100(1 )% percentile of
2
q distribution.
24