Professional Documents
Culture Documents
Instrumental Variables
Instrumental Variables
Instrumental Variables
Instrumental Variables
Walter Sosa-Escudero
Econ 507. Econometric Analysis. Spring 2009
March 8, 2009
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Motivation
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Endogeneities: Examples
Simultaneous Equations
The simplest supply and
s
qi
qd
is
qi
demand system:
= xsi 1s + 2s pi + si
= xdi 1d + 2d pi + di
= qid
In equilibrium:
pi = (2s 2d )1 (xdi 1d xsi 1s + di si )
In both, supply and demand, pi as an explanatory variable depends
on the error term. Simple OLS is not consistent
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Omitted Variables
Y = 1 X1 + 2 X2 + u
if we omit X2
Y = 1 X1 +
1 X2 . Then the predeterminedness assumption will be
violated unless X1 and X2 are orthogonal or 2 = 0.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Now
Cov(Xi , i ) = E(Xi , i ) = E [(Xi + i ) (2 i + ui )]
= 2 2 6= 0
Then the OLS estimator that regresses Yi on Xi is inconsistent.
We will derive more details in the next homework.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Linearity: yi = x0i 0 + ui
i = 1, . . . , n.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
1X
zi (yi x0i IV ) = 0
n
i=1
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
i=1
i=1
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
IV
= 0 +
= 0 +
(Z 0 X)1
Z 0u
0 1 0
Z X
n
zx <
Z u
n
p
p
d
01
(IV 0 ) N (0, 01
zx Szx ).
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Instrument Validity
IV validity 1): rank. E(zi x0i ) = zx , finite and invertible.
Intuitively this requires the instruments to be correlated with
the variables to be instrumented. This might be checked
empirically. More later.
IV validity 2): orthogonality. E(zik ui ) = 0. Instruments must
be uncorrelated with whatever is not observed that is a
determinant of yi . This depends on things we do not observe
and on how we setup the model.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
1X
zi (yi x0i IV ) = 0
n
i=1
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Linearity: yi = x0i 0 + ui
i = 1, . . . , n.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
1X
=0
zi (yi x0i )
n
i=1
Pn
i=1 zi (yi
x0i b)
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Walter Sosa-Escudero
1
X 0 ZWn Z 0 Y
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Comments:
The existence of T requires X 0 ZWn Z 0 X invertible. Check
that our assumptions guarantee this asymptotically.(Careful
what you do with n!).
When p = K this reduces to g = IV = (Z 0 X)1 Z 0 Y . WT
plays no role.
What is the value of J(g ) in this latter case?. This will be
crucial later on.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
The plan
1
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Asymptotic Properties of g
Prelude: Identification
Before showing consistency we need to guarantee that the
estimation problem is well defined
In our case, it means that the GMM problem has a unique
solution in the population
Consider the moment equations
E(zi ui ) = E zi (yi x0i b) = 0
which are satistied with b = 0 .
Identification: 0 is identified if it is the only solution.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
p
Consistency: g 0
Start with:
g = X 0 ZWn Z 0 X
1
X 0 ZWn Z 0 Y
1
X 0 Z Wn Z 0 u
X 0Z
Z 0X
Wn
n
n
Walter Sosa-Escudero
1
X 0Z
Z 0u
Wn
n
n
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
1 0
1X
p
ZX=
zi x0i E[zi xi ] = zx
n
n
i=1
1 0
1X
p
Zu=
zi ui E[zi ui ]
n
n
i=1
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
n (g 0 ) =
X 0Z
Z 0X
Wn
n
n
|
{z
1
n to get:
X 0Z
Z 0u
Wn
n
n
}
Z 0u 1 X
d
= n
zi ui N (0, S)
n
n
i=1
d
n ( 0 ) N (0, HSH 0 ).
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
0
AV(g ) = HSH 0 = (0xz W xz )1 0xz W S (0xz W xz )1 0xz W
See the close relationship with the White estimator we have seen before.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Efficient GMM
Our GMM estimator is consistent and AN for any choice of
Wn and its limit W .
Recall that the asymptotic variance of the GMM estimator is
AV(g ) = HSH 0
The optimal weighting matrix, W 0 is the value that minimizes
AV(g ).
Result: W 0 S 1 .
In such case, replacing, W = S 1 in AV(g ) leads to:
AVo (g ) = 0xz S 1 xz
1
which provides a lower bound for the AVAR of the generic GMM
estimator.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
1
X 0 ZWn Z 0 Y
1
X 0 Z S1 Z 0 Y
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
So we need to show
0xz S 1 xz 0xz W xz 0xz W SW xz
1
is psd.
Walter Sosa-Escudero
Instrumental Variables
0xz W xz
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
1
0xz W xz
i
h
1 0
xz W xz
0xz S 1 W xz 0xz W SW xz
i
h
1 0
0xz P 0 P W xz 0xz W (P 0 P )1 W xz
xz W xz
where P 0 P = S 1 , P invertible.
i
h
1 0
xz W P 1 P xz
0xz P 0 I P 01 W xz 0xz W P 1 P 01 W xz
0xz P 0 I B(B 0 B)1 B 0 P xz
for B P 01 W xz .
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
(q MB0 P xz c)
since q is a vector.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Now note that S = V (zi ui ) = E(u2i zi zi0 ) simplifies (using LIE) to:
S = E(u2i zi zi0 ) = 2 E(zi zi0 ) = 2 z
which can be consistently estimated by
1
S =
2
n
n
X
i=1
1
zi zi0 =
2 Z 0Z
n
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
1
1
1
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Remarks on TSLS
The TSLS is a very particular case of a GMM estimator for a
particular choice of W 0 that arises under conditional
homoskedasticity.
You have to be very careful with the efficiency assessments
that you want to make about TSLS. From the GMM
perspective it is not more efficient that other choices of W 0
that satisfies the optimality criterion. Why?, Can you quickly
produce another one?.
Careful with the steps. From the GMM perspective there is
only one step since we do not need to produce an estimate of
2 to implement the estimator (it cancelled out!!!).
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
TSLS as an IV estimator
In the standard literature it is called a two-stage estimator
because Pz Z(Z 0 Z)1 Z 0 is idempotent (Pz0 Pz = Pz ) and
symmetric, so:
1 0
T SLS = X 0 Z(Z 0 Z)1 Z 0 X
X Z(Z 0 Z)1 Z 0 Y
= (X 0 Pz X)1 X 0 Pz Y
= (X 0 Pz0 Pz X)1 X 0 Pz0 Pz Y
0
= (X X )1 X Y
with X Pz X and Y Pz Y .
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Efficient GMM
Conditional Homoskedasticity and 2SLS
Walter Sosa-Escudero
Instrumental Variables