Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks

Modelos Lineares
Variveis Instrumentais: Instrumentos Fracos
Cristine Campos de Xavier Pinto
CEDEPLAR/UFMG
Maio/2010
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
The literature considers two sides of the weak instrument
problem:
Just-identied case: With weak instruments, conventional
2SLS condence intervals tend to be wide when the
instruments is very weak. In general, these intervals do not
have the correct nominal coverage for all parts of the
parameter space.
High Degree of Overidentication: In this case, 2SLS can
be biased and the standard deviations can be severely biased
too. Conventional methods for inference can be misleading.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
This example is based on Imbens and Wooldridge (2007).
Angrist and Krueger (1991) study the returns of education:
Y
i
= + E
i
+
i
where Y
i
is log of earnings and E
i
is years of education.
Endogeneity: Ability aects both schooling choice and
earnings given the education level.
Instruments: The authors explore the variation in schooling
levels that arise from dierences of compulsory schooling laws.
In general, school districts require a student to have turned six
by January before entering 1st grade. Since individuals are
required to stay in school until they are sixteen, individuals
born in the 1st quarter have lower required minimum
schooling levels than individuals born in the last quarter.
These compulsory laws generate variation in schooling levels
by quarter of birth.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
Ration is the Wald estimator:

=
Y
4
Y
1
E
4
E
1
= 0.0893
(0.0105)
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
They also present estimates based on additional set of
instruments. They interact the quarter of birth with 50 state
and 9 year of birth dummies, leading to a large vector of
instruments, Z =
_
W
/
i
, Q
i
W
/
i
_
, where W
i
are the set of
dummies and Q
i
are the birth dummies.
In this case, suing 2SLS approach

2SL
= 0.073
(0.008)
Bound-Jaeger-Baker (1996) found that are potential problems
with these results. Despite the large samples, the normal
approximation may be poor since the instruments are very
weakly correlated with the endogenous regressors. They show
some evidence of this based on simulations.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
Re-calculate the estimates after replacing the actual quarter
of birth by random indicators with the same marginal
distributions.
In this case, there are zero correlations between instruments
and endogenous regressors. They did these calculations for
the single and multiple instruments case.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
Lets consider the model
Y
i
=
0
+
1
X
i
+
i
X
i
=
0
+
1
Z
i
+
i
where Y
i
, Z
i
X
i
R, and (
i
, v
i
) l Z
i
, and jointly normal
with covariance matrix .
We need to assume that Cov [
i
,
i
] ,= 0, which leads to
endogeneity.
The reduced form for the rst equation is
Y
i
=
0
+
1
Z
i
+u
i
where the parameter of interest is
1
=

1

1
.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
In this problem, the 2SLS estimator is

2SLS
=
_
1
N
N

i =1
X
/
i
Z
i
_
1

_
1
N
N

i =1
Z
/
i
Y
i
_
and

2SLS
1
=
1
N

N
i =1
_
Y
i
Y
_ _
Z
i
Z
_
1
N

N
i =1
_
X
i
X
_ _
Z
i
Z
_
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
In this case,
1
_
N
N

i =1
_
Y
i
Y
_ _
Z
i
Z
_

d
A (Cov (Y, Z) , V
YZ
)
1
_
N
N

i =1
_
X
i
X
_ _
Z
i
Z
_

d
A (Cov (X, Z) , V
XZ
)
If
1
,= 0, as the sample size gets large, the ratio will be well
approximated by a normal distribution.
If
1
- 0, the ratio may be better approximated by a Cauchy
distribution, which does not have moments.
In the second case,

2SLS
is not consistent.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
Lets think about a more general model
Y
i
= X
i
+
i
X
i
= Z
i
+
i
where X is Nx1, and Z is NxL, with L _ 1 and (
i
, v
i
) l Z
i
,
and jointly normal with covariance matrix .
Hahn and Hausman (2002) derive the bias of the 2SLS up to
a second order:
E
_

2SLS
_
-
L
v
N
_

/
ZZ
/

_
1
The nite sample bias of 2SLS is present even if we do not
have weak instruments.
However, when the instruments is weak - 0, the bias will
be exacerbated.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
The bias is an increasing function of the degree of endogeneity
(
v
) and the number of instrumental variables used (L) .
The bias of the OLS estimator also depends on
v
, so 2SLS
is biased toward OLS.
Hausman specication tests may incorrectly fail to reject the
use of OLS because of the bias.
Hahn and Hausman (2002) derive the asymptotic distribution
of 2SLS when L and N , such that
L
_
N
= +o (1)
_
N
_

2SLS

_

d
A
_

v
_

/
ZZ
/

_
1
,

/
ZZ
/

_
1
_
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
To estimate the asymptotic variance, we need to estimate

.
We use the residuals of 2SLS regression. However, the bias of
the 2SLS estimator can cause a downward bias in the estimate
of

.
E[
2SLS
] =

2
N
_
(L 2)
2
v
_

/
ZZ
/

_
1
_

1
N

2
v
+
1
N
_

vv
_

/
ZZ
/

_
1
_
The bias can be quite large when the instruments are weak
( - 0) .
The downward bias become large when the number of
instruments increases.
Monte Carlo results show that the test of overidentifying
restrictions rejects too often when the instruments are weak.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
There is no formal test for weak IV.
In practice, people look at the F test statistics in the rst
stage. If F < 10, it indicates that IV are weak.
Stock et al (2002) propose a test based on the F statistics of
the reduced form equation.
Hahn and Hausman (2002) have some specication tests
looking at a statistics that is reverse of 2SLS estimator. This
is a specication test.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
Solution 1: We use a (second-order) unbiased estimator, like
a maximum likelihood estimator under the normality
assumption.
These estimators do not have nite sample moments and
sometimes perform poorly with weak instruments.
In actual empirical situations, the lack of the moments can
cause problem for inference.
Solution 2: Use jackknife 2SLS estimator.
This estimator omits j th observation when calculating the
coecients in the rst stage when estimating the tted
value for the jth observation. It uses N 1 observations to
estimate , instead of N .
It eliminates the nite sample bias.
This estimator is a linear combination of N 2SLS estimators, it
will add some degree of overidentication in the model.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
Solution 3: Rather than focus on parameter estimators,
Kleibergen (2002) corrects the statistical-inference problem
under weak instruments.
For the just-identied case, Anderson and Rubin (1949)
propose a test that does not depend on the strength of the
instrument. Dene the statistics
S (
1
) =
1
N
N

i =1
_
Z
i
Z
_
(Y
i

1
X
i
)
Under the null hypothesis that
1
=
+
1
and conditional on
the instruments,
_
NS (
+
1
) ~ A
_
0,
N

i =1
_
Z
i
Z
_
2

2

_
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
Anderson and Rubin test:
H
0
:
1
=
0
1
vs H
1
:
1
,=
0
1
Dene the following statistics
AR
_

0
1
_
=
N S
_

0
1
_
2

N
i =1
_
Z
i
Z
_
2

_
_
1
0
1
_

_
1

0
1
__
1
~ A
2
1
where is the variance-covariance matrix of (u, v)
We can use the statistics to get a condence interval for
1
.
In this case, the 95% condence interval for
1
is
CI

1
0.95
=
1
[ AR (
1
) _ 3.84
This condence interval cannot be empty, because
AR
_

IV
1
_
= 0 and it is always in the CI.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
In the overidentication case,
_
NS (
+
1
) has a multivariate
normal distribution. In this case, AR statistic is
AR
_

0
1
_
= N S
_

0
1
_
/
_
N

i =1
_
Z
i
Z
_ _
Z
i
Z
_
/
_
1
S
_

0
1
_
_
_
1
0
1
_

_
1

0
1
__
1
~ A
2
L
The problem is that the CI with this statistics can be empty.
In this case, the test does not only test whether
1
=
0
1
, but
also whether the instruments are valid.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
Kleinergen (2002) modies this test by looking at the
correlation between a particular linear combination of
instruments and the residuals.
His test statistics has no longer an exact chi-square
distribution, but in large samples it approximates a chi-square
distribution with one degrees of freedom.
These condence intervals are asymptotically valid
independent of the strength of the rst stage.
They are NOT valid if one rst inspects the rst stage (by a F
test), and conditional on the strength on that, decides to
proceed.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
So far, we assume valid instruments, in the sense that Z is
orthogonal to .
Lets suppose that the instruments are slightly correlated with
, but much less correlated than v with .
Hahn and Hausman consider the "large sample bias" in this
case
p lim
_

2SLS
_
-

Z,
(
/
ZZ
/
)
If we have weak instruments, the bias will not decrease with
the sample size.
Lets compare OLS and 2SLS estimators in this simple case,
K = L = 1.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
In this very simple case,

OLS
1
=
1
N

N
i =1
_
X
i
X
_
(
i
)
1
N

N
i =1
_
X
i
X
_
2

2SLS
1
=
1
N

N
i =1
(
i
)
_
Z
i
Z
_
1
N

N
i =1
_
X
i
X
_ _
Z
i
Z
_
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
Using WLLN,
p lim
N

OLS
1
=
Cov (X, )
Var [X]
= corr (X, )
_
Var []
Var [X]
p lim
N

2SLS
1
=
Cov (Z, )
Cov(X, Z)
=
corr (Z, )
corr (Z, X)

_
Var []
Var [X]
In this case, if corr (Z, ) is bot equal to zero, the asymptotic
bias of

2SLS
1
is larger than

OLS
1
if
corr (Z, )
corr (Z, X)
> corr (X, )
which is likely to happen when corr (Z, X) - 0.
When we have weak instruments, OLS may do better than
2SLS.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
References
Angrist, J. and Krueger, A. (1991), "Does compulsory
schooling attendance aect schooling and earnings?",
Quarterly Journal of Economics, 106: 979:1014.
Bound, J, Jaeger, A and R. Baker (1996), " Problems with
instrumental variables estimation when the correlation
between the instruments and the endogenous explanatory
variable is weak", Journal of the American Statistical
Association, 90:443:450.
Donald, S and W. Newey (2001). "Choosing the number of
instruments", Econometrica, 69:1161:1191.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Example Identication Failure Bias and Inference Diagnosis Checks Possible Solutions Remarks
References
Flores-lagunes, A. (2007). "Finite sample evidence of IV
estimators under weak instruments", Journal of Applied
Econometrics, 22:677-694.
Hahn, J and J Hausman (2003), "Weak instruments:
diagnosis and cures in empirical econometrics", American
Economic Review, 93:118-115.
Staiger, D. and J. Stock (1997), "Instrumental variables
regression with weak instruments", Econometrica,
68:1055-1096.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares

You might also like