Professional Documents
Culture Documents
2013-01-18 Hansen IV Slides
2013-01-18 Hansen IV Slides
2013-01-18 Hansen IV Slides
Variables Methods
Christian Hansen
Booth School of Business
University of Chicago
Introduction
• Many studies in social sciences interested in
inferring structural/causal/treatment effects
– Price elasticity of demand
– Effect of smoking on birthweight
– Effect of 401(k) participation on saving
– Effect of job training on wages/employment
– Effect of schooling on wages
– …
• Only have observational data
Conventional statistical methods may not
recover desired effect
Example 1: Supply and Demand
Supply Curve
Observed relationship between price and quantity reveals neither supply nor
demand! (Simultaneity)
Example 2: Job Training
• Observe data on earnings for people who have and have not
completed job training.
• Want to infer the causal effect of job training on earnings
v2
v2 x2
Common Structure:
• “Structural” Model:
• y – outcome of interest
• x – observed “treatment” variable
• – treatment/structural/causal effect (NOT
regression coefficient)
• (Endogeneity)
• Intuition:
Movements in are unrelated to movements in but are
related to movements in
Movements in the
supply curve induced
by changing z trace
out the demand curve
How do instruments help?
• Quasi-mathematically. IV model
• First-Stage Equation:
------------------------------------------------------------------------------
| Robust
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
train | 3753.362 536.3832 7.00 0.000 2701.82 4804.904
.
.
.
------------------------------------------------------------------------------
| Robust
train | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
offer | .6088885 .0087478 69.60 0.000 .591739 .6260379
.
.
.
Strong evidence that E[zixi] ≠ 0
Example: Job Training
• Reduced-Form Results (from Stata):
------------------------------------------------------------------------------
| Robust
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
offer | 970.043 545.6179 1.78 0.075 -99.60296 2039.689.
.
.
.
Moderate evidence of a non-zero treatment effect
(maintaining exclusion restriction)
Example: Job Training Note: Some software
reports R2 after IV
• IV Results (from Stata): regression. This
object is NOT
meaningful and
should not be used.
ivreg earnings (train = offer) x1-x13 , robust
------------------------------------------------------------------------------
| Robust
earnings | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
train | 1593.137 894.7528 1.78 0.075 -160.9632 3347.238
.
.
.
Moderate evidence of a positive treatment effect (maintaining
exclusion restriction). Substantially attenuated relative to OLS,
consistent with intuition.
Two-Stage Least Squares
• May have more instruments than endogenous variables
• In principle, many IV estimators can be constructed
• 2SLS is the minimum variance (under homoskedasticity) linear
combination of the potential IV estimators (otherwise may use GMM)
• 2SLS is the GMM estimator using the full set of orthogonality conditions
implied by
• 2SLS and IV are numerically equivalent when # of endogenous
variables = # of instruments
• First-Stage Equation:
------------------------------------------------------------------------------
| Robust
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .067339 .0003883 173.40 0.000 .0665778 .0681001
.
.
. If intuition about source of endogeneity is correct, this should be an over-
estimate of the effect of schooling.
Example: Returns to Schooling
• First-Stage Results (from Stata):
xi: regress educ i.qob i.sob i.yob , robust
Linear regression Number of obs = 329509
F( 62,329446) = 292.87
Prob > F = 0.0000
R-squared = 0.0572
Root MSE = 3.1863
------------------------------------------------------------------------------
| Robust
educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Iqob_2 | .0455652 .015977 2.85 0.004 .0142508 .0768797
_Iqob_3 | .1060082 .0155308 6.83 0.000 .0755683 .136448
_Iqob_4 | .1525798 .0157993 9.66 0.000 .1216137 .1835459
.
.
.
testparm _Iqob*
( 1) _Iqob_2 = 0
( 2) _Iqob_3 = 0
( 3) _Iqob_4 = 0 First-stage F-statistic.
F( 3,329446) = 36.06
Prob > F = 0.0000
Example: Returns to Schooling
• Reduced-Form Results (from Stata):
xi: regress lwage i.qob i.sob i.yob , robust
------------------------------------------------------------------------------
| Robust
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Iqob_2 | .0028362 .0033445 0.85 0.396 -.0037188 .0093912
_Iqob_3 | .0141472 .0032519 4.35 0.000 .0077736 .0205207
_Iqob_4 | .0144615 .0033236 4.35 0.000 .0079472 .0209757
.
.
testparm _Iqob*
( 1) _Iqob_2 = 0
( 2) _Iqob_3 = 0
( 3) _Iqob_4 = 0
F( 3,329446) = 10.43
Prob > F = 0.0000
Example: Returns to Schooling
• 2SLS Results (from Stata):
xi: ivregress 2sls lwage (educ = i.qob) i.yob i.sob , robust
------------------------------------------------------------------------------
| Robust
lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .1076937 .0195571 5.51 0.000 .0693624 .146025
.
.
.
------------------------------------------------------------------------------
| Robust
lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .1077817 .0195588 5.51 0.000 .0694472 .1461163
.
.
.
• IV Estimates:
– Q2: .166 (.071)
– Q3: .209 (.076)
– Q4: .085 (.026)
^
𝛽 2 𝑆𝐿𝑆 = ( 𝑋 𝑃 𝑍 𝑋 ) ( 𝑋 𝑃 𝑍 𝑌 ) =( 𝑋 𝑋 ) ( 𝑋 𝑌 )= ^
′ −1 ′ ′ −1 ′
𝛽 𝑂𝐿𝑆
• Potential impacts:
– ``public use'' - removing economic blight and/or promoting
economic development
– redistribution of wealth from groups with little political power
– distortions in the efficient investment of capital
• underinvestment due to uncertainty induced by potential seizure
• overinvestment when property owners anticipate receiving higher
than market compensation
Example: Eminent Domain
• Want to understand effect of number of decisions that
favor private ownership (go against government seizure)
on economic outcomes
– Real estate prices, GDP
– log(GDP):
• OLS: .0099 (.0048)
• 2SLS (after LASSO): .013 (.016) (1 instrument selected)
Weak Identification
• Consider the IV estimator:
• ``Weak Identification’’
– E[zixi’] = 0 may hold in population but Z’X will never be 0 in a finite
sample.
• Any estimator that depends on (Z’X)-1 will always suggest you can learn about
β in finite samples
– Z’X may be non-zero but close to zero. Dividing by something close to
0 causes problems.
Weak Identification
• Extreme case as an illustration:
– From the exclusion restriction, we know since we are evaluating at the true
values
Weak first-stage
• First-Stage 2
2. Controls 2: Reject
Reject
1. Controls 1:
regress newy lnmort latitude , robust