Professional Documents
Culture Documents
Important Natural Experiments RDD
Important Natural Experiments RDD
Conclusion
November 2010
Clment de Chaisemartin
Dierence in dierences
Conclusion
Dierence in dierences Intuition Identication of a causal eect Discussion of the assumption Examples Regression Discontinuity Design Two policy questions Intuition Formal Analysis Discussion of the assumptions Applications Correction of the Stata exercise Conclusion
3 4
Clment de Chaisemartin
Dierence in dierences
Conclusion
Gold standard in impact evaluation = randomized experiments (session 6). When it is not possible to run a randomized experiment, three alternative possibilities:
instrumental variable (session 7) Dierence in dierences (today) Regression discontinuity (today)
Contrary to randomized experiments, these methods are not assumption free. Rely on assumptions which cannot be tested => should be carefully discussed.
Clment de Chaisemartin
Conclusion
Before-After analysis
Assume you want to measure the impact of the minimum wage on employment. Not possible to run a randomized experiment. Tough to nd out an instrumental variable. You have a data set of 200 american fast-food (FF) located in New Jersey (NJ), with the number of people employed during year N and year N+1. FF highly relevant for this type analysis: they indeed pay the minimum wage to their employees. We assume that the minimum wage in NJ changed between N and N+1. You could say that impact of the change of the minimum wage = average number of employees in N+1 - average number of employees in N. Issue = not only the minimum wage changed from the before to the after period (business cycle...)
Clment de Chaisemartin DID and RDD
Conclusion
Cross-sectional analysis
Assume that on top of this data set, you have another with the same data but for FF located in Pennsylvania (Penn) where there was no change in the minimum wage from N to N+1 and where the minimum wage in N and N+1 was similar to that of NJ in N. => you could say that impact of the change of the minimum wage = average number of employees in NJ rms in N+1 - average number of employees in Penn rms in N+1. Issue = NJ and Penn dier on more aspects than just minimum wage. => your estimate might capture more than just the minimum wage eect (maybe fast-food are just bigger in one state than in the other : larger population, larger cities...) => any idea to circumvent these two issues ?
Clment de Chaisemartin
Conclusion
Dierence in dierences
impact of the change of the minimum wage = average number of employees in NJ FF in N+1 - average number of employees in NJ FF in N-(average number of employees in Penn FF in N+1 - average number of employees in Penn FF in N). Dierence in dierences or double dierences. Intuition: you observe a change in FF size from N to N+1 in NJ.
Some of it is due to the change in the minimum wage, some of it would have happened anyway. To measure the change that would have happened anyway, you do the same computation but in a State where there has been no change in the minimum wage. => eect of the minimum wage = total change - change that would have happened anyway.
Clment de Chaisemartin
Conclusion
Two groups (NJ and Penn, Test and Control group) represented by a dummy G . Two periods of time (before and after) represented by a dummy T . One treatment, represented by a dummy D. D = T G . Only observations from the treatment group in period 1 are treated. Potential outcomes: Y (1): what happens to someone if he receives the treatment. Y (0): what happens to someone if he does not receive receives the treatment. Y = observed outcome
Clment de Chaisemartin
Conclusion
For the test group in period 1, is Y equal to Y (1) or to Y (0) ? For the control group in period 0 ? What we want to identify is E (Y (1) Y (0)|T = 1, G = 1) = E (Y (1)|T = 1, G = 1) E (Y (0)|T = 1, G = 1). Which of these two expectations can we easily estimate from the sample ? Which one is missing ? FF example: average number of employees in NJ FF after the minimum wage increase minus the the average number of employees in NJ FF after the minimum wage increase if actually the minimum wage had not increased.
Clment de Chaisemartin
Conclusion
Summary
T=1 0% treated
G=1
0% treated
100% treated
Clment de Chaisemartin
Conclusion
Assumption: E (Y (0)|T = 1, G = 1) E (Y (0)|T = 0, G = 1) = E (Y (0)|T = 1, G = 0) E (Y (0)|T = 0, G = 0). Interpretation of the assumption in the minimum wage example: without the rise of the minimum wage, average number of employees in NJ FF would have followed the same evolution from period 0 to 1 than the evolution observed in Pennsylvania. Under this assumption:
E (Y (0)|T = 1, G = 1) = E (Y (0)|T = 0, G = 1) + E (Y (0)|T = 1, G = 0) E (Y (0)|T = 0, G = 0)
and therefore
Clment de Chaisemartin
Conclusion
Regression analysis
If you run the following regression: Y = + 1{G =1} + 1{T =1} + 1{G =1} 1{T =1} + , = DID. Indeed, E (Y |T = 0, G = 0) = , E (Y |T = 1, G = 0) = + , E (Y |T = 0, G = 1) = + , E (Y |T = 1, G = 1) = + + + . => you can easily get the dierence in dierences estimator and its t-statistic from this regression.
Clment de Chaisemartin
Conclusion
Clment de Chaisemartin
Conclusion
Not testable by a formal test. Graphical test: if you have several years of data, you can test whether the outcome variable followed parallel trends over the period except in the year of the reform. Assume that the minimum wage reform was implemented on 01/01/1993. If you also have data on the same FF in 1991 and 1992, you can compute the same DID but from 1991 to 1992, that is to say over two years when there was no change in the minimum wage. This is what we call a placebo dierence in dierence. If your common trend assumption is true, do you expect to nd that this placebo dierence is large or small ?
Clment de Chaisemartin
Conclusion
Clment de Chaisemartin
Conclusion
Clment de Chaisemartin
Conclusion
Issues with the paper: they do not have several years of data => can not run the placebo tests described above. However, they have employment data on high standards restaurants => compute the di in di estimator on them and this is 0 as expected (a rise in the minimum wage should not have any impact on them).
Clment de Chaisemartin
Conclusion
Clment de Chaisemartin
Conclusion
Cons:
unemployment benets = an incentive for unemployed not to look for a job.
Clment de Chaisemartin
Conclusion
Clment de Chaisemartin
Conclusion
Conclusion
Clment de Chaisemartin
Conclusion
Clment de Chaisemartin
Conclusion
By comparing the probability of nding a job of individuals slightly below 50 to the same probability but for individuals slightly above 50. limA50,A50 E (Y |A) limA50,A<50 E (Y |A) = limA50,A50 E (Y1 |A) limA50,A<50 E (Y0 |A) = E (Y1 |A = 50) E (Y0 |A = 50) = E (Y1 Y0 |A = 50), thanks to the continuity assumption.
Clment de Chaisemartin
Conclusion
An interesting quantity (E (Y1 Y0 |A = 50)) is equal to something we can estimate from the sample: limA50,A50 E (Y |A) limA50,A<50 E (Y |A). to estimate limA50,A50 E (Y |A) limA50,A<50 E (Y |A), run the following regression: Y = + 1 (A 50) + 2 (A 50)2 + ... + k (A 50)k +1 (A 50)1{A50} + 2 (A 50)2 1{A50} + ... + k (A 50)k 1{A50} + 1{A50} + You can check that under this model, = limA50,A50 E (Y |A) limA50,A<50 E (Y |A).
Clment de Chaisemartin
Conclusion
Intuition
Estimate Y as a continuous (polynomial) function in the left hand side of the threshold, and in the right hand side, and see whether these two functions connect at the threshold.
Clment de Chaisemartin
Conclusion
Conclusion
=> you instrument CS by the threshold variable. Then you run the regression: Y = + 1 (SS 40) + 2 (SS 40)2 + ... + k (SS 40)k +1 (SS 40)1{SS40} + 2 (SS 40)2 1{SS40} + ... + k (SS 40)k 1{SS40} + CS + 2SLS procedure. CS = variation in class size only due to the size of the school. What will be the value of ?
Clment de Chaisemartin
Conclusion
Informal tests
No formal test of the continuity assumption. In the class size example, you can plot for instance some students characteristics (X =parents income...) as a function of the size of the school. What you expect = no big jump at the 40 students threshold. Intuition: Y jumps at the threshold, and so does CS. If those things = the two only things which jump, then we can attribute the jump in Y to the jump in CS (people comparable with every respect in the left and in the right of the threshold except with respect to CS). But if other things jump at the threshold, such as parents income, then we do not know whether jump in Y due to jump in CS or to jump in parents income. Example of micro-credit program in Mexico: eligibility rule = having a land smaller than 1 ha (10 000 m2 ). => some people temporarily sold part of their land to benet from the program. => now, people in the left and in the right of the threshold no longer comparable. In the left: people whose land = really below 1 ha + dishonest people with land above 1ha. In the right, only honest people whose land is above 1ha. => you are comparing apples and bananas again !
Clment de Chaisemartin DID and RDD
Conclusion
Clment de Chaisemartin
Conclusion
OLS results
Clment de Chaisemartin
Conclusion
Clment de Chaisemartin
Dierence in dierences
Conclusion
Stata exercise
Clment de Chaisemartin
Dierence in dierences
Conclusion
Exam (1/2)
List of the proofs exigible for the exam (5% of nal grade): Session 2:
deriving and from the OLS minimization model proof of the fact that under the univariate linear model assumptions = cov (X ,Y ) V (X ) consistency of
Session 3:
proof of the fact that in the multivariate regression model, 1 = covee((rr11,Y ) where r1 is the estimated residual in the regression of V ) X1 on all the other explanatory variables. 1 proof that if cove (X1 , X2 ) = 0, then 1 , the coecient of X1 when 2 Y is regressed on X1 only is the same than 1 , the coecient of X1 when Y is regressed on X1 and X2 .
Clment de Chaisemartin
Dierence in dierences
Conclusion
Exam (2/2)
Session 6:
proof of the fact that when you run a randomized experiment, E (Y1 ) E (Y0 ) = E (Y1 |T = 1) E (Y0 |T = 0) (assuming Y1 and Y0 are discrete)
Session 7:
proof of the fact that under the 2 IV assumption, =
cov (y ,z) . cov (x,z)
value of 1 when there is an omitted variable bias. Interpretation in the change in regression coecients induced by this result.
10% of nal grade: questions related to the course. 35% of nal grade: questions on a paper.
Clment de Chaisemartin
Dierence in dierences
Conclusion
Clment de Chaisemartin
Dierence in dierences
Conclusion
Issue of highly heterogenous initial background of students. Do you think that: The course should be kept as it is The course should be kept as it is but should become an elective so that only students who think their initial background is sucient will choose to follow it The course should remain a compulsory course, but to allow students with little initial background in maths to follow it, it should become a fully applied econometrics course, that is to say a course in which all sessions take place in the PC lab and consist of Stata exercises. There should not be an econometrics course in the program
Clment de Chaisemartin
Dierence in dierences
Conclusion
Stata projects were (to some extent) business oriented. But empirical articles we discussed in class were mostly policy oriented. You are business school students => those papers might be of little interest to you. => last question = do you think that the empirical papers we studied in class were interesting ? Very interesting Somewhat interesting Not very interesting Not interesting at all
Clment de Chaisemartin
Dierence in dierences
Conclusion
Methods to help you to forecast future outcomes based on information available today: regressions.
time series model to predict future sales... logistic regression to predict whether a customer will default or not...
Clment de Chaisemartin
Dierence in dierences
Conclusion
Clment de Chaisemartin
Dierence in dierences
Conclusion