Professional Documents
Culture Documents
Problem Set 2: Schools
Problem Set 2: Schools
Problem Set 2: Schools
Set 2: Schools
Matteo Vasca – ID 3020928
1. School vouchers
a. I would expect a downwardly biased value of ββ1 with respect to β1. This happens because,
intuitively, the effect of school vouchers would have been higher if only all the students who
received the voucher could have used it.
b. If we consider all the students who didn’t attend private schools as part of the control group, we
will find an upwardly biased ββ1, because the students who drop out of private school are more likely
to be lowscores students; so our treatment group will be made “only” of highscores students.
We find the mean score of the control group:
[ 60*(100%) + 40*(20%) ] / 1.20 = [ 60 + 8 ] / 1.20 = 56.67
Now we calculate the ββ1 by subtracting this result from 75 (the score of the students who went to
private school) and we obtain ββ1 = 18.33 > β1 = 10.
c. In this case we have a problem of attrition: some students (20%), we considered part of the
treated group, didn’t receive any treatment (private school). We can use an intenttotreat technique
by simply considering this 20% still part of the treated group. In this case we expect to have a
downwardly biased ββ1rand, because of two reasons: first, some students didn’t receive any private
instruction even if they were supposed to; second, the students who drop out of private schools
could probably be the less good ones, so their scores make our results lower in the treatment group.
In order to calculate the ββ1rand, we find the mean score in the treatment group:
75*(80%) + 40*(20%) = 60 + 8 = 68
From this result we can say that the ββ1rand value is 68 – 60 and so ββ1rand = 8. Since the real β1 value
is 10 points on a scale of 100, we confirm that our estimation is downwardly biased.
d. In this case, in the equation 1 we will consider part of the treatment group also those lowincome
students who managed to obtain a scholarship to private schools. Since they obtained a scholarship,
we can deduce that they are more likely to be highscores students and so we expect our ββ1 to be
upwardly biased (only the lowscores students will remain into the control group).
e. In this example the students who attended private school without a voucher were not that good.
This will have a negative effect on our ββ1, that will be downwardly biased.
Let’s check. We calculate the average score of the students who went to private schools:
[ 70*(100%) + 60*(20%) ] / 1.20 = [ 70 + 12 ] / 1.20 = 68.33
And so we find that ββ1 = 68.33 – 62.5 = 5.83 < β1 (downward effect).
f. In this case, with the ITT analysis, ββ1rand will be downwardly biased again. In fact, in the “public
school” control group, we do not consider the lowestscore students (and so the average score of
this subgroup will be higher), but the students who went to the private schools – even if they hadn’t
a voucher – could have got a also lower score if they had remained in public schools (so the effect
on the average score of the control group will be upward).
Let’s calculate the average score of the control group:
62.5*(80%) + 60*(20%) = 50 + 12 = 62
And, so, we have ββ1 = 70 – 62 = 8 < β1.
g. From this analysis we can simply conclude that the ITT estimator is always downwardly biased.
This happens because we can have an upward effect on the control group (noncompliance) or a
negative effect on the treatment group (attrition often reduces the ββ1 value). On the other hand, we
don’t know ex ante the bias direction in the equation 1 case and we have to calculate it case by case.
2. Schools construction in Indonesia (Duflo, 2001)
a. We report the results of the regression.
. reg educ Young Low Young_x_Low
Source | SS df MS Number of obs = 30,720
+ F(3, 30716) = 282.28
Model | 12555.4369 3 4185.14564 Prob > F = 0.0000
Residual | 455397.438 30,716 14.8260658 Rsquared = 0.0268
+ Adj Rsquared = 0.0267
Total | 467952.875 30,719 15.2333369 Root MSE = 3.8505
educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]
+
Young | .3749735 .0684179 5.48 0.000 .2408716 .5090754
Low | 1.32188 .0619635 21.33 0.000 1.200429 1.443331
Young_x_Low | .1171635 .0893285 1.31 0.190 .292251 .057924
_cons | 8.539234 .0478549 178.44 0.000 8.445436 8.633031
b. Here is the table, filled in terms of β0, β1, β2 , and β3:
Where:
➢ (A – E) indicates the value of having been young when the program was implemented in a low
intensity region;
➢ (B – F) indicates the difference between having been young instead of old in 1974 in a high
intensity region;
➢ (F – E) indicates the “added” value of being in a high intensity region if you were young in 1974;
➢ (B – A) is the same of (F – E), but calculated on old children in 1974;
➢ (F – B) – (E – A), that can also be written as “(F – E) – (B – A)” – but with a different
interpretation –, can be seen as the real value of the program. Is the difference of having been in
a high intensity region (instead of a low intensity region) of the difference between having been
young when the program started (instead of being old). So indicates the effect of the program on
young children in 1974, independently from the fact that they were in a high or low intensity
region: differenceindifference analysis.
d. ββ2 indicates the difference for a child who was old in 1974 (when the schools construction
started) of having been in a low intensity region instead of in a high one. Since we expect that the
program had no effect on children who were old in 1974, and the low regions had a higher level of
education (in terms of years of schooling) instead of the high regions, we expect this coefficient to
be positive (as it is in the regression).
e. ββ1 indicates the effect of being young in a high intensity region in 1974. So it says that a young
child is more likely to have an higher number of years of schooling than an old one, if both are in a
high intensity region. For these reason, we expect ββ1 to be positive (and in fact ββ1 = 0.3750,
statistically significant at 99% level).
g. We can say that the high enrollment regions were the lower density regions in our previous figure
on slide 231 and so, as the Problem Set 2 text asks, we have to suppose that the government of
Indonesia wants to improve the areas in which there is already a high level of enrollment (this
is how I interpret the fact that the government “targeted the program […] to districts which had
initially high enrollment, but parallel trends assumption still holds”).
Here is the new appropriate version of the figure on slide 23.
Do – file
cd "\\Client\A$\Università\3_anno\Northwestern\COURSES\Economics_of_developing_countries\PS2"
log using MVasca.log, replace
use duflo_winter17.dta
sum agegroup
drop if agegroup == 2
drop if agegroup == 4
gen Young= agegroup==1
gen Low= high==0
gen Young_x_Low= Young*Low
reg educ Young Low Young_x_Low
1
E. Duflo, in her paper at pp. 2, 9, writes that “the government’s effort [was] to allocate more schools to the regions were
initial enrollment was low” and also that “the program provision that more schools would be built in lower enrollment
regions is reflected in the differences between the education in law and high program regions”