Professional Documents
Culture Documents
14 - 382 - Pset - 5 (1) - Merged
14 - 382 - Pset - 5 (1) - Merged
382 Pset 5
Parker Whitfill
March 2024
1 Problem 1
a.
Following the notation from class, let
Z
Fj|k (y) ≡ FYj |Xj (y|x)dFXk (x), y ∈ T,
where Y is log wage and X are the covariates. Then we can define exactly as in the notes
b.
No, because there may be ’omitted variables’ that are correlated with being a man or woman
that determine the distribution of wages. For example, perhaps being a man or woman is correlated
with working more hours, then in a traditional labor market sense there is no discrimination in that
gender being paid more in overall wages. However, this decomposition would load this into the
’discrimination effect’ since we do not have an X variable for hours worked.
1
c.
0.4
0.2
0.0
2
Observed and Counterfactual Quantiles (90% CI)
4.5
3.5
3.0
2.5
2.0
Quantile index
3
Gender wage gap (90% CI)
0.5
Quantile difference
0.3
0.1
−0.1
Probability
0.3
0.1
−0.1
Probability
0.3
0.1
−0.1
Probability
4
Women log hourly wages Men log hourly wages
1500
800
600
1000
Frequency
Frequency
400
500
200
0
1 2 3 4 5 6 1 2 3 4 5 6
In words, the potential outcomes depend only on this periods treatment, not any other periods.
Formally, we are assuming that Dij for j < t is independent of the potential outcomes.
Then we can write
where the second line holds because g(0), g(1) do not depend on Dij for j < t by the assumption I
made, and we have independence for Dit by random assignment.
b.
We can use the fixed effect estimator since we proved in class that strict exogenity is a sufficient
condition to estimate α via fixed effects.
c.
Define estimator β̂
T
1X
β̂ ≡ E[Yit |Dit = 1] − E[Yit |Dit = 0]
T t=1
T
1X
= E[Yit (Di1 , Di2 , . . . Dit . . . DiT )|Dit = 1] − E[Yit (Di1 , Di2 , . . . Dit . . . DiT )|Dit = 1]
T t=1
T
1X
= E[Yit (Di1 , Di2 , . . . 1 . . . DiT )] − E[Yit (Di1 , Di2 , . . . 0 . . . DiT )] By random assignment
T t=1
6
d.
Suppose Yit (d1 , . . . dt , . . . dT ) = g(dt−1 , dt ) so it depends on the last treatment periods. Then
e.
No, this is immediate from part d since we produced an example where random assignment
holds but strict exogeneity does not hold. Moreover, we saw from class that with fixed effects we
have a full charectorization of what is needed to estimate the causal effect. We need
f.
Expanding we have
Now it is no longer possible that Yit (d) = g(dt ) since the other treatment periods explicitly
appear in the equation for the outcome. However, we could proceed similarly as before, assuming
Dit is randomly assigned in each period - however we add the additional assumption that Dij
independent of Yit (d) for Dij where j < t. This is not guaranteed by Dij random assignment
but is an assumption that the potential outcomes in a period do not depend on treatment in
previous periods. This also makes sure the pre determined regressor from the textbook holds since
Yit−1 (Di,t−1 , Di,t−2 . . . ) ⊥ ϵit is implied by the fact that there is no serial correlation in ϵ and
that Di,j for j < t is independent of Yit (d) and thus ϵit . Finally, we can verify α measures the
average contemporaneous effect by repeating the exact same logic as part a with the minor twist
that instead of the potential outcomes only depending on Di t even in a stochastic sense, we have
the potential outcomes only depend on Di t from a mean sense by the independence assumption we
are making.
g.
We are letting previous treatment periods have an a causal effect on the current treatment,
which is why need to make some sort of restriction where previous treatment do not change the
selection into treatment in the current treatment.
7
## 14.382 PS5 Q1, Parker Whitfill worked with Kao, Yucheng, Ryo
# This empirical example uses the data from the CPS to illustrate the use of distribution regression and counterfactual
distributions
# in the estimation of the gender gap
# Description of the data: the sample selection and variable contruction follow
# Mulligan, Casey B., and Yona Rubinstein. "Selection, investment, and women's relative wages over time." The Quarterly Journal
of Economics (2008): 1061-1110.
# Sample selection: white non-hipanic, ages 25-54, working full time full year (35+ hours per week at least 50 weeks), exclude
living in group quarters,
# self-employed, military, agricultural, and private household sector, allocated earning, inconsistent report on earnings and
employment, missing data
############
#install.packages("devtools")
#devtools::install_github("bmelly/discreteQ")
####################
rm(list = ls());
# Descriptive statistics,
vars <-
c("lnw","female","married","widowed","separated","divorced","nevermarried","lhs","hsg","sc","cg","ad","ne","mw","so","we","exp1");
options(digits=2);
dstats <- cbind(sapply(data[,vars], weighted.mean, weight), apply(data[female==1,vars], 2, weighted.mean, weight[female==1]),
apply(data[female==0,vars], 2, weighted.mean, weight[female==0]));
xtable(dstats);
# Estimation;
dr.fit <- discreteQ(y = lnw,d = 1-female,x = reg, w = weight, decomposition = TRUE, q.range=range(taus), method="logit",
bsrep=R, alpha=alpha, ys=ys, cl= my_cl)
# Hints:
#Define the treatment (d) as male to obtain positive values for the gender gap, d=1-female
#The r command "DiscreteQ" will give you everything you want here in one step. Note that you want w=weight,
#decomposition = TRUE, q.range=range(taus), method="logit", bsrep=R, alpha=alpha, ys=ys, cl= my_cl
# Part IV: Make figures based on results
legend(quantile(ys, .01), 1, c(' ', ' ',' ' ), col = c('light blue','light green','light grey'), lwd = c(4,4,4), horiz = F, bty =
'n');
legend(quantile(ys, .01), 1, c('Observed women distribution','Observed men distribution', 'Counterfactual distribution'), col =
c(4,'dark green', 'dark grey'), lwd = c(1,1,1), horiz = F, bty = 'n');
dev.off();
par(mfrow=c(1,1));
legend(min(taus), max(ys), c(' ', ' ',' '), col = c('light blue','light green','light grey'), lwd = c(4,4,4), horiz = F, bty =
'n');
legend(min(taus), max(ys), c('Observed women quantiles', 'Observed men quantiles', 'Counterfactual quantiles'), col = c(4,'dark
green','dark grey'), lwd = c(1,1,1), horiz = F, bty = 'n');
dev.off();
par(mfrow=c(3,1));
dev.off();