Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

14.

382 Pset 5
Parker Whitfill
March 2024

1 Problem 1
a.
Following the notation from class, let

Z
Fj|k (y) ≡ FYj |Xj (y|x)dFXk (x), y ∈ T,

where Y is log wage and X are the covariates. Then we can define exactly as in the notes

F(m|m) − F(w|w) = (F(m|m) − F(m|w) ) + (F(m|w) − F(w|w) )


| {z } | {z }
composition effect discrimination effect

b.
No, because there may be ’omitted variables’ that are correlated with being a man or woman
that determine the distribution of wages. For example, perhaps being a man or woman is correlated
with working more hours, then in a traditional labor market sense there is no discrimination in that
gender being paid more in overall wages. However, this decomposition would load this into the
’discrimination effect’ since we do not have an X variable for hours worked.

1
c.

Observed and Counterfactual Distributions (90% CI)


1.0

Observed women distribution


Observed men distribution
Counterfactual distribution
0.8
0.6
Probability

0.4
0.2
0.0

2.0 2.5 3.0 3.5 4.0 4.5

Log of hourly wage

Figure 1: Observed and Counterfactual Distributions (90% CI)

2
Observed and Counterfactual Quantiles (90% CI)
4.5

Observed women quantiles


Observed men quantiles
Counterfactual quantiles
4.0
Log of hourly wage

3.5
3.0
2.5
2.0

0.2 0.4 0.6 0.8

Quantile index

Figure 2: Observed and Counterfactual Quantiles (90% CI)

3
Gender wage gap (90% CI)

0.5
Quantile difference

0.3
0.1
−0.1

0.2 0.4 0.6 0.8

Probability

Discrimination (90% CI)


0.5
Quantile difference

0.3
0.1
−0.1

0.2 0.4 0.6 0.8

Probability

Composition (90% CI)


0.5
Quantile difference

0.3
0.1
−0.1

0.2 0.4 0.6 0.8

Probability

Figure 3: Gender Wage Gap Decomposition (90% CI)

4
Women log hourly wages Men log hourly wages

1500
800
600

1000
Frequency

Frequency
400

500
200
0

1 2 3 4 5 6 1 2 3 4 5 6

Log of hourly wage Log of hourly wage

Figure 4: Women vs Men Log Hourly Wages


2 Problem 2
a.
Here is a sufficient condition. Suppose that for all i, t

Yit (Di1 , Di2 . . . dit , . . . DiT ) = g(dit )

In words, the potential outcomes depend only on this periods treatment, not any other periods.
Formally, we are assuming that Dij for j < t is independent of the potential outcomes.
Then we can write

Yit = g(0) + [g(1) − g(0)]Dit


Yit = E[g(0)] + E[g(1) − g(0)]Dit + ϵit

where ϵit ≡ g(0) − E[g(0)] + [g(1) − g(0) − E[g(1) + g(0)])]Dit


We can verify strict exogeneity holds for all j < t
   
E[Dij ϵit ] = E[Dij g(0) − E[g(0)] ] + E[Dij Dit g(1) − g(0) − E[g(1) + g(0)] ]
= E[Dij ] E[g(0) − E[g(0)]] + E[Dit Dij ] E[g(1) − g(0) − E[g(1) + g(0)]]
=0

where the second line holds because g(0), g(1) do not depend on Dij for j < t by the assumption I
made, and we have independence for Dit by random assignment.

b.
We can use the fixed effect estimator since we proved in class that strict exogenity is a sufficient
condition to estimate α via fixed effects.

c.
Define estimator β̂
T
1X
β̂ ≡ E[Yit |Dit = 1] − E[Yit |Dit = 0]
T t=1
T
1X
= E[Yit (Di1 , Di2 , . . . Dit . . . DiT )|Dit = 1] − E[Yit (Di1 , Di2 , . . . Dit . . . DiT )|Dit = 1]
T t=1
T
1X
= E[Yit (Di1 , Di2 , . . . 1 . . . DiT )] − E[Yit (Di1 , Di2 , . . . 0 . . . DiT )] By random assignment
T t=1

6
d.
Suppose Yit (d1 , . . . dt , . . . dT ) = g(dt−1 , dt ) so it depends on the last treatment periods. Then

Yit = g(Di(t−1) , 0) + [g(Di(t−1) , 1) − g(Di(t−1) , 0)]Dit


E[g(Di(t−1) , 0)] + E[g(Di(t−1) , 1) − g(Di(t−1) , 0)]Dit + ϵit
 
where ϵit = g(Di(t−1) , 0)−E[g(Di(t−1) , 0)]+ g(Di(t−1) , 1)−g(Di(t−1) , 0)−E[g(Di(t−1) , 1)−g(Di(t−1) , 0)] Dit .
Strict exogeneity requires that E[Dij ϵit ] for all j ≤ t. But notice that in this case for j = t−1 then
this is not necessarily true because ϵit explicitly depends on Dit−1 so we may not have orthogonality.

e.
No, this is immediate from part d since we produced an example where random assignment
holds but strict exogeneity does not hold. Moreover, we saw from class that with fixed effects we
have a full charectorization of what is needed to estimate the causal effect. We need

f.
Expanding we have

Yit = αi (1 + p + p2 . . . pt ) + α(Dit + Di,t−1 p + Di,t−2 p2 . . . Di,1 pt ) + (ϵit + ϵi,t−1 p + ϵi,t−2 p2 . . . ϵi,1 pt )

Now it is no longer possible that Yit (d) = g(dt ) since the other treatment periods explicitly
appear in the equation for the outcome. However, we could proceed similarly as before, assuming
Dit is randomly assigned in each period - however we add the additional assumption that Dij
independent of Yit (d) for Dij where j < t. This is not guaranteed by Dij random assignment
but is an assumption that the potential outcomes in a period do not depend on treatment in
previous periods. This also makes sure the pre determined regressor from the textbook holds since
Yit−1 (Di,t−1 , Di,t−2 . . . ) ⊥ ϵit is implied by the fact that there is no serial correlation in ϵ and
that Di,j for j < t is independent of Yit (d) and thus ϵit . Finally, we can verify α measures the
average contemporaneous effect by repeating the exact same logic as part a with the minor twist
that instead of the potential outcomes only depending on Di t even in a stochastic sense, we have
the potential outcomes only depend on Di t from a mean sense by the independence assumption we
are making.

g.
We are letting previous treatment periods have an a causal effect on the current treatment,
which is why need to make some sort of restriction where previous treatment do not change the
selection into treatment in the current treatment.

You might also like