Applied Economics: The Experimental Ideal: Philipp Ager

Applied Economics: The Experimental Ideal
Philipp Ager
University of Mannheim and CEPR
Lecture 2 – Week 1, September 9, 2021
1 / 21
Introduction
The goal of program evaluation is to assess the causal effect

of public policy interventions. Examples
Job training programs on earnings and employment
Class size on test scores
Minimum wage on employment
...
In addition, we may be interested in the effect of variables that

do not represent public policy interventions. Examples
Immigration on wages
Natural disasters on economic growth
...
2 / 21
The Problem
How do we obtain a causal effect?
“Ideal scenario”: clone each treated individual and observe the

impacts of treatment on the outcomes of interest
What is the impact of giving Lisa a textbook on her test score?

Impact = Lisa’s score with a book - Lisa’s score without a book
In the real world, we either observe Lisa with or without a textbook
⇒ We never observe the counterfactual
3 / 21
The Problem
To measure the causal impact of giving Lisa a book on her test
score, we need to find a similar child that did not receive a book
“Causal impact” is the difference in test scores between the

treatment and the comparison group
Impact = Lisa’s score with a book - Bart’s score without a book
As this example illustrates, finding a good comparison group
is hard ⇒ potential solutions quasi-experimental approaches
(more on that soon . . . )
4 / 21
Causality with Potential Outcomes
Goal: find relationship between treatment and some outcome

that may be impacted by the treatment (e.g., wages)
Treatment indicator Di for unit i:

(
1 if unit i received the treatment
Di =
0 otherwise
For each individual, there are two potential outcomes:

(
Y0i if Di = 0
Potential Outcome =
Y1i if Di = 1
5 / 21
Causality with Potential Outcomes
Outcome Yi : Observed outcome of interest for unit i
We can write the outcome in terms of potential outcomes:

(
Y0i if Di = 0
Yi = Y1i Di + Y0i (1 − Di ) or = Yi =
Y1i if Di = 1
Treatment Effect for unit i is: Y1i − Y0i
Fundamental problem of causal inference:
⇒ We cannot observe both potential outcomes (Y1i ; Y0i )
6 / 21
Average Causal Effect
What we actually want to know is the average causal effect,

but that is not what we get from a difference in means comparison
Difference in group means: average causal effect of program

on participants + selection bias
Even in a large sample:
People will choose to participate in a program when they expect
the program to make them better off (i.e., when Y1i − Y0i > 0)
Participants are likely to be different than those who choose not
to . . . even in the absence of the program
7 / 21
Selection Bias
Difference in Group Means
E[Yi |Di = 1] − E[Yi |Di = 0] = E[Y1i |Di = 1] − E[Y0i |Di = 1]
+ E[Y0i |Di = 1] − E[Y0i |Di = 0]
The left-hand-side denotes observed difference in means between

treated and non-treated individuals
The two expressions on the right-hand side denote the average
treatment effect on the treated (ATET) and selection bias
E[Y1i |Di = 1] - E[Y0i |Di = 1] = E[Y1i − Y0i |Di = 1] = αATET
E[Y0i |Di = 1] - E[Y0i |Di = 0] = Selection Bias
8 / 21
Random Assignment Solves the Selection Problem
Recall: Difference in Group Means = ATET + Selection Bias
Random assignment of units to the treatment forces the selection

bias to be zero
The treatment and control group will tend to be similar along

all characteristics (including Y0 )
Random assignment of Di ⇒ expected outcomes are the

same in the absence of the program
9 / 21
Random Assignment Solves the Selection Problem
If treatment is random (independence of Y0i and Di ), we have:

E[Y0i |Di = 1] = E[Y0i |Di = 0] = E[Y0i ]
Difference in Group Means

= E[Yi |Di = 1] − E[Yi |Di = 0]
= E[Y1i |Di = 1] − E[Y0i |Di = 0]
= E[Y1i |Di = 1] − E[Y0i |Di = 1] + E[Y0i |Di = 1] − E[Y0i |Di = 0]
= E[Y1i |Di = 1] − E[Y0i |Di = 1] + E[Y0i ] − E[Y0i ]
= E[Y1i ] − E[Y0i ]
The difference in means is the αATE (average treatment effect)
10 / 21
Example – Randomized Experiment on Class Size
Many studies of education production using non-experimental
data suggest there is little or no link between class size and
student learning
Krueger (1999, QJE) provides an econometric analysis of the
Tennessee STAR experiment on class size
Project STAR was a longitudinal study in which kindergarten
students and their teachers were randomly assigned to one of
three groups beginning in the 1985-1986 school year
Small classes (13-17 students per teacher), regular classes
(22-25 students), and regular/aide classes (22-25 students)
which also included a full-time teacher’s aide
Random assignment took place within schools (80 schools
participated, 11,600 students over 4 years)
11 / 21
First Check
Does randomization successfully balance observables across
different treatment groups?
Common: compare pre-treatment outcomes or other covariates

across groups
12 / 21
Source: Krueger (1999, Table 1)
No/small differences in the fraction of students on free lunch, racial

mix, and the average age of students by different classes size
13 / 21
Randomization eliminates selection bias: Difference in group means
captures the (unbiased) average causal effect of class size
School FE necessary because randomization was done within schools
What about adding controls in this case?

14 / 21
Regression Analysis of Experiments
Useful tool for the study of causal questions

(much more on that later)
Assume the treatment effect is the same for everyone, i.e.,

Y1i − Y0i = ρ
Yi = α + ρDi + ηi
Constant: α = E(Y0i )
Treatment Effect: ρ = (Y1i − Y0i )
Random Part of Y0i : ηi = Y0i − E(Y0i )
15 / 21
Let’s evaluate the conditional expectation of this equation with
treatment status switched on and off:
E[Yi |Di = 1] = α + ρ + E[ηi |Di = 1]
E[Yi |Di = 0] = α + E[ηi |Di = 0]
Hence . . .
E[Yi |Di = 1] − E[Yi |Di = 0] = ρ + E[ηi |Di = 1] − E[ηi |Di = 0]
Selection bias amounts to the correlation between ηi and Di ,

reflecting the difference in (no-treatment) potential outcomes
between those who get treated and those who don’t
⇒ E[Y0i |Di = 1] - E[Y0i |Di = 0]

16 / 21
In Krueger (1999), where Di is randomly assigned (within schools),

the selection term disappears, and a regression of Yi on Di
estimates the causal effect of interest ρ
If other student characteristics, call them Xi , are uncorrelated

with the treatment Di , then they will not affect the estimate of ρ
Yi = α + ρDi + Xi′ γ + ηi
Yet, adding controls might help to obtain more precise estimates

of the treatment effect
17 / 21
Adding student and teacher characteristics has no detectable

effect on the effect of small classes on student achievement
18 / 21
Threats to the Validity of Randomized Experiments
Internal validity: can we estimate treatment effect for our

particular sample?
Fails when there are differences between treated and controls
(other than the treatment itself) that affect the outcome and that
we cannot control for
External validity: can we extrapolate our estimates to other

populations?
Fails when the treatment effect is different outside the evaluation
environment
19 / 21
Threats to the Validity of Randomized Experiments
Most common threats to Internal Validity:

Failure of randomization
Non-compliance with experimental protocol
Attrition
Most common threats to External Validity:

Non-representative sample
Non-representative program
20 / 21
Outlook
General Issue: Randomized trials are often impracticable or

not feasible (compliance, high costs, rejection on ethical grounds)
Potential remedy: find natural or quasi-experiments that mimic

a randomized trial
Quasi-experimental approaches:
Difference-in-Differences
Instrumental Variables
Regression Discontinuity
21 / 21

Applied Economics: The Experimental Ideal: Philipp Ager

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Applied Economics: The Experimental Ideal: Philipp Ager

Uploaded by

Copyright:

Available Formats

Applied Economics: The Experimental Ideal

Lecture 2 – Week 1, September 9, 2021

The goal of program evaluation is to assess the causal effect

In addition, we may be interested in the effect of variables that

“Ideal scenario”: clone each treated individual and observe the

What is the impact of giving Lisa a textbook on her test score?

“Causal impact” is the difference in test scores between the

Goal: find relationship between treatment and some outcome

Treatment indicator Di for unit i:

For each individual, there are two potential outcomes:

Outcome Yi : Observed outcome of interest for unit i

We can write the outcome in terms of potential outcomes:

Treatment Effect for unit i is: Y1i − Y0i

Fundamental problem of causal inference:

⇒ We cannot observe both potential outcomes (Y1i ; Y0i )

What we actually want to know is the average causal effect,

Difference in group means: average causal effect of program

Difference in Group Means

E[Yi |Di = 1] − E[Yi |Di = 0] = E[Y1i |Di = 1] − E[Y0i |Di = 1]

+ E[Y0i |Di = 1] − E[Y0i |Di = 0]

The left-hand-side denotes observed difference in means between

Recall: Difference in Group Means = ATET + Selection Bias

Random assignment of units to the treatment forces the selection

The treatment and control group will tend to be similar along

Random assignment of Di ⇒ expected outcomes are the

If treatment is random (independence of Y0i and Di ), we have:

Difference in Group Means

= E[Y1i |Di = 1] − E[Y0i |Di = 0]

= E[Y1i |Di = 1] − E[Y0i |Di = 1] + E[Y0i |Di = 1] − E[Y0i |Di = 0]

= E[Y1i |Di = 1] − E[Y0i |Di = 1] + E[Y0i ] − E[Y0i ]

The difference in means is the αATE (average treatment effect)

Common: compare pre-treatment outcomes or other covariates

Source: Krueger (1999, Table 1)

No/small differences in the fraction of students on free lunch, racial

School FE necessary because randomization was done within schools

Source: Krueger (1999, Table 5)

What about adding controls in this case?

Useful tool for the study of causal questions

Assume the treatment effect is the same for everyone, i.e.,

E[Yi |Di = 1] = α + ρ + E[ηi |Di = 1]

E[Yi |Di = 0] = α + E[ηi |Di = 0]

E[Yi |Di = 1] − E[Yi |Di = 0] = ρ + E[ηi |Di = 1] − E[ηi |Di = 0]

Selection bias amounts to the correlation between ηi and Di ,

⇒ E[Y0i |Di = 1] - E[Y0i |Di = 0]

In Krueger (1999), where Di is randomly assigned (within schools),

If other student characteristics, call them Xi , are uncorrelated

Yet, adding controls might help to obtain more precise estimates

Adding student and teacher characteristics has no detectable

Source: Krueger (1999, Table 5)

Internal validity: can we estimate treatment effect for our

External validity: can we extrapolate our estimates to other

Most common threats to Internal Validity:

Most common threats to External Validity:

General Issue: Randomized trials are often impracticable or

Potential remedy: find natural or quasi-experiments that mimic

You might also like