Download as pdf or txt
Download as pdf or txt
You are on page 1of 48

Quasi-Experimental Methods

Rema Hanna
Associate Professor of Public Policy
Evidence for Policy Design Harvard Kennedy School
August 29, 2012

Lecture Overview
1. Quasi-Experimental Methods
a) b) c) Regression Discontinuity Matching Difference-in-Differences
Strengths Threats Validity Tests

2.

Differences-in-Differences
a) b) c)

3.

Example: School Construction, Indonesia

Quasi-Experimental Methods
Quasi-experimental methods are alternative evaluation methods which:
Aim to find experiments naturally in data Are not as rigorous as randomization

Quasi-Experimental Methods
1. Regression Discontinuity 2. Matching 3. Difference-in-Differences

1. Regression Discontinuity
Program is assigned using a cutoff score and the participants and non-participants are compared controlling for the cutoff criteria.
Compare people who just met the eligibility criteria and those who were just excluded
Assumption 1: Discontinuity in program participation at some eligibility threshold Assumption 2: No discontinuities in any other factor that impacts the outcome of interest at that point

Regression Discontinuity

Obtains microcredit

Does not obtain microcredit

Land size

Regression Discontinuity

Obtains microcredit

Does not obtain microcredit

Land size

Local Average Treatment Effect


Regression discontinuity gives an estimate of the program impact for people near the cutoff point.

Can answer: Should the program cut-off be expanded at the margin? Cannot answer: Should the program exist?

Strengths & Challenges


Strengths: 1. Takes advantage of existing program rules 2. Yields unbiased estimates of impact in the vicinity of the cutoff Challenges: 1. Cannot tell you program impact for those far from cutoff 2. Need cutoff to be strictly enforced 3. Need large sample near cutoff

2. Matching
Construct a comparison group that is similar on observable characteristics, especially those factors that could influence likelihood of participating in the intervention. Use available baseline data on participants and non-participants, for example census or household survey data.

Matching
Vote 2002 Campaign
Have background characteristics on intervention group Have data on 2,000,000 unaffected individuals

Matching

Source: Arceneaux, Gerber, and Green (2004)

Matching
Estimation Method: Compare outcome for those who are similar
Strengths: 1. Equivalent on observable characteristics Challenges: 1. Unobservable characteristics 2. Requires lots of data (expensive)

3. Difference-in-Differences
Compare change in outcomes for intervention group and change in outcomes for comparison group

Start by reviewing:
Pre-post Simple difference

Review: Pre-Post
Compares what happened before and after intervention group received program
Assumption: Outcome variable would have remained constant in the absence of program. Source of Bias: We dont know what the change in outcome would have been without program.

Pre-Post: Impact Appears Positive


Primary Outcome Before Intervention After Intervention

Intervention Group

Time

.But, Outcome Already Improving


Primary Outcome Before Intervention After Intervention

Intervention Group

Time

Review: Simple Difference


Measures the difference in outcome between groups that received and did not receive program.
Assumption: Intervention and comparison group would have the same outcome in the absence of the program. Source of Bias: Two groups may be systematically different.

Impact Appears Positive


Primary Outcome Before Intervention After Intervention

Intervention Group Comparison Group

Time

But, Two Groups Always Different


Primary Outcome Before Intervention After Intervention

Intervention Group
Comparison Group Time

Difference-in-Differences
Combines Pre-Post and Simple Difference by comparing change in outcome over time between intervention and comparison groups.

Simple Difference
Biased because groups may be systematically different.
After Intervention Comparison Difference P2 C2 P2-C2 Before

Difference-in-Differences
Subtract out systematic differences using pre-intervention data: (P2 - C2) - (P1 - C1)
After Intervention Comparison Difference P2 C2 P2-C2 Before P1 C1 P1-C1

Pre-Post
Pre-post biased because do not know trend in outcome.
After Intervention Comparison P2 Before P1 Difference P2-P1

Difference-in-Differences
Subtract out trend from comparison group: (P2 - P1) - (C2 - C1)
After Intervention Comparison P2 C2 Before P1 C1 Difference P2-P1 C2-C1

Graphically:
Primary Outcome Before After Intervention Intervention P2

P1

C2

C1

D-in-D = (P2-C2) (P1-C1)

Time

When Can D-in-D be used?


1. Policies where some cities or regions affected, but not others 2. Policies that affected certain population groups, but not others Data requirements: Data on outcomes for areas both with and without the program, both before and after. Larger data requirement than for randomization, which only requires postintervention data.

Strengths
Eliminates Common Shocks: Eliminates bias that can affect both program and comparison groups the same way, i.e. change in national financial regulation Removes Systematic Differences: Subtracts out fixed differences in characteristics between the program and comparison groups over time, i.e. if the program group always has a larger proportion of small business owners than comparison

Remaining Source of Bias


Shocks that differentially affect intervention and comparison groups.

Parallel Trends Assumption

Requires that, in the absence of the program, the outcome variable would look similar in both intervention and comparison groups over time.

Trends the Same Prior to Program


Primary Outcome C1
C2 P2 P1

Before Intervention

After Intervention

Parallel Trends - Violations


An unrelated program or policy that affects the outcomes in either the intervention or comparison group. Example: another microfinance provider moves into the area of the comparison group Underlying differences in trends between the two groups. Example: one region has a faster rate of growth Random variation over time. Example: rate of new business creation fluctuates, is high in some periods, low in others

Different Underlying Trends


Primary Outcome
C2

C1
P1

P2

Before Intervention

After Intervention

Different Underlying Trends


Primary Outcome
C2

C1
P1

P2

Before Intervention

After Intervention

Mean Reversion
Primary Outcome C1 P1

P2 C2

Before After Intervention Intervention

Mean Reversion
Primary Outcome C1 P1

P2 C2

Before After Intervention Intervention

Validity Tests
Impossible to prove validity of the parallel trends assumption. However, several tests can be used to assess the validity of the assumption. 1. Check pre-intervention trends 2. Placebo intervention group 3. Placebo outcome

1. Pre-Program Trends
If outcomes of intervention and comparison groups moved in parallel before the program started it is more likely that they would have continued to move in parallel in the absence of the program.

1. Pre-Program Trends
Primary Outcome

Are trends parallel before?

Intervention period

2. Placebo Intervention
Choose a fake intervention group that was not affected by the program.
Compare two regions with no microfinance program

Difference-in-Difference estimate for this group should be close to zero or there is a problem.

3. Placebo Outcome
Choose an outcome that cannot logically be affected by the program.
Example: Did microfinance program affect rainfall?

Make sure the Difference-in-Differences effect is very close to zero.

What is the effect of a large-scale school construction program on years of schooling obtained?
Between 1973 and 1978, the Indonesian government built more than 61,000 primary schools.

Source: Duflo, E. (2001). Schooling & Labor Market Consequences of School Construction in Indonesia: Evidence from an Unusual Policy Experiment.

Program Theory
Could improve educational outcomes by increasing the supply of education. Could also reduce quality of education because new teachers are not as well trained, canceling out supply effect.

Evaluation Method #1
Compare average education levels of 18year-olds in 1973 and 1990. What is the problem with this evaluation strategy?

Evaluation Method #2
Compare average education levels of 18year-olds in 1990. across areas where the program was more intensive and less intensive. What is the problem with this evaluation strategy?

Difference-in-Differences
More schools were built in regions with the fewest schools per capita initially. Children over age of 12 when program began were not affected. Compare the change in educational outcomes between younger and older children in low program intensity regions with the change in outcomes in high program intensity regions.

Results:
Estimated impact: D-in-D estimate of gain of 0.12 years of education.
Years of education: age 2-6 in 1974 High program intensity Low program intensity Difference 8.50 9.76 -1.26 Years of education: age 12-17 in 1974 8.02 9.40 -1.38

Difference

0.48 0.36 0.12

Validity Testing: Placebo


Estimated impact very close to zero supports parallel trends assumption.
Years of education: age 12-17 in 1974 High program intensity Low program intensity Difference 8.00 9.41 -1.41 Years of education: age 18-24 in 1974 7.70 9.12 -1.42

Difference

0.30 0.29 0.01

Key Takeaways
Difference-in-Differences can remove some bias found in Pre-Post and Simple Differences methods. D-in-D does not fully solve the problem of selection bias, so one must still be cautious D-in-D often requires more data than randomization (may be more expensive unless administrative data is available)

You might also like