Matching Methods

Introduction Estimation
Meeting 10: Matching
Gumilang Aryo Sahadewo
Universitas Gadjah Mada
November 12, 2019

Motivation
1 Introduction
Motivation
Matching Methods
Finding a match
Propensity Score Matching
2 Estimation
Estimation
Matching Methods
Syntax
Motivation
The Problem of Counterfactual

Potential Outcome Framework
Recall the two potential outcomes:

(
Y1i Di = 1
Yi =
Y0i Di = 0
Yi = Y0i + (Y1i − Y0i ) Di
The causal effect of a program is (Y1i − Y0i )

What is the problem with estimating the causal effect?
Motivation


(
Y1i Di = 1
Yi =
Y0i Di = 0
Yi = Y0i + (Y1i − Y0i ) Di

Motivation


(
Y1i Di = 1
Yi =
Y0i Di = 0
Yi = Y0i + (Y1i − Y0i ) Di

Motivation

The fundamental problem of evaluation is no counterfactual

ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
ATE = E [Y1i − Yoi ] = P (D = 1) · ATT + P (D = 0) · ATNT
The methods that we’ve studied so far seek to construct a
valid comparison group
Matching is another method that applies statistical techniques
to construct a comparison group
Matching identifies average unobserved counterfactuals
Motivation


ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
Motivation


ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
Motivation


ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
Motivation


ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
Motivation


ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
Motivation


ATT = E [Y1i | Di = 1] − E [Y0i | Di = 1]
ATNT = E [Y1i | Di = 0] − E [Y0i | Di = 0]
Matching Methods
1 Introduction
Motivation
Matching Methods
Finding a match
2 Estimation
Estimation
Matching Methods
Syntax
Matching Methods
Matching Methods
Selection on Observables
The identifying assumption is selection on observables:
(Y (0) , Y (1)) ⊥ D | X
This is equivalent to:
Pr (D = 1 | Y (0) , Y (1) , X ) = Pr (D = 1 | X )
E (D = 1 | Y (0) , Y (1) , X ) = E (D = 1 | X )
Matching Methods
Matching Methods
The identifying assumption is selection on observables:
(Y (0) , Y (1)) ⊥ D | X
This is equivalent to:
Pr (D = 1 | Y (0) , Y (1) , X ) = Pr (D = 1 | X )
E (D = 1 | Y (0) , Y (1) , X ) = E (D = 1 | X )
Matching Methods
Matching Methods
Differences between treatment and comparison group are

captured in X
ATT : Y0i ⊥ D | X → E [Y0i | D = 1, X ] = E [Y0i | D = 0, X ]
ATNT : Y1i ⊥ D | X → E [Y1i | D = 1, X ] =
E [Y1i | D = 0, X ]
ATE : Y0 , Y1 ⊥ D | X
What would be the threat to identification?
Matching Methods
Matching Methods

captured in X
ATNT : Y1i ⊥ D | X → E [Y1i | D = 1, X ] =
E [Y1i | D = 0, X ]
ATE : Y0 , Y1 ⊥ D | X
Matching Methods
Matching Methods

captured in X
ATNT : Y1i ⊥ D | X → E [Y1i | D = 1, X ] =
E [Y1i | D = 0, X ]
ATE : Y0 , Y1 ⊥ D | X
Matching Methods
Matching Methods

captured in X
ATNT : Y1i ⊥ D | X → E [Y1i | D = 1, X ] =
E [Y1i | D = 0, X ]
ATE : Y0 , Y1 ⊥ D | X
Matching Methods
Matching Methods

captured in X
ATNT : Y1i ⊥ D | X → E [Y1i | D = 1, X ] =
E [Y1i | D = 0, X ]
ATE : Y0 , Y1 ⊥ D | X
Matching Methods
Matching Methods
Just like in the standard OLS framework, differences between

treatment and comparison group are not captured in X
Differences between the two groups are on the unobservable
characteristics
Matching Methods
Matching Methods
Just like in the standard OLS framework, differences between

treatment and comparison group are not captured in X
Differences between the two groups are on the unobservable
characteristics
Matching Methods
Matching Methods
Common Support
We observe individuals in the treatment and non-treatment

unit with the same characteristics
ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
When X = x , then Pr (D = 1 | X = x ) = 1.
We won’t observe individuals in the control group with X = x
Thus, there we cannot obtain a valid comparison group
Matching Methods
Matching Methods
Common Support

ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
Matching Methods
Matching Methods
Common Support

ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
Matching Methods
Matching Methods
Common Support

ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
Matching Methods
Matching Methods
Common Support

ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
Matching Methods
Matching Methods
Common Support

ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
Matching Methods
Matching Methods
Common Support

ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
Matching Methods
Matching Methods
Common Support

ATT : P (D = 1 | X ) < 1
ATNT : 0 < P (D = 1 | X )
ATE : 0 < P (D = 1 | X ) < 1
Imagine a scenario:
Matching Methods
Matching Methods
Selection on Observables & Common Support
If the assumptions hold, we can use the observed average

outcome of the non-treatment units to estimate the
counterfactual outcome
Finding a match
1 Introduction
Motivation
Matching Methods
Finding a match
2 Estimation
Estimation
Matching Methods
Syntax
Finding a match
Finding a match
The goal of matching is to approximate the characteristics

that explain individual’s decision to enroll
This procedure requires a large data set. Why?
Finding a match
Finding a match
The goal of matching is to approximate the characteristics

that explain individual’s decision to enroll
This procedure requires a large data set. Why?
Finding a match
Finding a match
Treated Untreated
Months unemployed Poor Months unemployed Poor
5 1 2 1
10 0 12 1
3 0 8 1
20 0 14 0
2 1 4 0
8 1 6 1
6 1 1 1
Finding a match
Finding a match
Treated Untreated
5 1 2 1
10 0 12 1
3 0 8 1
20 0 14 0
2 1 4 0
8 1 6 1
6 1 1 1
Finding a match
Finding a match
Treated Untreated
5 1 2 1
10 0 12 1
3 0 8 1
20 0 14 0
2 1 4 0
8 1 6 1
6 1 1 1
Finding a match
Finding a match
Treated Untreated
5 1 2 1
10 0 12 1
3 0 8 1
20 0 14 0
2 1 4 0
8 1 6 1
6 1 1 1
Finding a match
The problem of matching
It is difficult to identify a match for each of the units in the

treatment group
The list of observed characteristics is large
Each characteristics takes on many values
We can easily run into the curse of dimensionality
Dilemma:
Limit the set of observed characteristics, but...
Increase the number of observed characteristics, but...
Finding a match

treatment group
Dilemma:
Finding a match

treatment group
Dilemma:
Finding a match

treatment group
Dilemma:
Finding a match

treatment group
Dilemma:
Finding a match

treatment group
Dilemma:
Finding a match

treatment group
Dilemma:
1 Introduction
Motivation
Matching Methods
Finding a match
2 Estimation
Estimation
Matching Methods
Syntax
Idea
A solution to the curse of dimensionality problem is the

propensity score matching
The method computes the probability that the unit is enrolled
in the program using the observable characteristics
We do this for treatment and non-treatment units
Note that we only use the baseline or pre-treatment
observable characteristics
The propensity score is between 0 and 1
Idea

Idea

Idea

Idea

Idea
The propensity score is:
e (x) = P (D = 1 | X = x)
The score is used to make this assumption:
X ⊥ D | e (X )
Combining:
(Y1 , Y0 ) ⊥ D | X and 0 < e (x) < 1
(Y1 , Y0 ) ⊥ D | e (X )
Idea
e (x) = P (D = 1 | X = x)
X ⊥ D | e (X )
Combining:
(Y1 , Y0 ) ⊥ D | X and 0 < e (x) < 1
(Y1 , Y0 ) ⊥ D | e (X )
Idea
e (x) = P (D = 1 | X = x)
X ⊥ D | e (X )
Combining:
(Y1 , Y0 ) ⊥ D | X and 0 < e (x) < 1
(Y1 , Y0 ) ⊥ D | e (X )
Idea
Match treatment and non-treatment units with the closest

propensity score.
The matched non-treatment units become the comparison
group
The average difference in outcomes between the treatment
and the matched comparison is the estimate of the impact
The propensity score matching mimics a randomized
experiment
Treatment and comparison units have similar propensities.
Idea

propensity score.
group
experiment
Idea

propensity score.
group
experiment
Idea

propensity score.
group
experiment
Idea

propensity score.
group
experiment
Steps to PSM
Find representative surveys to identify the treatment and

non-treatment units
Pool the sample and estimate the probability that each
individual receives the treatment (based on observable
characteristics)
It is important to include relevant variables to avoid a biased
estimate
theory and previous empirical findings
formal statistical tests
Always remember the tradeoff:
Small set of characteristics: selection on observable assumption
Large set of characteristics: problems of common support
Steps to PSM

non-treatment units
characteristics)
estimate
Steps to PSM

non-treatment units
characteristics)
estimate
Steps to PSM

non-treatment units
characteristics)
estimate
Steps to PSM

non-treatment units
characteristics)
estimate
Steps to PSM

non-treatment units
characteristics)
estimate
Steps to PSM

non-treatment units
characteristics)
estimate
Steps to PSM

non-treatment units
characteristics)
estimate
Steps to PSM
Obtain the propensity scores

Restrict the sample to units with a common support
For each enrolled unit, locate a subgroup of non-treated units
with similar propensity scores
Test whether the means for the treatment and non-treated
units are statistically different
Steps to PSM

Steps to PSM

Steps to PSM

Steps to PSM
The measure of the impact is the difference between the

outcomes of the treatment and the matched comparison.
Estimation
1 Introduction
Motivation
Matching Methods
Finding a match
2 Estimation
Estimation
Matching Methods
Syntax
Estimation
Matching Strategy and ATT
The matching strategy is:

Pair each treatment unit i with one or more comparable
non-treated units
Associate the outcome Yiobs a matched outcome Ŷi (0) given
the weighted outcomes of its neighbors:
wij Yjobs
X
Ŷi (0) =
j∈C (i)
C (i) is the set of neighbors with DP

= 0 of the treated subject i
wij is the weight of non-treated j, j∈C(i) wij = 1
Estimation

non-treated units
wij Yjobs
X
Ŷi (0) =
j∈C (i)

Estimation

non-treated units
wij Yjobs
X
Ŷi (0) =
j∈C (i)

Estimation

non-treated units
wij Yjobs
X
Ŷi (0) =
j∈C (i)

Estimation

non-treated units
wij Yjobs
X
Ŷi (0) =
j∈C (i)

Estimation
The ATT:
E [Yi (1) − Yi (0) | Di = 1]
is estimated as:
ˆ = 1
X h i
ATT T
Yiobs − Ŷi (0)
N i:D =1
i
N T is the number of matched treated in the sample

Estimation
The ATT:
E [Yi (1) − Yi (0) | Di = 1]
is estimated as:
ˆ = 1
X h i
ATT T
Yiobs − Ŷi (0)
N i:D =1
i
N T is the number of matched treated in the sample

Matching Methods
1 Introduction
Motivation
Matching Methods
Finding a match
2 Estimation
Estimation
Matching Methods
Syntax
Matching Methods
One-to-one matching is still the most desirable

But it is difficult to observe two units with the same
propensity scores
Matching methods have been developed to deal with this
problem:
Nearest-neighbor matching
Radius matching
Kernel matching
Stratification matching
Matching Methods

propensity scores
problem:
Radius matching
Kernel matching
Matching Methods

propensity scores
problem:
Radius matching
Kernel matching
Matching Methods

propensity scores
problem:
Radius matching
Kernel matching
Matching Methods

propensity scores
problem:
Radius matching
Kernel matching
Matching Methods

propensity scores
problem:
Radius matching
Kernel matching
Matching Methods

propensity scores
problem:
Radius matching
Kernel matching
Matching Methods
Nearest Neighbor Matching
The absolute difference between the estimated propensity

scores of treatment and control groups are minimized
The group of control individuals are selected such that:
C (Pi ) = minj |Pi − Pj |
where:
Pi is the estimated propensity score for treated individuals i
Pj is the estimated propensity score for the control individuals j
Matching Methods

where:
Matching Methods

where:
Matching Methods

where:
Matching Methods
Radius and Kernel Matching
Radius: each individual in the treatment group is matched

with individuals in the control group whose scores are within a
predefined interval of the treatment individuals’ propensity
score.
Kernel: each individual in the treatment group is matched
with the weighted average of control individuals’ outcomes.
Matching Methods
Radius and Kernel Matching
Radius: each individual in the treatment group is matched

with individuals in the control group whose scores are within a
predefined interval of the treatment individuals’ propensity
score.
Kernel: each individual in the treatment group is matched
with the weighted average of control individuals’ outcomes.
Syntax
1 Introduction
Motivation
Matching Methods
Finding a match
2 Estimation
Estimation
Matching Methods
Syntax
Syntax
PSCORE
Stata command pscore calculates propensity scores

pscore also tests the balancing hypothesis through this
algorithm:
Split the sample in k equally spaced intervals of e (X )
Within each interval test that the average e (X ) of treated and
untreated do not differ
If the test fails, split the interval and test again
Continue until, in all intervals, the average e (X ) of treated
and untreated units do not differ
Within each interval, test that the means of each characteristic
do not differ between treated and untreated
Syntax
PSCORE

algorithm:
Syntax
PSCORE

algorithm:
Syntax
PSCORE

algorithm:
Syntax
PSCORE

algorithm:
Syntax
PSCORE

algorithm:
Syntax
PSCORE

algorithm:
Syntax
Estimation
Use STATA package psmatch2 and pstest for estimation and

balance check

Matching Methods

Uploaded by

Copyright:

Available Formats

You might also like

Matching Methods

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Matching Methods

Uploaded by

Copyright:

Available Formats

Introduction Estimation

Meeting 10: Matching

Gumilang Aryo Sahadewo

Universitas Gadjah Mada

November 12, 2019

The Problem of Counterfactual

Recall the two potential outcomes:

The causal effect of a program is (Y1i − Y0i )

The Problem of Counterfactual

Recall the two potential outcomes:

The causal effect of a program is (Y1i − Y0i )

The Problem of Counterfactual

Recall the two potential outcomes:

The causal effect of a program is (Y1i − Y0i )

The Problem of Counterfactual

The fundamental problem of evaluation is no counterfactual

The Problem of Counterfactual

The fundamental problem of evaluation is no counterfactual

The Problem of Counterfactual

The fundamental problem of evaluation is no counterfactual

The Problem of Counterfactual

The fundamental problem of evaluation is no counterfactual

The Problem of Counterfactual

The fundamental problem of evaluation is no counterfactual

The Problem of Counterfactual

The fundamental problem of evaluation is no counterfactual

The Problem of Counterfactual

The fundamental problem of evaluation is no counterfactual

The identifying assumption is selection on observables:

This is equivalent to:

The identifying assumption is selection on observables:

This is equivalent to:

Differences between treatment and comparison group are

Differences between treatment and comparison group are

Differences between treatment and comparison group are

Differences between treatment and comparison group are

Differences between treatment and comparison group are

Just like in the standard OLS framework, differences between

Just like in the standard OLS framework, differences between

We observe individuals in the treatment and non-treatment

We observe individuals in the treatment and non-treatment

We observe individuals in the treatment and non-treatment

We observe individuals in the treatment and non-treatment

We observe individuals in the treatment and non-treatment

We observe individuals in the treatment and non-treatment

We observe individuals in the treatment and non-treatment

We observe individuals in the treatment and non-treatment

If the assumptions hold, we can use the observed average

The goal of matching is to approximate the characteristics

The goal of matching is to approximate the characteristics

The problem of matching

It is difficult to identify a match for each of the units in the

The problem of matching

It is difficult to identify a match for each of the units in the

The problem of matching

It is difficult to identify a match for each of the units in the

The problem of matching

It is difficult to identify a match for each of the units in the

The problem of matching