Professional Documents
Culture Documents
When Should You Adjust Standard Errors For Clustering?: Alberto Abadie, Susan Athey, Guido Imbens, & Jeffrey Wooldridge
When Should You Adjust Standard Errors For Clustering?: Alberto Abadie, Susan Athey, Guido Imbens, & Jeffrey Wooldridge
When Should You Adjust Standard Errors For Clustering?: Alberto Abadie, Susan Athey, Guido Imbens, & Jeffrey Wooldridge
1
Second Question: Would your answer change if the RA
suggested clustering by gender?
How would you explain the answers (and possibly the dif-
ference between the answer for clustering on state and the
clustering on gender)?
2
Some Views from the Econometric Literature
3
The RA goes ahead anyway, and, using the Liang-Zeger
(STATA) clustered standard errors, comes back with:
4
• Adjusting standard errors for clustering is common in em-
pirical work.
1 X X 1 ? ? 0 ? ? 0 ...
2 ? ? 0 ? ? 0 ? ? 0 ...
3 ? ? 0 X X 1 X X 1 ...
4 ? ? 0 X X 1 ? ? 0 ...
... ... ... ... ... ... ... ... ... ... ...
M X X 1 ? ? 0 ? ? 0 ...
7
Design-based Uncertainty (X is observed, ? is missing)
1 X ? 1 1 X ? 1 1 ? X
2 ? X 0 1 ? X 0 1 ? X
3 ? X 0 1 X ? 1 1 X ?
4 ? X 0 1 ? X 0 1 X ?
... ... ... ... ... ... ... ... ... ... ...
M X ? 1 1 ? X 0 1 ? X
8
Clustering Setup
Yi is outcome
9
Least squares estimator (not generalized least squares)
N
(Yi − α − τ · Di)2 β̂ = (α̂, τ̂ )0
X
(α̂, τ̂ ) = arg min
i=1
Residuals
ε̂i = Yi − α̂ − τ̂ · Di
Focus of the paper is on properties of τ̂ :
• What is variance of τ̂ ?
10
Standard Textbook Approach:
11
Common Variance estimators (normalized by sample size)
Eicker-Huber-White, standard robust var (zero error covar):
−1 −1
N N N
0 0 2 XiXi0
X X X
V̂robust = N XiXi XiXi ε̂i
i=1 i=1 i=1
Liang-Zeger, STATA, standard clustering adjustment, (unre-
stricted within-cluster covariance matrix):
−1 0 −1
N G N
XiXi0 XiXi0
X X X X X
V̂cluster = N Xiε̂i Xiε̂i
i=1 g=1 i:Gi =g i:Gi =g i=1
Moulton/Kloek (constant covariance within-clusters)
N
V̂moulton = V̂ols · 1 + ρε · ρD ·
G
where ρε, ρD are the within-cluster correlations of ε̂ and D.
12
Related Literature
Population of size M .
PM
Ri ∈ {0, 1} is sampling indicator, i=1 Ri = N is sample size.
14
1. Descriptive Question:
Outcome Yi
15
definitions:
1 X 2 G X
σg2 = Yi − Y M,g Yg = Yi
Mg − 1 i:G =g M i:G =g
i i
G
2 1 X 2
σcluster = Yg−Y
G − 1 g=1
G
2 1 X
σcond = σg2
G g=1
G X (Yi − Y )(Yj − Y ) 2
σcluster
ρ= 2
≈ 2 2
M (M − G) i6=j,G =G σ σcluster + σcond
i j
M
1
σ2 = (Yi − Y )2 ≈ σcluster
2 2
X
+ σcond
M − 1 i=1
16
Estimator is
M
1 X
θ̂ = R i · Yi
N i=1
17
What do the variance estimators give us here?
h i
E V̂robust RS ≈ σ 2 correct
N
h i
E V̂cluster RS ≈ σ 2 · 1 + ρ · −1 wrong
G
18
Why is the cluster variance wrong here?
Two takeaways:
2. You can not tell from the data which variance is correct,
because it depends on the question.
19
Consider a model-based approach:
20
• (clustered sampling) Suppose we randomly select H clusters
out of G, and then select N/H units randomly from each of
the sampled clusters:
!−1 !−H
G M/G
pr(R = r) = · ,
H N/H
X X
for all r s.t. ∀g ri = N/G ∨ ri = 0.
i:Gi =g i:Gi =g
1 1 X
εi(d) = Yi(1) − Yi(d) εg (d) = ε(d)
N Mg i:G =g
i
22
Two special cases where we know what to do:
• Should cluster.
23
Question: what to do if assignment is correlated within clus-
ters (but not perfectly correlated)?
24
In between case: Random Sampling of Units, Imper-
fectly Correlated Assignment
• random assignment if σ = 0
25
normalized var (τ̂ ) = robust var + cluster adj
M
1 X
robust var ≈ 2 εi(1)2 + εi(−1)2
M i=1
2 C
4σ N 1 X 2
2
correct cluster adj = pc (εc(1) + εc(−1))
C C c=1
26
Conclusion
27
References