Professional Documents
Culture Documents
Experimental Design and Its Role in Data Science: Tirthankar Dasgupta CS 109 / Stat 121 November 17, 2015
Experimental Design and Its Role in Data Science: Tirthankar Dasgupta CS 109 / Stat 121 November 17, 2015
Treatment Effect after one hour Treatment Effect after one hour
Cured Not cured Cured Not cured
Took aspirin 70 30 Took aspirin 90 10
(Treatment) (Treatment)
Did nothing 50 50 Did nothing 10 90
(Control) (Control)
Modern-day experiments
• Education
• Marketing
• Stem Cell
• Nanotechnology
• Law
• Internet: A/B testing
How it all started
• Agricultural experiments at the Rothamstead
experimental station, U.K.
• Sir R. A. Fisher hired in 1919, first edition of Design of
Experiments" published in 1935.
Aerial view of Rothamstead in 2013 (www.bbsrc.ac.uk)
Potential Outcomes
A potential outcome for each unit when exposed to each treatment level
11!
= 462
5!6!
Step-4: Is the observed value of the
test statistic “unusual”?
Randomization distribution of test statistic
tobs=1.83
200
150
pvalue=0.62
Frequency
100
50
0
0 2 4 6 8 10
test statistic
The role of randomization
Which plots receive A and which receive B?
A A B B A B B B A A B
A B A B A A B B B A B
… A A B B A B B B A A B 1.83
462 B B B B B B A A A A A 5.32
What is the expected value of the statistic over all possible randomizations?
(-2.82+…+1.83+…+5.31)/462 =?
A A A A A B B B B B B
Block 1: assign 3 A’s and 3 B’s Block 2: assign 2 A’s and 3 B’s
FISHER:
• “Block what you can, randomize what
you cannot”
• “Analyze as you randomize”
• Twenty animals
available.
1200
1000
800
pvalue=0.0132
Frequency
600
tobs=0.41
400
200
0
test statistic
Three fundamental principles of
experimentation (Fisher 1925)
• Randomization
• Replication
• Blocking
How experiments are changing
• Hundreds of covariates associated with each
experimental unit
– e.g., patients in clinical trials
• Multiple treatment factors
– Thirty chemical modulators in stem-cell
experiments
• Complex randomization restrictions
– Non-compliance
– Multi-stratum
Designs that balance several covariates
over treatment groups
• The more covariates, the more likely at least
one covariate will be imbalanced across
treatment groups
• Covariate imbalance not limited to “unlucky”
randomizations
• Blocking not intuitive
• The solution: Re-randomization
Define a measure of “balance”
between treatment and control groups
• Define a measure
• Small values of the measure are acceptable
• Large values of the measure indicate lack of
balance and are unacceptable
A A B B A B B B A A B
A A B B A B B B A A B
(3+4+6+7+8+11)/6-(1+2+5+9+10)/5=1.10
A B A B A A B B B A B
(2+4+7+8+9+11)/6-(1+3+5+6+10)/5=1.83
A A A A A B B B B B B
(6+7+8+9+10+11)/6-(1+2+3+4+5)/5=5.5
Design and analysis of re-randomized
experiments
• Decide acceptable criterion (which
randomizations to accept)
• Randomize until the acceptability criterion is met
• Analysis using randomization test:
– Calculate observed value of test statistic
– Assuming hypothesis to be true, impute missing potential
outcomes
– Generate distribution of test statistic using repeated
assignments under same mechanism (i.e., accepting
randomizations that are acceptable)
– Determine if observed value of “unusual”
Visualization for two continuous
covariates
Visualization for two continuous
covariates
Visualization for two continuous
covariates
Visualization for two continuous
covariates
Visualization for two continuous
covariates
Criterion for re-randomization
• Mahalanobis distance M (a multivariate distance
between group mean vectors)
• Acceptance criterion: M ≤a
• Here a is a pre-determined constant
⎡ Yi (t ) ⎤ ⎛ ⎡ µt ⎤ 2 ⎡ 1 ρ⎤⎞
⎢Y (c)⎥ ~ N ⎜⎜ ⎢ µ ⎥, σ ⎢ ρ ⎥ ⎟
1 ⎦ ⎟⎠
⎣ i ⎦ ⎝⎣ c⎦ ⎣
⎡ µt ⎤ ⎛ ⎡0⎤ ⎡τ 2 0 ⎤ ⎞
⎢ µ ⎥ ~ N ⎜⎜ ⎢0⎥, ⎢ 2⎥⎟
⎟
⎣ c⎦ ⎝⎣ ⎦ ⎣ 0 τ ⎦⎠
Super population of all
NY schools
⎡ Yi (t ) ⎤ ⎛ ⎡ µt ⎤ 2 ⎡ 1 ρ⎤⎞
⎢Y (c)⎥ ~ N ⎜⎜ ⎢ µ ⎥, σ ⎢ ρ ⎥ ⎟
1 ⎦ ⎟⎠
⎣ i ⎦ ⎝⎣ c⎦ ⎣
obs
Y ⎡ µt ⎤ ⎛ ⎡0⎤ ⎡τ 2 0 ⎤ ⎞
⎢ µ ⎥ ~ N ⎜⎜ ⎢0⎥, ⎢ 2⎥⎟
⎟
⎣ c⎦ ⎝⎣ ⎦ ⎣ 0 τ ⎦⎠
Fixed to random potential outcomes,
Finite to super-population inference
⎡ Yi (t ) ⎤ ⎛ ⎡ µt ⎤ 2 ⎡ 1 ρ⎤⎞
⎢Y (c)⎥ ~ N ⎜⎜ ⎢ µ ⎥, σ ⎢ ρ ⎥ ⎟
1 ⎦ ⎟⎠
⎣ i ⎦ ⎝⎣ c⎦ ⎣
obs
Y ⎡ µt ⎤ ⎛ ⎡0⎤ ⎡τ 2 0 ⎤ ⎞
⎢ µ ⎥ ~ N ⎜⎜ ⎢0⎥, ⎢ 2⎥⎟
⎟
⎣ c⎦ ⎝⎣ ⎦ ⎣ 0 τ ⎦⎠
⎡ µt ⎤ obs
⎢µ ⎥ | Y , W
⎣ c⎦
Back to finite-population inference
⎡ Yi (t ) ⎤ ⎛ ⎡ µt ⎤ 2 ⎡ 1 ρ⎤⎞
⎢Y (c)⎥ ~ N ⎜⎜ ⎢ µ ⎥, σ ⎢ ρ ⎥ ⎟
1 ⎦ ⎟⎠
⎣ i ⎦ ⎝⎣ c⎦ ⎣
Y mis | Y obs , W ⎡ µt ⎤ ⎛ ⎡0⎤ ⎡τ 2 0 ⎤ ⎞
⎢ µ ⎥ ~ N ⎜⎜ ⎢0⎥, ⎢ 2⎥⎟
⎟
⎣ c⎦ ⎝⎣ ⎦ ⎣ 0 τ ⎦⎠
⎡ µt ⎤ obs
⎢µ ⎥ | Y , W
⎣ c⎦
Finite-population inference: Impute missing
potential outcomes “stochastically”
Plot of land Fertilizer A (old) Fertilizer B (new) Assignment (W)
1 29.2 ? 0
2 11.4 ? 0
3 ? 26.6 1
4 ? 23.7 mis obs 1
5 25.3 ?
Y | Y ,W 0
6 ? 28.5 1
7 ? 14.2 1
8 ? 17.9 1
9 16.5 ? 0
10 21.1 ? 0
11 ? 24.3 1
Average 20.70 22.53
To learn more, take Stat 140/240