Experimental Design

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 155

Experimental Design

Dr Yunita Ismail
Environmental Engineering Study Program
Engineering Faculty
President University
What is Experimental Design?

Experimental design includes both


• Strategies for organizing data collection
• Data analysis procedures matched to those data collection
strategies

Classical treatments of design stress analysis procedures


based on the analysis of variance (ANOVA)

Other analysis procedure such as those based on hierarchical


linear models or analysis of aggregates (e.g., class or school
means) are also appropriate
Why Do We Need Experimental Design?

Because of variability
We wouldn’t need a science of experimental design if

• If all units (students, teachers, & schools) were identical


and
• If all units responded identically to treatments

We need experimental design to control variability so


that treatment effects can be identified
Problem statement

 Complexity of Environmental Problems


 Too many variables in the system
 Interactive/non linear structure
 Difficulty in physical experimentation
 Proposed solutions
 Implement Design of Experiments (DOE)
 In the Laboratory or on simulation models
 EVOP to physical experimentation

3
Examples of Environmental Projects

 Salinity, Ph., temperature, invasive species


 In the survival of indigenous species
 Best mining and agricultural practices
 In the life (length, quality) of specific species
 Contaminants, light, water velocity, flora
 On indigenous species of the ecosystem
 Dam building and ecosystem destruction
 Difficulty to experiment in actual environment
 Or to re-create the complete environment in lab

4
Goals of Experimental Design

• Avoid experimental artifacts


• Eliminate bias
1. Use a simultaneous control group
2. Randomization
3. Blinding
• Reduce sampling error
1. Replication
2. Balance
3. Blocking
Goals of Experimental Design
• Avoid experimental artifacts
• Eliminate bias
1. Use a simultaneous control group
2. Randomization
3. Blinding
• Reduce sampling error
1. Replication
2. Balance
3. Blocking
Experimental Artifacts
• Experimental artifacts: a bias in a measurement
produced by unintended consequences of
experimental procedures

• Conduct your experiments under as natural of


conditions as possible to avoid artifacts
Experimental Artifacts
• Example: diving birds
Goals of Experimental Design
• Avoid experimental artifacts
• Eliminate bias
1. Use a simultaneous control group
2. Randomization
3. Blinding
• Reduce sampling error
1. Replication
2. Balance
3. Blocking
Control Group
• A control group is a group of subjects left untreated
for the treatment of interest but otherwise
experiencing the same conditions as the treated
subjects

• Example: one group of patients is given an inert


placebo
The Placebo Effect
• Patients treated with placebos, including sugar pills,
often report improvement
• Example: up to 40% of patients with chronic back
pain report improvement when treated with a
placebo
• Even “sham surgeries” can have a positive effect
• This is why you need a control group!
Randomization
• Randomization is the random assignment of
treatments to units in an experimental study

• Breaks the association between potential


confounding variables and the explanatory
variables
Confounding variable Experimental units
Confounding variable Experimental units Treatments
Experimental units Treatments

Without
randomization,
Confounding variable

the confounding
variable differs
among
treatments
Confounding variable Experimental units Treatments
Experimental units Treatments

With
randomization,
Confounding variable

the confounding
variable does
not differ among
treatments
Blinding
• Blinding is the concealment of information from
the participants and/or researchers about which
subjects are receiving which treatments
• Single blind: subjects are unaware of treatments
• Double blind: subjects and researchers are unaware
of treatments
Blinding
• Example: testing heart medication
• Two treatments: drug and placebo
• Single blind: the patients don’t know which group
they are in, but the doctors do
• Double blind: neither the patients nor the doctors
administering the drug know which group the
patients are in
Goals of Experimental Design
• Avoid experimental artifacts
• Eliminate bias
1. Use a simultaneous control group
2. Randomization
3. Blinding
• Reduce sampling error
1. Replication
2. Balance
3. Blocking
Replication
• Experimental unit: the individual unit to which
treatments are assigned

Experiment 1

Experiment 2

Tank 1 Tank 2

Experiment 3
All separate tanks
Replication
• Experimental unit: the individual unit to which
treatments are assigned

2 Experimental
Experiment 1
Units

2 Experimental
Experiment 2
Units
Tank 1 Tank 2

8 Experimental
Units Experiment 3
All separate tanks
Replication
• Experimental unit: the individual unit to which
treatments are assigned

2 Experimental
Experiment 1
Units

2 Experimental
Units Pseudoreplication Experiment 2

Tank 1 Tank 2

8 Experimental
Units Experiment 3
All separate tanks
Why is pseudoreplication bad?
Experiment 2

Tank 1 Tank 2

• problem with confounding and replication!


• Imagine that something strange happened, by
chance, to tank 2 but not to tank 1
• Example: light burns out
• All four lizards in tank 2 would be smaller
• You might then think that the difference was due to
the treatment, but it’s actually just random chance
Why is replication good?
• Consider the formula for standard error of the
mean:

s
SE Y 
n
Larger n Smaller SE
Balance
• In a balanced experimental design, all treatments
have equal sample size

Better than

Balanced Unbalanced
Balance
• In a balanced experimental design, all treatments
have equal sample size
• This maximizes power
• Also makes tests more robust to violating
assumptions
Blocking
• Blocking is the grouping of experimental units that
have similar properties
• Within each block, treatments are randomly
assigned to experimental treatments
• Randomized block design
Randomized Block Design
Randomized Block Design
• Example: cattle tanks in a field
Very sunny

Not So Sunny
Block 1

Block 2
Block 3
Block 4
What good is blocking?
• Blocking allows you to remove extraneous variation
from the data
• Like replicating the whole experiment multiple
times, once in each block
• Paired design is an example of blocking
Basic Ideas of Design:
Independent Variables (Factors)

The values of independent variables are called levels


Some independent variables can be manipulated, others can’t

Treatments are independent variables that can be manipulated


Blocks and covariates are independent variables that cannot be
manipulated

These concepts are simple, but are often confused

Remember:
You can randomly assign treatment levels but not blocks
Basic Ideas of Design (Crossing)

Relations between independent variables

Factors (treatments or blocks) are crossed if every level of one


factor occurs with every level of another factor

Example
The Tennessee class size experiment assigned students to one
of three class size conditions. All three treatment conditions
occurred within each of the participating schools

Thus treatment was crossed with schools


Basic Ideas of Design (Nesting)

Factor B is nested in factor A if every level of factor B occurs


within only one level of factor A

Example
The Tennessee class size experiment actually assigned
classrooms to one of three class size conditions. Each
classroom occurred in only one treatment condition

Thus classrooms were nested within treatments

(But treatment was crossed with schools)


Where Do These Terms Come From?
(Nesting)

An agricultural experiment where blocks are literally blocks or


plots of land Blocks
1 2 … n

T1 T2 … T1

Here each block is literally nested within a treatment


condition
Where Do These Terms Come From?
(Crossing)
An agricultural experiment
Blocks
1 2 … n

T1 T2 T1

T2 T1 T2

Blocks were literally blocks of land and plots of land


within blocks were assigned different treatments
Where Do These Terms Come From?
(Crossing)
Blocks were literally blocks of land and plots of land within
blocks were assigned different treatments.
Blocks
1 2 … n

T1 T2 T1

T2 T1 T2

Here treatment literally crosses the blocks


Where Do These Terms Come From?
(Crossing)
The experiment is often depicted like this. What is
wrong with this as a field layout?

Blocks
1 2 … n

Treatment 1

Treatment 2
Consider possible sources of bias
Think About These Designs

A study assigns a reading treatment (or control) to children in


20 schools. Each child is classified into one of three groups
with different risk of reading failure.

A study assigns T or C to 20 teachers. The teachers are in five


schools, and each teacher teaches 4 science classes

Two schools in each district are picked to participate. Each


school has two grade 4 teachers. One of them is assigned to
T, the other to C.
Three Basic Designs

The completely randomized design


Treatments are assigned to individuals

The randomized block design


Treatments are assigned to individuals within blocks
(This is sometimes called the matched design, because
individuals are matched within blocks)

The hierarchical design


Treatments are assigned to blocks, the same treatment is
assigned to all individuals in the block
The Completely Randomized Design

Individuals are randomly assigned to one of two treatments


Treatment Control

Individual 1 Individual 1
Individual 2 Individual 2


Individual nT Individual nC
The Randomized Block Design

Block 1 … Block m

Individual 1 Individual 1



Treatment 1

Individual n1 Individual nm
Individual n1 +1 Individual nm + 1


Treatment 2 …

Individual 2n1 Individual 2nm


The Hierarchical Design

Treatment Control

Block 1 Block m Block m+1 Block 2m

Individual 1 Individual 1 Individual 1 Individual 1

Individual 2 Individual 2 Individual 2 Individual 2


… …


Individual n1 Individual nm Individual nm+1 Individual n2m
Randomization Procedures

Randomization has to be done as an explicit process devised by the


experimenter

• Haphazard is not the same as random

• Unknown assignment is not the same as random

• “Essentially random” is technically meaningless

• Alternation is not random, even if you alternate from a random


start

This is why R.A. Fisher was so explicit about randomization


processes
Randomization Procedures

R.A. Fisher on how to randomize an experiment with small


sample size and 5 treatments

A satisfactory method is to use a pack of cards numbered


from 1 to 100, and to arrange them in random order by
repeated shuffling. The varieties [treatments] are
numbered from 1 to 5, and any card such as the number 33,
for example is deemed to correspond to variety [treatment]
number 3, because on dividing by 5 this number is found as
the remainder. (Fisher, 1935, p.51)
Randomization Procedures

You may want to use a table of random numbers, but be sure to pick
an arbitrary start point!

Beware random number generators—they typically depend on seed


values, be sure to vary the seed value (if they do not do it
automatically)

Otherwise you can reliably generate the same sequence of random


numbers every time

It is no different that starting in the same place in a table of random


numbers
Randomization Procedures

Completely Randomized Design


(2 treatments, 2n individuals)

Make a list of all individuals

For each individual, pick a random number from 1 to 2 (odd or


even)

Assign the individual to treatment 1 if even, 2 if odd

When one treatment is assigned n individuals, stop assigning


more individuals to that treatment
Randomization Procedures

Completely Randomized Design


(2pn individuals, p treatments)

Make a list of all individuals

For each individual, pick a random number from 1 to p

One way to do this is to get a random number of any size, divide by


p, the remainder R is between 0 and (p – 1), so add 1 to the
remainder to get R + 1

Assign the individual to treatment R + 1

Stop assigning individuals to any treatment after it gets n individuals


Randomization Procedures

Randomized Block Design with 2 Treatments


(m blocks per treatment, 2n individuals per block)

Make a list of all individuals in the first block

For each individual, pick a random number from 1 to 2 (odd or


even)

Assign the individual to treatment 1 if even, 2 if odd

Stop assigning a treatment it is assigned n individuals in the block

Repeat the same process with every block


Randomization Procedures

Randomized Block Design with p Treatments


(m blocks per treatment, pn individuals per block)

Make a list of all individuals in the first block

For each individual, pick a random number from 1 to p

Assign the individual to treatment p

Stop assigning a treatment it is assigned n individuals in the block

Repeat the same process with every block


Randomization Procedures

Hierarchical Design with 2 Treatments


(m blocks per treatment, n individuals per block)

Make a list of all blocks

For each block, pick a random number from 1 to 2

Assign the block to treatment 1 if even, treatment 2 if odd

Stop assigning a treatment after it is assigned m blocks

Every individual in a block is assigned to the same treatment


Randomization Procedures

Hierarchical Design with p Treatments


(m blocks per treatment, n individuals per block)

Make a list of all blocks

For each block, pick a random number from 1 to p

Assign the block to treatment corresponding to the number

Stop assigning a treatment after it is assigned m blocks

Every individual in a block is assigned to the same treatment


Inferential Population and Inference Models

The inferential population or inference model has


implications for analysis and therefore for the
design of experiments
Do we make inferences to the schools in this sample
or to a larger population of schools?

Inferences to the schools or classes in the sample are


called conditional inferences
Inferences to a larger population of schools or classes
are called unconditional inferences
Inferential Population and Inference Models

Note that the inferences (what we are estimating) are


different in conditional versus unconditional inference
models

• In a conditional inference, we are estimating the mean (or


treatment effect) in the observed schools

• In unconditional inference we are estimating the mean (or


treatment effect) in the population of schools from which
the observed schools are sampled

We are still estimating a mean (or a treatment effect) but they


are different parameters with different uncertainties
Fixed and Random Effects
When the levels of a factor (e.g., particular blocks
included) in a study are sampled and the inference
model is unconditional, that factor is called random
and its effects are called random effects

When the levels of a factor (e.g., particular blocks


included) in a study constitute the entire inference
population and the inference model is conditional,
that factor is called fixed and its effects are called
fixed effects
Applications to Experimental Design

We will look in detail at the two most widely used


experimental designs in education

• Randomized blocks designs

• Hierarchical designs
Experimental Designs
For each design we will look at
• Structural Model for data (and what it means)
• Two inference models
• What does ‘treatment effect’ mean in principle
• What is the estimate of treatment effect
• How do we deal with context effects
• Two statistical analysis procedures
• How do we estimate and test treatment effects
• How do we estimate and test context effects
• What is the sensitivity of the tests
The Randomized Block Design
The population (the sampling frame)

We wish to compare two treatments

• We assign treatments within schools

• Many schools with 2n students in each

• Assign n students to each treatment in each school


The Randomized Block Design
The experiment

Compare two treatments in an experiment

• We assign treatments within schools

• With m schools with 2n students in each

• Assign n students to each treatment in each school


The Randomized Block Design

Diagram of the design


Schools

Treatment 1 2 … m

1 …
2 …
The Randomized Block Design
School 1

Schools

Treatment 1 2 … m

1 …
2 …
The Conceptual Model
The statistical model for the observation on the kth
person in the jth school in the ith treatment is
Yijk = μ +αi + βj + αβij + εijk

where
μ is the grand mean,
αi is the average effect of being in treatment i,
βj is the average effect of being in school j,
αβij is the difference between the average effect of
treatment i and the effect of that treatment in school
j,
εijk is a residual
Effect of Context

Yijk     i   j    ij   ijk

Context Effect
Two-level Randomized Block Design
With No Covariates (HLM Notation)
Level 1 (individual level)

Yijk = β0j + β1jTijk+ εijk ε ~ N(0, σW2)

Level 2 (school Level)

β0j = π00 + ξ0j ξ0j ~ N(0, σS2)

β1j = π10+ ξ1j ξ1j ~ N(0, σTxS2)

If we code the treatment Tijk = ½ or - ½ , then the parameters are


identical to those in standard ANOVA
Effects and Estimates
The population mean of treatment 1 in school j is

α1 + αβ1j

The population mean of treatment 2 in school j is

α2 + αβ2j

The estimate of the mean of treatment 1 in school j is

α1 + αβ1j + ε1j●

The estimate of the mean of treatment 2 in school j is

α2 + αβ2j + ε2j●
Effects and Estimates
The comparative treatment effect in any given school j is

(α1 – α2) + (αβ1j – αβ2j)

The estimate of comparative treatment effect in school j is

(α1 – α2) + (αβ1j – αβ2j) + (ε1j● – ε2j●)

The mean treatment effect in the experiment is

(α1 – α2) + (αβ1● – αβ2●)

The estimate of the mean treatment effect in the experiment is

(α1 – α2) + (αβ 1● – αβ2●) + (ε1●● – ε2●●)


Inference Models
Two different kinds of inferences about effects

Unconditional Inference (Schools Random)


Inference to the whole universe of schools
(requires a representative sample of schools)

Conditional Inference (Schools Fixed)


Inference to the schools in the experiment
(no sampling requirement on schools)
Statistical Analysis Procedures
Two kinds of statistical analysis procedures

Mixed Effects Procedures (Schools Random)


Treat schools in the experiment as a sample from a
population of schools
(only strictly correct if schools are a sample)

Fixed Effects Procedures (Schools Fixed)


Treat schools in the experiment as a population
Unconditional Inference
(Schools Random)
The estimate of the mean treatment effect in the experiment is

(α1 – α2) + (αβ 1● – αβ2●) + (ε1●● – ε2●●)

The average treatment effect we want to estimate is

(α1 – α2)

The term (ε1●● – ε2●●) depends on the students in the schools in the
sample

The term (αβ1● – αβ2●) depends on the schools in sample

Both (ε1●● – ε2●●) and (αβ1● – αβ2●) are random and average to 0
across students and schools, respectively
Conditional Inference
(Schools Fixed)
The estimate of the mean treatment effect in the experiment is still

(α1 – α2) + (αβ 1● – αβ2●) + (ε1●● – ε2●●)

Now the average treatment effect we want to estimate is

(α1 + αβ1●) – (α2 + αβ2●) = (α1 – α2) + (αβ1● – αβ2●)

The term (ε1●● – ε2●●) depends on the students in the schools in the
sample

The term (αβ1● – αβ2●) depends on the schools in sample, but the
treatment effect in the sample of schools is the effect we want to
estimate
Expected Mean Squares
Randomized Block Design
(Two Levels, Schools Random)

Source df E{MS}

Treatment (T) 1 σW2 + nσTxS2 + nmΣαi2

Schools (S) m–1 σW2 + 2nσS2

TXS m–1 σW2 + nσTxS2

Within Cells 2 m(n – 1) σW2


Mixed Effects Procedures
(Schools Random)
The test for treatment effects has
H0: (α1 – α2) = 0

Estimated mean treatment effect in the experiment is

(α1 – α2) + (αβ1● – αβ2●) + (ε1●● – ε2●●)

The variance of the estimated treatment effect is

2[σW2 + nσTxS2] /mn = 2[1 + (nωS – 1)ρ]σ2/mn

Here ωS = σTxS2/σS2 and ρ = σS2/(σS2 + σW2) = σS2/σ2


Mixed Effects Procedures
The test for treatment effects:

FT = MST/MSTxS with (m – 1) df

The test for context effects (treatment by schools interaction) is

FTxS = MSTxS/MSWS with 2m(n – 1) df

Power is determined by the operational effect size


 α1  α 2  n
 1  ( nω S  1) ρ

where ωS = σTxS2/σS2 and ρ = σS2/(σS2 + σW2) = σS2/σ2


Expected Mean Squares
Randomized Block Design
(Two Levels, Schools Fixed)

Source df E{MS}

Treatment (T) 1 σW2 + nmΣαi2

Schools (S) m–1 σW2 + 2nΣβi2/(m – 1)

SXT m–1 σW2 + nΣΣαβij2/(m – 1)

Within Cells 2m(n – 1) σW2


Fixed Effects Procedures
The test for treatment effects has

H0: (α1 – α2) + (αβ1● – αβ2●) = 0

Estimated mean treatment effect in the experiment is

(α1 – α2) + (αβ1● – αβ2●) + (ε1●● – ε2●●)

The variance of the estimated treatment effect is

2σW2 /mn
Fixed Effects Procedures
The test for treatment effects:

FT = MST/MSWS with m(n – 1) df

The test for context effects (treatment by schools interaction) is

FC = MSTxS/MSWS with 2m(n – 1) df

Power is determined by the operational effect size

 α1  α 2    α  1•  α  2 • 
n

with m(n – 1) df
Comparing Fixed and Mixed Effects Statistical
Procedures
(Randomized Block Design)
Fixed Mixed
Inference
Model Conditional Unconditional

Estimand (α1 – α2) + (αβ1● – αβ2●) (α1 – α2)


Contaminating
Factors (ε1●● – ε2●●) (αβ1● – αβ2●) + (ε1●● – ε2●●)

Operational  α1  α 2    α  1•  α  2 •   α1  α 2  n
n
Effect Size   1  ( nω S  1) ρ

df 2m(n – 1) (m – 1)

Power higher lower


Comparing Fixed and Mixed Effects Procedures
(Randomized Block Design)

Conditional and unconditional inference models


• estimate different treatment effects
• have different contaminating factors that add uncertainty

Mixed procedures are good for unconditional inference

The fixed procedures are good for conditional inference

The fixed procedures have higher power


The Hierarchical Design
The universe (the sampling frame)

We wish to compare two treatments

• We assign treatments to whole schools


• Many schools with n students in each
• Assign all students in each school to the same
treatment
The Hierarchical Design
The experiment

We wish to compare two treatments

• We assign treatments to whole schools


• Assign 2m schools with n students in each
• Assign all students in each school to the same
treatment
The Hierarchical Design
Diagram of the experiment

Schools

2
Treatment 1 2 … m m +1 m +2 …
m

2
The Hierarchical Design
Treatment 1 schools

Schools

Treatment 1 2 … m m +1 m+2 … 2m

2
The Hierarchical Design
Treatment 2 schools

Schools

Treatment 1 2 … m m+1 m+2 … 2m

2
The Conceptual Model
The statistical model for the observation on the kth person in the jth school in the
ith treatment is

Yijk = μ + αi + βi + αβij + εjk(i) = μ + αi + βj(i) + εjk(i)

μ is the grand mean,

αi is the average effect of being in treatment i,

βj is the average effect if being in school j,

αβij is the difference between the average effect of treatment i and the effect of
that treatment in school j,

εijk is a residual

Or βj(i) = βi + αβij is a term for the combined effect of schools within treatments
The Conceptual Model
The statistical model for the observation on the kth person in the jth school in the
ith treatment is

Yijk = μ + αi + βi + αβij + εjk(i) = μ + αi + βj(i) + εjk(i)


Context Effects
μ is the grand mean,
αi is the average effect of being in treatment i,
βj is the average effect if being in school j,
αβij is the difference between the average effect of treatment i and the effect of
that treatment in school j,
εijk is a residual

or βj(i) = βi + αβij is a term for the combined effect of schools within treatments
Two-level Hierarchical Design
With No Covariates (HLM Notation)
Level 1 (individual level)

Yijk = β0j + εijk ε ~ N(0, σW2)

Level 2 (school Level)

γ0j = π00 + π01Tj + ξ0j ξ ~ N(0, σS2)

If we code the treatment Tj = ½ or - ½ , then

π00 = μ, π01 = α1, ξ0j = βj(i)

The intraclass correlation is ρ = σS2/(σS2 + σW2) = σS2/σ2


Effects and Estimates
The comparative treatment effect in any given school j is still

(α1 – α2) + (αβ1j – αβ2j)

But we cannot estimate the treatment effect in a single school because each
school gets only one treatment

The mean treatment effect in the experiment is

(α1 – α2) + (β●(1) – β●(2))

= (α1 – α2) +(β1● – β2● )+ (αβ1● – αβ2●)

The estimate of the mean treatment effect in the experiment is

(α1 – α2) + (β● (1) – β● (2)) + (ε1●● – ε2●●)


Inference Models
Two different kinds of inferences about effects
(as in the randomized block design)

Unconditional Inference (schools random)


Inference to the whole universe of schools
(requires a representative sample of schools)

Conditional Inference (schools fixed)


Inference to the schools in the experiment
(no sampling requirement on schools)
Unconditional Inference
(Schools Random)
The average treatment effect we want to estimate is

(α1 – α2)

The term (ε1●● – ε2●●) depends on the students in the


schools in the sample

The term (β●(1) – β●(2)) depends on the schools in sample

Both (ε1●● – ε2●●) and (β●(1) – β●(2)) are random and


average to 0 across students and schools, respectively
Conditional Inference
(Schools Fixed)
The average treatment effect we want to (can) estimate is

(α1 + β●(1)) – (α2 + β●(2)) = (α1 – α2) + (β●(1) – β●(2))

= (α1 – α2) + (β1● – β2● )+ (αβ1● – αβ2●)

The term (β●(1) – β●(2)) depends on the schools in sample,


but we want to estimate the effect of treatment in the
schools in the sample

Note that this treatment effect is not quite the same as in the
randomized block design, where we estimate
(α1 – α2) + (αβ1● – αβ2●)
Statistical Analysis Procedures
Two kinds of statistical analysis procedures
(as in the randomized block design)

Mixed Effects Procedures


Treat schools in the experiment as a sample from a
universe

Fixed Effects Procedures


Treat schools in the experiment as a universe
Expected Mean Squares
Hierarchical Design
(Two Levels, Schools Random)

Source df E{MS}

Treatment (T) 1 σW2 + nσS2 + nmΣαi2

Schools (S) 2(m – 1) σW2 + nσS2

Within Schools 2m(n – 1) σW2


Mixed Effects Procedures
(Schools Random)
The test for treatment effects has

H0: (α1 – α2) = 0

Estimated mean treatment effect in the experiment is

(α1 – α2) + (β●(1) – β●(2)) + (ε1●● – ε2●●)

The variance of the estimated treatment effect is

2[σW2 + nσS2] /mn = 2[1 + (n – 1)ρ]σ2/mn

where ρ = σS2/(σS2 + σW2) = σS2/σ2


Mixed Effects Procedures
(Schools Random)
The test for treatment effects:

FT = MST/MSBS with (m – 2) df

There is no omnibus test for context effects

Power is determined by the operational effect size

 α1  α 2  n
2 1  2( n  1) ρ2
where ρ = σS /(σS2 + σW ) = σS /σ
2
Expected Mean Squares
Hierarchical Design
(Two Levels, Schools Fixed)

Source df E{MS}

Treatment (T) 1 σW2 + nmΣ(αi + β●(i))2

Schools (S) m–1 σW2 + nΣΣβj(i)2/2(m – 1)

Within Schools 2 m(n – 1) σW2


Mixed Effects Procedures
(Schools Fixed)
The test for treatment effects has

H0: (α1 – α2) + (β●(1) – β●(2)) = 0

Note that the school effects are confounded with treatment effects

Estimated mean treatment effect in the experiment is

(α1 – α2) + (β●(1) – β●(2)) + (ε1●● – ε2●●)

The variance of the estimated treatment effect is


2σW2 /mn
Mixed Effects Procedures
(Schools Fixed)
The test for treatment effects:

FT = MST/MSWS with m(n – 1) df

There is no omnibus test for context effects, because


each school gets only one treatment

Power is determined by the operational effect size


 α1  α 2     • (1)   • ( 2 ) 
n

and m(n – 1) df
Comparing Fixed and Mixed Effects Procedures
(Hierarchical Design)

Fixed Mixed
Inference
Conditional Unconditional
Model
Estimand (α1 – α2) + (β●(1) – β●(2)) (α1 – α2)

Contaminating (ε1●● – ε2●●) (β●(1) – β●(2)) + (ε1●● – ε2●●)


Factors
Effect Size  α1  α 2     • (1)   • ( 2 )   α1  α 2  n
n
  1  ( n  1) ρ
df m(n – 1) (m – 2)

Power higher lower


Comparing Fixed and Mixed Effects Statistical
Procedures (Hierarchical Design)
Conditional and unconditional inference models
• estimate different treatment effects
• have different contaminating factors that add uncertainty

Mixed procedures are good for unconditional inference

The fixed procedures are not generally recommended

The fixed procedures have higher power


Comparing Hierarchical Designs to
Randomized Block Designs
Randomized block designs usually have higher power, but
assignment of different treatments within schools or classes may
be

• practically difficult
• politically infeasible
• theoretically impossible

It may be methodologically unwise because of potential for

• Contamination or diffusion of treatments


• compensatory rivalry or demoralization
Applications to Experimental Design
We will address the two most widely used experimental designs in
education

• Randomized blocks designs with 2 levels


• Randomized blocks designs with 3 levels
• Hierarchical designs with 2 levels
• Hierarchical designs with 3 levels

We also examine the effect of covariates

Hereafter, we generally take schools to be random


Complications
Which matchings do we have to take into account in design (e.g.,
schools, districts, regions, states, regions of the country, country)?

Ignore some, control for effects of others as fixed blocking factors

Justify this as part of the population definition

For example, we define the inference population as these five


districts within these two states

But, doing so obviously constrains generalizability


Precision of the Estimated Treatment Effect

Precision is the standard error of the estimated treatment


effect

Precision in simple (simple random sample) designs depends


on:

• Standard deviation in the population σ

• Total sample size N

SE  2is
The precision N
Precision of the Estimated Treatment Effect

Precision in complex (clustered sample) designs depends on:

• The (total) standard deviation σT

• Sample size at each level of sampling


(e.g., m clusters, n individuals per cluster)

• Intraclass correlation structure

It is a little harder to compute than in simple designs, but


important because it helps you see what matters in design
Intraclass Correlations in
Two-level Designs
In two-level designs the intraclass correlation structure is
determined by a single intraclass correlation

This intraclass correlation is the proportion of the total variance


that is between schools (clusters)
2 2
S S
ρ 
2 2 2
S  W T
Precision in Two-level Hierarchical Design
With No Covariates
The standard error of the treatment effect is
 2   1  ( n  1) ρ 
SE   T   
 m  n 

SE decreases as m (number of schools) increases

SE deceases as n increases, but only up to point

SE increases as ρ increases
Statistical Power
Power in simple (simple random sample) designs depends on:

• Significance level

• Effect size

• Sample size

Look power up in a table for sample size and effect size


Fragment of Cohen’s Table 2.3.5

d
n 0.10 0.20 … 0.80 1.00 1.20 1.40
8 05 07 … 31 46 60 73
9 06 07 … 35 51 65 79
10 06 07 … 39 56 71 84
11 06 07 … 43 63 76 87
Computing Statistical Power
Power in complex (clustered sample) designs depends on:

• Significance level

• Effect size δ

• Sample size at each level of sampling


(e.g., m clusters, n individuals per cluster)

• Intraclass correlation structure

This makes it seem a lot harder to compute


Computing Statistical Power
Computing statistical power in complex designs is only a little
harder than computing it for simple designs

Compute operational effect size (incorporates sample design


information) ΔT

Look power up in a table for operational sample size and


operational effect size

This is the same table that you use for simple designs
Power in Two-level Hierarchical Design
With No Covariates
Basic Idea:
Operational Effect Size = (Effect Size) x (Design Effect)
n
 
T

1   n  1 ρ

ΔT = δ x (Design Effect)

For the two-level hierarchical design with no covariates


n
 
T

1   n  1 ρ

Operational sample size is number of schools (clusters)


Power in Two-level Hierarchical Design
With No Covariates
As m (number of schools) increases, power increases

As effect size increases, power increases

Other influences occur through the design effect


n 1

1   n  1 ρ 1
n
 (1  1n ) 

As ρ increases the design effect (and power) decreases

No matter how large n gets the maximum design effect is


1/ ρ
Thus power only increases up to some limit as n increases
Two-level Hierarchical Design
With Covariates (HLM Notation)
Level 1 (individual level)

Yijk = β0j + β1jXijk+ εijk ε ~ N(0, σAW2)

Level 2 (school Level)

β0j = π00 + π01Tj + π02Wj + ξ0j ξ ~ N(0, σAS2)


β1j = π10

Note that the covariate effect β1j = π10 is a fixed effect

If we code the treatment Tj = ½ or - ½ , then the parameters are


identical to those in standard ANCOVA
Precision in Two-level Hierarchical Design
With Covariates
The standard error of the treatment effect
 1   n  1  ρ   R 2   nR 2  R 2    
 2   W S W 
SE   T  
 m  n 
 

SE decreases as m increases
SE deceases as n increases, but only up to point
SE increases as ρ increases
SE decreases as RW2 and RS2 increase
Power in Two-level Hierarchical Design
With Covariates
Basic Idea:

Operational Effect Size = (Effect Size) x (Design Effect)

ΔT = δ x (Design Effect)

For the two-level hierarchical design with covariates


n
 
T

1   n  1  ρ   RW   n R S  RW   
A 2 2 2
 

The covariates increase the design effect


Power in Two-level Hierarchical Design
With Covariates
As m and effect size increase, power increases

Other influences occurnthrough the design effect

1   n  1  ρ   RW   n R S  RW   
2 2 2
 

As ρ increases the design effect (and power) decrease

Now the maximum design effect as large n gets big is


2
1 (1  R S ) ρ

As the covariate-outcome correlations RW2 and RS2 increase the


design effect (and power) increases
Three-level Hierarchical Design

Here there are three factors


• Treatment
• Schools (clusters) nested in treatments
• Classes (subclusters) nested in schools

Suppose there are


• m schools (clusters) per treatment
• p classes (subclusters) per school (cluster)
• n students (individuals) per class (subcluster)
Three-level Hierarchical Design
With No Covariates
The statistical model for the observation on the lth person in the kth
class in the jth school in the ith treatment is

Yijkl = μ + αi + βj(i) + γk(ij) + εijkl

where
μ is the grand mean,
αi is the average effect of being in treatment i,
βj(i) is the average effect of being in school j, in treatment i
γk(ij) is the average effect of being in class k in treatment i, in school
j,
εijkl is a residual
Three-level Hierarchical Design
With No Covariates (HLM Notation)
Level 1 (individual level)
Yijkl = β0jk + εijkl ε ~ N(0, σW2)

Level 2 (classroom level)


β0jk = γ0j + η0jk η ~ N(0, σC2)

Level 3 (school Level)


γ0j = π00 + π01Tj + ξ0j ξ ~ N(0, σS2)

If we code the treatment Tj = ½ or - ½ , then


π00 = μ, π01 = α1, ξ0j = γk(ij), η0jk = βj(i)
Three-level Hierarchical Design
Intraclass Correlations
In three-level designs there are two levels of clustering and two
intraclass correlations

At the school (cluster) level


2 2
S S
ρS  
2 2 2 2
S C  W T

At the classroom (subcluster)


2 level2
C C
ρC  
2 2 2 2
S C  W T
Precision in Three-level Hierarchical Design
With No Covariates
The standard error of the treatment effect

 2   1   pn  1  ρ S  ( n  1)  C 
SE   T   
 m  pn 

SE decreases as m increases

SE deceases as p and n increase, but only up to point

SE increases as ρS and ρC increase


Power in Three-level Hierarchical Design
With No Covariates
Basic Idea:

Operational Effect Size = (Effect Size) x (Design Effect)

ΔT = δ x (Design Effect)

For the three-level hierarchical design with no covariates


pn
 
T

1  ( pn  1)  S   n  1  ρ C

The operational sample size is the number of schools


Power in Three-level Hierarchical Design
With No Covariates
As m and the effect size increase, power increases

Other influences occur through the design effect


pn
1  ( pn  1)  S   n  1  ρ C

As ρS or ρC increases the design effect decreases

No matter how large n gets the maximum design effect is

1  S
 1
p
C 
Thus power only increases up to some limit as n increases
Three-level Hierarchical Design
With Covariates (HLM Notation)
Level 1 (individual level)

Yijkl = β0jk + β1jkXijkl + εijkl ε ~ N(0, σAW2)

Level 2 (classroom level)

β0jk = γ00j + γ01jZjk + η0jk η ~ N(0, σAC2)


β1jk = γ10j

Level 3 (school Level)

γ00j = π00 + π01Tj + π02Wj + ξ0j ξ ~ N(0, σAS2)


γ01j = π01
γ10j = π10
The covariate effects β1jk = γ10j = π10 and γ01j = π01 are fixed
Precision in Three-level Hierarchical Design
With Covariates

 2 
SE   T   
m

1  ( pn  1)  S   n  1  ρ   RW   pnR S  RW   S   nR C  RW   C 
2 2 2 2 2
 
pn

SE decreases as m increases

SE deceases as p and n increase, but only up to point

SE increases as ρ increases

SE decreases as RW2, RC2, and RS2 increase


Power in Three-level Hierarchical Design
With Covariates
Basic Idea:

Operational Effect Size = (Effect Size) x (Design Effect)

ΔT = δ x (Design Effect)

For the three-level hierarchical design with covariates


pn
A  
T

1  ( pn  1)  S   n  1  ρ   RW   pnR S  RW   S   nR C  RW   C 
2 2 2 2 2
 

The operational sample size is the number of schools


Power in Three-level Hierarchical Design
With Covariates
As m and the effect size increase, power increases

Other influences occur through the design effect


pn
1  ( pn  1)  S   n  1  ρ   RW   pnR S  RW   S   nR C  RW   C 
2 2 2 2 2
 

As ρS or ρC increase the design effect decreases

No matter how large n gets the maximum design effect is

1 1  R S   S 
 1  R   
2 1 2
 p C C 
Thus power only increases up to some limit as n increases
Randomized Block Designs
Two-level Randomized Block Design
With No Covariates (HLM Notation)
Level 1 (individual level)

Yijk = β0j + β1jTijk+ εijk ε ~ N(0, σW2)

Level 2 (school Level)

β0j = π00 + ξ0j ξ0j ~ N(0, σS2)


β1j = π10+ ξ1j ξ1j ~ N(0, σTxS2)

If we code the treatment Tijk = ½ or - ½ , then the parameters are


identical to those in standard ANOVA
Randomized Block Designs
In randomized block designs, as in hierarchical designs, the
intraclass correlation has an impact on precision and power

However, in randomized block designs designs there is also a


parameter reflecting the degree of heterogeneity of treatment
effects across schools

We define this heterogeneity parameter ωS in terms of the


amount of heterogeneity of treatment effects relative to the
heterogeneity of school means

Thus

ωS = σTxS2/σS2
Precision in Two-level Randomized Block Design
With No Covariates

The standard error of the treatment effect


 2   1  ( n S  1) ρ 
SE   T   
 m  n 

SE decreases as m (number of schools) increases

SE deceases as n and p increase, but only up to point

SE increases as ρ increases
SE increases as ωS = σTxS2/σS2 increases
Power in Two-level Randomized Block Design
With No Covariates

Basic Idea:
Operational Effect Size = (Effect Size) x (Design Effect)
n
 
T

1   n  1 ρ

ΔT = δ x (Design Effect)

For the two-level hierarchical design with no covariates


n/2
 
T

1   n S  1  ρ

Operational sample size is number of schools (clusters)


Precision in Two-level Randomized Block Design
With Covariates

The standard error of the treatment effect


 1   n  1  ρ   R 2   n R 2  R 2    
 2  S  W S S W 
SE   T  
 m  n 
 
SE decreases as m increases

SE deceases as n increases, but only up to point

SE increases as ρ increases
SE increases as ωS = σTxS2/σS2 increases

SE (generally) decreases as RW2 and RS2 increase


Power in Two-level Randomized Block Design
With Covariates

Basic Idea:

Operational Effect Size = (Effect Size) x (Design Effect)

ΔT = δ x (Design Effect)

For the two-level hierarchical design with covariates


n/2
 
T

1   n S  1  ρ   RW   n  S R S  RW   
A 2 2 2
 

The covariates increase the design effect


Three-level Randomized Block Designs
Three-level Randomized Block Design
With No Covariates
Here there are three factors
• Treatment
• Schools (clusters) nested in treatments
• Classes (subclusters) nested in schools

Suppose there are


• m schools (clusters) per treatment
• 2p classes (subclusters) per school (cluster)
• n students (individuals) per class (subcluster)
Three-level Randomized Block Design
With No Covariates
The statistical model for the observation on the lth person in the kth
class in the ith treatment in the jth school is

Yijkl = μ +αi + βj + γk + αβij + εijkl


where
μ is the grand mean,
αi is the average effect of being in treatment i,
βj is the average effect of being in school j,
γk is the effect of being in the kth class,
αβij is the difference between the average effect of treatment i and
the effect of that treatment in school j,
εijkl is a residual
Three-level Randomized Block Design
With No Covariates (HLM Notation)
Level 1 (individual level)
Yijkl = β0jk + εijkl ε ~ N(0, σW2)

Level 2 (classroom level)


β0jk = γ00j + γ01jTj + η0jk η ~ N(0, σC2)

Level 3 (school Level)


γ00j = π00 + ξ0j ξoi ~ N(0, σS2)
γ01j = π10 + ξ1j ξ1i ~ N(0, σTxS2)

If we code the treatment Tj = ½ or - ½ , then

π00 = μ, π10 = α1, ξ0j = βj , ξ1j = αβij , η0jk = γk


Three-level Randomized Block Design
Intraclass Correlations
In three-level designs there are two levels of clustering and two
intraclass correlations

At the school (cluster) level


2 2
S S
ρS  
2 2 2 2
S C  W T

At the classroom (subcluster)


2
level2
C C
ρC  
2 2 2 2
S C  W T
Three-level Randomized Block Design
Heterogeneity Parameters
In three-level designs, as in two-level randomized block designs,
there is also a parameter reflecting the degree of
heterogeneity of treatment effects across schools

We define this parameter ωS in terms of the amount of


heterogeneity of treatment effects relative to the
heterogeneity of school means (just like in two-level designs)

Thus

ωS = σTxS2/σS2
Precision in Three-level Randomized Block Design
With No Covariates

The standard error of the treatment effect

 2   1   pn S  1  ρ S  ( n  1)  C 
SE   T   
 m  pn 

SE decreases as m increases

SE deceases as p and n increase, but only up to point

SE increases as ωS increases

SE increases as ρS and ρC increase


Power in Three-level Randomized Block Design
With No Covariates

Basic Idea:

Operational Effect Size = (Effect Size) x (Design Effect)

ΔT = δ x (Design Effect)

For the three-level hierarchical design with no covariates


pn / 2
 
T

1  ( pn S  1)  S   n  1  ρ C

The operational sample size is the number of schools


Power in Three-level Randomized Block Design
With No Covariates

As m and the effect size increase, power increases

Other influences occur through the design effect


pn / 2
1  ( pn S  1)  S   n  1  ρ C

As ρS or ρC increases the design effect decreases

No matter how large n gets the maximum design effect is

1 2  S  S  1
p
C 
Thus power only increases up to some limit as n increases
Power in Three-level Randomized Block Design
With Covariates

 2 
SE   T   
m

1  ( pn S  1)  S   n  1  ρ   RW   pn S R S  RW   S   nR C  RW   C 
2 2 2 2 2
 
pn

SE decreases as m increases

SE deceases as p and n increases, but only up to point

SE increases as ρ and ωS increase

SE decreases as RW2, RC2, and RS2 increase


Power in Three-level Randomized Block Design
With Covariates

Basic Idea:
Operational Effect Size = (Effect Size) x (Design Effect)

ΔT = δ x (Design Effect)

For the three-level hierarchical design with covariates


A 
T

pn / 2

1  ( pn S  1)  S   n  1  ρ   RW   pn S R S  RW   S   nR C  RW   C 
2 2 2 2 2
 
The operational sample size is the number of schools
Power in Three-level Randomized Block Design
With Covariates

As m and the effect size increase, power increases

Other influences occur through the design effect


pn / 2
1  ( p n S  1)  S   n  1  ρ   RW   p n S R S  RW   S   n R C  RW   C 
2 2 2 2 2
 

As ρS or ρC increases the design effect decreases

No matter how large n gets the maximum design effect is

1 2 1  R S   S  S 
 1  R   
2 1 2
 p C C 
Thus power only increases up to some limit as n increases
What Unit Should Be Randomized?
(Schools, Classrooms, or Students)
Experiments cannot estimate the causal effect on any
individual

Experiments estimate average causal effects on the units that


have been randomized

• If you randomize schools the (average) causal effects are


effects on schools

• If you randomize classes, the (average) causal effects are on


classes

• If you randomize individuals, the (average) causal effects


estimated are on individuals
What Unit Should Be Randomized?
(Schools, Classrooms, or Students)
Theoretical Considerations

Decide what level you care about, then randomize at that


level

Randomization at lower levels may impact generalizability of


the causal inference (and it is generally a lot more trouble)

Suppose you randomize classrooms, should you also randomly


assign students to classes?

It depends: Are you interested in the average causal effect of


treatment on naturally occurring classes or on randomly
assembled ones?
What Unit Should Be Randomized?
(Schools, Classrooms, or Students)
Relative power/precision of treatment effect

Assign Schools  1   pn  1  ρ S  ( n  1)  C 
(Hierarchical Design)  
 pn 

Assign Classrooms  1   pn S  1  ρ S  ( n  1)  C 


(Randomized Block)  
 pn 

Assign Students  1   pn S  1  ρ S  ( n C  1)  C 


(Randomized Block)  
 pn 
What Unit Should Be Randomized?
(Schools, Classrooms, or Students)
Precision of estimates or statistical power dictate
assigning the lowest level possible

But the individual (or even classroom) level will not


always be feasible or even theoretically desirable
Thank You !

You might also like