Professional Documents
Culture Documents
Lab Exam: - When: Nov 27 - Dec 1 - Length 1 Hour
Lab Exam: - When: Nov 27 - Dec 1 - Length 1 Hour
• length = 1 hour
– each lab section divided in two
Ice cream
Violent
crime
Hot weather
Alternative explanation
Ice cream
Correlation is
not causation
Violent
crime
Hot weather
Why do experiments?
• Observational studies are prone to
confounding variables: Variables that
mask or distort the association between
measured variables in a study
– Example: hot weather
• In an experiment, you can use random
assignments of treatments to individuals to
avoid confounding variables
Goals of Experimental Design
• Avoid experimental artifacts
• Eliminate bias
1. Use a simultaneous control group
2. Randomization
3. Blinding
• Reduce sampling error
1. Replication
2. Balance
3. Blocking
Goals of Experimental Design
• Avoid experimental artifacts
• Eliminate bias
1. Use a simultaneous control group
2. Randomization
3. Blinding
• Reduce sampling error
1. Replication
2. Balance
3. Blocking
Experimental Artifacts
• Experimental artifacts: a bias in a
measurement produced by unintended
consequences of experimental procedures
Without
randomization,
Confounding variable
the confounding
variable differs
among
treatments
Confounding variable Experimental units Treatments
Experimental units Treatments
With
randomization,
Confounding variable
the confounding
variable does
not differ
among
treatments
Blinding
• Blinding is the concealment of information
from the participants and/or researchers
about which subjects are receiving which
treatments
• Single blind: subjects are unaware of
treatments
• Double blind: subjects and researchers
are unaware of treatments
Blinding
• Example: testing heart medication
• Two treatments: drug and placebo
• Single blind: the patients don’t know which
group they are in, but the doctors do
• Double blind: neither the patients nor the
doctors administering the drug know which
group the patients are in
Goals of Experimental Design
• Avoid experimental artifacts
• Eliminate bias
1. Use a simultaneous control group
2. Randomization
3. Blinding
• Reduce sampling error
1. Replication
2. Balance
3. Blocking
Replication
• Experimental unit: the individual unit to
which treatments are assigned
Experiment 1
Experiment 2
Tank 1 Tank 2
Experiment 3
All separate tanks
Replication
• Experimental unit: the individual unit to
which treatments are assigned
2 Experimental
Experiment 1
Units
2 Experimental
Experiment 2
Units
Tank 1 Tank 2
8 Experimental
Units Experiment 3
All separate tanks
Replication
• Experimental unit: the individual unit to
which treatments are assigned
2 Experimental
Experiment 1
Units
2 Experimental
Units Pseudoreplication Experiment 2
Tank 1 Tank 2
8 Experimental
Units Experiment 3
All separate tanks
Why is pseudoreplication bad?
Experiment 2
Tank 1 Tank 2
• Imagine that something strange happened, by chance, to tank 2 but not to tank 1
• You might then think that the difference was due to the treatment, but it’s actually
just random chance
Why is replication good?
• Consider the formula for standard error of
the mean:
s
SEY =
n
Larger n Smaller SE
Balance
• In a balanced experimental design, all
treatments have equal sample size
Better than
Balanced Unbalanced
Balance
• In a balanced experimental design, all
treatments have equal sample size
• This maximizes power
• Also makes tests more robust to violating
assumptions
Blocking
• Blocking is the grouping of experimental
units that have similar properties
• Within each block, treatments are
randomly assigned to experimental
treatments
• Randomized block design
Randomized Block Design
Randomized Block Design
• Example: cattle tanks in a field
Very sunny
Not So Sunny
Block 1
Block 2
Block 3
Block 4
What good is blocking?
• Blocking allows you to remove extraneous
variation from the data
• Like replicating the whole experiment
multiple times, once in each block
• Paired design is an example of blocking
Experiments with 2 Factors
• Factorial design – investigates all
treatment combinations of two or more
variables
• Factorial design allows us to test for
interactions between treatment variables
Factorial Design
pH
5.5 6.5 7.5
50
Growth Rate
40
30
20
10
0
25 30 35 40
Temperature
Interpretations of 2-way ANOVA
Terms
45
pH 5.5
pH 6.5
40 pH 7.5
Effect of pH and Temperature,
with interaction
35
30
Growth Rate
25
20
15
10
0
25 30 35 40
Temperature
Goals of Experimental Design
• Avoid experimental artifacts
• Eliminate bias
1. Use a simultaneous control group
2. Randomization
3. Blinding
• Reduce sampling error
1. Replication
2. Balance
3. Blocking
What if you can’t do experiments?
• Sometimes you can’t do experiments
• One strategy:
– Matching
– Every individual in the treatment group is
matched to a control individual having the
same or closely similar values for known
confounding variables
What if you can’t do experiments?
• Example: Do species on islands change
their body size compared to species in
mainland habitats?
• For each island species, identify a closely
related species living on a nearby
mainland area
Power Analysis
• Before carrying out an experiment you
must choose a sample size
• Too small: no chance to detect treatment
effect
• Too large: too expensive
• We can use power analysis to choose our
sample size
Power Analysis
• Example: confidence interval
• For a two-sample t-test, the approximate
width of a 95% confidence interval for the
difference in means is:
√2
precision = 4 σ
√n
(assuming that the data are a random
sample from a normal distribution)
Power Analysis
• Example: confidence interval
• The sample size needed for a particular
level of precision is:
2
σ
n = 32 Precision
Power Analysis
• Assume that the standard deviation of exam scores for a class is 10.
I want to compare scores between two lab sections. A. How many
exams do I need to mark to obtain a confidence limit for the
difference in mean exam scores between two classes that has a
width (precision) of 5?
2
σ
n = 32 Precision
2
10
n = 32 5 =128
Power Analysis
• Example: power
• Remember, power = 1 - β
∀ β = Pr[Type II error]
• Typical goal is power = 0.80
• For a two-sample t-test, the sample size
needed for a power of 80% to detect a
difference of D is:
σ 2
n = 16
D
Power Analysis
• Assume that the standard deviation of exam scores for a class is 10.
I want to compare scores between two lab sections. B. How many
exams do I need to mark to have sufficient power (80%) to detect a
mean difference of 10 points between the sections?
σ 2
n = 16
D
2
10 = 16
n = 16
10