Download as pdf or txt
Download as pdf or txt
You are on page 1of 93

Lecture 12

Analysis of Variance
 Review and Preview

 One-Way ANOVA

 Two-Way ANOVA

SIS 1037Y 2020 - 2021 2


 We have looked at methods for comparing
the means from two independent samples.

SIS 1037Y 2020 - 2021 3


 Now we look to test the equality of three or
more means by using the method of one-way
analysis of variance (ANOVA).

SIS 1037Y 2020 - 2021 4


 Review and Preview

 One-Way ANOVA

 Two-Way ANOVA

SIS 1037Y 2020 - 2021 5


 This part introduces the method of one-way
analysis of variance, which is used for tests of
hypotheses that three or more population
means are all equal.
 Because the calculations are demanding, the
interpretation of results obtained by using
technology is emphasize .

SIS 1037Y 2020 - 2021 6


 Understand that a small P-value (such
as 0.05 or less) leads to rejection of
the null hypothesis of equal means.
With a large P-value (such as greater
than 0.05), fail to reject the null
hypothesis of equal means.
 Develop an understanding of the
underlying rationale by studying the
examples in this part.

SIS 1037Y 2020 - 2021 7


Definition:
One-way analysis of variance (ANOVA) is a
method of testing the equality of three or more
population means by analyzing sample
variances.
One-way analysis of variance is used with data
categorized with one factor (or treatment),
which is a characteristic that allows us to
distinguish the different populations from one
another.

SIS 1037Y 2020 - 2021 8


 In real life things do not typically result
in two groups being compared
 Two-sample t-tests are problematic
◦ Increasing the risk of a Type I error
◦ At .05 level of significance, with 100
comparisons, 5 will show a difference when
none exists (experimental error)
◦ So the more t-tests we run, the greater the
risk of a type I error (rejecting the null when
there is no difference)
 ANOVA allows us to see if there are
differences between means with a single
test

SIS 1037Y 2020 - 2021 9


 Data must be experimental
 If we do not have access to statistical
software, an ANOVA can be computed by
hand
 With many experimental designs, the sample
sizes must be equal for the various factor
level combinations
 A regression analysis will accomplish the
same goal as an ANOVA.
 ANOVA formulas change from one
experimental design to another

SIS 1037Y 2020 - 2021 10


 A representation of the spread of scores
 What contributes to differences in scores?
◦ Individual differences
◦ Which group you are in

 Variance can be separated into two major


components
◦ Within groups – variability or differences in
particular groups (individual differences)
◦ Between groups - differences depending what
group one is in or what treatment is received

SIS 1037Y 2020 - 2021 11


 It was of interest to investigate the effect of a
student’s grade in O Level Maths on their
final examination mark in SIS 1037Y.
Final mark SIS 1037Y

Spreads A centred
C

similar highest, then


B and C
O Level

B
A

20 40 60 80

 We have more than one group and we wish to


compare their means.
SIS 1037Y 2020 - 2021 12
 We are applying the variance concept to
means
◦ How do means of different groups compare to the
overall mean
 Do the means vary so greatly from each other
that they exceed individual differences within
the groups?

SIS 1037Y 2020 - 2021 13


 We are able to compare MULTIPLE means
 Null hypothesis: no difference in means
 Alternative hypothesis: difference in means

SIS 1037Y 2020 - 2021 14


Here we would almost certainly reject the null hypothesis.

SIS 1037Y 2020 - 2021 15


Here we would fail to reject the null hypothesis.

SIS 1037Y 2020 - 2021 16


 We are comparing “variance estimates”
◦ Variance = SS/df
 The idea is to partition the variance into
between and within group variance
◦ Between-group variance reflects
differences in the way the groups were
treated
◦ Within-group variance reflects individual
differences
 How does the between group variance
compare with the within group variance?

SIS 1037Y 2020 - 2021 17


 We are examining the ratio of differences
(variances) from treatment to variances from
individual differences
 If the ratio is large there is a significant
impact from treatment.
 We know if a ratio is “large enough” by
calculating the ratio of the MST to MSE and
conducting an F test.
◦ Mean squares (MS) are measures of variance,
◦ T = Treatment
◦ E = Error

SIS 1037Y 2020 - 2021 18


 One-factor completely randomized designs
Total SS = Treatment SS + Error SS
SS(Total) = SST + SSE
 Randomized Block Designs
Total SS = Treatment SS + Block SS + Error SS
SS(Total) = SST + SSB + SSE
 Two-Factor Factorial Experiments
Total SS = Main effect SS Factor A + Main effect SS
Factor B + AB Interaction SS + Error SS
SS(Total) = SS(A) + SS (B) + SS (AB) + SSE

SIS 1037Y 2020 - 2021 19


 Motivating Example
 Analysis of Variance
 Model & Assumptions
 Data Estimates of the Model
 Analysis of Variance
 Multiple Comparisons
 Checking Assumptions
 One-way ANOVA Transformations

SIS 1037Y 2020 - 2021 20


SIS 1037Y 2020 - 2021 21
 Analysis of Variance is a widely used
statistical technique that partitions the total
variability in our data into components of
variability that are used to test hypotheses.

 In One-way ANOVA, we wish to test the


hypothesis:
H0 : 1 = 2 =  = k
against:
Ha : Not all population means are the same

SIS 1037Y 2020 - 2021 22


 In ANOVA, we compare the between-group
variation with the within-group variation to
assess whether there is a difference in the
population means.

 Thus by comparing these two measures of


variance (spread) with one another, we are able
to detect if there are true differences among the
underlying group population means.

SIS 1037Y 2020 - 2021 23


 If the variation between the sample means is
large, relative to the variation within the
samples, then we would be likely to detect
significant differences among the sample
means.

SIS 1037Y 2020 - 2021 24


Analysis of Variance
Between-group If we sampled from
variation is large these populations, we
compared to the would expect to reject
Within-group H0
variation

1
1 3j = y3j - 3
2
2 = 2 - 
3
3

 If the variation between the sample means is
small, relative to the variation within the
samples, then there would be considerable
overlap of observations in the different
samples, and we would be unlikely to detect
any differences among the population means.

SIS 1037Y 2020 - 2021 26


Analysis of Variance
If we sampled
All i = 0 from these
populations, we
would not expect
to reject H0

2j = y2j - 2

 = 1 = 2 = 3
 If we consider all of the data together, regardless of
which sample the observation belongs to, we can
measure the overall total variability in the data by:
k ni
 ij
( x - x̄ ) 2

i =1 j =1

• This is the Total Sum of Squares (SSTotal).


• We have k groups and Group j has nj elements.
• If we divide this sum of squares by its degrees of
freedom (N - 1), we will have a measure of
variance.
SIS 1037Y 2020 - 2021 28
Analysis of Variance
• Now, the deviation of every observation from the
overall (grand) mean can be partitioned as:
• (xij - ͞x) = ((xij - ͞xi) + (x͞i - ͞x)
• Squaring and summing across all observations, we
get:
k ni k ni k ni

 ij
( x - x ) 2
=  i
( x - x ) 2
+  ij i
( x - x ) 2

i =1 j =1 i =1 j =1 i =1 j =1

Measure variation due to the fact Measures error variation, variation in


different treatments are used. response when same treatment is
applied.
Treatment Sum of Squares
(SSTreat) or Between Group Sum of Error Sum of Squares (SSError) or
Squares Within Group Sum of Squares
 This equation is the ANOVA Identity:
◦ SST = SSB + SSW
 This partitions the Total Sum of Squares into two
components of interest for our hypothesis.

SIS 1037Y 2020 - 2021 30


 To convert Sums of Squares (SS) into
comparable measures of variance, we need to
divide the SS by their respective degrees of
freedom.

 This gives us mean squares (MS) which are


measures of variance:
MSTreat = SSTreat / dfTreat= SSTreat / (k – 1)
MSError = SSError / dfError = SSError / (N – k)

SIS 1037Y 2020 - 2021 31


Test Statistic for One-Way ANOVA

variance between samples


F=
variance within samples

SIS 1037Y 2020 - 2021 32


1. The F distribution is not symmetric; it is
skewed to the right.
2. The values of F cannot be negative.
3. The exact shape of the F distribution
depends on the two different degrees of
freedom.

SIS 1037Y 2020 - 2021 33


SIS 1037Y 2020 - 2021 34
 Our test statistic is the F-ratio (or F-statistic)
which compares these two mean squares:
MSTreat
F0 =
MSError

Note that the greater the natural variability within


the groups, the larger the effects (i) will need to
be (as estimated by MSTreat) for us to detect any
significant differences.

SIS 1037Y 2020 - 2021 35


 Traditionally the Analysis of Variance calculations
have been presented in an ANOVA Table.
 The format of the table is:

These cols add up SS/df

Source of Degrees ofSum of Mean F- Ratio P-value


Variation Freedom SquaresSquare
Treatment k–1 SS Treat MS Treat MS Treat/MSError Tail Area
Error N –k SS Error MS Error
Total N –1 SS T

SIS 1037Y 2020 - 2021 36


F0 = 307.32/57.68 = 5.42
 A large F-statistic provides evidence against H0
while a small F-statistic indicates that the data
and H0 are compatible.

 To calculate a P-value to test H0, we compare the


F-statistic we obtained from our data to the
distribution it would have under a true H0, i.e. an
F-distribution with (k - 1) and (N – k) degrees of
freedom.

 Note that F0 is always positive, so this is always a


one-tailed test.

SIS 1037Y 2020 - 2021 38


Analysis of Variance
When H0 is true, F0 ~ F (df1,df2)

For example, consider the F-distribution with 4 and 30 df


F-distribution
0.8
0.6

Then the P-value = 0.06


0.4
0.2
0.0

0 1 2 3 4

Let’s say our observed value for F was F0 = 2.5


Analysis of Variance
> summary.1way(fit1) F0 = 893.95/3.977 = 224.77
ANOVA Table:
Df Sum Squares Mean Square F-statistic p-value
Between Groups 2 1787.90127 893.95064 224.77131 0
Within Groups 87 346.0126 3.97716
Total 89 2133.91387 Very strong evidence against
H0
Numeric Summary:
Sample size Mean Median Std Dev Midspread
All Data 90 50.03461 49.90676 4.89659 8.15498
1 30 44.60219 44.85629 2.21488 3.28144
2 30 49.98225 49.63903 1.95331 2.41903
3 30 55.51939 55.58918 1.79175 2.42022

Table of Effects: (GrandMean and deviations from GM)


typ.val 1 2 3
50.03461 -5.43242 -0.05236 5.48478
Analysis of Variance
Variation between groups is large compared to the variation within
groups

Plot of `y1' by levels of `group',


with TUKEY intervals (95%, pooled SDs)
100
80
60
y

40
20
0

1 2 3
group
Analysis of Variance
> summary.1way(fit2)
ANOVA Table:
Df Sum Squares Mean Square F-statistic p-value
Between Groups 2 1446.60015 723.30007 1.95683 0.14748
Within Groups 87 32157.7458 369.62926
Total 89 33604.34595 No evidence against H0
Numeric Summary:
Sample size Mean Median Std Dev Midspread
All Data 90 49.98675 49.70776 19.43134 24.86226
1 30 45.08567 42.91017 19.83452 23.30580
2 30 49.96857 55.10369 19.79035 26.50691
3 30 54.90600 52.83934 17.99504 26.97609

Means
Table of Effects: (GrandMean and deviations & effects
from GM) similar to
typ.val 1 2 3 previous example
49.98675 -4.90108 -0.01817 4.91926
Analysis of Variance
Variation between groups is small compared to the variation within groups

Plot of `y2' by levels of `group',


with TUKEY intervals (95%, pooled SDs)
100
80
60
y

40
20
0

1 2 3
group
 A significant F-test tells us that at least two of the
underlying population means are different, but it
does not tell us which ones differ from the others.
 We need extra tests to compare all the means, which
we call Multiple Comparisons.
 We look at the difference between every pair of group
population means, as well as the confidence interval
for each difference.
 When we have k groups, there are:

k k! k ( k - 1)
  = =
k choose 2  2 2! ( k 2 )!
- 2
possible pair-wise comparisons.

SIS 1037Y 2020 - 2021 44


• If we estimate each comparison separately with 95%
confidence, the overall error rate will be greater
than 5%.
• So, using ordinary pair-wise comparisons (i.e. lots
of individual t-tests), we tend to find too many
significant differences between our sample means.
• We need to modify our intervals so that they
simultaneously contain the true differences with 95%
confidence across the entire set of comparisons.
• The modified intervals are known as:
simultaneous confidence intervals OR
multiple comparison procedures
SIS 1037Y 2020 - 2021 45
Requirements
1. The populations have approximately normal
distributions.
2. The populations have the same variance σ2
(or standard deviation σ).
3. The samples are simple random samples of
quantitative data.
4. The samples are independent of each other.
5. The different samples are from populations
that are categorized in only one way.

SIS 1037Y 2020 - 2021 46


To test H 0 : 1 = 2 = 3 = = k

1. Use technology to obtain results.


2. Identify the P-value from the display.
3. Form a conclusion based on these criteria:
If the P-value ≤ α, reject the null hypothesis of
equal means. Conclude at least one mean is
different from the others.
If the P-value > α, fail to reject the null hypothesis
of equal means.

SIS 1037Y 2020 - 2021 47


When we conclude that there is sufficient
evidence to reject the claim of equal
population means, we cannot conclude from
ANOVA that any particular mean is different
from the others.
There are several other tests that can be used
to identify the specific means that are
different, and some of them are discussed in
Part 2 of this section.

SIS 1037Y 2020 - 2021 48


Example
Use the performance IQ scores of children listed in
the following table and a significance level of α =
0.05 to test the claim that the three samples come
from populations with means that are all equal.

SIS 1037Y 2020 - 2021 49


Example - Continued

Here are summary statistics from the collected data:

SIS 1037Y 2020 - 2021 50


Example - Continued
Requirement Check:
1. The three samples appear to come from
populations that are approximately normal
(normal quantile plots OK).
2. The three samples have standard deviations that
are not dramatically different.
3. We can treat the samples as simple random
samples.
4. The samples are independent of each other and
the IQ scores are not matched in any way.
5. The three samples are categorized according to a
single factor: low lead, medium lead, and high
lead. SIS 1037Y 2020 - 2021 51
Example - Continued
The hypotheses are:

H 0 : 1 = 2 = 3
H1 : At least one of the means is different from the others.

The significance level is α = 0.05.

Technology results are presented on the next slides.

SIS 1037Y 2020 - 2021 52


Example - Continued

SIS 1037Y 2020 - 2021 53


Example - Continued

SIS 1037Y 2020 - 2021 54


Example - Continued

SIS 1037Y 2020 - 2021 55


Example - Continued

The displays all show that the P-value is 0.020 when


rounded.
Because the P-value is less than the significance
level of α = 0.05, we can reject the null hypothesis.
There is sufficient evidence that the three samples
come from populations with means that are
different.
We cannot conclude formally that any particular
mean is different from the others, but it appears that
greater blood lead levels are associated with lower
performance IQ scores.

SIS 1037Y 2020 - 2021 56


P-Value and Test Statistic
Larger values of the test statistic result in smaller P-
values, so the ANOVA test is right-tailed.
The figure on the next slide shows the relationship
between the F test statistic and the P-value.
Assuming that the populations have the same
variance σ2 (as required for the test), the F test
statistic is the ratio of these two estimates of σ2:
(1) variation between samples (based on variation
among sample means)
(2) variation within samples (based on the sample
variances).

SIS 1037Y 2020 - 2021 57


Relationship Between F Test Statistic and P-Value

SIS 1037Y 2020 - 2021 58


Caution

When testing for equality of


three or more populations, use
analysis of variance.
Do not use multiple
hypothesis tests with two
samples at a time.

SIS 1037Y 2020 - 2021 59


SIS 1037Y 2020 - 2021 60
Designing the Experiment
When performing ANOVA, we use one factor as the
basis for partitioning the data into several categories.
If we conclude that there is a significant difference
among means, we can’t be absolutely certain the
differences are explained by the factor being used.
One way to reduce the effect of extraneous factors is
to run a completely randomized design, in which
each sample value is given the same chance of
belonging to different factor groups.
Another way to reduce the effect of extraneous
factors is to use a rigorously controlled design, in
which sample values are carefully chosen so that all
other factors have no variability.
SIS 1037Y 2020 - 2021 61
Identifying Which Means Are Different
After conducting ANOVA, there are several informal
methods for determining which means are different:
• Construct boxplots of the different samples to see
if one or more of them is very different from the
others.
• Construct confidence interval estimates of the
means for the different samples, then compare
those confidence intervals to see if one or more of
them does not overlap with the others.

SIS 1037Y 2020 - 2021 62


Identifying Which Means Are Different
There are several formal procedures to determine
which means are different.
•Range tests allow us to identify subsets of means
that are not significantly different from each other.
•Multiple comparison tests use pairs of means,
making adjustments to overcome the problem of
having a significance level that increases as the
number of tests increases.
•There are many multiple comparisons tests, we
introduce just one: The Bonferroni Multiple
Comparison Test.

SIS 1037Y 2020 - 2021 63


Bonferroni Multiple Comparison Test
Step 1: Do a separate t test for each pair of samples,
but make the following adjustments:
Step 2: Use the value of MS(error), which uses all
available sample data, as an estimate for the
variance σ2. This value is obtained when
conducting ANOVA.
Calculate the test statistic:
x1 - x2
t=
1 1
MS(error)  + 
 n1 n2 

SIS 1037Y 2020 - 2021 64


Bonferroni Multiple Comparison Test
Step 3: Make the following adjustments:
P-value: Use the test statistic t with df = N – k, where
N is the total number of sample values and k
is the number of samples. Find the P-value
as per usual, but adjust the P-value by
multiplying it by the number of different
possible pairings of two samples (e.g. with
three samples, there are three possible
pairings, so multiply by 3).
Critical Value: When finding the critical value, adjust
α by dividing it by the number of possible
pairings of two samples.

SIS 1037Y 2020 - 2021 65


Example

We previously concluded, for the IQ


score test, that there is sufficient
evidence to warrant rejection of the
claim of equal means.
Use the Bonferroni test with a 0.05 level
of significance to identify which mean is
different from the others.

SIS 1037Y 2020 - 2021 66


Example - Continued
The null hypotheses to be tested are:

H0 : 1 = 2 H0 : 1 = 3 H0 : 2 = 3

We begin with the first, and using the sample data


presented earlier, arrive at (using technology):
n1 = 78 x1 = 102.705128

n2 = 22 x2 = 94.136364

MS(error) = 248.424127

SIS 1037Y 2020 - 2021 67


Example - Continued
The test statistic:
x1 - x2
t=
1 1
MS(error)  + 
 n1 n2 

102.705128 - 94.136364
= = 2.252
 1 1 
248.424127  + 
 78 22 

SIS 1037Y 2020 - 2021 68


Example - Continued
The test statistic of t = 2.252 has N – k = 121 – 3 =
118 df.
The two-tailed P-value is 0.026172, but it needs to
be multiplied by 3 because there are three possible
pairing of sample means.
Thus, the P-value = 0.079 rounded.
Because this P-value is not small (less than 0.05), we
fail to reject the null hypothesis.
It appears that Samples 1 and 2 do not have
significantly different means.

SIS 1037Y 2020 - 2021 69


Example - Continued
Technology can display Bonferroni test results:

SIS 1037Y 2020 - 2021 70


Example - Continued

Conclusion:
Although the ANOVA test tells us that at
least one of the sample means is
different from the others, the
Bonferroni test results do not identify
any one particular sample mean that is
significantly different from the others.

SIS 1037Y 2020 - 2021 71


 Review and Preview

 One-Way ANOVA

 Two-Way ANOVA

SIS 1037Y 2020 - 2021 72


Key Concepts
We introduce the method of two-way analysis
of variance, which is used with data
partitioned into categories according to two
factors.
The methods of this section require that we
begin by testing for an interaction between
the two factors.
Then we test whether the row or column
factors have effects.

SIS 1037Y 2020 - 2021 73


The data in the table are categorized with two
factors:
1. Gender: Male or Female
2. Blood Lead Level: Low, Medium, or High
The subcategories are called cells, and the
response variable is IQ score.
The data are presented on the next slide:

SIS 1037Y 2020 - 2021 74


SIS 1037Y 2020 - 2021 75
There is an interaction between two factors if
the effect of one of the factors changes for
different categories of the other factor.

SIS 1037Y 2020 - 2021 76


Let us explore the IQ data in the table by calculating the mean
for each cell and constructing an interaction graph.

SIS 1037Y 2020 - 2021 77


An interaction effect is suggested if the line
segments are far from being parallel.
No interaction effect is suggested if the line
segments are approximately parallel.
For the IQ scores, it appears there is an interaction
effect:
 Females with high lead exposure appear to have
lower IQ scores, while males with high lead
exposure appear to have high IQ scores.

SIS 1037Y 2020 - 2021


78
1. For each cell, the sample values come from a
population with a distribution that is
approximately normal.
2. The populations have the same variance σ2.
3. The samples are simple random samples.
4. The samples are independent of each other.
5. The sample values are categorized two ways.
6. All of the cells have the same number of sample
values (a balanced design – this section does
not include methods for a design that is not
balanced).
SIS 1037Y 2020 - 2021 79
Two-Way ANOVA calculations are quite
involved, so it will be assume that a software
package/ Programming Language is being
used.

Minitab, Excel, StatCrunch, STATDISK, or


other software can be used.

SIS 1037Y 2020 - 2021 80


Step 1: Interaction Effect - test the null hypothesis
that there is no interaction

Step 2: Row/Column Effects - if we conclude there


is no interaction effect, proceed with these
two hypothesis tests
Row Factor: no effects from row
Column Factor: no effects from column

All tests use the F distribution and it is assumed


technology will be used.

SIS 1037Y 2020 - 2021 81


SIS 1037Y 2020 - 2021 82
Procedure for
Two-Way ANOVA

SIS 1037Y 2020 - 2021 83


Given the performance IQ scores in the table at the
beginning of this section, use two-way ANOVA to
test for an interaction effect, an effect from the row
factor of gender, and an effect from the column
factor of blood lead level.

Use a 0.05 level of significance.

SIS 1037Y 2020 - 2021 84


Requirement Check:

1. For each cell, the sample values appear to be from a


normally distributed population.
2. The variances of the cells are 95.3, 146.7, 130.8, 812.7,
142.3, and 143.8, which are considerably different from
each other. We might have some reservations that the
population variances are equal – but for the purposes of
this example, we will assume the requirement is met.
3. The samples are simple random samples.
4. The samples are independent of each other.
5. The sample values are categorized in two ways.
6. All the cells have the same number of values.

SIS 1037Y 2020 - 2021 85


An output is displayed below:

SIS 1037Y 2020 - 2021 86


Step 1: Test that there is no interaction between the two
factors.

The test statistic is F = 0.43 and the P-value is


0.655, so we fail to reject the null hypothesis.

It does not appear that the performance IQ scores


are affected by an interaction between gender and
blood lead level.

There does not appear to be an interaction effect,


so we proceed to test for row and column effects.

SIS 1037Y 2020 - 2021 87


Step 2: We now test:
H 0 : There are no effects from the row factor (gender).
H 0 : There are no effects from the column factor (blood lead level).

For the row factor, F = 0.07 and the P-value is 0.791. Fail
to reject the null hypothesis, there is no evidence that IQ
scores are affected by the gender of the subject.

For the column factor, F = 0.10 and the P-value is 0.906.


Fail to reject the null hypothesis, there is no evidence that
IQ scores are effected by the level of lead exposure.

SIS 1037Y 2020 - 2021 88


Interpretation:

Based on the sample data, we


conclude that IQ scores do not appear
to be affected by gender or blood lead
level.

SIS 1037Y 2020 - 2021 89


Two-way analysis of variance is not one-way
analysis of variance done twice.

Be sure to test for an interaction between the


two factors.

SIS 1037Y 2020 - 2021 90


Special Case:
One Observation per Cell and
No Interaction
If our sample data consist of only one observation
per cell, there is no variation within individual cells
and sample variances cannot be calculated for
individual cells.

If it seems reasonable to assume there is no


interaction between the two factors, make that
assumption and test separately:
H 0 : There are no effects from the row factor.
H1 : There are no effects from the column factor.

(The tests details are the same as presented earlier.)


SIS 1037Y 2020 - 2021 91
SIS 1037Y 2020 - 2021 92
Comments?

You might also like