H 6 I I H 6: Comparing The Means of Many Independent Samples

Chapter 11
Comparing the Means of Many Independent

Samples
11.1 Introduction
S Chapter 7: Compare the means of 2 populations using 2 independent
samples
H0 : 61 62
S Chapter 11: Compare 2 or more, in general I population means using I
independent samples.
H 0 : 6 1 6 2 ... 6 I
Example 11.1 (p463)
Treatment
1 2 3 4 5
16.5 11 8.5 16 13
15 15 13 14.5 10.5
B B B B B
10.5 5 9 8.5 11
Mean 11.5 9.6 10.3 11.1 12.3
SD 3.5 2.4 2.0 3.1 2.9
n 12 12 12 12 12
1
S Note the following:
[ The sample means are different from each other
[ There are considerable variation within each group
[ Also look at figure 11.1 (p464)
S Notation to be used:
[ Population means: 6 1 , 6 2 , ..., 6 I
[ Population standard deviations: @ 1 , @ 2 , ..., @ I
S Q:
H0 : 61 62 63 64 65
H A : The 6 Ui s are not all equal
A: Use the method called Analysis of Variance (ANOVA) to test this
hypothesis.
S Q:Why not repeated t tests?

A: There are three reasons:
1. P type I error PRe ject H 0 | H 0 true .
This probability increases as the number of repeated t tests
increases.
2. The ANOVA technique combines the information on
variability from all the samples simultaneously, therefore
increases the precision of the analysis.
3. The structure of the treatment groups makes the t tests
ineffective.
2
A Graphical Perspective on ANOVA
S This is one of the first steps in any analysis. (Use a

graph to see what is going on.)
S Will consider 2 variabilities
[ between group sample means
[ within groups
3
11.2 The Basic Analysis of Variance
S Before we can use the ANOVA procedure to test the hypothesis

H0 : 61 62 63 64 65
S we first need to do a few calculations that describe the variability
between as well as within the different groups.
S Notation:
[ y ij indicate observation j in group i with i 1, T, I

and j 1, T, n i
[ ni is the sample size of group i
[ I the number of groups
[ ! j1
ni
y ij the mean for group i
y i) ni
[ n n 1 T n I is the total sample size
[ ! i1
I
! j1
ni
y ij is the grand mean
y )) n
4
Example 11.2: (p11.2)
Diet 1 Diet 2 Diet 3
y 11 8 y 21 9 y 31 15
y 12 16 y 22 16 y 32 10
y 13 9 y 23 21 y 33 17
y 24 11 y 34 6
y 25 18
ni n1 3 n2 5 n3 4
! j1 y 1j 33 ! j1 y 2j 75 ! j1 y 3j 48

Sum 3 5 4
y i
Mean ! j1
3
y 1j ! j1
5
y 2j ! j1
4
y 3j
n1 11 n2 15 n3 12
y i
The overall mean:
! i1
I
! j1
ni
y ij
y )) n
337548
354
13
5
Variation Within Groups
S Notation:
[ SSwithin ! i1 ! j1

Ii
y ij " y i) 2
n
the Sum of Squares within groups
[ dfwithin n " I
n 1 " 1 T n I " 1
the within groups degrees of freedom
[ MSwithin
SSwithin
dfwithin
the mean square within groups
[ S pooled MSwithin
pooled standard deviation
6
Example 11.3 & 11.4: (p469 &471)
8 9 15
16 16 10
9 21 17
11 6
18
ni 3 5 4
Mean 11 15 12
Sum ! j1 y ij " y i) 2
n
i
38 98 74
! j1
3
y 1j " y 1) 2 8 " 11 2 ... 9 " 11 2 38
! j1
5
y 2j " y 2) 2 9 " 15 2 ... 18 " 15 2 98
! j1
4
y 3j " y 3) 2 15 " 12 2 ... 6 " 12 2 74
S SSwithin 38 98 74 210
S dfwithin 12 " 3 9
S MSwithin 210
9
23.333
S S pooled 210
9
4.83
7
More on SSwithin and MSwithin
SSwithin
! i1 ! j1 y ij " y i) 2
I i n
! j1 y 1j " y 1) 2 ... ! j1 y Ij " y I) 2

1 n I n
n 1 "1
n 1 "1
! j1
n1
y 1j " y 1) 2 ...
n I "1
n I "1
! j1
n1
! j1
nI
y Ij " y I) 2
! n1
y 1j "y 1)
2
! nI
y Ij "y I)
2
n 1 " 1 j1
n 1 "1
.. n I " 1 j1
n I "1
n 1 " 1 s 21 ... n I " 1 s 2I
MSwithin
SSwithin
dfwithin
n 1 "1 s 21 ...n I "1 s 2I
n "I
n 1 "1 s 21 ...n I "1 s 2I
n 1 "1 Tn I "1
S The MSwithin can now also be calculated from the

individual sample standard deviations i.e. using
s 21 , s 22 , ..., s 2I .
8
Variation Between Groups
S Notation:
[ SSbetween ! i1 n i y i) " y )) 2

I
the Sum of Squares between Groups
[ dfbetween I " 1
the degrees of freedom between groups
[ MSbetween
SSbetween
dfbetween
the mean square between groups
9
Example 11.5 (p11.5)
ni n1 3 n2 5 n4 4
Mean y 1) 11 y 2) 15 y 3) 12
Grand mean y )) 13
SSbetween
! i1 n i y i) " y )) 2
I
311 " 13 2 515 " 13 2 412 " 13 2

36
dfbetween
I"1
3"1
2
MSbetween
SSbetween
dfbetween
36
2
18
10
Fundamental Relationship of ANOVA
Total Sum of Squares

y ij " y )) y ij " y i) y i) " y ))
which leads to
! i ! j y ij " y )) 2
! i ! j y ij " y i) 2 ! i ! j y i) " y )) 2
! i ! j y ij " y i) 2 ! i n i y i) " y ))
SSwithin SSbetween
SSTotal SSwithin SSbetween
Total degrees of freedom

dftotal
n " 1
n " I I " 1
dfwithin dfbetween
11
Example 11.6: (p474)
SSTotal
! i ! j y ij " y )) 2
8 " 13 2 ... 9 " 13 2
9 " 13 2 ... 18 " 13 2
15 " 13 2 ... 6 " 13 2
246
SSwithin 210
SSbetween 36
dftotal
n " 1
12 " 1
11
dfwithin 9
dfbetween 2
12
The ANOVA table
S This is a summary of all the formulae and calculations.
Source df SS MS
Between I"1 ! i1
I
n i y i) " y )) 2 SSb /df
n " I ! i1 ! j1 y ij " y i) 2 SSw /df
I n
Within i
Total n " 1 ! i ! j y ij " y )) 2
Source df SS MS
Between 2 36 18
Within 9 210 23.333
Total 11 246
13
11.3 The Analysis of Variance Model
Think of the ANOVA in terms of the following
statistical model:
y ij 6 A i / ij
6 the grand population mean
A i effect of group i i.e. the difference between the

population mean for group i, 6 i , and the grand mean,
6. Therefore, A i 6 i " 6.
S If A i 0 (positive): the observations from group i tend
to be greater than the overall average
S If A i 0 (negative): the observations from group i tend
to be smaller than the overall average.
/ ij random error
The following two null hypothesis are equivalent
H 0 : 6 1 ... 6 I
and
H 0 : A 1 ... A I 0
14
Thus, the statistical model can be stated in words as:
y ij 6 A i / ij
observation overall average group effect
random error
Parameter estimates:
§
6 y
§
6 i y i
§
Ai §
6i " §
6 y i " y
§
/ ij y ij " y i
Thus,
y ij y y i " y y ij " y i
so that
§
y ij §
6§
A i / ij
SSbetween ! i1 n i y i) " y )) 2 ! i1 n i §

A 2i
I I
ni § 2
SSwithin ! i1 ! j1 y ij " y i) 2 ! i1 ! j1 / ij
I ni I
15
11.4 The Global F Test
In this section we use all the calculation done in the previous section to
test the hypothesis
H 0 : 6 1 ... 6 I
S This is a compound hypothesis.
S Upon rejection of H 0 we do not know which means differ from which.
S Further analysis is needed.
The F Distribution F v 1 ,v 2
Q: What does this distribution look like?
0.8
0.8
0.6
dF ( x, 9 , 5) 0.4
0.2
0 0
0 2 4 6 8 10
0 x 10
S The distribution is heavily skewed to the right

S Depends on two parameters:
[ Numerator degrees of freedom
[ Denominator degrees of freedom
S Critical values are found in Table 10
16
The F Test
1. H 0 : 6 1 ... 6 I
2. Choose ) ?
3. Calculate the F test statistic value
MSbetween
FS MSwithin
with
Numerator df dfbetween
Denominator df dfwithin
4. Find the p " value and compare it to the chosen
) " value or find the critical value from Table 9
and compare it to the calculated test statistic
value.
5. Conclusion: Reject or do not reject H 0
17
Example 11.9: (p479)
1. H 0 : 6 1 6 2 6 3
H A :The 6 Ui s are not all equal
2. Choose ) 0.05

MSbetween
FS MSwithin
18
23.333
0.77
with
Numerator df dfbetween 2
Denominator df dfwithin 9
4. The p " value 0.2 and the critical value

F 2,9,0.05 4.26.
5. Conclusion: Since the p " value 0.2 ) 0.05

or since F S 0.77 F 2,9,0.05 4.26 we do not
reject H 0
18
More ways to calculate F S
FS
MSbetween
MSwithin
SSbetween
dfbetween
SSwithin
dfwithin
dfwithin SSbetween
dfbetween SSwithin
19
11.5 Applicability of Methods
The calculations and the interpretations of the

ANOVA are based on certain conditions
1. Design conditions:
S Should be reasonable to regard the groups of
observations as random samples from their
respective populations. The observations
within each sample must be independent
from each other
S The I samples must be independent of each
other.
2. Population conditions:
S The I population distributions must be
(approximately) normal with equal standard
deviations i.e. @ 1 @ 2 ... @ I
20
11.5 Two-way ANOVA
Analysis of Variance for:

1. Randomized complete block design
2. Two factors
Randomized complete block design
Example 11.12 (p487)

S Effect of different amounts of acid on the growth rate
of plants and at the same time take into account of the
differing amounts of sunlight
Low High Control Block Mean

Block 1 1.58 1.10 2.47 1.717
Block 2 1.15 1.05 2.15 1.450
Block 3 1.27 0.50 1.46 1.077
Block 4 1.25 1.00 2.36 1.537
Block 5 1.00 1.50 1.00 1.167
ni 5 5 5
y i 1.25 1.03 1.888
21
Dotplots of the 3 treatment groups
2.75
2.25
Height (cm)
1.75
1.25
0.75
0.25
0.5 1 1.5 2 2.5 3 3.5
low high control
S Hypothesis:
H A : 6 1 6 2 6 3 , but we need to take into account
the differences between blocks
S Statistical model:
y ijk 6 A i * j / ijk
S Notation:
U
y ijk the k th observation when treatment i is applied
in block j
6 grand population mean
A i effect of group i
* j effect of block j
22
SStotal SSwithin SStreatments SSblocks
Formulae:
Treatments:
SStreatments ! i1 n i y i) " y )) 2
I
dftreatments I " 1
MStreatments SStreatments
dftreatments
Blocks:
SSblocks ! j1 m i y j " y )) 2
B
dfblocks B " 1
MSblocks SSblocks
dfblocks
Total:
SStotal ! i1 ! j1 ! k1 y ijk " y 2
I B
i n
dftotal n ' " 1

Within:
SSwithin ! i1 ! j1 y ij " y i) " y )j y 2
Ii n
dfwithin n " I " B 1

MSwithin SSwithin
dfwithin
23
Example 11.13
Low High Control mi y j

Block 1 1.58 1.10 2.47 3 1.717
Block 2 1.15 1.05 2.15 3 1.450
Block 3 1.27 0.50 1.46 3 1.077
Block 4 1.25 1.00 2.36 3 1.537
Block 5 1.00 1.50 1.00 3 1.167
n 5 5 5 n ' 15
y i 1.25 1.03 1.888 y 1.389
SStreatments
! i1 n i y i) " y )) 2
I
51.25 " 1.389 2 51.03 " 1.389 2 51.888 " 1.389 2

1.986
dftreatments
I"1
3"1
2
24
MStreatments
SStreatments
dftreatments
1.986
2
1.986
SSblocks
! j1 m i y j " y )) 2
B
31.717 " 1.389 2 ... 31.167 " 1.389 2

0.840
dfblocks
B"1
5"1
4
MSblocks
SSblocks
dfblocks
0840
4
0.210
25
SStotal
! i1 ! j1 ! k1 y ijk " y 2
I Bi n
1.58 " 1.389 2 1.15 " 1.389 2 ... 1.00 " 1.389 2
4.278
dftotal
n' " 1
15 " 1
14
SSwithin
SStotal " SStreatments " SSblocks
4.278 " 1.986 " 0.840
1.452
OR
SSwithin
! ! y " y i) y )j y 2
I ni
i1 j1 ij
1.58 " 1.25 " 1.717 1.389 2 ...

1.0 " 1.888 " 1.167 1.389 2
1.452
26
dfwithin
dftotal " dftreatments " dfblocks
14 " 2 " 4
8
OR
dfwithin
n " I " B 1
15 " 3 " 5 1
8
27
Summarize all the calculations in an ANOVA table:
Source df SS MS F " ratio
Treatments 2 1.986 0.993 5.47
Blocks 4 0.840 0.210
Within grou ps 8 1.452 0.1815
Total 14 4.278
Thus,
1. H 0 : 6 1 6 2 6 3
H A :The 6 Ui s are not all equal
2. Choose ) 0.05

MStreatments
FS MSwithin
0.993
0.1815
5.47
with
Numerator df dfbetween 2
Denominator df dfwithin 8
4. The p " value 0.05 and the critical value

F 2,8,0.05 4.46.
5. Conclusion: Since the p " value 0.05 ) 0.05

or since F S 5.47 F 2,8,0.05 4.46 we reject H 0
28
Factorial ANOVA
S Two or more factors influence the response variable
simultaneously i.e.there are more than one explanatory
variable
Example 11.15:(p490)
S The effect of stress (control & stress) and light (low &
moderate) on growth of soybean plants
S 2 2 factorial experiment
Representation of a 2 2 factorial experiment:
B1 B2
Low Moderate
A1 .. .. 1, 1 .. .. 1, 2

Control Treatment 1 Treatment 3
A2 .... 2, 1 O..O..O 2, 2

Stress Treatment 2 Treatment 4
29
S Factor A: Mechanical Stress
[ Level 1: Control i.e. No stress
[ Level 2: Stress
S Factor B: Light
[ Level 1: Low
[ Level 2: Moderate
S Treatment 1: Control & Low light
S Treatment 2: Stress & Low light
S Treatment 3: Control & Moderate light
S Treatment 4: Stress & Moderate light
30
The data of example 11.15:
Treatment 1 Treatment 2 Treatment 3 Treatment 4
264 235 314 283
200 188 320 312
225 195 310 291
B B B B
288 255 282 282
230 202 273 257
y 11 245.3 y 21 245.3 y 12 245.3 y 22 245.3

SD 11 27.0 SD 21 27.0 SD 12 27.0 SD 22 27.0
13 13 13 13
31
Statistical Model for no interaction
y ijk 6 A i * j / ijk
Notation
y ijk the k U th observation of level i of the first factor

and level j of the second factor.
A i the effect of level i of the first factor
* j the effect of level j of the second factor
where
! i1
I
Ai 0
32
Arrange sample means in a table:
B1 B2 Difference
A1 y 11 245.3 y 12 304.1 58.8
A2 y 21 212.9 y 22 268.8 55.9
Difference "32.4 "35.3
B2
B1
A1 A2
S Additive factors the influence of the two factors are

equal to the sum of their separate influences i.e. there
is no interaction between the two factors.
S Take note of simple effects and main effects (See
page 492)
33
Interaction graphs:
A1
A2
B1 B2
S No interaction in a factorial experiment (Parallel lines)
A1
A2
B1 B2
S Interaction as a difference in magnitude of response
34
A1
A2
B1 B2
S Interaction as a difference in direction of response
A1
A2
B1 B2
S Interaction due to a difference in magnitude and

direction of response
35
Example 11.17 (p493)
Unfertilized Fertilized Difference

Ambient y 11 0.289 y 12 0.347 0.058
Elevated y 21 0.227 y 22 0.496 0.269
Difference "0.062 0.149
Carbon absorbtion values
Fertilized
Unfertilized
Ambient Elevated
CO2 Concentration
Suppose that
S Factor 1: CO 2 concentrations
[ Level 1: Ambient
[ Level 2: Elevated
S Factor 2: Soil Type
[ Level 1: Unfertilized
[ Level 2: Fertilized
36
Statistical Model for Interaction
y ijk 6 A i * j + ij / ijk
Notation
y ijk the k U th observation of level i of the first factor

and level j of the second factor.
A i the effect of level i of the first factor
* j the effect of level j of the second factor
+ ij the interaction effect between level i of factor 1

and level j of factor 2
where
! i1
I
Ai 0
! j1
J
*j 0
! i1
I
+ ij ! j1 + ij 0
J
37
Assume that we have a balanced design i.e. we have
an equal number of observations (or measurements)
in each of the treatments. Furthermore, suppose that
we have r observation per treatment.
1 2 C J Marginal
1 y 11 y 12 C y 1J y 1
2 y 21 y 22 C y 2J y 2
B B B E B B
I y I1 y I2 C y IJ y I
Marginal y 1 y 2 C y J y
SStotal SSfactor1 SSfactor2

SSinteraction SSwithin
Source df SS
I"1 rJ ! i1 y i " y 2
I
Factor1
J"I rI ! j1 y j " y
J 2
Factor2
I " 1 J " 1 r ! i1 ! j1 y ij " y i " y j y
I J
Inter
n " IJ ! i1 ! j1 ! k1 y ijk " y ij
I J r 2
Within
n "1

! ! ! y ijk " y
I J r 2
Total i1 j1 k1
38
Source df SS MS F
Factor1 1 0.005678 0.005678 1.19
Factor2 1 0.080197 0.080197 16.79
Interaction 1 0.33391 0.33391 6.99
Within 8 0.038202 0.004775
Total 11 0.157468
The three null hypothesis that we need to test
1. H 0 : A 1 A 2 ...A I 0
2. H 0 : * 1 * 2 ...* I 0
3. H 0 : + 11 ...+ IJ 0
Take note:
Inferences about individual factor effects depend
upon the presence or absence of interaction.
Significance of interaction is determined before any
determinations of significance for main effects of the
factors, since significant interaction can modify any
inferences based on the significant differences among
the marginal means of the factors
39
Testing for Interaction effects
1. H 0 : + 11 ...+ IJ 0
H A : The + Ui s are not all equal to zero
2. Choose ) 0.05
MSInteraction
FS MSwithin
6.99
with
Numerator df dfInteraction I " 1 J " 1 1
Denominator df dfwithin n " IJ 8
value. In this case F s 6.99 F 1,8,0.05 5.32
5. Conclusion: Reject or do not reject H 0 . In this

case we reject H 0
Take note:
S If H 0 is rejected we should be careful in interpreting
main effects
40
Testing for main effects of factor 1
1. H 0 : A 1 A 2 ...A I 0
H A : The A Ui s are not all equal to zero
2. Choose ) 0.05
MSfactor1
FS MSwithin
1.19
with
Numerator df dffactor1 I " 1 1
value. In this case F s 1.19 F 1,8,0.05 5.32
41
Testing for main effects of factor 2
1. H 0 : * 1 * 2 ...* I 0
H A : The * Ui s are not all equal to zero
2. Choose ) ?
MSfactor2
FS MSwithin
16.79
with
Numerator df dffactor2 J " 1 1
value.In this case F s 16.79 F 1,8,0.05 5.32
42
Chapter 11: Exercises
11.1 11.2 11.3 11.4 11.5 11.6 p476
11.8 11.9 11.10 p481
11.17 11.18 11.19 11.20 11.21 11.22 p497
Take note:
S These exercises is part of the textbook and can be
included in any class test, semester test or exam!
43

H 6 I I H 6: Comparing The Means of Many Independent Samples

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

H 6 I I H 6: Comparing The Means of Many Independent Samples

Uploaded by

Copyright:

Available Formats

Chapter 11

Comparing the Means of Many Independent

S Q:Why not repeated t tests?

S This is one of the first steps in any analysis. (Use a

S Before we can use the ANOVA procedure to test the hypothesis

[ y ij indicate observation j in group i with i  1, T, I

[ I the number of groups

[ n   n 1  T  n I is the total sample size

Diet 1 Diet 2 Diet 3

! j1 y 1j  33 ! j1 y 2j  75 ! j1 y 3j  48

The overall mean:

[ SSwithin  ! i1 ! j1

the Sum of Squares within groups

the mean square within groups

 ! j1 y 1j " y 1) 2  ...  ! j1 y Ij " y I) 2

 n 1 " 1 s 21  ...  n I " 1 s 2I

S The MSwithin can now also be calculated from the

[ SSbetween  ! i1 n i y i) " y )) 2

the Sum of Squares between Groups

the mean square between groups

Diet 1 Diet 2 Diet 3

 311 " 13 2  515 " 13 2  412 " 13 2

Total Sum of Squares

SSTotal  SSwithin  SSbetween

Total degrees of freedom

S This is a summary of all the formulae and calculations.

Total n " 1 ! i ! j y ij " y )) 2

6 the grand population mean

A i effect of group i i.e. the difference between the

SSbetween  ! i1 n i y i) " y )) 2  ! i1 n i §

S The distribution is heavily skewed to the right

5. Conclusion: Reject or do not reject H 0

3. Calculate the F test statistic value

4. The p " value  0.2 and the critical value

5. Conclusion: Since the p " value  0.2  )  0.05

The calculations and the interpretations of the

Analysis of Variance for:

Randomized complete block design

Example 11.12 (p487)

Low High Control Block Mean

low high control

dftotal  n ' " 1

dfwithin  n  " I " B  1

Low High Control mi y j

 51.25 " 1.389 2  51.03 " 1.389 2  51.888 " 1.389 2

 31.717 " 1.389 2  ...  31.167 " 1.389 2

 1.58 " 1.25 " 1.717  1.389 2  ...

3. Calculate the F test statistic value

4. The p " value  0.05 and the critical value

5. Conclusion: Since the p " value  0.05  )  0.05

A1 ..  ..  1, 1 ..  ..  1, 2

A2 .. .. 2, 1 O..O..O 2, 2

264 235 314 283

200 188 320 312

225 195 310 291

230 202 273 257

y 11  245.3 y 21  245.3 y 12  245.3 y 22  245.3

y ijk the k U th observation of level i of the first factor

6 grand population mean

A i the effect of level i of the first factor

* j the effect of level j of the second factor

S Additive factors  the influence of the two factors are

[ y ij indicate observation j in group i with i 1, T, I

[ n n 1 T n I is the total sample size

! j1 y 1j 33 ! j1 y 2j 75 ! j1 y 3j 48

[ SSwithin ! i1 ! j1

! j1 y 1j " y 1) 2 ... ! j1 y Ij " y I) 2

n 1 " 1 s 21 ... n I " 1 s 2I

S The MSwithin can now also be calculated from the

[ SSbetween ! i1 n i y i) " y )) 2

311 " 13 2 515 " 13 2 412 " 13 2

SSTotal SSwithin SSbetween

Total n " 1 ! i ! j y ij " y )) 2

SSbetween ! i1 n i y i) " y )) 2 ! i1 n i §

4. The p " value 0.2 and the critical value

5. Conclusion: Since the p " value 0.2 ) 0.05

dftotal n ' " 1

dfwithin n " I " B 1

Low High Control mi y j

51.25 " 1.389 2 51.03 " 1.389 2 51.888 " 1.389 2

31.717 " 1.389 2 ... 31.167 " 1.389 2

1.58 " 1.25 " 1.717 1.389 2 ...

4. The p " value 0.05 and the critical value

5. Conclusion: Since the p " value 0.05 ) 0.05

A1 .. .. 1, 1 .. .. 1, 2

A2 .... 2, 1 O..O..O 2, 2

y 11 245.3 y 21 245.3 y 12 245.3 y 22 245.3

S Additive factors the influence of the two factors are

SStotal SSfactor1 SSfactor2