Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Advanced Statistics for College Students

2.3 ANOVA (Analysis of Variance)


The Analysis of Variance (ANOVA) is an extension of two-sample
independent t-test. It is used to test the significant difference of the measurements
from 3 or more independent groups. Example of when you might want to test
different groups:

• A group of psychiatric patients are trying three different therapies:


counseling, medication and biofeedback. You want to see if one therapy is
better than others.
• A manufacturer has two different processes to make light bulbs. They want
to know if one process is better than the other.

Properties of ANOVA:

1. ANOVA employs an additive data decomposition, and its sums of squares


indicate the variance of each component of the decomposition.
2. Comparisons of mean squares.
3. ANOVA provides strong statistical analysis
4. It has been adapted to the analysis of a variety of experimental design.

One-way ANOVA

• Random samples of size n are selected from each of k populations. The k


different populations are classified on the basis of a single criterion such
as different treatments or groups.
• It is used to test for differences among 3 or more independent groups
(mean)
• Comparing of 3 or more population means
• Single factor ANOVA

Methods for testing the hypothesis:

𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 = ⋯ 𝜇𝑘 where k = population

𝐻1 : at least one mean differs from the other

MARLON S. FRIAS, Ph.D. | BUKIDNON STATE UNIVERSITY 1


Advanced Statistics for College Students

Group 1 Group 2 Group 3 ⋯ Group 𝑘


𝑥11 𝑥21 𝑥31 ⋯ 𝑥𝑘1
𝑥12 𝑥22 𝑥32 ⋯ 𝑥𝑘2
𝑥13 𝑥23 𝑥33 ⋯ 𝑥𝑘3
⋮ ⋮ ⋮ ⋮ ⋮
𝑥1𝑛 𝑥2𝑛 𝑥3𝑛 ⋯ 𝑥𝑘𝑛
𝑇1. 𝑇2 . 𝑇3 . ⋯ 𝑇𝑘. 𝑇…

ANOVA TABLE FOR EQUAL SAMPLE SIZES

Sum of
Source of variation Df Mean Square F computed
Squares
𝑆𝑆𝐵 𝑠12
Between SSB 𝑘−1 𝑠12 = 𝐹= 2
𝑘−1 𝑠2
𝑆𝑆𝑊
Within SSW 𝑘 (𝑛 − 1) 𝑠22 =
𝑘 (𝑛 − 1)
Total SST 𝑛𝑘 − 1

In the ANOVA table above, 𝑘 is the number of groups while 𝑛 is the sample
size for each group given that the sample sizes of the groups are equal.

For Equal Sample Size:

SST (Sum of Squares Total)

𝑇. .2
2
𝑆𝑆𝑇 = ∑ ∑ 𝑥𝑖𝑗 − =If equal sample size
𝑛𝑘
𝑗 𝑖

SSB (Sum of Squares in Between)

∑ 𝑇𝑖 .2 𝑇. .2 =If equal sample size


𝑆𝑆𝐵 = −
𝑛 𝑛𝑘
SSW (Sum of Square Within)

SSW= SST-SSB

MARLON S. FRIAS, Ph.D. | BUKIDNON STATE UNIVERSITY 2


Advanced Statistics for College Students

ANOVA TABLE FOR UNEQUAL SAMPLE SIZES

Sum of
Source of variation Df Mean Square F computed
Squares
𝑆𝑆𝐵 𝑠12
Between SSB 𝑘−1 𝑠12 = 𝐹= 2
𝑘−1 𝑠2
𝑆𝑆𝑊
Within SSW 𝑁−𝑘 𝑠22 =
𝑁−𝑘
Total SST 𝑁−1

For Unequal Sample Size:

2 𝑇..2
𝑆𝑆𝑇 = ∑ ∑ 𝑥𝑖𝑗 −
𝑁
𝑗 𝑖

𝑘
𝑇𝑖 .2 𝑇..2
𝑆𝑆𝐵 = ∑ −
𝑛𝑖 𝑁
𝑖=1

𝑆𝑆𝑊 = 𝑆𝑆𝑇 − 𝑆𝑆𝐵

Example 1:

Below are the ages that female get married in Valencia City, Malaybalay City
and Maramag at a 0.05 level of significance, perform an ANOVA Test to see if the
average age of marriage in these 3 municipalities are equal.

Valencia City Malaybalay City Maramag


18 18 21
19 20 22
20 16 17
21 20 18
22 21 22
23 20 19
18 18 21
19 19 20
20 17 18
21 13 23

MARLON S. FRIAS, Ph.D. | BUKIDNON STATE UNIVERSITY 3


Advanced Statistics for College Students

Steps:

1. 𝐻0 : 𝜇1 = 𝜇2 = 𝜇3
𝐻𝑎 : 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 2 𝑚𝑒𝑎𝑛𝑠 𝑎𝑟𝑒 𝑛𝑜𝑡 𝑒𝑞𝑢𝑎𝑙
2. 𝛼 = 0.05
3. Critical Region: Reject H0 if 𝐹 > 3.35

(see critical values of F distribution in Table A. 2


for 𝛼 = 0.05)
(𝐹0.05,2,27 = 3.35 since 𝛼 = 0.05, 𝑑𝑓1 = 3 −
1 = 2 , 𝑑𝑓2 = 3(10 − 1) = 27, see the figure
below)
4. Test-statistic: F-test
5. Computation:

Valencia City Malaybalay City Maramag


18 18 21
19 20 22
20 16 17
21 20 18
22 21 22
23 20 19
18 18 21
19 19 20
20 17 18
21 13 23
𝑇1⋅ = 201 𝑇2⋅ = 182 𝑇3⋅ = 201 𝑇⋅⋅ = 584

2
∑∑𝑥𝑖𝑗 = 11 506 (Sum of squares of all observations)
𝑇1⋅ = 201 (Total of the first column)
𝑇2⋅ = 182 (Total of the first column)
𝑇3⋅ = 201 (Total of the first column)
𝑇⋅⋅ = 584 (Grand Total)

5842
𝑆𝑆𝑇 = 182 + 192 + 202 + ⋯ + 232 −
30
= 11,506 − 11,368.53333
= 137.4667

2012 + 1822 + 2012 5842


𝑆𝑆𝐵 = −
10 30
= 11,392.6 − 11,368.53333
= 24.0667

MARLON S. FRIAS, Ph.D. | BUKIDNON STATE UNIVERSITY 4


Advanced Statistics for College Students

𝑆𝑆𝑊 = 137.4667 − 24.0667


= 113.4

Note:
The formula used in the computation is for equal sample sizes of the groups.

ANOVA TABLE
Sum of
Source of variation Df Mean Square F computed
Squares
Between 24.067 2 12.034 2.87
Within 113.400 27 4.200
Total 137.467 29

6. Decision: The computed F-value is 2.87 this is < 3.35. So, we accept 𝐻0 .

7. Conclusion: This means that there is no significant difference in the ages of


females, when they get married, from the three municipalities/cities.

Example 2:

A study was done to see whether the source of dietary fat affects visual
discrimination. Rats were placed on one of four diets for 2 months: Diet 1 had 5%
corn oil; Diet 2 was the same as Diet 1 with the addition of 20% safflower oil; Diet 3
was Diet 1 with 20% added coconut oil; Diet 4 was Diet 1 with 20% added olive oil.
All the rats were trained on a simple visual discrimination task, and their errors
before achieving a certain criterion were recorded. Test the data to see whether the
different diets affected learning of the task. Use a 0.01 level of significance.

Diet 1 Diet 2 Diet 3 Diet 4


13 22 25 28
20 20 24 35
31 22 26 27
18 27 24 31
11 25 28 36
11 21 25 27
11 18 26 25
12 12 23
12 22 25
12 21

MARLON S. FRIAS, Ph.D. | BUKIDNON STATE UNIVERSITY 5


Advanced Statistics for College Students

Answer:

1. 𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 = 𝜇4

𝐻1 : at least two means (1 pair)are not equal

2. 𝛼 = 0.01
3. Critical Region: Reject 𝐻0 if the computed 𝐹 > 16.26
(𝑑𝑓1 = 4 − 1 = 3, 𝑑𝑓2 = 36 − 4 = 32,
𝛼 = 0.01. See Table A.3)
4. Test Statistic: F-test / ANOVA
5. Computation:

Diet 1 Diet 2 Diet 3 Diet 4


13 22 25 28
20 20 24 35
31 22 26 37
18 27 24 31
11 25 28 36
11 21 25 37
11 18 26 35
12 12 23
12 22 25
12 21
𝑇1⋅ = 151 𝑇2⋅ = 210 𝑇3⋅ = 226 𝑇4⋅ = 239 𝑇⋅⋅ = 826

8262
𝑆𝑆𝑇 = 132 + 202 + 312 + ⋯ + 35 −
36
= 21,126 − 18,952.11111 = 2,173.8889

1512 2102 2262 2392 8262


𝑆𝑆𝐵 = + + + −
10 10 9 7 36
= 20,525.3540 − 18,952.1111 = 1,573.2429

𝑆𝑆𝑊 = 2,173.8889 − 1,573.2429 = 600.646

ANOVA TABLE
Sum of
Source of variation Df Mean Square F computed
Squares
Between 1,573.2429 3 524.4143 27.94
Within 600.6460 32 18.7701
Total 2,173.8889 35

MARLON S. FRIAS, Ph.D. | BUKIDNON STATE UNIVERSITY 6


Advanced Statistics for College Students

6. Decision: Reject 𝐻0 since 27.94 > 16.26


7. Conclusion:

Therefore, there is a significant difference in the learning task of the


rats across the 4 different diets. In other words, the diets have something
to do with the visual discrimination of the tasks.

MARLON S. FRIAS, Ph.D. | BUKIDNON STATE UNIVERSITY 7

You might also like