Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Name: LEARNING ACTIVITY SHEET 5

Two-Way Analysis of Variance

Analysis of Variance (ANOVA)

Supposed we are interested in the effect of three teaching approaches (conventional, modular, blended
learning) to the performance of the students in Mathematics. If there is no difference between the different
approaches, then we would expect all of them to be approximately equal. Otherwise, we would expect
the mean performances to differ. Here, we introduce a situation that utilizes a test that is used to analyze
the data from more than two populations. These tests are used to deal with treatment (e.g. teaching
approaches) effects, including tests that take into account other factors that may affect the response (e.g.
Mathematics performance). The hypothesis that the population means are equal is considered equivalent
to the hypothesis that there is no difference in treatment effects. The analytical method we will use in such
problems is called the analysis of variance (ANOVA). Initial development of this method could be credited
to Sir Ronald A. Fisher who introduced this technique for the analysis of agricultural field experiments which
at present is also useful when dealing with social science and educational researches.

Two-way ANOVA

Now, consider a situation where it is of interest to study the effect of two factors, A and B, on some
response. For instance, in addition to the teaching approaches as factor affecting the Mathematics
performance of the students, we may also consider the time when their classes are schedule (i.e.
morning or afternoon). In this case, we will use a randomized block design, or the two-way analysis of
variance. It considers the case of 𝑛 replications of the treatment combinations (we term it here blocks)
determined by 𝑎 levels of factor 𝐴 (e.g. schedule) and 𝑏 levels of factor 𝐵 (e.g. teaching approach). This
means that there will be 𝑎𝑏 blocks in all. For each of these cases, it is important not only to determine if
the two factors have an influence on the response (e.g. Mathematics performance), but also if there is a
significant interaction between the two factors, hence, there are three pairs of hypotheses.

𝐻𝑜′: The means are equal when grouped according to factor A.


𝐻𝑎′: At least two of the means are unequal when grouped according to factor A.

𝐻𝑜′′: The means are equal when grouped according to factor B.


𝐻𝑎′′: At least two of the means are unequal when grouped according to factor B.

𝐻𝑜′′′: There is no interaction between factor A and factor B.


𝐻𝑎′′′: There is interaction between factor A and factor B.

The two-way ANOVA table has the following format with the corresponding formula for the value that
falls on each of its cells.

Source of Degrees of Mean


Sum of Squares F-statistic
Variation Freedom Squares
𝑆𝑆𝑎 𝑀𝑆𝑎
Factor A 𝑆𝑆𝑎 = 𝑏𝑛 ∑(𝑦̅𝑎 − 𝑦̅)2 𝑑𝑓𝑎 = 𝑎 − 1 𝑀𝑆𝑎 =
𝑑𝑓𝑎
𝐹𝑎 =
𝑀𝑆𝑤

𝑆𝑆𝑏 𝑀𝑆𝑏
Factor B 𝑆𝑆𝑏 = 𝑎𝑛 ∑(𝑦̅𝑏 − 𝑦̅)2 𝑑𝑓𝑏 = 𝑏 − 1 𝑀𝑆𝑏 =
𝑑𝑓𝑏
𝐹𝑏 =
𝑀𝑆𝑤

𝑆𝑆𝑎𝑏 𝑀𝑆𝑎𝑏
Interaction 𝑆𝑆𝑎𝑏 = 𝑆𝑆𝑡 − 𝑆𝑆𝑎 − 𝑆𝑆𝑏 − 𝑆𝑆𝑤 𝑑𝑓𝑎𝑏 = (𝑑𝑓𝑎 )(𝑑𝑓𝑏 ) 𝑀𝑆𝑎𝑏 =
𝑑𝑓𝑎𝑏
𝐹𝑎𝑏 =
𝑀𝑆𝑤

𝑆𝑆𝑤
Within 𝑆𝑆𝑤 = ∑(𝑦𝑎𝑏 − 𝑦̅𝑎𝑏 )2 𝑑𝑓𝑤 = 𝑎𝑏(𝑛 − 1) 𝑀𝑆𝑤 =
𝑑𝑓𝑤

Total 𝑆𝑆𝑡 = ∑(𝑦 − 𝑦̅)2 𝑑𝑓𝑡 = 𝑎𝑏𝑛 − 1

Advanced Statistics /LAS 5/Page 1 of 9


Again just like one-way ANOVA, the formulas involved are overwhelming despite that (for me) those are
already simplified. To facilitate understanding on how to obtain the values for each corresponding cell
in the table, we will study one example.

Example 2.2.2. The samples presented in the table below represent test scores from six classes of
Statistics taught using three different approaches (conventional, modular and blended
learning) set at different schedules (morning and afternoon) and are independently
obtained. Assume that the populations are normal with equal variances. At α = 0.05 level of
significance, test for equality of population means.

Notice that there six blocks with 4 replications each (so 𝑛 = 4). There are 2 types of schedule implying
that there are 2 levels of Factor A (so 𝑎 = 2). There are 3 types of teaching approach used implying
that there are 3 levels of Factor B (so 𝑏 = 3).
Factor B
Teaching Approach
Factor A
Conventional Modular Blended
Schedule
10 7 4
12 9 5
Morning
11 8 6
9 12 5
12 13 6
13 15 6
Afternoon
10 12 4
13 12 4

In this problem, we will test the hypotheses that:


1. 𝐻𝑜 ′ : There is no significant difference between the test scores when grouped according to their
schedule.
𝐻𝑎′ : There is significant difference between the test scores when grouped according to their
schedule.

2. 𝐻𝑜 ′ ′: There is no significant difference between the test scores when grouped according to teaching
approach used.
𝐻𝑎′′ : There is significant difference between the test scores when grouped according to teaching
approach used.

3. 𝐻𝑜 ′′′ : There is no significant interaction between the schedule and teaching approach used.
𝐻𝑎′′′ : There is significant interaction between the schedule and teaching approach used.

Here are the steps of calculations involving two-way ANOVA:


Step 1: Calculate the required Means
a. Block means (𝑦̅𝑎𝑏 )
b. Factor A mean or row means (𝑦̅𝑎 )
c. Factor B mean or column means (𝑦̅𝑏 )
d. Total mean (𝑦̅).
Factor B (Teaching Approach)
Factor A (Schedule) Conventional Modular Blended
10 7 4
12 9 5
Morning
11 8 6
9 12 5
12 13 6
13 15 6
Afternoon
10 12 4
13 12 4
Calculation of the Means
Factor B
Conventional Modular Blended
Factor A Row Means (𝑦̅𝑎 )
(dark cells)
Morning 10.50 9.00 5.00 8.167
(light cells)
Afternoon 12.00 13.00 5.00 10.000
(blue cells) (red cells) (orange cells) Total Mean (all cells)
Column Means (𝑦̅𝑏 ) 11.250 11.000 5.000 𝑦̅ = 9.083

Advanced Statistics /LAS 5/Page 2 of 9


Step 2: Calculate the Sums of Squares (𝑆𝑆)
a. Factor A
𝑆𝑆𝑎 = 𝑏𝑛 ∑(𝑦̅𝑎 − 𝑦̅)2
This formula means that we are going to
i. Subtract the total mean from each of the row means
ii. Square each of the differences
iii. Summate the result from (ii)
iv. Multiply the result from (iii) with the product 𝑏𝑛
𝑦̅𝑎 − 𝑦̅ 𝑏 = 3; 𝑛 = 4
8.167 – 9 .083 = (– 0.916)2 = 0.839
10.000 – 9 .083 = (0.917)2 = 0.841
Sum = 1.680 x 3 x 4 = 20.16
Therefore, 𝑆𝑆𝑎 = 20.16

b. Factor B
𝑆𝑆𝑏 = 𝑎𝑛 ∑(𝑦̅𝑏 − 𝑦̅)2
Similar to (a), this formula means that we are going to
i. Subtract the total mean from each of the column means
ii. Square each of the differences
iii. Summate the result from (ii)
iv. Multiply the result from (iii) with the product 𝑎𝑛
𝑦̅𝑏 − 𝑦̅ 𝑎 = 2; 𝑛 = 4
11.250 – 9 .083 = (2.167)2 = 4.696
11.000 – 9 .083 = (1.917) 2 = 3.675
5.000 – 9 .083 = (– 4.083)2 = 16.671
Sum = 25.042 x 2 x 4 = 200.33
Therefore, 𝑆𝑆𝑏 = 200.33

c. Within or Error
𝑆𝑆𝑤 = ∑(𝑦𝑎𝑏 − 𝑦̅𝑎𝑏 )2
This formula implies that we are going to
i. Get the difference between each entry data in a block and their
corresponding block means (This is the reason why I color-coded each of
the blocks with different colors and colored also the block means with the
same color as the block it represents.)
ii. Square each of the differences
iii. Summate the result from (ii)
𝑦𝑎𝑏 − 𝑦̅𝑎𝑏 𝑦𝑎𝑏 − 𝑦̅𝑎𝑏 𝑦𝑎𝑏 − 𝑦̅𝑎𝑏

10 – 10.50 = (-0.5)2 = 0.25 7 – 9.00 = (-2)2 =4 4 – 5.00 = (-1)2 =1

12 – 10.50 = (1.5)2 = 2.25 9 – 9.00 = (0)2 =0 5 – 5.00 = (0)2 =0

11 – 10.50 = (0.5)2 = 0.25 8 – 9.00 = (-1)2 =1 6 – 5.00 = (1)2 =1

9 – 10.50 = (-1.5)2 = 2.25 12 – 9.00 = (3)2 =9 5 – 5.00 = (0)2 =0

12 – 12.00 = (0)2 =0 13 – 13.00 = (0)2 =0 6 – 5.00 = (1)2 =1

13 – 12.00 = (1)2 =1 15 – 13.00 = (2)2 =4 6 – 5.00 = (1)2 =1

10 – 12.00 = (-2)2 =4 12 – 13.00 = (-1)2 =1 4 – 5.00 = (-1)2 =1

13 – 12.00 = (1)2 =1 12 – 13.00 = (-1)2 =1 4 – 5.00 = (-1)2 =1

Sum = 37
Therefore, 𝑆𝑆𝑤 = 37

d. Total
𝑆𝑆𝑡 = ∑(𝑦𝑎𝑏 − 𝑦̅)2
This formula implies that we are going to
i. Subtract the total mean from each of the data
ii. Square each of the differences
iii. Summate the result from (ii)

Advanced Statistics /LAS 5/Page 3 of 9


𝑦𝑎𝑏 −𝑦̅ 𝑦𝑎𝑏 −𝑦̅ 𝑦𝑎𝑏 −𝑦̅

10 – 9.083 = (0.917)2 = 0.841 7 – 9.083 = (-2.083) 2 = 4.339 4 – 9.083 = (-5.083) 2 = 25.837

12 – 9.083 = (2.917) 2 = 8.509 9 – 9.083 = (-0.083) 2 = 0.007 5 – 9.083 = (-4.083) 2 = 16.671

11 – 9.083 = (1.917) 2 = 3.675 8 – 9.083 = (-1.083) 2 = 1.173 6 – 9.083 = (-3.083) 2 = 9.505

9 – 9.083 = (-0.083) 2 = 0.007 12 – 9.083 = (2.917) 2 = 8.509 5 – 9.083 = (-4.083) 2 = 16.671

12 – 9.083 = (2.917) 2 = 8.509 13 – 9.083 = (3.917) 2 = 15.343 6 – 9.083 = (-3.083) 2 = 9.505

13 – 9.083 = (3.917) 2 = 15.343 15 – 9.083 = (5.917) 2 = 35.011 6 – 9.083 = (-3.083) 2 = 9.505

10 – 9.083 = (0.917) 2 = 0.841 12 – 9.083 = (2.917) 2 = 8.509 4 – 9.083 = (-5.083) 2 = 25.837

13 – 9.083 = (3.917) 2 = 15.343 12 – 9.083 = (2.917) 2 = 8.509 4 – 9.083 = (-5.083) 2 = 25.837

Sum = 273.83
Therefore, 𝑆𝑆𝑡 = 273.83

e. Interaction
The Sum of Squares Interaction is
𝑆𝑆𝑡 = 𝑆𝑆𝑎 + 𝑆𝑆𝑏 + 𝑆𝑆𝑎𝑏 + 𝑆𝑆𝑤
By manipulating the equation to solve for 𝑆𝑆𝑎𝑏 given 𝑆𝑆𝑡 from (d), 𝑆𝑆𝑎 from (a), 𝑆𝑆𝑏
from (b) and 𝑆𝑆𝑤 from (c), then we have
𝑆𝑆𝑎𝑏 = 𝑆𝑆𝑡 − 𝑆𝑆𝑎 − 𝑆𝑆𝑏 − 𝑆𝑆𝑤
Therefore, 𝑆𝑆𝑎𝑏 = 273.83 − 20.16 − 200.33 − 37 = 16.33

Step 3: Calculate the Degree of Freedoms (df)


a. Factor A b. Factor B
𝑑𝑓𝑎 = 𝑎 − 1 = 2 − 1 = 1 𝑑𝑓𝑏 = 𝑏 − 1 = 3 − 1 = 2

c. Interaction d. Within or Error


𝑑𝑓𝑎𝑏 = (𝑑𝑓𝑎 )(𝑑𝑓𝑏 ) = (1)(2) = 2 𝑑𝑓𝑤 = 𝑎𝑏(𝑛 − 1) = (2)(3)(4 − 1) = 18

e. Total
𝑑𝑓𝑡 = 𝑎𝑏𝑛 − 1 = (2)(3)(4) − 1 = 24 − 1 = 23

Step 4: Calculate the Mean Squares (𝑀𝑆)


a. Factor A b. Factor B
𝑆𝑆 20.16 𝑆𝑆 200.33
𝑀𝑆𝑎 = 𝑑𝑓𝑎 = 1 = 20.16 𝑀𝑆𝑏 = 𝑑𝑓𝑏 = 2 = 100.17
𝑎 𝑏

c. Interaction d. Within or Error


𝑆𝑆 16.33 𝑆𝑆 37
𝑀𝑆𝑎𝑏 = 𝑑𝑓𝑎𝑏 = 2 = 8.17 𝑀𝑆𝑤 = 𝑑𝑓𝑤 = 18 = 2.06
𝑎𝑏 𝑤

Step 5: Calculate the F-statistics (𝐹)


a. Factor A
𝑀𝑆 20.16
𝐹𝑎 = 𝑀𝑆 𝑎 = 2.06 = 9.79
𝑤
b. Factor B
𝑀𝑆 100.17
𝐹𝑏 = 𝑀𝑆 𝑏 = 2.06 = 48.63
𝑤
c. Interaction
𝑀𝑆 8.17
𝐹𝑎𝑏 = 𝑀𝑆𝑎𝑏 = 2.06 = 3.97
𝑤

Step 6: Of course, to decide whether to accept or reject the null hypothesis, there is a need to
calculate also either of the following:
a. F-critical for each of the sources of variation
Textbooks and other references suggest looking it up on Table of Critical Values
for F-Distribution which are usually on the appendices of Statistics books but if you
have a computer with Microsoft Excel, you can enter the formula function
= F. INV. RT (α, dfof the source , dfwithin )
Note that when using this method in deciding whether to accept or reject the
hypothesis, if:
• F-statistic < F-critical, accept the Ho.
• F-statistic > F-critical, reject the Ho and conclude the Ha.

Advanced Statistics /LAS 5/Page 4 of 9


Since we set 𝛼 = 0.05, therefore,
Sources of
df F-statistic F-critical Decision
Variation

= F. INV. RT (0.05,1, 18)


Factor A 1 9.79 Reject Ho
It will return 4.41

= F. INV. RT (0.05,2, 18)


Factor B 2 48.63 Reject Ho
It will return 3.55

= F. INV. RT (0.05,2, 18)


Interaction 2 3.97 Reject Ho
It will return 3.55

Within 18

b. P-value of each of the sources of variation


Other than (a), we can also obtain the p-value using Microsoft Excel just by
entering the formula function
= FDIST(F − statistic, dfof the source , dfw )
Note that when using this method in deciding whether to accept or reject the
hypothesis, if:
• P-value < α, reject the Ho and conclude the Ha.
• P-value > α, accept the Ho.
Sources of
df F-statistic F-critical Decision
Variation

= FDIST(9.79,1,18)
Factor A 1 9.79 Reject Ho
It will return 0.006

= FDIST(48.63,2,18)
Factor B 2 48.63 Reject Ho
It will return 0.000

= FDIST(3.97,2,18)
Interaction 2 3.97 Reject Ho
It will return 0.037

Within 18

The ANOVA table below summarized the calculations above.


Source of Sum of Degrees of Mean F- F- P-
Decision
Variation Squares Freedom Squares statistic critical value

Factor A 20.17 1 20.17 9.79 4.41 0.006 Reject Ho

Factor B 200.33 2 100.16 48.63 3.55 0.000 Reject Ho

Interaction 16.33 2 8.17 3.97 3.55 0.037 Reject Ho

Within 37.00 18 2.06

Total 273.83 23

We therefore conclude the following:


1. There is significant difference between the test scores when grouped according to their
schedule.
2. There is significant difference between the test scores when grouped according to
teaching approach used.
3. There is significant interaction between the schedule and teaching approach used.

Advanced Statistics /LAS 5/Page 5 of 9


AS_5a.
An agronomy major student sought help from a Statistics student for the statistical analysis of his experiment
involving four types of fertilizers exposed to three different levels humidity. This totals to 12 combinations of
fertilizer and humidity level each have 3 plants each. A partially completed ANOVA table is given. Help the
Statistics student fill in the missing entries (blue cells) and test the relevant hypothesis using 0.05 level of
significance

Source of Sum of Degrees of Mean F- F- P-


Decision
Variation Squares Freedom Squares statistic critical value
Factor A 1157 3
Factor B 350 2
Interaction
Within 1501 24
Total 3779

1. Formulate the null and alternative hypothesis.


a. ′
𝐻𝑜 :

𝐻𝑎′ :

b.
𝐻𝑜 ′ ′:

𝐻𝑎′′ :

c.
𝐻𝑜 ′′′ :

𝐻𝑎′′′ :
2. Decide the level of significance, 𝒂.
The hypothesis will be tested using 𝑎 = ______ level of significance.
3. Choose the appropriate test statistic.
The test that will be used is ___________________________________ because __________________________
_______________________________________________________________________________________________.
4. Compute the value of the statistical test.
Computation:
(Show calculation for each of the missing entries.)
Source of Sum of Degrees of Mean
F- statistic F-critical P-value Decision
Variation Squares Freedom Squares
Factor A 1157 3

Factor B 350 2

Interaction

Within 1501 24

Total 3779

Advanced Statistics /LAS 5/Page 6 of 9


5. Decide whether to accept or reject the null hypothesis.
Explain the decisions you put on the table above.

6. Draw a conclusion.
Conclusion:

Advanced Statistics /LAS 5/Page 7 of 9


AS_5b.
Thirty-two pupils were selected as samples for an action research investigating the effect of the type of
grouping the students and the remedial intervention given to them. Each was assigned to one of the 8
blocks involved in the study. The result of data collection is posted below. State all possible hypotheses that
can be formulated for the study and test them at 0.05 level of significance.

Remedial Intervention Given


Type of Activity Videos &
No intervention Videos
Groupings Sheets Activity Sheets

23 34 38 39

Heterogeneous 29 37 39 32
grouping 31 36 41 30
32 32 38 27
43 38 39 35

Homogenous 39 38 40 36
grouping 39 39 37 38
42 35 36 39

Advanced Statistics /LAS 5/Page 8 of 9


Advanced Statistics /LAS 5/Page 9 of 9

You might also like