Two-Way Anova

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

TWO WAY ANALYSIS OF VARIANCE WITH EQUAL SAMPLE SIZE

Here we study the effect of two factors simultaneously, say factor A and factor B. The studies
can be based on experimental on observational data.

Data for two way Anova

Notations

Y - the response variable


Factor A with levels i  1,2, , a

Factor B with levels j  1,2,, b

A particular combination of levels is called a treatment on a cell. There are ab treatments.

Yijk is the k th observation for treatment ij . k  1,2,, n

Note: k denotes the k th observation for treatment ij .

For now, we assume equal sample size in each treatment combination. This is called a balanced
design.

Example

The effect of sales based on three selling prices (Kshs 59, Ksh 60, Ksh 64) and two types of
promotional campaign (radio advertisement or newspaper advertisement)

Price Description (advertisement)


Kshs 59 Radio
Ksh 60 Radio
Ksh 64 Radio
Kshs 59 Newspaper
Ksh 60 Newspaper
Ksh 64 Newspaper

Here, Y response is the number of items sold. Factor A is price which has 3 levels i.e., a  3 ,
factor B is types of promotional campaign which has 2 levels , i.e., b  2 . For this case, there are
6 treatments , i.e., ab  6

The above can be put in rows and columns, two-way layout

Factor B (type of promotional campaign)


j  1 (Radio) j  2 (Newspaper)
Factor A i  1 ( Kshs 59) Y111 , Y112 ,, Y11k Y121, Y122 ,, Y12k
(price)
i  2 ( Ksh 60) Y211, Y212 ,, Y21k Y221, Y222 ,, Y22 k
i  3 ( Ksh 64) Y311 , Y312 ,, Y31k Y321 , Y322 ,, Y32 k
Model assumptions

We assume that the response variable of the observations are independent and normally
distributed with a mean that may depend on the levels of the factors A and B, the variance is
constant.

CELL MEANS MODEL

Model formulation

We express the analysis of variance model in terms of the cells (treatment) mean ij . The model
is of the form

Yijk  ij   ijk ………………………….. (*)

Where ij is the theoretical mean or the expected value of all observations in the cell i, j  ,
 ij ~ N 0, 2  iid and Yijk ~ N ij ,  2  they are independent.

There are ab  1 parameters of the model. In this case ij , i  1,2, , a , j  1,2,, b

Matrix of the cell means model

Factor B
1 2 3  b Mean
Factor A 11 12 13  1b 1
1  21  21  23   2b  2
2 31 32 33   3b  3
3      
  a1 a 2 a3   ab  a
a
1 2 3  b 

Parameter estimates

We estimate ij by the mean of the observations in cell i, j 

Y ijk
Y ij  k
, that is ̂ij  Y ij which is the treatment mean.
nk
Parameter definitions

∑ 𝜇𝑖•

 ij  i  ∑ 𝜇•• = 𝑖
𝑎

  i j
or   i
or 𝜇•• = 𝑖
𝑏
ab a
Treatment mean

The mean response for a given treatment is denoted as ij where i refers to the level of factor A
i  1,2,, a and j refers to the level of factor B  j  1,2,, b

Factor level means

The column average for the j th column is defined by  j

 ij
 j  i 1
a

The row average for the i th row is defined by  i


j 1
ij

 i 
b
Example

Suppose we want to determine whether the brand of laundry detergent used and the
temperature affects the amount of dirt removed from the laundry work. To this end, you buy
two different brand of detergent (“super” and “best”) and choose three different temperature
levels (“cold”, “warm” and “hot”). Then you divide your laundry randomly into 6  k piles of
equal size and assign k piles into the combinations of (“super” and “best”) and (“cold”, “warm”
and “hot”). The amount of 𝑦𝑖𝑗𝑘 dirt removed when washing sub-piles k k  1,2,3,4 with
detergent i i  1,2 at temperature j  j  1,2,3 are recorded in the table below.

Factor B
Cold Warm Hot
Factor A Super 4,5,6,5 7,9,8,12 10,12,11,9
Best 6,6,4,4 13,15,12,12 12,13,10,13

Compute treatment means, factor level means and overall mean.

Solution

Treatment means
Y ijk
̂ij  Y ij  k
nk

Factor B (Temperature)
Cold Warm Hot Mean
Factor A Super 11  5 12  9 𝜇13 = 10 𝜇1• = 8.0
(Detergent) Best 21  5 22  13  23  12 2  10
Mean 1  5 𝜇•2 = 11 𝜇•3 = 11.0 𝜇•• = 9.0

Factor level means


b


a

 ij j 1
ij

Column -  j  i 1
row - i 
a b

Overall/grand mean

  ij  i  i
  i j
 i
 i
ab a a
The combination of “best” and “warm” has the maximum mean treatment.

Note:

i)  
E Yijk  ij
ii) Anova model (*) above could also be expressed as Y  XB  

Illustration

For two factor study with each factor having 2 levels i.e., a  b  2 and two replicates, that is,
number of observations in each. Then Y , X,  and  can be defined as
Y111  1 0 0 0 111 
Y  1   
 112   0 0 0  112 
Y121  0 1 0 0  11  121 
       
 Y122  0 1 0 0 12  122 
Y , X ,    and   
Y211  0 0 1 0   21   211 
       
Y212  0 0 1 0   22   212 
Y  0 0 0 1  
 221     221 
Y222  0 0 0 1  222 

1 0 0 0  11 
1 0 0 0  
  11 
0 1 0 0  11   12 
   
0 1 0 0  12   12 
E Y   X   
0 0 1 0   21    21 
    
0 0 1 0   22    21 
0 0 0 1  
   22 
0 0 0 1   22 
FACTOR EFFECTS MODEL

The factor effects model is obtained by replacing treatment means ij with identical expression
in terms of factor levels. For two way Anova model, we have ij    i   j   ij .
Replacing ij in model (*), we get

Yijk    i   j   ij   ijk

 
In this case  ijk ~ N 0,  which is a factor effects model for a two factor studies. Note that
2

 
E Yijk    i   j   ij

 
This implies ij    i   j   ij , that is, E Yijk  ij

Where;

 -overall mean / grand mean

 i -main effect of factor A

 j -main effect of factor B

 ij - interactive effects between factors A and factors B


Note:  ij is the name of a parameter all on its own and does not refer to the product of  and
 . This term is referred to as interaction term.
Parameter definitions

1. Mean for the j th level of factor B


 ij

 j  j

a
2. Mean for the i th level of factor A
 ij
 i  i
b
3. Mean effect of factor A of i th level
 i  i    i   i  
th
4. Main effect of factor B on the j level
 j   j  
5.  ij is the difference between ij and   i   j , that is
ij    i   j    ij
 ij    i      j   
 ij    i     j  
  ij  ij  i   j  

If ij  i   j   , then  ij  0 and hence no interaction.

Interacting factor effects

When ij  i   j   we say that factors are interacting. The interaction of the i th level of
factor A and j level of factor B is denoted by  ij and is defined as above.
th

Remarks:

i) A model without the interaction term, that is, ij    i   j is called an additive
model.
ii) As in one way Anova model, we now have too many parameters hence we need
several constraints. The constraints are:

    i  0
i
    j  0
j

  j    ij  0


j

 i    ij  0


i

Estimates for the factor effects model

Parameter Estimate
Grand mean   Y ijk

̂  Y   i j k

abni
 i   i     ˆ i  ˆ i  ˆ   Y i  Y 
 j   j   ˆ j  ˆ  j  ˆ   Y  j  Y 

Example

Suppose we want to determine whether the brand of laundry detergent used and the
temperature affects the amount of dirt removed from the laundry work. To this end, you buy
two different brand of detergent (“super” and “best”) and choose three different temperature
levels (“cold”, “warm” and “hot”). Then you divide your laundry randomly into 6  k piles of
equal size and assign k piles into the combinations of (“super” and “best”) and (“cold”, “warm”
and “hot”). The amount of Yijk dirt removed when washing sub-piles k k  1,2,3,4 with
detergent i i  1,2 at temperature j  j  1,2,3 are recorded in the table below.

Factor B
Cold Warm Hot
Factor A Super 4,5,6,5 7,9,8,12 10,12,11,9
Best 6,6,4,4 13,15,12,12 12,13,10,13

Compute:

i) Overall mean
ii) Main effect of factor A
iii) Main effect of factor B
iv) Interaction effects
Solution

∑ ∑ ∑ 𝑌𝑖𝑗𝑘
𝑘
𝑗 218
The overall mean is given by 𝜇̂ •• = 𝑌̅•• = 𝑖
𝑎𝑏𝑛𝑖
= 2×3×4 = 9.083

The main effect of factor A

ˆ i  ˆ i  ˆ   Y i  Y 
𝛼̂1 = 8.0 − 9.0 = −1.0
𝛼̂2 = 10 − 9.0 = 1.0

This indicates that the “best” laundry detergent increase the effectiveness of dirt removed by an
average of 1 . “super” laundry detergent decrease the effectiveness of dirt removed by an
average of 1 .

Main effect of factor B

ˆ j  ˆ  j  ˆ   Y  j  Y 

ˆ1  5  9  4

ˆ2  11 9  2

ˆ3  11 9  2

Warm and hot water increases the effectiveness of dirt removed by an average of 2 while cold
water decrease the effectiveness of dirt removed by average of 4 .

Note: 
i
i  0 and 
j
j 0

Interaction effects

ˆ ij  ˆ ij  i   j  

 ˆij    i   j 

ˆ 11  ˆ11    1  1   5  9   1   4  1

ˆ 12  ˆ12    1   2   9  9   1  2  1

ˆ 21  
 1 , ˆ 22  
 1, ˆ 23 0

Note:   
j
ij  0 and   
i
ij  0,
Partitioning of total sum of squares

SST   yijk  y 


2

i j k

SSTr  n y ij  y  


2

i j

SSE   yijk  y ij   2

i j k

SST  SSTr  SSE


Partitioning of treatment sum of squares

We shall decompose the estimated treatment mean deviation y ij  y  . In terms of components
reflecting the factor A main effect, the factor B main effect and the interaction of factors A and B.

y ij  y   y i  y   y  j  y   y ij  y i  y  j  y 

SSTr  SSA SSB  SSAB


Where

 
a
SSA  nb y i  y 
2

i 1

 
b
SSB  na y  j  y 
2

j 1


SSAB  n y ij  y i  y  j  y  
2

i j

SSAB  SST  SSA  SSB  SSE

The interaction sum of squares can also be obtained as a remainder since


SSAB  SST  SSA SSB  SSE or SSAB  SST  SSE  SSA SSB
But SSTr  SST  SSE

 SSAB  SSTr  SSA SSB


Partitioning degrees of freedom

For two factor studies with n cases for each treatment, there are a total of nT  nab cases,
r  ab treatments, hence SST has nab 1 , SSTr has ab 1 and SSE has nab  ab  abn 1 .

Further, partitioning of SSTr degrees of freedom yields the following;


SSAhas a  1degrees of freedom
SSB has b  1 degrees of freedom

SSAB has ab  1  a  1  b  1 degrees of freedom which is equal to a  1b  1

Questions of interest in two way Anova

1. Is there a significant interaction between the two factors being studied?

This question needs to be answered first because if we conclude that there is significant
interaction, then both effects are important and hence each effect cannot be discussed
individually.

If we conclude that there is no significant interaction between the factors being studied, then we
can test the effects individually.

2. Is there a significant factor A effects?


3. Is there a significant factor A effects?

Hypothesis for two-way Anova

The three questions of interest in a two-way Anova can be formulated in terms of the following
parameter values.

1. For testing interaction between factors A and B, we have


H 0 :  ij  0 for all treatment combinations
H1 :  ij  0 for some treatment combinations
2. For testing factor A effects we have
H 0 : i  0 for all i' s
H1 :  i  0 for some i' s
3. For testing factor B effects, we have
H0 :  j  0 for all j ' s
H1 :  j  0 for some j ' s

Mean squares

The mean squares are the sum of squares divided by their degrees of freedom.

SSA SSB SSAB SSE


MSA  , MSB  , MSAB  and MSE 
a 1 b 1 a  1b  1 abn  1
Test statistic

The sampling distribution is F-distribution. The test statistic for:

i) Interaction
MSAB
FAB  ~ Fa1b1,ab n1 reject H 0 if FAB  F ,a1b1,abn1 , otherwise do not
MSE
reject H 0 .
ii) Main effects of factor A
MSA
FA  ~ Fa 1,ab n1 and reject H 0 if FA  F ,a1,abn1
MSE
iii) Main effects of factor B
MSB
FB  ~ Fb1,ab n1 and reject H 0 if FB  F ,b1,abn1
MSE
Anova table for two-way factor study with fixed effects model

Source of Sum of squares Degrees of Mean squares F-ratio


variation freedom
Factor A SSA a 1 SSA MSA
MSA  FA 
a 1 MSE
Factor B SSB b 1 SSB MSB
MSB  FB 
b 1 MSE
Factor AB SSAB a  1b  1 MSAB 
SSAB
FAB 
MSAB
a  1b  1 MSE
Error SSE abn  1 SSE
MSE 
abn  1
Total SST nab 1

Example

Suppose we want to determine whether the brand of laundry detergent used and the
temperature affects the amount of dirt removed from the laundry work. To this end, you buy
two different brand of detergent (“super” and “best”) and choose three different temperature
levels (“cold”, “warm” and “hot”). Then you divide your laundry randomly into 6  k piles of
equal size and assign k piles into the combinations of (“super” and “best”) and (“cold”, “warm”
and “hot”). The amount of Yijk dirt removed when washing sub-piles k k  1,2,3,4 with
detergent i i  1,2 at temperature j  j  1,2,3 are recorded in the table below.

Factor B
Cold Warm Hot
Factor A Super 4,5,6,5 7,9,8,12 10,12,11,9
Best 6,6,4,4 13,15,12,12 12,13,10,13

Test if the amount of dirt removed does depend on the type of detergent and the temperature.
Use   5% .

Solution

H 0 :  ij  0
H1 :  ij  0

 y 
2
Computing the interaction effect SSAB  n ij  y i  y  j  y 
i j

 
2 3
 4 y ij  y i  y  j  y 
2

i 1 j 1


 4 5  8  5  9  9  8 11 9  5 10  5  9  
2 2 2

 12
The degrees of freedom is equal to a  1b  1  2

12
MSAB  6
2

 
2 3 4
SSE   yijk  y ij  4  5  5  5  6  5    38
2 2 2 2

i 1 j 1 k 1

The degrees of freedom  abn  1  2  3  3  18

38
MSE   2.111
18
6
FAB   2.844
2.111
F0.05, 2,18  3.555

Since 2.844  3.555 , we do not reject H 0 at   5% . This means that the interaction effect is
insignificant. Now we test for individual effects, that is, factor A effects and factor B effects.

Computing for factor A

H 0 : i  0
H1 :  i  0
 
a
SSA  nb y i  y 
2

i 1


 438  9  10  9  24
2 2

Degrees of freedom  a 1  1

24 24
MSA   24 , FA   11.374
1 2.111
F0.05,1,18  4.4

Since 11.374  4.4 , we reject H 0 at   5% . Therefore, amount of dirt removed depend on the
type of detergent used.

Computing for factor B

H0 :  j  0
H1 :  j  0

 
b
SSB  na y  j  y 
2

j 1


 425  9  11 9  11 9  192
2 2 2

Degrees of freedom  b 1  2

SSB 192
MSB    96
b 1 2
MSB 96
FB    45.48
MSE 2.111
F0.05, 2,18  3.555

Since 45.48  3.555 , we reject H 0 at   5% . Therefore, amount of dirt removed depends on


the temperature of water used.

You might also like