Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 32

One way and Two Way ANOVA, ANCOVA

Amanpreet Singh, (Ph.D.),

Assistant Professor, School of Management Studies, Punjabi University, Patiala
 ANOVA is used to compare the means when there is
need to evaluate differences among more than two
 Although ANOVA is extension of t test but in
various ways it is different from t test –
 In two sample t test, there is only one predictor
variable (men vs women, brand ‘A’ loyalist vs
brand B loyalist), in ANOVA two or more than
two can define the groups and help us understand
the data better.
Then why not multiple t test? i.e Gp1 & Gp2,
Gp2 & Gp3 & Gp3 & Gp1

 At significance level of 0.05, the probability of type

1 error is only 5% and probability of no-error
would be 95%.
 In case of three independent t tests, the probability of no
error would 0.95 x 0.95 x 0.95 = 0.857.
 Probability of error would be 1-0.857 = 0.143 or 14.3%.
 With more experimental conditions, the error rate will be
even higher i.e error = 1 – (0.95)n
1 1.00 1.00 10.00 9.00
Source – Marketing 2 1.00 1.00 9.00 10.00
Research by N K Malhotra, 3 1.00 1.00 10.00 8.00
4 1.00 1.00 8.00 4.00
6th edition
5 1.00 1.00 9.00 6.00
6 1.00 2.00 8.00 8.00
7 1.00 2.00 8.00 4.00
8 1.00 2.00 7.00 10.00
9 1.00 2.00 9.00 6.00
10 1.00 2.00 6.00 9.00
11 1.00 3.00 5.00 8.00
12 1.00 3.00 7.00 9.00
13 1.00 3.00 6.00 6.00
14 1.00 3.00 4.00 10.00
15 1.00 3.00 5.00 4.00
16 2.00 1.00 8.00 10.00
17 2.00 1.00 9.00 6.00
18 2.00 1.00 7.00 8.00
19 2.00 1.00 7.00 4.00
20 2.00 1.00 6.00 9.00
21 2.00 2.00 4.00 6.00
22 2.00 2.00 5.00 8.00
23 2.00 2.00 5.00 10.00
24 2.00 2.00 6.00 4.00
25 2.00 2.00 4.00 9.00
26 2.00 3.00 2.00 4.00
27 2.00 3.00 3.00 6.00
28 2.00 3.00 2.00 10.00
29 2.00 3.00 1.00 9.00
30 2.00 3.00 2.00 8.00
The complete randomized design (CRD) – One Way Anova

 When there is only one independent variable or

single factor, the procedure is called CRD or One
Way Anova.
 Y = f(x) where in Y is dependent variable and X is
independent variable.
 Sales is dependent variable and independent
variables are Coupon and Promotion.
 If we wish to measure the effect of in store promotion
on sales, it is called CDR or OWA.
 Single factor – In store promotion, 3 levels – High,
Medium & Low
 Ho = Category means are equal (in the population)
 H1 = Category means are not equal/At least one of the
mean is significantly different
 i.e µ1≠ µ2 ≠ µ3 or µ1= µ2 ≠ µ3 or µ1≠ µ2 = µ3
 Sources of Variation: SST = SSA+SSW

Among Groups
Variation (SSA)
Total Variation
Within Group
Variation SSW
Decomposition of total variance

c n
SST   ( X ij  X ) 2
j 1 i 1
SSA   n j ( X j  X ) 2
j 1

c n 2

SSW   ( X ij  X j )
j 1 i 1

In regression, SSA is also called as Model Sum of Squares

(SSM) and SSW is called residual sum of squares.
 SSA or SSM is called Model Sum of Squares i.e the
total variance explained by the model.
 SSW or SSR is called variance not explained by the
model or variance due to extraneous factors.
 Mean Square is sum of squares divided by degree of
Degree of Freedom
 We divide by Df and not by parameter is that we are
trying to estimate to a population, therefore some of
parameters will be held constants.
 Df for SST = n-1 (G. mean is fixed) i.e 30-1=29
 Df for SSA = c-1, (mean in each gp is fixed) 3-1= 2
 Df for SSW = n-c ,(3 means are fixed) 30-3 = 27
Mean Square
 Mean sum of Among (MSA) = SSA/c-1
 Mean Square Within (MSW) = SSW/n-c
 Mean Square Total (MST) = SST/n-1
 F Ratio: MSA/MSW

Systematic variation/Unsystematic variation
 In regression: F ratio is measure of variation
explained by the model and variation explained by
unsystematic factors.
 Or ratio of how good the model is to how bad
it is.
 Reject Ho if F>Fu otherwise do not reject Ho
 For 2 & 27 Df, Fu is 3.35. => F>fu therefore reject
null hypothesis.
Other Statistics
Estimates of size effect (2) – The size of effect of ‘X’
on ‘Y’. Value of 2 lies between 0 and 1.

2 = (SSA)/SST = SSA/SST

0 means – no effect when all category means are equal.

1 means – when there is no variability within each
category of X (SSW=0) but there is some variability
between categories.
2 = 106.067/185.867 = 0.57 (medium sized effect)
Further analysis
 Once it is proven that Category means are not
equal, it may have various possibilities
(a) µ1≠ µ2 ≠ µ3 – all means are significantly different
(b) µ1=µ2 ≠ µ3=µ1 i.e. only µ2 and µ3 are significantly
(c) µ1 ≠ µ2 = µ3 = µ1 i.e. only µ1 and µ2 are
significantly different
(d) µ1 ≠ µ3 = µ2 = µ1 i.e. only µ1 and µ3 are
significantly different.
Post Hoc Test
 The way to know this is: t test between groups but the
problem is already discussed that it will inflate the type
1 error rate.
 Second way it to adjust the significance level such a
way that the total type 1 error should not exceed 5%.
 To test which means are significantly different and
which are not different, we do Post Hoc Test.
 A basic criteria a post hoc test should fulfill is that it
should control Type 1 error without losing it’s power
(i.e. Type 2 error).
Different post hoc tests and their reliabilities.
 REGWQ has tight control over Type 1 error and is best
when all pairs are to be compared. So it must be done.
 Hochberg GT2 is designed to test when sample sizes
are different but not reliable when population variance
is not same.
 Tamhane’s T2, Dunnett’s T3, Games Howel and
Dunnett’s C Timhane’s T2 test keep very tight control
over Type 1 error and test in the situation when
population variance is not same.
 The Games-Howell procedure is most powerful but
not reliable when sample size is small.
Summary of post hoc tests
 REGWQ and TUKEY test – Must be done when sample
sizes are equal and population variance is same.
 Bonferroni is done when you want very tight control
over Type 1 error.
 Sample sizes are slightly different and population
variance is same – use Gabriel procedure.
 But if samples sizes are very different and population
variance is same, use Hochberg’s GT2 test.
 Games -Howell test is must because it is used when
population variance is not same. It is recommended to
do this test along with other tests always.
Assumptions of ANOVA
 ANOVA is a parametric test
 Three assumptions of parametric test would apply –
 Randomness and Independence
 Normality
 Homogeneity of Variance.
 Anova is fairly robust against departure from condition of
 If you have equal sample size in each group, inference based
on F distribution, are not seriously affected by unequal
 In case of departure from normality and variance, non-
parametric tests are done or data transformation is done
Two Way (or n ways) ANOVA
 How do the consumer’s intentions to buy a brand vary with
different levels of price and different levels of distributions?
 How do advertising level (high, medium and low) interact
with price levels (high, medium and low) to influence a
brand’s preference?
 Do educational level (High school, UG, PG etc) level and
age (below 30, >30 but<50, >50) etc affects consumption of
a brand?
 What is the effect of consumer’s familiarity with a
department store (high, medium, low) and store image
(positive, neutral and negative) on preference of a store?
Two Way (or n ways) ANOVA
 When we study the effect of two factors on one
independent variable.
 Interaction: Interaction between two factors occurs
when the effect of one factor on independent variable
depends upon on the level or categories of other factor.
 Procedure to do two way/n ways ANOVA is similar to one
way ANOVA and almost all outputs are also similar.
 Test of homogeneity is not significant which
means null hypothesis is accepted that the
variances is equal.
 Test results of coupon and promotion are
significant which means higher coupon and
promotion means higher sales.
 There is no interaction between coupon and
store promotion or results are non-sigificant,
hence they are independent of each other.
 Management can increase the sales by increasing
storewide distribution of coupon or promotion
separately to increase the sales.

Y = aX1+bX2+cX3 +…………….mXn +b
1 continuous DV with normal distribution
2 (or more) Categorical or Continuous IV
with normal distribution
 Continuous variable that are not part of main
experimental manipulation but have an
influence on the DV, are known as Covariates.
 Suppose we wanted to determine the effect of in-store
promotion and couponing on sales while controlling
the effect of clientele (affluence)
 Covariate – Clientele
 Null Hypothesis (Ho)- Affluence of the clientele does
not have effect on the sale of department store.
 Alternate Hypothesis (H1) - Affluence of the clientele
effect sale of department store.
 Sum of square value of covariate is 0.838 with 1 Df
that gives identical value of mean square value.
 The associated F value is 0.838/0.972=0.862 which is
not significant i.e. 0.363>0.05. therefore Null
hypothesis is accepted or we fail to reject the null
 Thus, the conclusion is that the affluence of the
clientele does not have effect on the sale of department

You might also like