Anova Ancova Aman-Seen

One way and Two Way ANOVA, ANCOVA
Amanpreet Singh, (Ph.D.),

Assistant Professor, School of Management Studies, Punjabi University, Patiala
ANOVA
 ANOVA is used to compare the means when there is
need to evaluate differences among more than two
population.
 Although ANOVA is extension of t test but in
various ways it is different from t test –
 In two sample t test, there is only one predictor
variable (men vs women, brand ‘A’ loyalist vs
brand B loyalist), in ANOVA two or more than
two can define the groups and help us understand
the data better.
Then why not multiple t test? i.e Gp1 & Gp2,
Gp2 & Gp3 & Gp3 & Gp1
 At significance level of 0.05, the probability of type

1 error is only 5% and probability of no-error
would be 95%.
 In case of three independent t tests, the probability of no
error would 0.95 x 0.95 x 0.95 = 0.857.
 Probability of error would be 1-0.857 = 0.143 or 14.3%.
 With more experimental conditions, the error rate will be
even higher i.e error = 1 – (0.95)n
S TORE COUPON PROMOTIONS ALES CLIENTEL
1 1.00 1.00 10.00 9.00
Source – Marketing 2 1.00 1.00 9.00 10.00
Research by N K Malhotra, 3 1.00 1.00 10.00 8.00
4 1.00 1.00 8.00 4.00
6th edition
5 1.00 1.00 9.00 6.00
6 1.00 2.00 8.00 8.00
7 1.00 2.00 8.00 4.00
8 1.00 2.00 7.00 10.00
9 1.00 2.00 9.00 6.00
10 1.00 2.00 6.00 9.00
11 1.00 3.00 5.00 8.00
12 1.00 3.00 7.00 9.00
13 1.00 3.00 6.00 6.00
14 1.00 3.00 4.00 10.00
15 1.00 3.00 5.00 4.00
16 2.00 1.00 8.00 10.00
17 2.00 1.00 9.00 6.00
18 2.00 1.00 7.00 8.00
19 2.00 1.00 7.00 4.00
20 2.00 1.00 6.00 9.00
21 2.00 2.00 4.00 6.00
22 2.00 2.00 5.00 8.00
23 2.00 2.00 5.00 10.00
24 2.00 2.00 6.00 4.00
25 2.00 2.00 4.00 9.00
26 2.00 3.00 2.00 4.00
27 2.00 3.00 3.00 6.00
28 2.00 3.00 2.00 10.00
29 2.00 3.00 1.00 9.00
30 2.00 3.00 2.00 8.00
The complete randomized design (CRD) – One Way Anova
 When there is only one independent variable or

single factor, the procedure is called CRD or One
Way Anova.
 Y = f(x) where in Y is dependent variable and X is
independent variable.
 Sales is dependent variable and independent
variables are Coupon and Promotion.
 If we wish to measure the effect of in store promotion
on sales, it is called CDR or OWA.
 Single factor – In store promotion, 3 levels – High,
Medium & Low
Hypothesis
 Ho = Category means are equal (in the population)
i.e.µ1=µ2=µ3
 H1 = Category means are not equal/At least one of the
mean is significantly different
 i.e µ1≠ µ2 ≠ µ3 or µ1= µ2 ≠ µ3 or µ1≠ µ2 = µ3
 Sources of Variation: SST = SSA+SSW
Among Groups
Variation (SSA)
Total Variation
SST
Within Group
Variation SSW
Decomposition of total variance
c n
SST   ( X ij  X ) 2
j 1 i 1
c
SSA   n j ( X j  X ) 2
j 1
c n 2
SSW   ( X ij  X j )
j 1 i 1
In regression, SSA is also called as Model Sum of Squares

(SSM) and SSW is called residual sum of squares.
 SSA or SSM is called Model Sum of Squares i.e the
total variance explained by the model.
 SSW or SSR is called variance not explained by the
model or variance due to extraneous factors.
 Mean Square is sum of squares divided by degree of
freedom
Degree of Freedom
 We divide by Df and not by parameter is that we are
trying to estimate to a population, therefore some of
parameters will be held constants.
 Df for SST = n-1 (G. mean is fixed) i.e 30-1=29
 Df for SSA = c-1, (mean in each gp is fixed) 3-1= 2
 Df for SSW = n-c ,(3 means are fixed) 30-3 = 27
Mean Square
 Mean sum of Among (MSA) = SSA/c-1
 Mean Square Within (MSW) = SSW/n-c
 Mean Square Total (MST) = SST/n-1
 F Ratio: MSA/MSW
or
Systematic variation/Unsystematic variation
 In regression: F ratio is measure of variation
explained by the model and variation explained by
unsystematic factors.
 Or ratio of how good the model is to how bad
it is.
Output
 Reject Ho if F>Fu otherwise do not reject Ho
 For 2 & 27 Df, Fu is 3.35. => F>fu therefore reject
null hypothesis.
Other Statistics
Estimates of size effect (2) – The size of effect of ‘X’
on ‘Y’. Value of 2 lies between 0 and 1.
2 = (SSA)/SST = SSA/SST
0 means – no effect when all category means are equal.

1 means – when there is no variability within each
category of X (SSW=0) but there is some variability
between categories.
2 = 106.067/185.867 = 0.57 (medium sized effect)
Further analysis
 Once it is proven that Category means are not
equal, it may have various possibilities
(a) µ1≠ µ2 ≠ µ3 – all means are significantly different
(b) µ1=µ2 ≠ µ3=µ1 i.e. only µ2 and µ3 are significantly
different
(c) µ1 ≠ µ2 = µ3 = µ1 i.e. only µ1 and µ2 are
significantly different
(d) µ1 ≠ µ3 = µ2 = µ1 i.e. only µ1 and µ3 are
significantly different.
Post Hoc Test
 The way to know this is: t test between groups but the
problem is already discussed that it will inflate the type
1 error rate.
 Second way it to adjust the significance level such a
way that the total type 1 error should not exceed 5%.
 To test which means are significantly different and
which are not different, we do Post Hoc Test.
 A basic criteria a post hoc test should fulfill is that it
should control Type 1 error without losing it’s power
(i.e. Type 2 error).
Different post hoc tests and their reliabilities.
 REGWQ has tight control over Type 1 error and is best
when all pairs are to be compared. So it must be done.
 Hochberg GT2 is designed to test when sample sizes
are different but not reliable when population variance
is not same.
 Tamhane’s T2, Dunnett’s T3, Games Howel and
Dunnett’s C Timhane’s T2 test keep very tight control
over Type 1 error and test in the situation when
population variance is not same.
 The Games-Howell procedure is most powerful but
not reliable when sample size is small.
Summary of post hoc tests
 REGWQ and TUKEY test – Must be done when sample
sizes are equal and population variance is same.
 Bonferroni is done when you want very tight control
over Type 1 error.
 Sample sizes are slightly different and population
variance is same – use Gabriel procedure.
 But if samples sizes are very different and population
variance is same, use Hochberg’s GT2 test.
 Games -Howell test is must because it is used when
population variance is not same. It is recommended to
do this test along with other tests always.
Assumptions of ANOVA
 ANOVA is a parametric test
 Three assumptions of parametric test would apply –
 Randomness and Independence
 Normality
 Homogeneity of Variance.
 Anova is fairly robust against departure from condition of
normality.
 If you have equal sample size in each group, inference based
on F distribution, are not seriously affected by unequal
variance.
 In case of departure from normality and variance, non-
parametric tests are done or data transformation is done
Two Way (or n ways) ANOVA
Applications:
 How do the consumer’s intentions to buy a brand vary with
different levels of price and different levels of distributions?
 How do advertising level (high, medium and low) interact
with price levels (high, medium and low) to influence a
brand’s preference?
 Do educational level (High school, UG, PG etc) level and
age (below 30, >30 but<50, >50) etc affects consumption of
a brand?
 What is the effect of consumer’s familiarity with a
department store (high, medium, low) and store image
(positive, neutral and negative) on preference of a store?
Two Way (or n ways) ANOVA
 When we study the effect of two factors on one
independent variable.
 Interaction: Interaction between two factors occurs
when the effect of one factor on independent variable
depends upon on the level or categories of other factor.
 Procedure to do two way/n ways ANOVA is similar to one
way ANOVA and almost all outputs are also similar.
Output
Interpretation
 Test of homogeneity is not significant which
means null hypothesis is accepted that the
variances is equal.
 Test results of coupon and promotion are
significant which means higher coupon and
promotion means higher sales.
 There is no interaction between coupon and
store promotion or results are non-sigificant,
hence they are independent of each other.
 Management can increase the sales by increasing
storewide distribution of coupon or promotion
separately to increase the sales.
ANCOVA (ANALYSIS OF COVARIANCE)
Y = aX1+bX2+cX3 +…………….mXn +b
1 continuous DV with normal distribution
2 (or more) Categorical or Continuous IV
with normal distribution
 Continuous variable that are not part of main
experimental manipulation but have an
influence on the DV, are known as Covariates.
Applications
 Suppose we wanted to determine the effect of in-store
promotion and couponing on sales while controlling
the effect of clientele (affluence)
 Covariate – Clientele
Hypothesis
 Null Hypothesis (Ho)- Affluence of the clientele does
not have effect on the sale of department store.
 Alternate Hypothesis (H1) - Affluence of the clientele
effect sale of department store.
Output
Interpretation
 Sum of square value of covariate is 0.838 with 1 Df
that gives identical value of mean square value.
 The associated F value is 0.838/0.972=0.862 which is
not significant i.e. 0.363>0.05. therefore Null
hypothesis is accepted or we fail to reject the null
hypothesis.
 Thus, the conclusion is that the affluence of the
clientele does not have effect on the sale of department
store.

Anova Ancova Aman-Seen

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Anova Ancova Aman-Seen

Uploaded by

Copyright:

Available Formats

One way and Two Way ANOVA, ANCOVA

Amanpreet Singh, (Ph.D.),

 At significance level of 0.05, the probability of type

 When there is only one independent variable or

In regression, SSA is also called as Model Sum of Squares

0 means – no effect when all category means are equal.

You might also like