Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 22


Analysis of Variance


History of ANOVA
 The history of the ANOVA test dates back to the year 1918.
It’s a concept that Sir Ronald Fisher gave out and so it is
called the Fisher Analysis of Variance.
 He had originally wished to publish his work in the journal
Biometrika, but since he was on “not so good” terms with
Karl Pearson, the arrangement could not take place. So he
eventually settled with the Journal of Agricultural Science.
What is ANOVA?
 A statistical technique that is used to check if the means of
two or more groups are significantly different from each
 It checks the impact of one or more factors by comparing the
means of different samples.
 It is used to identify which process, among all other
processes is better.
Identify which is the independent and
dependent variable:
1. A study to determine whether how long a student sleeps affects test
IV: Length of Time
DV: Test scores
2. You want to compare brands of paper towels, to see which holds the
most liquid.
IV: Brand of paper towels
DV: the amount of liquid absorbed
3. You are asked to determine the effect of intermittent fasting on blood
sugar levels.

IV: Presence or absence of

intermittent fasting
DV: blood sugar levels
Assumptions of ANOVA
 An ANOVA can only be conducted if there is no relationship between
the subjects in each sample. This means that the subjects in the first
group cannot also be in the second group.
 The different groups/levels must have equal sample sizes
 An ANOVA can only be conducted if the dependent variable is
normally distributed, so that the middle scores are the most frequent
and extreme scores are least frequent.
 Population variances must be equal or there is Homogeneity of
Terminologies in ANOVA
1. Means (Grand and Sample)
Sample mean (μn) represents the average value for a group.
 Grand mean (μ) represents the average value of sample means of different groups
or mean of all the observations combined.

2. F-Statistic
It is determined by an ANOVA test. It determines the significance of the groups of
variables. It also tells us that variation between sample means or variation within
the samples.
 The F-statistic is used to make decisions in support or against the null hypothesis. The
decision rule is that:
1. If the F-statistic is GREATER than the F-critical value, the test is SIGNIFICANT.
2. If the F-statistic is LESS than the F-critical value, the test is NOT SIGNIFICANT

The higher the F-value in an ANOVA, the higher the variation between sample means
relative to the variation within the samples. The higher the F-value, the lower the
corresponding p-value.
3. Sums of Squares
In the ANOVA test, it is used while computing the value of F.

As the sum of squares tells you about the deviation from the mean, it is
also known as variation.

While calculating the value of F, we need to find SSTotal that is equal

to the sum of SSEffect and SSError.
SSTotal = SSEffect + SSError
4. Degrees of Freedom (Df)
Degrees of Freedom refers to the maximum numbers of logically independent
values that have the freedom to vary in a data set.

5. Mean Squared Error (MSE)

The Mean Squared Error tells us about the average error in a data set. To find
the mean squared error, we just divide the sum of squares by the degrees of
6. Hypothesis (Alternate and Null)
In the ANOVA test, we use Null Hypothesis (H0) and Alternate Hypothesis
(H1). The Null Hypothesis in ANOVA is valid when the sample means are
equal or have no significant difference.
The Alternate Hypothesis is valid when at least one of the sample means is
different from the other.

7. Group Variability (Within-group and Between-group)

To understand group variability, we should know about groups first. In the
ANOVA test, a group is the set of samples within the independent variable.
There are variations among the individual groups as well as within the
group. This gives rise to the two terms: Within-group variability and
Between-group variability.
When there is a big variation in the sample distributions of the
individual groups, it is called between-group variability.
On the other hand, when there are variations in the sample distribution
within an individual group, it is called Within-group variability.

8. p-value of the F-statistic - This shows how likely it is that the F-value
calculated from the test would have occurred if the null hypothesis of no
difference among group means were true.
1. One-way ANOVA - generally the most used method of
performing the ANOVA test. It is also referred to as one-
factor ANOVA, between-subjects ANOVA, and an
independent factor ANOVA. It is used to compare the means
of two independent groups.

Two carry out the one-way ANOVA test, you should

necessarily have only one independent variable with at least
two levels. One-way ANOVA does not differ much from t-test.
2. Two-way ANOVA - carried out when you have two independent
variables. It is an extension of one-way ANOVA. You can use the two-way
ANOVA test when your experiment has a quantitative outcome and there
are two independent variables.
Two-way ANOVA is performed in two ways:

2.1. Two-way ANOVA with replication: It is performed when there are two
groups and the members of these groups are doing more than one thing.
2.2. Two-way ANOVA without replication: This is used when you have
only one group but you are double-testing that group.
Two-way ANOVA without

A certain type of therapy is used in different individuals and

observed their blood pressure levels before, in the middle, and
at the end of therapy.
Two-way ANOVA with replication

Three types of therapy are used in different groups and

individuals’ blood pressure level is measured before, in the
middle and at the end of the therapy.
Sample Data
Assumptions for Two-way ANOVA
• The population must be close to a normal distribution.
• Samples must be independent.
• Population variances must be equal.
• Groups must have equal sample sizes.
When we have multiple or more than two independent variables, we use
MANOVA. The main purpose of the MANOVA test is to find out the effect on
dependent/response variables against a change in the independent variable.
It answers the following questions:
• Does the change in the independent variable significantly affect the dependent
• What are interactions among the dependent variables?
• What are interactions between independent variables?
MANOVA is advantageous as compared to ANOVA because it allows you to test
multiple dependent variables and protects from errors where we ignore a true null
Identify what type of ANOVA will be used in
the following:
1. A group of patients who are suffering from fever and given three
different medicines. You wanted to determine the effectiveness of each

Two-Way ANOVA without replication

2. You are researching which type of fertilizer and planting density

produces the greatest crop yield in a field experiment. You assigned
different plots in a field.

Two-Way ANOVA with Replication

3. As a crop researcher, you want to test the effect of three different
fertilizer mixtures on crop yield.


You might also like