ECON 256 LECTURE 10 Analysis of Variance

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 12

ANALYSIS OF VARIANCE

Pt 1
Introduction
In the previous lectures we developed methods to determine whether there is
difference between 2 population means. What if we wanted to compare more
than two population means? The 2 sample test used in the previous chapter
require that the population at a time.
1. This will be time consuming.
2. There will be a build up of type I error-i.e. the total value of α will become
quite large as the number of comparisons increased.
Introduction
ANOVA is a statistical technique for testing whether several
populations have the same mean.
A second test compares two sample variances to determine
whether the populations are equal. This test is useful in
validating a requirement of the two sample t-test presented
in the previous lecture.
This test assumed that the population standard deviations
were equal but unknown
The F Distribution
The F Distribution was named to honour Sir Ronald Fisher, one of the founders
no modern day statistics. This probability statistics is used as the test statistic for
several situations.
It is used to test whether two samples are from populations having equal
variances
It is also applied when we want to compare several population means
simultaneously.
The simultaneous comparison of several population means is called analysis of
variance(ANOVA)
In these situations the populations must be normal and the data must be at least
interval-scale.
The F Distribution
The F Distribution is a continuous probability distribution where F is always 0 or
positive. The distribution is positively skewed. It is based on 2 parameters, the df
in the numerator and the df in the denominator.

CHARACTERISTICS OF THE F DISTRIBUTION


1. The f distribution is positively skewed.
2. F cannot be negative, the smallest value it can assume is 0.
3. F is a continuous distribution. This means it can assume an infinite number of
values between 0 and plus infinitive (+∞)
4. The f values ranges from 0 to ∞. As the value increases, the curve approaches
the x-axis but never touches it.
CHART
COMPARING TWO POPULATION
VARIANCES
The F-distribution in this section is used to test the hypothesis that the variance
of one normal population equals the variance of another normal population.
That is, this test is useful for determining whether one normal population has
more variance than the other.
The value of the test statistic is determined using this formula

TEST STATISTIC FOR COMPARING TWO VARIANCE

F=

Where and are variance of the first and second sample respectively.
Comparing Two Variances
The F ratio is determined by putting the larger of the two variances in the
numerator and the smaller in the denominator
Thus the F ratio is larger than 1.00
This allows us to use the upper tail of the F-statistic and avoid the need for more
extensive F tables
Regardless of whether we want to determine if one population has more
variation than another population or validate an assumption for a statistical test,
we first state the null hypothesis.
• The null hypothesis that the variance of one normal population equals the
variance of the other normal population .
• The alternate hypothesis could be that the variances differ. This test of
hypothesis is written as:
Comparing Two Variances
The critical value of F is found by dividing the significance level in half() and
refer to the appropriate number of df. Usually 0.10 or 0.02 for two-tailed test.
Do not divide α into two if it is 0.05.
Example:
Metro mass offers bus services from Kumasi to Techiman. The MD of the
company is considering 2 routes. One is via Offinso and the other is via Seko. He
wants to study the 2 routes and compare the results. He collected the following
sample data. Using the 0.10 significance level, is there a difference in the
variation in the two routes?
ROUTE MEAN TIME STANDARD SAMPLE SIZE
( MINUTES) DEVIATION
(MINUTES)
OFFINSO 56 12 7
SEKO 58 5 8
Solution
The usual hypothesis testing procedure is employed.
Step 1: State the null and alternate hypotheses

The test is 2 tailed since we are looking for difference in the variation of the 2 routes.
We are not trying to find out whether one route has more variation than the other.

Step 2: State the level of significance


the 0.10 significance level is selected

Step 3 : the appropriate test statistic is the F distribution


F=
Solution
Step 4: state the decision rule
The significance level is 0.05 found by ==0.05
df numerator =n-1= 7-1=6
df denominator =n-1= 8-1=7
Using 6 and 7 find the critical value in the table. The critical value is 3.87
The decision rule is: if the ratio of the sample variances, , exceeds 3.87, the is
rejected.
F>
• F> , F > 3.87
Step 5: Determine the value of the test statistic by taking the ratio of the two
sample variances using the formula:
F= = = 5.76
Solution
Therefore since F, The null hypothesis is rejected. We conclude there is a
difference in the variation in the travel time along the routes.
Where the X’s are given.
Offinso 52 67 56 45 70 54 64  
Seko 59 60 61 51 56 63 57 65

Offinso
= =

Seko
=
Solution
The null hypothesis is rejected and the alternate accepted. We
conclude that there is difference in the variation in the travel time
along the 2 routes.

You might also like