Chapter 4

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 58

Measure of Dispersion

CHAPTER 4

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 1


Measure of Dispersion or Variability

Meaning of Variability:
• Variability means ‘Scatter’ or ‘Spread’.
• Measures of variability refer to the scatter or spread of scores around
their central tendency.
• The measures of variability indicate how the distribution scatter above
and below the central tender.
Definitions of Variability:
◦ Dispersion is the spread of a distribution.
◦ Dispersion is the measure of the variation of the items.
◦ Dispersion or spread is the degree of the scatter or variation of the
variables about a central value.
Thus the property which denotes the extent to which the values are
dispersed about the central values is called dispersion. It also indicates
the lack of uniformity in the size of items of a distribution.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 2


Need of Variability

1. Helps to as-certain the measures of deviation:


The measures of variability help us to measure the degree of deviation,
which exist in the data. By that can determine the limits within which the
data will navy in some measureable variety or quality.
2. It helps to compare different group:
With the help of measures of validity we can compare the original data
expressed in different units.
3. It is useful to supplement the information provided by the measures of
central tendency.
4. It is useful to calculate further advance statistics based on the measures
of dispersion.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 3


Method of Measures of Variability

There are four measures of variability:


1. The Range
2. The semi-Inter quartile range or Quartile Deviation.
3. The Mean Deviation or the Average Deviation
4. The Variance and Standard Deviation

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 4


Classification of Measures of Dispersion

The measure of dispersion is categorized as:


(i) An absolute measure of dispersion:
An absolute measure of dispersion gives an idea about the
amount of dispersion/ spread in a set of observations. These
quantities measures the dispersion in the same units as the units
of original data.

 The measures which express the scattering of observation in


terms of distances i.e., range, quartile deviation.
 The measure which expresses the variations in terms of the
average of deviations of observations like mean deviation and
standard deviation.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 5


Continue

(ii) A relative measure of dispersion:


The relative measures of depression are used to compare the
distribution of two or more data sets. This measure compares
values without units and expressed in the form of ratio or
percentage. It is also called coefficient of dispersion. Common
relative dispersion methods include:

 Coefficient of Range
 Coefficient of Quartile Deviation
 Coefficient of Mean Deviation
 Coefficient of Variation
 Coefficient of Standard Deviation

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 6


Range
A range is the most common and easily understandable measure of
dispersion. It is the difference between two extreme observations of the data
set. If X max and X min are the two extreme observations then
Range = X max – X min

Coefficient of Dispersion =

Merits of Range
It is the simplest of the measure of dispersion
Easy to calculate
Easy to understand
Independent of change of origin

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 7


Continue
Demerits of Range
It is based on two extreme observations. Hence, get affected by fluctuations
A range is not a reliable measure of dispersion
Dependent on change of scale

Uses of Range:
1. Range is used as a measure of dispersion when variations in the value of the
variable are not much.
2. Range is the best measure of variability when the data are too scattered or too
scant.
3. Range is used when the knowledge of extreme score or total spread is wanted.
4. When a quick estimate of variability is wanted range is used.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 8


Example

Calculate the Range and also relative dispersion of the Numbers


3, 8, 6, 10, 12, 9, 11, 10, 12, 7.
Solution:
Range = X max – X min
X min= 3
X max = 12
Range = 12 – 3
Range = 9
Coefficient of Dispersion =

Coefficient of Dispersion = = 0.6

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 9


Quartile Deviation

Interquartile Range
The difference between the upper and lower quartile is known as the interquartile
range.
The formula for the interquartile range is given below:
Interquartile range = Upper Quartile – Lower Quartile = Q ­3 – Q­1

where Q1 is the first quartile and Q3 is the third quartile of the series

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 10


Semi Interquartile Range or Quartile Deviation

The semi-interquartile range is defined as the measures of dispersion.


Semi interquartile range also is defined as half of the interquartile range.
It is computed as one half the difference between the 75th percentile (Q 3)
and the 25th percentile (Q1). The semi-interquartile range is one-half of
the difference between the first and third quartiles.

The Formula for Quartile Deviation is

Q.D =
Coefficient of Q.D =

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 11


Merits and Demerits of Quartile Deviation

Merits of Quartile Deviation


1. All the drawbacks of Range are overcome by quartile deviation
2. It uses half of the data
3. Independent of change of origin
4. The best measure of dispersion for open-end classification
Demerits of Quartile Deviation
1. It ignores 50% of the data
2. Dependent on change of scale
3. Not a reliable measure of dispersion

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 12


Uses of Quartile Deviation

1. When Median is the measure of central tendency at that time Q


is used is used as the measure of dispersion.
2. When extreme scores affect S.D. or the scores are scattered at
that time Q is used as measure of variability.
3. When our primary interest is to know the concentration around
the median-the middle 50% of cases, at that time Q is used.
4. When the class intervals are open ended, Q is used as
measure of dispersion.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 13


Mean Deviation

Average deviation is the arithmetic mean of all the deviations of


different scores from the mean value of the scores without the regard
for sign of the deviation.”
Thus average deviation s arithmetic mean of the deviations of a series
computed from some measure of central tendency. So average deviation is
the mean of the deviations taken from their mean (Sometimes from Median
and Mode.)
or
“Average deviation is the mean of the absolute values of the
differences between the values of a variable and the mean of its
distribution.”
No account is taken of signs and all deviations whether +ve or —ve
treated as positive.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 14


Mean Deviation From Mean

Formula for ungrouped data

For Population Data For Sample Data


M.D = M.D =

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 15


Mean Deviation From Mean

Formula for grouped data

For Population Data For Sample Data


M.D = M.D =

Coefficient of M.D =

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 16


Mean Deviation From Median

Formula for ungrouped data

For Population Data For Sample Data


M.D = M.D =

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 17


Mean Deviation From Median

Formula for grouped data

For Population Data For Sample Data


M.D = M.D =

Coefficient of M.D =

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 18


Mean Deviation for ungrouped data

Example:
Find M.D from mean and from median of the following 9 scores given
below:
23, 34, 16, 27, 28, 39, 45, 26, 18
Solution:
Mean

= = = 28.44444
Median
First arrange the observations in an ascending order (or descending order).
16, 18, 23, 26, 27, 28, 34, 39, 45
The number of observations is 9, which is odd number.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 19


Continue
x
Median is the value of th
item 16 12.44444 11
th
18 10.44444 9
Median is the value of item
23 5.44444 4
th
Median is the value of item 26 2.44444 1
Median is the value of th
item is 27 27 1.44444 0
28 0.44444 1
Median () = 27
34 5.55556 7
39 10.55556 12
45 16.55556 18

65.33332

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 20


Mean Deviation From Mean

For sample data


M.D =
M.D = = 7.25926

Coefficient of M.D =
Coefficient of M.D =
Coefficient of M.D = 0.255208

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 21


Mean Deviation From Median

For sample data


M.D =
M.D = = 7

Coefficient of M.D =
Coefficient of M.D =
Coefficient of M.D = 0.259259

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 22


Mean Deviation for grouped data

Example : In a foreign language class there are 4 languages and the


frequencies of students learning the language and the frequency of
lectures per week is given as:
Language Sanskrit Spanish French English
No. of
students(xi) 6 5 9 12

Frequency of
lectures(fi) 5 7 4 9

A. Calculate the mean deviation about the mean and about median
for the given data.
B. Calculate relative dispersion of both mean deviations.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 23


Calculation for Mean and Median

x f f.x C
Solution:
6 5 30 5
Mean
5 7 35 12
9 4 36 16
= 12 9 108 25
= 8.36
∑f = 25 ∑f.x =
209

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 24


Continue
Median
Median is the value of th
item
Median is the value of th
item
Median is the value of th item
From the cumulative frequency table, we find that median i.e., 13 th
item is 9.
Hence Median () = 9

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 25


Continue

x f f

6 5 2.36 11.8 3 15

5 7 3.36 23.52 4 28

9 4 0.64 2.56 0 0

12 9 3.64 32.76 3 27

∑f = 25 70
= 70.64

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 26


Mean Deviation From Mean

For sample data


M.D =
M.D = = 2.8256

Coefficient of M.D =
Coefficient of M.D =
Coefficient of M.D = 0.33799

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 27


Mean Deviation From Median

For sample data


M.D =
M.D = = 2.8

Coefficient of M.D =
Coefficient of M.D =
Coefficient of M.D = 0.31111

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 28


Merits and Demerits of Mean Deviation

Merits of Mean Deviation


1. Based on all observations
2. It provides a minimum value when the deviations are taken from the
median
3. Independent of change of origin
Demerits of Mean Deviation
1. Not easily understandable
2. Its calculation is not easy and time-consuming
3. Dependent on the change of scale
4. Ignorance of negative sign creates artificiality and becomes useless
for further mathematical treatment

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 29


Variance

Variance is a measurement of the spread between numbers in a data set.


That is, it measures how far each number in the set is from the mean and
therefore from every other number in the set.
The Variance is defined as:
The average of the squared differences from the Mean

When it is calculated from the entire population, the variance is called the
population variance and denoted by σ2.
If instead the data from the sample are used to calculate the variance, it is
referred to as the sample variance and is denoted by S 2 .The variance is
also denoted by Var (X).

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 30


Continue
 A large variance indicates that numbers in the set are far from the
mean and from each other, while a small variance indicates the
opposite.
 Variance cannot be negative. A variance value of zero indicates
that all values within a set of numbers are identical.
 All variances that are not zero will be positive numbers.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 31


Formula for ungrouped data

For Population Data For Sample Data


= S2 =

Simplified Formula
For Population Data For Sample Data
2 2
= - = -

2 2
= - = -
Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 32
Formula for grouped data

For Population Data For Sample Data


= S2 =

Simplified Formula
For Population Data For Sample Data
2 2
= - = -

2 2
= - = -
Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 33
Advantages and Disadvantages of
Variance
 The advantage of variance is that it treats all deviations
from the mean the same regardless of their direction.
The squared deviations cannot sum to zero and give
the appearance of no variability at all in the data.
 One drawback to variance is that it gives added weight
to outliers, the numbers that are far from the mean.
Squaring these numbers can skew the data.
 The drawback of variance is that it is not easily
interpreted. Users of variance often employ it primarily
in order to take the square root of its value, which
indicates the standard deviation of the data set.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 34


Standard Deviation

 Standard deviation is a measure that calculate the dispersion of a dataset


relative to its mean and is measured as the square root of the variance.
 It is calculated as the square root of variance by determining the variation
between each data point relative to the mean.
 If the data points are further from the mean, there is a higher deviation
within the data set; thus, the more spread out the data, the higher the
standard deviation.
Definition
A standard deviation is the positive square root of the arithmetic mean
of the squares of the deviations of the given values from their
arithmetic mean.
Population Standard deviation is denoted by a Greek letter sigma, σ and
sample standard deviation is denoted by S. It is also referred to as root
mean square deviation.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 35


Formula for ungrouped data

For Population Data For Sample Data


= S=

Simplified Formula

For Population Data For Sample Data


= =
= =

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 36


Formula for grouped data

For Population Data For Sample Data


= S=

Simplified Formula

For Population Data For Sample Data


= =
= =

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 37


Coefficient of Variation
 The coefficient of variation (relative standard deviation) is a statistical
measure of the dispersion of data points around the mean.
 It is commonly used to compare the data dispersion between distinct
series of data.
 The coefficient of variation provides a relatively simple and quick tool to
compare different data series.
 If the expected return in the denominator of the coefficient of variation
formula is negative or zero, the result could be misleading.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 38


Coefficient of Variation

Formula

For Population Data For Sample Data


C.V = x 100 C.V = x 100

Coefficient of S.D =

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 39


Example for ungrouped data

A student scored 85, 91, 88, 78, 85 for a series of exams. Calculate the variance and
C.V for his test scores?

Solution: x
Mean 85 -0.4 0.16
91 5.6 31.36
=
88 2.6 6.76
= 78 -7.4 54.76
= 85.4 85 -0.4 0.16
x = 427 =
93.2

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 40


Continue
Sample variance
S2 =
S2 =
S2 = 18.64
Sample Standard Deviation
The standard deviation is the positive square root of the variance.
S=
S=

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 41


Continue
S=
S = 4.31741
Coefficient of variation
C.V = x 100
C.V = x 100
C.V = 0.050555 x100 = 5.06%

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 42


Example for grouped data

The table below shows the total number of man-days lost to sickness during one
week's operation of a small chemical plant.

Days Lost 1-3 4-6 7-9 10-12 13-15


Frequency 8 7 10 9 6

Calculate the variance and standard deviation of the number of lost days.
Solution
Mean
=
=
= 7.85 days

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 43


Continue

Class mid-value Freq (f) f.x f.x2


Interval (x)

1-3 2 8 16 32
4-6 5 7 35 175
7-9 8 10 80 640
10-12 11 9 99 1089
13-15 14 6 84 1176
Total ∑f = 40 ∑f.x = 314 ∑ f.x2 =3112

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 44


Continue

Sample variance
2
= -

S2 = - (7.85)2
S2 = 77.8 - 61.6225
S2 = 16.1775
Sample Standard Deviation
The standard deviation is the positive square root of the
variance.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 45


Continue

S=
=
S=
S = 4.02213
Coefficient of variation
C.V = x 100
C.V = x 100
C.V = 0.51237 x100 = 51.24%

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 46


Merits and Demerits Standard Deviation

Merits of Standard Deviation


1. Squaring the deviations overcomes the drawback of ignoring signs in
mean deviations
2. Suitable for further mathematical treatment
3. Least affected by the fluctuation of the observations
4. The standard deviation is zero if all the observations are constant
5. Independent of change of origin
6. Standard deviation is never negative.

Demerits of Standard Deviation


1. Not easy to calculate
2. Difficult to understand.
3. Dependent on the change of scale

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 47


Properties of Variance and Standard Deviation

The variance, var(X) and standard deviation S.D (X) of a random variable X
have the following useful properties.
1. The variance or standard deviation of a constant is zero.
Symbolically, If X = a ( a constant)
Var (a) = 0 and S.D (a) = 0
2. The variance and standard deviation are independent of origin i.e they
remain unchanged when the values are increased or decreased by a
constant.
Symbolically,
Var(X + a) = Var(X), and S.D(X + a) = S.D(X),
where ‘a’ is a constant.
Var(X ─ a) = Var(X), and S.D(X ─ a) = S.D(X)

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 48


Continue
3. When all the values are multiplied or divide by a constant the variance of these
values is multiplied or divide by the square of the constant and the standard
deviation is multiplied or divide by the constant.
Symbolically,
Var(aX) = a2.Var(X), where a is a constant.
Var(aX + b) = a2.Var(X), where a and b are constants.
S.D(aX) = a. S.D(X), and S.D(aX + b) = a. S.D(X)
Var = Var(X), and S.D = S.D(X)
4. The variance of the sum or difference of two independent random variables is
the sum of their respective variance . Thus if X and Y are independent random
variables ,
Var(X + Y) = Var(X) + Var(Y)
and Var(X ─ Y) = Var(X) + Var(Y)

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 49


Continue
5. The variance has the minimal property. This means that the variance or
the standard deviation is minimum if and only if the deviations are taken
from the mean . In other words.
Var (X) = is a minimum if and only if a =
This property provide the basis for defining the standard deviation as above.

6. For Normal distributions:


When analyzing normally distributed data, standard deviation can be used in
conjunction with the mean in order to calculate data intervals.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 50


Continue
If = mean, S = standard deviation and
x = a value in the data set, then

about 68% of the data lie in the interval: -S<x< + S.

about 95% of the data lie in the interval: - 2S < x < + 2S.

about 99.7% of the data lie in the interval: - 3S < x < + 3S.

7. If σ1, σ2 are two standard deviations of two series of sizes n 1 and n2 with means ȳ1 and ȳ2. The
variance of the two series of sizes n1 + n2 is:

σ 2 = (1/ n1 + n2) ÷ [n1 (σ1 2 + d1 2) + n2 (σ2 2 + d2 2)]

where, d1 = ȳ 1 − ȳ , d2 = ȳ 2 − ȳ , and ȳ = (n1 ȳ 1 + n2 ȳ 2) ÷ ( n1 + n2).

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 51


Example on Relative Dispersion

Problem: Below is the table showing the values of the results for two
companies A, and B.

1. Which of the company has a larger wage bill?


2. Calculate the coefficients of variations for both of the companies.
3. Calculate the average daily wage and the variance of the distribution of
wages of all the employees in the firms A and B taken together.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 52


Solution
For Company A
No. of employees = n1 = 900, and average daily wages = ȳ 1 = Rs. 250
We know, average daily wage = Total wages ⁄ Total number of employees
or, Total wages = Total employees × average daily wage
= 900 × 250 = Rs. 225000 … (i)
For Company B
No. of employees = n2 = 1000, and average daily wages = ȳ2 = Rs. 220
So, Total wages = Total employees × average daily wage
= 1000 × 220 = Rs. 220000 … (ii)
Comparing (i), and (ii), we see that Company A has a larger wage bill.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 53


Continue

For Company A
Variance of distribution of wages = σ12 = 100
C.V. of distribution of wages = 100 x standard deviation of distribution of
wages/ average daily wages
Or, C.V. A = 100 × √100⁄250 = 100 × 10⁄250 = 4 … (i)
For Company B
Variance of distribution of wages = σ22 = 144
C.V. B = 100 × √144⁄220 = 100 × 12⁄220 = 5.45 … (ii)
Comparing (i), and (ii), we see that Company B has greater variability.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 54


Continue
For Company A and B, taken together
The average daily wages for both the companies taken together
ȳ = (n1 ȳ 1 + n2 ȳ 2) ⁄ ( n1 + n2)
= (900 × 250 + 1000 × 220) ÷ (900 + 1000)
= 445000 ⁄ 1900 = Rs. 234.21
The combined variance, σ2 = (1/ n1 + n2) ÷ [n1 (σ1 2 + d1 2) + n2 (σ2 2 + d2 2)]
Here, d1 = ȳ1 − ȳ = 250 – 234.21 = 15.79,
d2 = ȳ2 − ȳ = 220 – 234.21 = – 14.21.
Hence, σ2 = [900 × (100 + 15.792) + 1000 × (144 + – 14.212)] ⁄ (900 + 1000)
or, σ2 = (314391.69 + 345924.10) ⁄ 1900 = 347.53.

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 55


Assignment 4
Q1:What is the range for the following set of numbers?

57, -5, 11, 39, 56, 82, -2, 11, 64, 18, 37, 15, 68
Find
a) Quartile Deviation
b) Mean Deviation From Mean and from Median.
C) Coefficient of variation

Q2: The weights of a number of students were recorded in kg.


Weight (kg) 30 ≤ w < 35 35 ≤ w < 40 40 ≤ w < 45 45 ≤ w < 50 50 ≤ w < 55
Frequency 10 11 15 7 4

Estimate the Mean deviation from Mean and from Median and also calculate relative dispersion

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 56


Continue
Q3:In an experiment, 50 people were asked to estimate the length of a rod to
the nearest centimeter. The results were recorded.
Length (cm) 20 21 22 23 24 25 26 27 28 29
Frequency 0 4 6 7 9 10 7 5 2 0

Calculate Mean Deviation from Median and also relative dispersion.


Q4: A teacher notes the number of correct answers given by a class on a
multiple-choice
Correct test.
1 – 10 11 – 20 21 – 30 31 – 40 41 – 50
answers
Frequency 2 8 15 11 3

Calculate coefficient of variation

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 57


Continue

Q5:Cost of Flying listed below are costs (in dollars) of roundtrip flights from
JFK airport in New York City to San Francisco. All flights involve one stop
and a two week stay. The airlines are US Air,Continental, Delta, United,
American, Alaska and Northwest. Compare the variability.
30 Days in 244 260 264 264 278 318 280
Advance
1 Days in 456 614 567 943 628 1088 536
Advance

Abdul Wali Khan University Mardan Pakistan. www.awkum.edu.pk 58

You might also like