Professional Documents
Culture Documents
Lecture - 4 Dispersion
Lecture - 4 Dispersion
DISPERSION
Consider the series (i) 4, 5, 6, 7, 8 (ii) 2, 3, 6, 9, 10 (iii) –3, -1, 7, 12, 15. In
all these cases we see that n, the number of observations is 5 and the mean x is 6.
If we are given that the mean of 5 observations is 6, we cannot form an idea as to
whether it is the average of first series or second series or third series or of any
other series of five observations whose sum is 30. Thus we see that the measures
of central tendency are inadequate to give us a complete idea of the distribution.
They must be supported and supplemented by some other measures. One such
measure is dispersion.
We study dispersion to have an idea about the homogeneity or
heterogeneity of the distribution. In the above case we say that series (i) is more
homogeneous (less dispersed) than the series (ii) or we say that series (iii) is more
heterogeneous (more scattered) than the series (i) or (ii).
MEASURES OF DISPERSION
The measures of dispersion having the same units as the original variable
is termed as absolute measures of dispersion.
Range
Quartile Deviation
Where, Q1 and Q3 are the first and third quartiles respectively of the distribution.
Mean Deviation
The arithmetic mean of the absolute deviations of the individual values of a
variable from their average A (usually mean, median or mode) is called the mean
deviation
Case 1. If xi (i= 1,2,…n ) is the ith value of a variable, then
1 n
M .D. xi A
n i 1
Case II. If xi (i = 1, 2,… k) is the value of the ith class with corresponding
k
frequency fi such that, f
i 1
i = n, then
1 k
M .D. f i xi A
n i 1
N
1
N
(X
i 1
i ) 2 [for ungrouped data]
k
1
N
f (X
i 1
i i )2 [for grouped data]
1 n
s
n i 1
( xi x ) 2 [for ungrouped data]
1 k
s
n i 1
f i ( xi x ) 2 [for grouped data]
1 n
s ( xi x ) 2
n 1 i 1
[for ungrouped data]
1 k
s
n 1 i 1
f i ( xi x ) 2 [for grouped data]
Variance: The square of the standard deviation is called the variance i.e., the
arithmetic mean of the squared deviations of observations taken from their mean
is known as the variance.
n
Sum of Squares: The quantity (x
i 1
i x ) 2 is often referred to as the
corrected sum of squares or simply sum of squares (S. S) of the observed values
x1, x2………..., xn. It is called the corrected sum of squares as it can be expressed
as raw some of squares minus the correction term. We can write:
n n n n n n
(x
i 1
i x ) 2 ( xi2 2 x xi x 2 ) xi2 2 x xi x 2 xi2 nx 2
i 1 i 1 i 1 i 1 i 1
2
n
n x i
xi2 i 1
i 1 n
2
n
The terms n 2 x i are usually called the raw sum of squares (RSS)
xi and i 1
i 1 n
and the correction term respectively.
k
2
k i i 1 k
f x
1 k 1 2 1 k
s2 f i ( xi x ) f i xi i 1
2 2
f i xi nx f i xi x
2 2 2
n i 1 n i 1 n n i 1 n i 1
k
2
k f i x i
1
1 k 1 2 1 k 2
k
i 1
s2
n 1 i 1
f i ( xi x ) 2 i i
n 1 i 1
f x 2
n
n 1
f i xi nx
i 1
2
n 1
f i xi nx
i 1
2
Suitability of standard deviation: The standard deviation is by far the
most widely encountered measure of dispersion.
Merits:
Demerits:
(1) It is affected markedly by extreme values.
(2) It is more difficult to compute than other measures of dispersion.
(ii) The variance is the minimum of all mean squared deviation (MSD) and
standard deviation is the minimum of all root mean squared deviation
(RMSD)
1 k 1 k
i.e.,
n i 1
f i ( xi x ) 2 f i ( xi A) 2
n i 1
1 k 1 k
and
n i 1
f i ( xi x ) 2
n i 1
f i ( xi A) 2
(iii) Variance of the combined series: If n1, N2 are the sizes; x1 , x 2 are the
means, and s12 , s 22 are the variances of two set of data: (x11, x12, ……x1n1) and
(x21, x22, ……x2n2) , then the variance s2, of the combined series is given by
n1 s12 n 2 s 22 n1 d12 n 2 d 22
s2
n1 n2
n1 x1 n 2 x 2
Where, d1 x1 x , d 2 x 2 x and x .
n1 n2
(iv) Standard error: The standard deviation of any statistic is termed as the
standard error of that statistic.
Standard error of the sample mean: The standard deviation can be viewed as
a parameter, which can provide a lot of information when, combined with other
techniques. It is particularly useful when the population has a special type of
frequency distribution, called the normal distribution. It is possible then to find
the percentage of observations falling within distance of one, two or three o’ s
from the mean. About 68.27 percent, 95.45 percent and 99.73 percent of the
observations will lie within the regions (μ ± ơ ), (μ ± 2ơ) and (μ ± 3ơ) respectively,
where, μ and ơ are the mean and standard deviation of normal distribution. Thus,
in a normal curve, 3 times the ơ constitutes practically the whole range of the
values in the distribution.
A B
C.D. , Wher A and B are the greatest and smallest items
A B
respectively in the series,
Q3 Q1
2 Q Q1
C.D. 3
Q3 Q1 Q3 Q1
2
Mean deviation
C .D.
Average from which it is calculated
Co-efficient of Variation
Professor Dr. Khandoker Saif Uddin Lecture # 4, Page 8
Measures of Dispersion, Skewness and Kurtosis
SKEWNESS
Measures of Skewness
(1) Sk = M - Md (2) Sk = M – M0
where, M is the mean, Md, the median and M0 , the mode of the distribuition.
calculate the relative measures, called the co-efficients of skewness, which are
pure numbers independent of units of measurement. The following are the co-
efficients of skewness.
M M0
Sk
Where, is the standard deviation of the distribution.
3( M M d )
Sk
(Q3 M d ) ( M d Q1 ) Q3 Q1 2 M d
Sk
(Q3 M d ) ( M d Q1 ) Q3 Q1
3
1 1 3
.
2 2
When
(i) 1 0 , then the distribution is symmetrical.
1 k 1 k 1 k
r
n i 1
f i ( xi x ) r i.e., 2 f i ( xi x ) 2 and 3 f i ( xi x ) 3
n i 1 n i 1
KURTOSIS
When
(i) 2 0 , then the curve is mesokurtic.
(ii) 2 0 , then the curve is leptokurtic.
(iii) 2 0 , then the curve is platykurtic.