Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 11

Measures of Dispersion, Skewness and Kurtosis

DISPERSION

Averages or the measures of central tendency give us an idea of the


concentration of the observations about the central part of the distribution. If we
know the average alone we cannot form a complete idea about the distribution as
will be clear from the following example.

Consider the series (i) 4, 5, 6, 7, 8 (ii) 2, 3, 6, 9, 10 (iii) –3, -1, 7, 12, 15. In
all these cases we see that n, the number of observations is 5 and the mean x is 6.
If we are given that the mean of 5 observations is 6, we cannot form an idea as to
whether it is the average of first series or second series or third series or of any
other series of five observations whose sum is 30. Thus we see that the measures
of central tendency are inadequate to give us a complete idea of the distribution.
They must be supported and supplemented by some other measures. One such
measure is dispersion.
We study dispersion to have an idea about the homogeneity or
heterogeneity of the distribution. In the above case we say that series (i) is more
homogeneous (less dispersed) than the series (ii) or we say that series (iii) is more
heterogeneous (more scattered) than the series (i) or (ii).

MEASURES OF DISPERSION

Literal meaning of dispersion is scatteredness or variability. The


measurement of scatter of the values of a data set among themselves is called a
measure of dispersion or variation.

Purposes of measures of dispersion: The measure of dispersion in conjunction


with an average gives us a description of the structure of the distribution and the
role of the individual values in it. A measure of dispersion serves two purposes:

(i) It provides one of the most important characteristics of a frequency


distribution.
(ii) It helps us to compare two or more frequency distribution.

Characteristic for an ideal measure of dispersion: The characteristics


for an ideal measure of dispersion are the same as those for an ideal measure of
central tendency, viz.:

(i) It should be rigidly defined


(ii) It should be easy to calculate and easy to understand.
(iii) It should be based on all observations.

Professor Dr. Khandoker Saif Uddin Lecture # 4, Page 1


Measures of Dispersion, Skewness and Kurtosis

(iv) It should be amenable to further mathematical treatment.


(v) It should be affected as little as possible by fluctuations of sampling.

Types of measures of dispersion: There are two types of measures of


dispersion.

(i) Absolute measures of dispersion


(ii) Relative measures of dispersion.

Absolute Measures of Dispersion

The measures of dispersion having the same units as the original variable
is termed as absolute measures of dispersion.

There are four absolute measures of dispersion:


(i) Range,
(ii) Quartile deviation or Semi-interquartile range,
(iii) Mean deviation
(iv) Variance
(v) Standard deviation.
(vi) Standard Error

Range

The range is the difference between two extreme observations of the


distribution. It is usually denoted by R. If A and B are the greatest and the
smallest observations respectively in a distribution, then its range is
R = A - B.

Suitability of range: Range is the simplest but a crude measure of


dispersion. Since it is based on two extreme observations, it is not at all a reliable
measure of dispersion.

Quartile Deviation

Quartile deviation or semi-interquartile range is defined as the half of the


difference of the first quartile from the third quartile.
1
i.e., Q.D.  (Q3  Q1 )
2

Professor Dr. Khandoker Saif Uddin Lecture # 4, Page 2


Measures of Dispersion, Skewness and Kurtosis

Where, Q1 and Q3 are the first and third quartiles respectively of the distribution.

Suitability of quartile deviation: It is definitely a better a measure than


the range as it makes use of 50% of the data. But since it ignores the other 50% of
the data, it cannot be regarded as a reliable measure.

Mean Deviation
The arithmetic mean of the absolute deviations of the individual values of a
variable from their average A (usually mean, median or mode) is called the mean
deviation
Case 1. If xi (i= 1,2,…n ) is the ith value of a variable, then

1 n
M .D.   xi  A
n i 1

Case II. If xi (i = 1, 2,… k) is the value of the ith class with corresponding
k
frequency fi such that, f
i 1
i = n, then

1 k
M .D.   f i xi  A
n i 1

Case III. If xi (i = 1,2,…….K) is the mid-value of the i th class with


k
corresponding frequency fi such that f
i 1
i = n, then
1 k
M .D.   f i xi  A
n i 1

Suitability of mean deviation: Since mean deviation is based on all the


observations, it is a better measure of dispersion than range or quartile deviation.
But the step of ignoring the signs of the deviation (xi-A) creates artificiality and
makes it useless for further mathematical treatment.

Important result: Mean deviation is least when taken from median


i. e., When A = Md.

Professor Dr. Khandoker Saif Uddin Lecture # 4, Page 3


Measures of Dispersion, Skewness and Kurtosis

Standard Deviation and Variance

Standard Deviation: The positive square root of the arithmetic mean of


the squared deviations of observations taken from their mean is known as the
standard deviation.

Standard deviation of population values is denoted by  and is defined as

N
1

N
(X
i 1
i   ) 2 [for ungrouped data]

k
1

N
 f (X
i 1
i i  )2 [for grouped data]

Where N is the population size and μ is the population mean.

Standard deviation of sample values is denoted by s and is defined as

1 n
s 
n i 1
( xi  x ) 2 [for ungrouped data]

1 k
s 
n i 1
f i ( xi  x ) 2 [for grouped data]

To get an unbiased estimate of population standard deviation from the


sample of small size, the following formula is often used

1 n
s  ( xi  x ) 2
n  1 i 1
[for ungrouped data]

1 k
s 
n  1 i 1
f i ( xi  x ) 2 [for grouped data]

Professor Dr. Khandoker Saif Uddin Lecture # 4, Page 4


Measures of Dispersion, Skewness and Kurtosis

Variance: The square of the standard deviation is called the variance i.e., the
arithmetic mean of the squared deviations of observations taken from their mean
is known as the variance.
n
Sum of Squares: The quantity  (x
i 1
i  x ) 2 is often referred to as the

corrected sum of squares or simply sum of squares (S. S) of the observed values
x1, x2………..., xn. It is called the corrected sum of squares as it can be expressed
as raw some of squares minus the correction term. We can write:
n n n n n n

 (x
i 1
i  x ) 2   ( xi2  2 x xi  x 2 )   xi2  2 x  xi   x 2   xi2  nx 2
i 1 i 1 i 1 i 1 i 1
2
 n

n  x i 
  xi2   i 1 
i 1 n

2
n 
The terms n 2  x i  are usually called the raw sum of squares (RSS)
 xi and  i 1 
i 1 n
and the correction term respectively.

Similarly for a frequency or grouped data


2
k 
k k  f i xi 
 f i ( xi  x ) 2   f i xi2   i 1 
i 1 i 1 n

Hence, for large sample,

  k  
2

 k  i i   1  k
f x
1 k 1   2 1 k
s2   f i ( xi  x )   f i xi   i 1
2 2
  f i xi  nx    f i xi  x
2 2 2

n i 1 n  i 1 n  n  i 1  n i 1
 
 

Professor Dr. Khandoker Saif Uddin Lecture # 4, Page 5


Measures of Dispersion, Skewness and Kurtosis

For small sample,

  k  
2

 k   f i x i  
  1 
1 k 1  2 1  k 2
k
 i 1
s2  
n  1 i 1
f i ( xi  x ) 2   i i
n  1  i 1
f x 2

n

n  1
 f i xi  nx  
 i 1
2

 n  1
 f i xi  nx 
 i 1
2



 
 
Suitability of standard deviation: The standard deviation is by far the
most widely encountered measure of dispersion.

Merits:

(1) It is rigidly defined.


(2) It is based on all observations and is readily understood.
(3) It is amenable to algebraic treatment.
(4) It is the most important and most reliable among all the four measures
of absolute dispersion. The standard deviation possesses a majority of
the properties which are desirable in a measure of dispersion.
(5) It is easy to use mathematically. Many statistical theorems are built
around it.

Demerits:
(1) It is affected markedly by extreme values.
(2) It is more difficult to compute than other measures of dispersion.

Some important properties of standard deviation:

(i) It is independent of origin but not of scale of measurement i.e., if a variable x


xi  A
is transformed to another variable u by u i  , then s x  c su .
C

(ii) The variance is the minimum of all mean squared deviation (MSD) and
standard deviation is the minimum of all root mean squared deviation
(RMSD)

1 k 1 k
i.e., 
n i 1
f i ( xi  x ) 2   f i ( xi  A) 2
n i 1

1 k 1 k
and 
n i 1
f i ( xi  x ) 2  
n i 1
f i ( xi  A) 2

Professor Dr. Khandoker Saif Uddin Lecture # 4, Page 6


Measures of Dispersion, Skewness and Kurtosis

Where, A is any other quantity except the mean.

(iii) Variance of the combined series: If n1, N2 are the sizes; x1 , x 2 are the
means, and s12 , s 22 are the variances of two set of data: (x11, x12, ……x1n1) and
(x21, x22, ……x2n2) , then the variance s2, of the combined series is given by

n1 s12  n 2 s 22  n1 d12  n 2 d 22
s2 
n1  n2

n1 x1  n 2 x 2
Where, d1  x1  x , d 2  x 2  x and x  .
n1  n2

(iv) Standard error: The standard deviation of any statistic is termed as the
standard error of that statistic.

Standard error of the sample mean: The standard deviation can be viewed as
a parameter, which can provide a lot of information when, combined with other
techniques. It is particularly useful when the population has a special type of
frequency distribution, called the normal distribution. It is possible then to find
the percentage of observations falling within distance of one, two or three o’ s
from the mean. About 68.27 percent, 95.45 percent and 99.73 percent of the
observations will lie within the regions (μ ± ơ ), (μ ± 2ơ) and (μ ± 3ơ) respectively,
where, μ and ơ are the mean and standard deviation of normal distribution. Thus,
in a normal curve, 3 times the ơ constitutes practically the whole range of the
values in the distribution.

Professor Dr. Khandoker Saif Uddin Lecture # 4, Page 7


Measures of Dispersion, Skewness and Kurtosis

Relative Measures of Dispersion

The measures of dispersion having no unit of measurement, as the original


variable, is termed as relative measures of dispersion. Whenever we want to
compare the variability of the two series which differ widely in their averages or
which are measured in different units. We do not merely calculate the absolute
measures of dispersion but calculate the relative measures of dispersion. Relative
measures of dispersion usually termed as the co-efficients of dispersion which are
pure numbers.

The co-efficient of dispersion (C. D.) based on different absolute measures


of dispersion are as follows:

(i) Based upon range:

A B
C.D.  , Wher A and B are the greatest and smallest items
A B
respectively in the series,

(ii) Based upon quartile deviation:

Q3  Q1
2 Q  Q1
C.D.   3
Q3  Q1 Q3  Q1
2

(iii) Based upon mean deviation:

Mean deviation
C .D. 
Average from which it is calculated

For example, when M0 is used to calculate the mean deviation, then

Mean deviation about M 0


C.D. 
M0

(iv) Based upon standard deviation:

S tan ndard deviation 


C.D.  
Mean x

Co-efficient of Variation
Professor Dr. Khandoker Saif Uddin Lecture # 4, Page 8
Measures of Dispersion, Skewness and Kurtosis

Co-efficient of dispersion based upon standard deviation when multiplied


by 100, is called the co-efficient of variation (C. V.). i. e.,

C.V .   100
x
According to Professor Karl Pearson who suggested this measure, C.V. is
the percentage variation in the mean, standard deviation being considered as total
variation in the mean.

Suitability of C.V. : For comparing the variability of two series, we


calculate the co-efficient of variation for each series. The series having greater
C.V. is said to have more variability than the other and the series having lesser
C.V. is said to have more consistency than the other.

SKEWNESS

Literally skewness means ‘lack of symmetry’. We study skewness to have an


idea about the shape of the curve, which we can draw, with the help of the given
data. A distribution is said to be skewed if

(i) Mean, median and mode fall at different points


i.e., Mean  Median  Mode
(ii) Quartiles are not equidistant from the median, and
(iii) The curve drawn with the help of the given data is not symmetrical
but stretched more to one side then to the other.

Measures of Skewness

Various Measures of skewness are

(1) Sk = M - Md (2) Sk = M – M0

where, M is the mean, Md, the median and M0 , the mode of the distribuition.

(3) Sk = (Q3 – Md) – (Md – Q1)

These are the absolute measures of skewness. As in dispersion, for


comparing two series we do not calculate these absolute measures, but we

Professor Dr. Khandoker Saif Uddin Lecture # 4, Page 9


Measures of Dispersion, Skewness and Kurtosis

calculate the relative measures, called the co-efficients of skewness, which are
pure numbers independent of units of measurement. The following are the co-
efficients of skewness.

1. Prof. Karl Pearson,s Co-efficient of skewness: It is based on averages


(mean, median and Mode) and defined as

M  M0
Sk 

Where,  is the standard deviation of the distribution.

If mode is ill defined, then using the relation, M – M 0 = 3 (M – Md), for a


moderately asymmetrical distribution. We get

3( M  M d )
Sk 

Skewness is positive if M > M0 or M > Md and negative if M < M0 or


M < Md.

II. Prof. Bowley’s Co-efficient of Skewness: It is based on quartiles and


defined as

(Q3  M d )  ( M d  Q1 ) Q3  Q1  2 M d
Sk  
(Q3  M d )  ( M d  Q1 ) Q3  Q1

III. Co-efficient of Skewness Based on Moments: From a theoretical point


of view, the most important measure of skewness is based upon the corrected
moments. A measure of skewness may be obtained by using the third corrected
moment  3 . But  3 is a measure of absolute skewness. The measure of relative
skewness is given by:
 32
1  3 ,
2
Skewness is also sometimes measured by

3
 1  1  3
.
2 2

When
(i)  1  0 , then the distribution is symmetrical.

Professor Dr. Khandoker Saif Uddin Lecture # 4, Page 10


Measures of Dispersion, Skewness and Kurtosis

(ii)  1  0 , then the distribution is positively skewed.


(iii)  1  0 , then the distribution is negatively skewed.

It is to mention that rth corrected moment, denoted by  r , of a distribution


is defined as

1 k 1 k 1 k
r  
n i 1
f i ( xi  x ) r i.e.,  2   f i ( xi  x ) 2 and  3   f i ( xi  x ) 3
n i 1 n i 1

KURTOSIS

If we know the measures of central tendency, dispersion and skewness, we


still cannot form a complete idea about the distribution. In addition to these
measures we should know one more measure which Prof. Karl Pearson calls as
the Convexity of a curve’ or Kurtosis.

Kurtosis enables us to have an idea about the flatness or peakedness of the


curve. It is measured by the co-efficient  2 , given by
4
2  ,
 22
also measured by  2 , which is the derivation of  2 , given by
 2   2  3.

When
(i)  2  0 , then the curve is mesokurtic.
(ii)  2  0 , then the curve is leptokurtic.
(iii)  2  0 , then the curve is platykurtic.

Professor Dr. Khandoker Saif Uddin Lecture # 4, Page 11

You might also like