Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

UNIT 3 SKEWNESS AND KURTOSIS

Structure
3.1 Introduction
Objectives
3.2 Moments and Quantiles
Moments of a Frequency Distribution
Quantiles of a Frequency Distribution
3.3 Skewness
3.4 Kurtosis
-3.5 Summary
3.6 Solutions and Answers

3.1 INTRODUCTION
In Unit 2, we talked about the central tendency and dispersion of frequency
distributions. We have also seen how to compute some measures of central tendency
and dispersion. Now, in this unit, we shall discuss two additional features of
frequency distributions. These are : skewness and kurtosis. A measure of skewness
would tell us how far the frequency curve of the given frequency distribution deviates
from a symmetric one. On the other hand, a measure of kurtosis gives us some
information about the degree of flatness (or peakedness) of the frequency curve. So,
these two features, along with the two discussed in the previous unit should give us
a good idea about the given frequency distribution.
The measures of skewness and kurtosis that we are going to discuss here, make use
of moments and quantiles. So, we shall first introduce these in Section 3.2.
While studying this unit, you will need to look back at the tables of data given in
Unit 1. You will also need your calculator. Check all the calculations in the solved
examples, so that you don't have any difficulty in solving the exercises later.
With this unit, we end our discussion of the descriptive measures of univariate data.
In the next unit, Unit 4, we'll talk about bivariate data, i.e., data concerning two
variables.

Objectives
After reading this unit, you should be able to :
calculate the momenrs and the quantiles of a given frequency distribution,
compute some measures of skewness and kurtosis,
discuss the relative advantages and disadvantages of these measures.

3.2 MOMENTS AND QUANTILES


As we have mentioned in the Introduction, using moments and quantiles, we can
define some measures of skewness and kurtosis. So we can say that moments and
quantiles give us some information about the nature of a given frequency distribution.
You will see that the mean of a frequency distribution can also be considered 'as its
moment, whereas the median can be considered as its quantile. Now let's study these
one by one.

3.2.1 Moments of a Frequency Distribution


If you have studied a little physics, you would have come across the word, "moment".
Moment, in physics, measures the tendency of a force to produce rotation. If there
are n forces, fl, f2, ......... ., fn acting at aistances xl, x2, ......... .,x,, from the on, .
then the moment of the total force is

Now isn't this a familiar expression? If in (I), we take fi to be the frequency of xi,
i = l , 2,. ....., n, then (1) also gives us the mean of the distribution of x-values. It is
because of this similarity, that the term, "moment" has found its way into Statistics.
Now let's try to understand this term in the context of Statistics.
You know that the mean and the variance of data on a v i a b l e x are given by
k
- 1 -f
x = ---fixi
n = - = X f i ( x i - 0). and

s2 = -2
1
n
fi (xi - i)',respectively.
i=l

Now, taking a cue from this, if A is any number, we define the rtb moment of x about
A to be the mean of the rth power of the deviations of x from A. We denote this by
m:(A), or simply mi, if there is no confusion about the origin chosen. Thus,

1
m:(A) = ---fi (xi - A)'
i=l

1 "
For raw data, we can write m:(A) = ---(xi - A)' -
1

If we take A = %, then we get what are called the central moments. The rth central '

moment is denoted by m,. Thus

For raw data, m, = --8(xi - 3' .


1

We are sure you would agree with. the following :


m; = m, = 1 ... (4)
m;(o) = x ... (5)
qtn ... (6)
m2 = s2 ... (7)

(5) and (7) say that the mean of a variable is its nrst moment about zero, while the
variance is its second central moment. Recall that we have already established (6) in
Sectior! 2.3.4.
Now let us try to establish a connection between central moments and moments about
any value A'. For simplicity we consider raw data only.

For every i, we have


Therefore, using binomial theorem, we get
-
(xi - x ) ~= (xi - A p - ( ) (xi - AT-' mi + (xi - ~ 1 1 -(lmi)'

If we sum both sides over all i, i=1,2 ,.......,n, and divide the, result by n, we get
m, = mi - ( ; ) m i - , m i + (;)rn:-,(m;)'- ............ +

Now let us take r = 1. In this case, (8) becomes

m, =mi - (:)mi = o .
We have already stated this in (6). Now, if we put r = 2 in (8), we get

m2 = m; - ( :) (.mil2 + ( ) (mi)'
This gives us
mz = rn; - (mi)'
Putting r = 3 in (8) gives us
m, = m; - 3mi mi + 2(rni13
Check that if you put r=4 in (a), you get
m4 = m; - +
4m; mi 6mi (mi)' - 3(mil4
Further, we have jT = A + mi
We can use these relationships between the central moments and the moments about
any A, for simplifying the calculations involved in the computation of central We choose A to be a value near
moments. Here is an example to show how this is done. the centre of the range of values
of x.
Example 1 : Let's evaluate the mean and central moments of the milk yield data (in
litres) of a dairy farm first used in Example 1 in Unit 2. Let us take A = 200 litres
and first obtain the moments about A. The values of u = x-A and of the squares,
cubes and fourth powers of u are shown in the table below. The last column is taken
to serve as a check on the calculations, as you will see later.
L t
UI U
: :u 4
- (~,+1)'
4.

18.2 331.;4 6028.568 109719.94 135895.45


-0.3 0.09 -0.027 0.01 0.24
7.3 53.29 389.017 2839.82 4745.83
-14.6 213.16 -3112.136 b5437.19 34210.20
13.7 ,187.69 2571.353 35227.54 46694.89
-15.3 234.09 -3581.577 54798.13 41816.16
-20.5 420.25 -8615.125 176610.06 144590.06
-5.6 31.36 -175.616 983.45 447.75
24.3 590.49 14348.907 348678.44 409715.21
3.5 12.25 42.875 150.06 410.06
--
Total 10.7 2073.91 7896.239 774444.64 818525.85
I

The last row of the table shows that


n=10,
Now to check these calculations, we make use of the last column. We have

But X ( u i + 1)' also equals x u : + 4 2 u: + 6 x u : + 4 X u i + n


1 1 1 1 1

n
Since we get the same value for 2 (ui + 1)' by these two different methods, we can
1
be sure that the computations are correct.
Now
10.7
mi = -10
= 1.07,

mi = 77444.46.
Hence -x = 200 + mi = 201.071itres, (from (12)).
m2 = mi - (mi)2 (from (9)).
= 207.391 - 1.145 = 206.246 itre re)^,
m3 = mi - 3mimi + 2(mi13 (from (10))
= 789.624 - 3 x 221.908 + 2 x 1.225
= 792.074 - 665.724 r'726.35 (litre13s
and m4 = mi - 4m;mi + 6mi(mi)2 - 3(mi)' (from (11))
= 77444.46 - 4 X 844.90 + 6 X 237.44 - 3 X 1.31
= 78869.10 - 3383.53 = 75485.6 (litre)4s.
Try to do this exercise now. The result in this exercise also helps to simplify the
computation of central moments.

E l ) Suppose the data are subjected to a change of both origin and scale, i.e., let

1 "
If V; = - uf , show that
1

Notice how we have used this result in simplifying the computation of the mean and
central moments in our next example.
Exmgle 2 : For the frequency table of petiole length of leaves of a pipal tree, let us
take u = (x - 4.95)/0.8. The table below shows the steps to be followed in obtaining
-
v i , v;, v; and vi. Here, again, we take the last column to provide a check on the
computations.

k
We have xfi(ui + 1)4 = 12320

Hence, we are sure that the column totals are free of errors.
We now have
v; = 82J198 = 0.41414
vh = 6901198 = 3.4848,
v; = 148/198 = 0.74747
v; = 706Y198 = 35.667.
The mean and central moments of petiole length are
-
x = 4.93 + 0.8 x 0.41414 = 5.2813 cm,
m, = (0.8)~[3.4848 - (0.41414)~]
= 0.64 X 3.3133 = 1.7892 (cm)'
m3 = (0.8)~[0.74747 - 3 x 3.4848 x 0.41414 + 2 x (0.41414)~]
= 0.512 x (-3.44006) = - 1.7613 ( ~ m ,) ~
m4 = (0.8)~[35.667 - 4 x 0.74747 x 0.41414 + 6 x 3.4848 x
(0.41414)~- 3 X (0.41414)~]
= 0.4096 X 37.927 = 15.535
We may also encounter an exactly opposite situation, that is, given,the mean and
central moments of a variable x, we may like to express the moments about 'some
other origin, say A , in terms of these quantities. So, we would like to have some
formulas which express each m: in terms of the central moments. We believe you are
now in a position to derive the required formulas.

E2) Prove that m: = m, + ( ;) m,-, d + ( ;) m,-, d2 + ....................


-
..... +.( ri2 ) m2dr-' + ( :.) dr ,
where d = x - A.
Hint : Use the fact, xi - A = (xi-$ +.(%-A) .
E3) In Sections 2.3.4 and 2.4.4, you have seeit formulas for the composite mean and
composite variance in terms of the group means and group variances, when
several groups of data on a variable are taken together. Obtain similar formulas
for the composite third and fourth central moments, using E 2.

Let us now turn our attention to quantiles.

3.2.2 Quantiles of a Frequency Distribution


Remember how we define median of a given set of data? It is a value x, such that at
most half of the observations are below x, and at most half are above x. Now, instead
of half, if we take a proportion p, 0 < p < 1, then we get a p-quantile.
Thus, by the p-quantile (or p-fractile, or quantile of order p,) of a variable x, we mean
a value, say z,, of the variable such that at most a proportion p of the observation!
is below zp and at most a proportion (1-p) is above zp. SO,the median is the quantile
of order --.1 With p = T1v 1 and 3 we have the three quartiles z,,, z,,, and z,,,
2
(which are also denoted by ql, q2 and q3). Taking p = 0.1,0.2, ......,0.9, we get the
nine deciles and with p = 0.01, 0.02, ......, 0.99, we get the ninety-nine percentiles
Like the median, the p-quantile for a set of observations may not be unique. Now
let's see how to compute this p-quantile. When a frequency table for a continuous
variable is given, we first decide which of the class-intervals contains 2,. If x, and x
are its lower and upper boundaries and F, and Fu the corresponding cumulative
frequencies, then the p-quantile z, may be determined approximately by using the
formula
z - xI
P =
np-F,
Xu - XI Fu - F,

1
where cis the width of the interval and f, is its frequency. In (13), if we put p = -
2'
we get the formula for the median. Now let's use Formula (13) to compute the
quartiles in an example.
Example 3 : For the frequency distribution of petiole length of 198 !eaves of a pipa
tree, the median (i.e., the second quartile) was evaluated in Example 5 of Unit 2.
We'll now find the first and third quartiles.

Here - 3n = 148.5.
= 49.5 and -
4 4
On going through the cumulative frequency table (Table 8, Unit I), we find that q,
lies in the interval 3.75 - 4.55 (cm) and q3 in the interval 6.15-6.95 (cm). So, after
putting the appropriate values from Table 7 and 8 of Unit 1 in (13), we get

ql = 3.75 + 49.524- 26 o.8 -

= 4.533 cm.

q3 = 6.15 + 148.533- 145 o.8

= 6.158 cm.

See if you can solve this .exercise now.


- - - -

E4) Find the three quartiles for the age-distribution of the Indian population
according to the 1981 census (given in Unit 2). Check that q2 - q, < q3 - q2.

If we are given the first few moments or a small set of quantiles of a frequency
distribution, we can get a fairly good idea about the distribution. In fact, for most
purposes, it will be enough to state the values of Z, m2, m3 and m, or those of the
three quartiles (or of the nine deciles).
We'll come back to this later. Now we~iptroducca very useful concept, that of
weighted mean.

Weighted Mean
Consider this situation. Students are admitted to a B.Sc. course in Statistics on the
basis of their performance in the Higher Secondary, or an equivalent examination.
Then don't you think that their scores on the mathematics papers should be
considered more important than those on the physics papers? Similarly, shouldn't the
scores on language papers be considered least important? It is necessary in such a
situation to take into account the relative importance (or weight) of the different
observations while evaluating the mean.
Suppose wi 3 0 is the weight attached to the value xi (i.e., to the value of x for the
ith individual). Then the appropriate mean would be
n

The measure is called a weighted mean of x. This concept is particularly useful in


economic studies-in the construction of a price index number. This will become
clear from the next example.'

Example 4 : The price increases from 1985 to 1989 for five food items have been (in
percentage terms) as follows:
132.1 153.4 144.3 119.7 120.1
If the figures given below indicate the relative importance of these items in a typical
citizen's diet,
34 19 24 12 11,
then the average price increase for these items should be taken to be

- 13626'7
--- - 136.27 per cent.
100
Observe that the formula

that we used to compute the mean of x from grouped data is nothing but the weighted
mean of x,, ......., x,, the respective frequencies now serving as the weights.
Try this exercise now:

E5) A student gets 85, 76 and 82 marks in the three tests for the course MTE-11.
She gets 79 marks in the final examination. What'are her average marks if the
weightage given to the tests and the final examination are 10, 10, 10 and 70,
respectively?

In the next two sections, we'll see how moments and quantiles lead to measures of
skewness and kurtosis of a frequency distribution.

-
3.3 SKEWNESS
In Unit 1, you saw that frequency distributions may be classified as symmetrical and
skewed (or asymmetrical). Skewed distributions can again be classified as positively
skewed or negatively skewed, according as4thelonger tail of the distribution is
towards the higher or the lower values of the variable (see Fig. I).
Fig. 1 : (i) Symmetrical (ii) positively skewed and (iii) negatively skewed distributions.

Now, the degree of skewness is the extent to which the given distribution departs
from symmetry. A good measure of the degree of skewness has to fulfil the following
criteria:
i) It should be a pure number, i.e., should be free of the units in which the variable
is measured.

ii) It should be zero, positive and negative for a symmetrical distribution, a


positively skew distribution and a negatively skew distribution, respectively.

iii) It should vary between two definite limits, say, -k and +k, as the nature of a
distribution changes from extreme negative asymmetry to extreme positive
asymmetry. Here are some commonly used measures (assuming s >0) :

As usual,
Tt denotes the mean,
x denotes the median.
denotes the mode.
q,, the ithquartile,
m3,the third moment,
about iT and s, the standard
deviation.
and Sk, = 3
s3

Skl may not be defined, since the mode Tay not be defined. To get over this
difficulty, we use the empirical relation SE-x 1: 3 6 - 2 ) to get the measure Sk2. Sk2
and Sk3 too, may not be unique since the median and the first and third quartiles may
M:
not be unique. The square of ~ k , , - ~ k=; -is called Pearson's coefficient and is
denoted by bl. M;
All the four measures above are free of units. Secondly, for a symmetrical (unimodal)
8 distribution,
mean = median - mode
and q3 - q2 = q2 - ql
Further, for most of the distributions which we see in practice, we have the following
two observations:
1) For a positively skew distribution,
'mean > median > mode,
and q3 - q2 > q2 - 41

2) F0r.a negatively skew distribution,


mean < median < mode,
and q j - q2 < q2 - ql . See Fig. 2(a) and (b).
Variable value Variable value
Fig. 2
For a symmetrical distribution, m3 (or for that matter, any odd-order central
moment) is zero. m3 > 0 for a positively skew distribution, and m3 < 0 for a
negatively skew distribution. Hence, all the four measures meet the second criterion.
As to the third criterion, we have the following results :
i) For any distribution with s>O,
- 3 S Sk2 S +
3.
ii) For any distribution,
- 1 S Sk3 S + 1.
Thus, Sk, and Sk3 meet the third criterion. Since we have the empirical relation.
mean - mode = 3(mean - median) ,
we can say that Sk,, too, roughly meets this criterion. However, Sk4 = &may take
any value between -GO and + m and hence, is inferior to the other measures.
Let's now calculate these four measures for the data on petiole length.
Example 5 : For the frequency distribution of petiole length, we have
-
x = 5.281 cm, k (=q2) = 5359 cm, = 5.607 cm.
Also, for this distribution, the first and third quartiles are
q, = 5.093 cm, q3 = 6.235 cm ,
while the standard deviation and third central moment are
s = 1.456 cm, m3 = 1.761 ( ~ m ) ~
-'

Hence.

---
- - - 0.029, a i d
1.702
- 1.761 - - 1.761 = - 0.905.
Sk4 = -- -
(1.456)3 . 1.945
All these values indicate that the distribution is only slightly asymmetric, and that it
is a case of negative asymmetry. This is also apparent from the histogram of the
distribution (see Fig. 4 in Unit 1).
Now here is an exercise for you.
I
I
1 E6) Show that, for a distribution symmetrical about a, the mean as well as the
median is a and the central moments are all equal to zero.
I
Hint : You may take the values of x to be, say a f hl, a f h2, ........, a f h,
i (hi 3 0 for each i) with frequepcies "; for a - hi and also for a + hi .
Now that we have seen how to measure the skewness of afrequency distribution, let
us talk about its kurtosis.

3.4 KURTOSIS
We now focus attention on another feature of a frequency distribution that
determines the shape of the distribution. It is the degree of steepness or pointedness
of distribution-or, to use a Greek work, the kurtosis of the distribution. Some
distributions are flat-topped; some are highly peaked; most distributions will be in
between these two extreme types, not too peaked and not too flat-topped either.
In Fig. 3, we have the frequency curves of a distribution that is highly peaked, one
that is of moderate kurtosis and a third one which is rather flat-topped.

Fig. 3 :Three symmetrical distributions with same mean and s.d. but of varying kurtosis.

It has been observed that for two distributions having the same dispersion and the
same degree of skewness, the one with higher kurtosis, usually has higher fourth
powers of deviations from the mean and hence a higher value of m,. This observation
is used to define a measure of kurtosis (under the assumption that s > 0) as follows :

The division by s4 makes the measure free of units. It also ensures that the measure
takes into account that part of the peakedness of the distribution which is
independent of (or is in addition to) the part that is due to the variance.
For a normal distribution (about which you will learn in Block 3), b2 = 3. This value
meso means moderate, lepta is taken as a standard against which the kurtosis of other distributions is judged. Any
means thin and platy mean flat. distribution with b2 = 3 is called mesokurtic (i.e., of moderate kutosis); one with
, b2> 3 is said to be leptokurtic, while one with b2 < 3 is said to be platykurtic. Thus, in
Fig. 3, (i) is lepokurtic, (ii) is mesokurtic, while (iii) is platykurtic.
For any univariate distribution with s > 0, we have
b2 3 1.

Let's prove this.

(xi - 5)
Proof : Let ui = for each i.
S

Then x u i = ;Z(?- -
x) = O
1 1'
- -
Skewness and Kurtasiis
and iui=l$(xi-h2
1 s2 1

Also, we must have


n
2 (u: - 1)2 2 0 , since the L.H.S. is a sum of squares.
1

This means. x u f - 2 z u : + n 2 0.
1 1

1 4
or - x nm - 2n.+ n 2 0,
s4
or n(b2 - 1) 2 0 .
Since n > 0,this implies that b2 - 1 3 0, so Zhat the result is established.
Note that we have b2 = 1 iff n(b2 - 1) = 0.
n
So we can also say that b2 = 1 iff (uf - 1)' = 0,i.e.,
I
iff uf = 1 for each i,
i.e., iff xi = 2 + s for each i.
Thus, the coefficient of kurtosis b2 = 1, iff the variable x can assume just two values
with equal frequencies (so that the mean may be exactly midway between the two
values).
Now let's calculate the coefficient of kurtosis for our favourite distribution.

Example 6 : For the frequency distribution of petiole length, we have, from Example 2,
m2 = 1.7892 (cm12, m4 = 15.535 ( ~ m ) ~ .
Hence, the kurtosis coefficient b2 for this distribution is given by
b2 = 15.535/(1.7892)2
= 4.853.
So, we find that this distribution is slightly leptokurtic.
Try to do this exercise now.

E7) Comment on the skewness and kurtosis of the -age-distribution of the Indian
population (1981 census) given in E3) in Unit 2, by evaluating appropriate
coefficients and also by considering the histogram of the distribution.

Let us now summarise the points covered in this unit.

3.5 SUMMARY
In this unit, you have seen
1) what is meant by inoments and quantiles of different orders about A:

1
mi = - X f (xi - A)' for data in the form of a frequency distribution
n
i=l

and mi = -
n
l
2 (xi - A)' for raw data.
1
2) how central moments, i.e., moments about x are related to moments about any A:

3) when weighting of the observations would be appropriate for the computation


of mean,
4) what is meant by skewness and kurtosis :
skewness is the departure from symmetry and kurtosis gives the degree of flatness
of a frequency distribution.
5) how to compute measures of skewness and kurtosis of a frequency distribution :
Measures of Skewness :

Sk3 - - -q + (q, -
Sk4 = % provided s f 0 .
9

Measure of kurtosis :
m
b2=97(s#O).
s4

3.6 SOLUTIONS AND ANSWERS


1
El) v; = -
n
2" ui
1

-
-p
1
2 '

Similarly, solve for m3 and m,.

E2) xi - A = (xi - Z) + (5 - A)
= (xi - %) + d, say

.'. (xl - A)' = (xi - X)r + (,; ) (xi - X)r-2 d2

O n summing over i and dividing the result by n, we get

where d = ?-A.
~ L
Suppose there are k sets of data, each having nj observations and mean Ej,
j=1, 2 ,......, k.
i Then
-

2 (xi - Zj) (f, - f)2 + x 6,-


3
I I
3 3
I
-..
i" since 2(xi - -x,) = 0
i
J

Similarly,

E4) q l = 9.41 years, q2 = 20.48 years, q3 = 37.73 years.

E5) weigh;ed mean =


10 x 85 + 10 x 76 + 10 x 82 + 70 x 79
100

E6) The first moment about a is


k k
mi = in [ ~ [ ( a+ hi) - a ] fi + ?[(a - hi) - a
IIfi

= [$(hi - hi) f i ] = 0 0

If a is not a possible value of x, then the total irequency of values less than a
equals the total frequency of values exceeding a . Hence a is the median. If a is
a possible value (occurring with frequency f,, say), then also, total frequency
below a equals total frequency above a, hence a is again the median.
E7) The moments about 27.5 (years) are
mi = - 0.488 x 5
m i = -14.4954 x 5' ,
m; = 19.0042 x 5' ,
m i = 464.9984 x 54 .
x
Hence = 25.06, m, = 14.2573 x s2, m, = 39.9931 x s3, m4 = 522.6385 X 54
* &= 0.743, b2 = 2.571.
The distribution is moderately skew and platykurtic.
This is also indicated by the histogram.

You might also like