Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

UNIT IV - MEASURES OF DISPERSION AND SKEWNESS

DEFINITION:

“The degree to which numerical data tend to spread about an average value is called the variation
or dispersion of the data.” - Spriegel.

Measures of dispersion are also called “averages of the second order” for the reason that these
measures give an average of the differences of the various items from an average.

Objectives:

1. To determine the dependability of an average: measures of variation reveal the extent to


which an average is representative of the mass. When variation is small, the average is
reliable and vice versa.
2. To serve as a basis for control of the variability: measures of dispersion reveal the cause
and effect relationship, thus acting as a basis for control. For instance, variations in blood
pressure, body temperature, etc., acts as a basis for reasoning and control.
3. To compare two or more series with regard to their variability: measures of dispersion
makes comparative study of two or more sets of data with regard to the degree of
consistency, uniformity, reliability etc. a low degree of variation means more consistency
and a high degree of variation means lack of uniformity.
4. To facilitate the use of other statistical techniques: measures of dispersion are essential
and used for computing other statistical measures like correlation analysis, regression
analysis, testing hypothesis, time series etc.

Characteristics of an ideal measure of dispersion.


1. It should be simple to understand.
2. It should be easy to calculate.
3. It should be rigidly defined.
4. It should be based on each and every item of distribution.
5. It should be suitable for further algebraic and arithmetical manipulation.
6. It should have sampling stability
7. It should not be unduly affected by extreme items.

Methods of studying variation:

1. The range.
2. The interquartile range and the quartile deviation.
3. The mean deviation.
4. The standard deviation
5. The Lorenz curve.
Out of the above range, quartile deviation, mean deviation and standard deviation are
mathematical methods and the Lorenz curve is a graphical method.
RANGE:
The range is the simplest measure of dispersion. When the data are arranged in an order the
difference between the largest value and the smallest value in the arranged group is called the
range:
Range = L – S
L= largest value
S= smallest value
This is range in terms of absolute measure:
As relative measure, the coefficient of range is given as

absoulte range
Coefficient of range =
∑ of two extremes
L−S
Coefficient of range =
L+S

1. Range – individual observations:


a. Find the range and its coefficient from the following data.
200,250,222,275,280,230
b. Compute the range and the coefficient of range of the series and state which series is
more dispersed and which series is more uniform.
Series Value of variables
A 9 10 12 8 7 4
B 12 10 6 5 13 14
C 1 8 9 18 12 2

2. Range – discrete series.


a. From the following data compute range and its co-efficient
X 10 20 30 40 50 60 70 80
F 8 4 31 50 67 36 4 10

3. Range – continuous series


a. Compute range and coefficient of range
X 110-120 120-130 130-140 140-150 150-160 160-170
F 22 34 48 59 36 15

Uses of range:
Range is very much useful in quality control, weather forecasts and in knowing the fluctuations
in the prices of shares and indices of stock markets.

Merits of range:
a. Simplest measure of dispersion.
b. Simplest to understand and to compute.
Demerits of range:
a. It is largely affected by extreme values.
b. It is not based on each and every observation.
c. It is not amenable to further mathematical treatment.
d. Not suitable for open-end class distribution.

INTER – QUARTILE RANGE AND QUARTILE DEVIATION


Quartiles together with the median are the points that divide the whole data into
approximately four equal parts. We have three quartiles Q1, Q2 or median and Q3
Inter-quartile range represents the difference between the third quartile and the first quartile.

Inter quartile range = Q3 – Q1

When inter quartile range is reduced to semi-inter quartile range by dividing it by 2, it is called
quartile deviation (Q.D)
Q3−Q1
Q.D =
2

Quartile deviation gives the average amount by which the two quartiles differ from the median.
Quartile deviation is an absolute measure of dispersion. Its relative measure is the co-efficient of
quartile deviation. It is given as:

Q3−Q1
Coefficient of Q.D =
Q3+ Q1

1. Q.D – individual series


a. Calculate the value of quartile deviation and its coefficient from the following data:
Wages: 200, 220, 250, 150, 175, 260, 190
b. Calculate quartile deviation and its coefficient
Weight: 55, 60, 61, 63, 68, 69, 71, 72, 73, 75.

2. Q.D – Discrete series


a. Compute inter quartile rage, Q.D and its coefficient
X 10 20 30 40 50 60 70 80
F 2 5 12 28 14 7 3 1

3. Q.D – continuous series


a. Compute Q.D and its co-efficient from the following data.
X 10-20 20-30 30-40 40-50 50-60 60-70 70-80
F 12 25 55 120 60 30 13

Merits
1. It is easy to compute and easy to understand.
2. It is bases on central 50% of the observation. Therefore extremes are avoided.
3. It is useful in open-end distribution.

Demerits.
1. It ignores 50% of items. Therefore it is not based on all items.
2. It is not amenable to further mathematical treatment.
3. Quartile deviation is not computed from any central value. Therefore some experts argue
that Q.D is not a measure of dispersion.
4. Q.D is not affected by the change in the distribution outside the quartiles.

Mean Deviation Or Average Deviation.

Mean deviation or average deviation is the average difference between the items in a distribution
and the median or mean or mode of that series.

In simple words, the arithmetic average of the deviation (ignoring signs) from the mean , median
and mode is known as mean deviation.

Formulas:

Individual observation - M.D. = ∑ |D| ÷ N

Discrete series – M.D. = ∑f |D| ÷N

Continuous series – M.D. = ∑f |D| ÷N

Coefficient of M.D. = Mean Deviation ÷ Mean ( If M.D. is based on Mean)

Coefficient of M.D. = Mean Deviation ÷ Median ( If M.D. is based on Median)

Coefficient of M.D = Mean Deviation ÷ Mode ( If M.D. is based on Mode)

Problems:

1. The following are the marks obtained by 10 students in an examination. Find mean
deviation about both mean and median and also find the coefficient of M.D.

Students Roll No. 1 2 3 4 5 6 7 8 9 10


Marks 60 55 75 80 50 44 48 56 78 54

2. Find out the value of Mean Deviation and its coefficient from the following:
45 , 70 , 78 , 52 , 75 , 83 , 110 , 98 , 64.

3. Calculate Mean Deviation and its coefficient from the following data:

Heights (in inches) : 61 62 63 64 65 66


No. of students : 8 18 45 15 10 4

4. Calculate the Mean Deviation and its coefficient from mean as well as median of the
following data:
X 115 125 135 145 155 165 175
F 31 48 72 116 60 22 3

5. Calculate the Mean Deviation and its coefficient from both mean and median for the
following:

Marks 0-10 10-20 20-30 30-40 40-50


No. of Students 10 16 30 32 12

6. Compute Mean Deviation about median and the coefficient of mean deviation :

Heights of 5.1-6.0 6.1-7.0 7.1-8.0 8.1-9.0 9.1-10.0 10.1- 11.1-


inches 11.0 12.0
No. of plants 11 27 25 17 15 11 9

7. Compute the Mean Deviation from the following data :

Wages (Rs.) 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-
90
No. of persons 8 10 15 25 20 18 9 5

8. Calculate Mean Deviation from the following data :

Marks (Les than) 10 20 30 40 50 60 70


No. of Students 4 10 20 40 48 55 60

Merits of Mean Deviation :

a) Average deviation is relatively simple to understand and easy to compute.


b) It take into account each and every item of the data.
c) Comparative to Standard Deviation it is less affected by the values of extreme items.
d) It is widely used in economics , ocio-economic and business fields.

Demerits of Mean Deviation :


a) Major throwback of mean deviation is ,it is based on absolute value i.e., deviations are
calculated ignoring + or – signs. This is not justified mathematically.
b) It is generally not useful for statistical conclusions when deviations are taken frommode.
Again if deviations are taken from median , it suffers from limitation of median such as
median is positional and not based on all observations.
c) M.D. is not capable of further algebraic treatment.
d) Because of superiority of standard deviation as a measure of dispersion , mean deviation
is rarely used.
Standard Deviation (Root – Mean – Square Deviation)

Standard Deviation (S.D.) is the square root of the arithmetic mean of the squared deviations of
values from their arithmetic mean. It is generally denoted by the symbol σ (read as sigma).

Formulas:

Individual observations Discrete & Continuous series


Actual Mean Method Actual Mean Method
σ = √ ∑ x 2∕ N where x = (X – Mean) σ = √ ∑ f x 2∕ N where x = (X –Mean) or x =
(m-Mean) [where m = midpoint in
continuous series].
Assumed Mean Method Assumed Mean Method
2 2
∑d ∑fd
σ = √ ∑d 2 / N−( ) σ = √ ∑ f d 2 / N−( )
N N

Step Deviation Method Step Deviation Method

( ) ( )
2 2
∑ d' ∑ d'
σ = √ ∑ d ' / N−
2
×i σ = √ ∑ d ' / N−
2
×i
N N
where d’ = X-A∕ i ,A = Assumed mean , where d’ = X-A∕ i in case of discrete series
i = common value and d’ = M-A∕ i in case of continuous
series.
A = Assumed mean , i = Class interval or
common value , M = Mid point.
Problems:

1. Calculate S.D. of the following data:


Marks : 35 38 40 43 45 46 50 55

2. Compute S.D. from the following data:


X 10 12 15 18 20 23

3. Calculate S.D. from the following data under assumed mean method :
X 10 12 15 18 20 23

4. Calculate S.D. from the following data:


Runs : 50 98 103 76 5 39 55

5. Compute S.D. from the following data:

X 3 4 5 6 7 8 9
F 3 9 11 14 12 7 4

6. Compute S.D. from the following data:

X 10-20 20-30 30-40 40-50 50-60 60-70


F 1 4 14 8 2 1

7. Compute S.D. from the following data:

X 10 20 30 40 50 60 70
F 5 11 19 22 15 6 2

8. Find S.D. from the following data:

X 14-18 18-22 22-26 26-30 30-34 34-38


F 7 10 12 16 7 8

9. Compute S.D. from the following data ( use Step Deviation Method ) :
Marks : 5 10 15 20 25 30 35 40

10. Compute S.D. from the following data:

X 1000 2000 3000 4000 5000 6000 7000


F 8 17 30 32 16 15 2

11. Calculate S.D. from the following data:

Age 10-20 20-30 30-40 40-50 50-60 60-70 70-80


Persons (in 000s) 2 5 7 11 6 3 1

Coefficient of Variation [ Relative measure of S.D.]

The S.D. is the absolute measure of dispersion. The corresponding relative measure is known as
the Coefficient of Variation (C.V.). It is given as: C.V. = (σ ∕ mean ) × 100.

Note: C.V. is always measured in terms of percentage.

Variance

The term variance is used to describe the square of the S.D.

Variance = σ 2 and therefore σ = √ Variance


Problems:

1. The following table shows the scores of two batsmen , Rahul and Raju in a recent cricket
tournament. Find out who is better run getter and who is more consistent .

Rahul 103 85 8 50 0 17 28 142 35 49


Raju 47 79 9 105 111 51 67 78 4 28

Rahul

 Mean = 51.7
 Std Deviation = 43.05
 Co-efficient of Variation = Std. Deviation/mean x 100
= 43.05/51.7 x 100 = 83.26%

Raju

 Mean = 57.9
 Std Deviation = 35.13
 Co-efficient of variation = std. deviation/mean x 100
= 35.13/57.9 x 100 = 60.67%

 Raju is a better run getter as the mean value of raju’s runs is more.
 CV of raju is lesser than CV of rahul, therefore raju is more consistent than Rahul.

2. Find which of the following batsman is more consistent in scoring. Would you accept
him as a better run getter? why?

Batsman 5 7 16 27 39 53 56 61 80 101 105


A
Batsman 0 4 16 21 41 43 57 78 83 90 95
B

3. You are given below the daily wages paid to workers in two factories A and B.
a) Which factory pays higher average wages?
b) In which factory are wages more variable?

Daily wages (in 50-60 60-70 70-80 80-90 90-100 100-110


Rs.)
No. of workers 10 15 22 23 10 5
A
No. of worker B 7 16 24 23 8 7
4. Coefficient of variation of two series are 60% and 80%. Their S.D. are 20 and 16
respectively. What are their arithmetic means?
CV = std. deviation/ mean x 100
a. 60 = 20/mean x 100 = 33.33
b. 80= 16/mean x 100= 20

5. The average marks of 2nd sem B.Com students in Busines Statistics of a college increases
from 65 to 68 and S.D. increases from 4.5 to 5.2. Have the marks in business statistics
of that college become consistent than before?
Ans:
 Mean = 65
 S.D = 4.5
 CV = 4.5/65 x 100 = 6.92%

Mean = 68
S.D = 5.2
CV = 5.2 /68 x 100 = 7.64%
Since the CV has increased. The marks have not become consistent.

6. The M & S.D. of two brands of bulbs are given below:

Brand A Brand B
M 1000 hours 820 hours
S.D. 100 hours 65 hours
Calculate a measure of relative dispersion for the two brands and interpret the results.

7. Following particulars relate to wage paid by two factories M and N belonging to the same
industry :

Factory M Factory N
No. of workers 856 684
Average wages Rs. 552 Rs. 574
Variance 144 196
a) Which factory pays higher wages ?
b) Which factory has greater variability in wage ?

Solution:

A) Total wages = no of workers x avg wages


 Factory M = 856 x 552 = Rs. 4,72,512
 Factory N = 684 x 574 = Rs. 3,92,616
 Factory M pays higher wages
B) Co- efficient of variation
 Factory M: S.D = 12 (square root of 144)
 Mean= 552
 CV = 12/552 x 100 = 2.174%
 Factory N : S.D = 14 (square root of 196)
 Mean = 574
 CV = 14/574 x 100 = 2.439%
 CV of factory N is greater and therefore wages of factory N are more variable

8. An organization has two units A and B. An analysis of weekly wages paid toworkers
gave the following results.

Unit A Unit B
No. of wage earners 500 670
Average weekly wages 65 72
(Rs.)
S.D. (Rs.) 9 9
a) Which unit pays larger amount a weekly wages ?
b) In which unit there is greater variability in wage distribution ?
c) Find the combined average wage and the combined S.D. of wages for the
whole organization.

Skewness :

Skewness means lack of symmetry ( balance or equal spread). If a distribution asymmetrical it is


called skewed distribution.

Definition :

“ A distribution is said to be ‘skewed’ when the mean and the median fall at different points in
the distribution,and the balance (or centre of gravity ) is shifted to one side or the other-to left
or right. “ - Garret

“Skewness refers to as asymmetry or lack of symmetry in the shape of a frequency


distribution . “ - Morris Hamburg

Karl Pearson’s Coefficient of Skewness :

Karl Pearson’s coefficient of skewness or simply Pearsonian coefficient of skewness (Sk p) is


based on the difference between mean and mode. This difference is divided by Standard
Deviation toget relative measure. That is :
( Mean− Mode )
Sk p=
Standard Deviation

In the absence of mode, the empirical formula [Mode = 3 Median – 2 Mean] is used instead. In
such a case, Pearson’s coefficient of Skewness is calculated using the formula :

3 ( Mean−Median )
Sk p=
Standard Deviation

Problems :

1. From the following data calculate Karl Pearon’s coefficient of skewness :


22 33 28 30 32 33 29 33
2. From the following data calculate Karl Pearon’s coefficient of skewness :
40,36,42,53,45,50,68,61,55.
3. The wages paid to 550 workers per day in a manufacturing unit is as follows:

Wages (Rs.) 100 200 300 400 500 600 700 800 900
No. of Workers 35 40 48 100 125 87 43 22 50
Calculate Karl Pearson’s coefficient of Skewness .
4. Calculate Karl Pearon’s coefficient of skewness from the following table :

Lifetime (in 300- 400- 500- 600- 700- 800- 900-


hours) 400 500 600 700 800 900 1000
No. of bulbs 25 56 60 75 48 30 15

5. Find Karl Pearon’s coefficient of skewness from the following table :

Wages (Rs.) 270- 280- 290- 300- 310- 320- 330- 340-
280 290 300 310 320 330 340 350
No.of 12 18 35 42 50 45 20 8
workers

6. Consider the following data :

Distribution A Distribution B
Mean 100 90
Median 90 80
S.D. 10 10
State whether the following statements are true or false .
a) Distribution A has the same degree of the variation as Distribution B .
b) Both distributions have the same degree of Skewness.
Solution:

a) Co-efficient of variation
 Distribution A = CV= 10/100 x 100 = 10
 Distribution B = CV = 10/90 x 100 = 11.11
 False, the degree of variation of distribution A and B are not same.
b) Co-efficient of Skewness
 Distribution A = 3(100 -90)/10 = 3
 Distribution B = 3 (90 – 80)/10 = 3
 True, Both the distribution have the same degree of skewness

7. In a certain distribution Mean = 45, Mode = 44 ; Sk p=0.4 . Find S.D.


8. Calculate coefficient of skewness. Mean = 59.5 ; Variance = 110 ; Median = 55.7.
9. Given C.V. = 30% ; Mean = 25 ; Mode = 16. Find Sk p .
10. In a moderately skewed frequency distribution the mean is 20 and the median is 18.5. If
the coefficient of variation is 30%n, find the Pearsonian Coefficient of Skewness of the
distribution.
11. The following information was obtained from the record of a factory relating to wages :
Arithmetic Mean = Rs. 560
Median = Rs. 565
S.D. = 25.76.
Calculate the coefficient of variation and skewness.

You might also like