Professional Documents
Culture Documents
Measures of Variation
Measures of Variation
Group: 5
Name
Jannatul Fardous
Shurovi
Israt Jahan Mou
Momena Khatun
Md. Nahidul Hasan
Sadia Maliha Trisha
Md. Uzzal Miah
Md. Khairul islam
Md. Moin khan
Md. Nazmul Hasan
Roll
15132537
15132538
15132539
15132540
15132541
15132542
15132543
15132544
15132545
Table of content
Measures of variation
Introduction
Significance of measures of variation
Properties of a good measure of variation
Measuring variation
Range
Definition of range
Example of range
Merits of range
Limitations of range
Uses of range
Quartile Deviation
Definition of quartile deviation
Example of quartile deviation
Merits of quartile deviation
Limitation of quartile deviation
Average deviation
Definition of average deviation
Example of average deviation
Merits of average deviation
Limitation of average deviation
Standard deviation
Definition of standard deviation
Example of standard deviation
Merits of standard deviation
Limitation of standard deviation
Mathematical properties of standard deviation
Lorenz curve
Which measure of variation to use
Introduction
The Measure of central tendency serve to locate the center of the distribution, but they
do not reveal how the items are spread out on either side of the center. This characteristic
of a frequency distribution is commonly referred to as variation. In a series all the items
are not equal. There is difference or variation among the values. The degree of variation
is evaluated by various measures of variation. Small variation indicates high uniformity
of the items, while large variation indicates less uniformity.
Measuring variation
The following are the important methods of studying variation.
1. Range,
2. The Inter-quartile range or Quartile deviation,
3. The Average deviation,
4. The Standard deviation,
5. The Lorenz curve.
Of these, the first four are mathematical methods and the last is a graphical one.
Range
Range is the simpliest and quickest measure of dispersion. Being a positional measure it
accounts only the difference between the highest and lowest observation in any data
series.
So, R = L-S
Here, R= Range
L= Largest value.
S= Smallest value.
Coefficient of range
The relative measure corresponding to the range, called coefficient of range.
Coefficient of range =
Example of range
LS
LS
Example 1: Find the value of range and its co-efficient for the following data.
7, 9, 6, 8, 11, 10, 4
Solution:
L=11, S = 4.
Range = L S = 11- 4 = 7
Coefficient of range =
LS
LS
11 4
11 4
7
15
= 0.467
LS
LS
75 60
75 60
15
135
= 0.11
Example 3 : The yields (kg per plot) of a cotton variety from five plots are 8, 9, 8, 10 and
11. Find the range and coefficient of range.
Solution
L=11, S = 8.
Range = L S = 11- 8 = 3
Coefficient of range =
LS
LS
11 8
3
=
11 8 19
= 0.158
Example 4 : the following are the prices of shares of a company from Monday to
Saturday,
Day
Price (Rs.)
Day
Price (Rs.)
Monday
200
Thursday
160
Tuesday
210
Friday
220
Wednesday
208
Saturday
250
Calculate range and coefficient of range.
Solution : Range = L S = 250 160 = 90
Coefficient of range =
LS
LS
250 160
250 160
90
410
= 0.22
No. of cos.
8
10
12
No. of cos.
8
4
Solution : Range = L S = 60 10 = 50
Coefficient of range =
LS
LS
60 10
60 10
50
70
= 0.714
Merits of range
1. Simple to compute.
2. Easy to understand.
3. Quickly calculated.
Limitations of range
1. It is very much affected by the extreme items.
2. It is based on only two extreme observations.
3. It cannot be calculated from open-end class intervals.
4. It is not suitable for mathematical treatment.
5. It is a very rarely used measure.
Uses of range
1. For quality control.
2. Fluctuation in the share price.
3. Weather forecast.
Quartile Deviation
Quartile Deviation is half of the difference between the first and third quartiles. Hence, it
is called Semi Inter Quartile Range. The interquartile range or the quartile deviation is a
better measure of variation in a distribution than the range.
Here, avoiding the 25 percent of the distribution at both the ends uses the middle 50
percent of the distribution. In other words, the interquartile range denotes the difference
between the third quartile and the first quartile.
Symbolically, interquartile range = Q3 -Q1
Many times the interquartile range is reduced in the form of semi-interquartile range or
quartile deviation as shown below:
N / 2 p.c. f
i
f
c.f
8
24
63
121
181
221
243
258
273
282
292
= 1410+
Q1 = size of the
146 121
20 = 1410 + 8.333 = 1418.333
60
N
292
th observation =
= 73th observation
4
4
N / 4 p.c. f
i
f
= 1390 +
73 63
20 = 1390 + 3.448 = 1393.448
58
Q3 = size of the
3N
3 292
th observation =
= 219th observation
4
4
3 N / 4 p.c. f
i
f
1430 +
219 181
20 = 1430+ 8.333 = 1449
40
Q3 Q1 1449 1393.448
Coefficient of Q.D.=
Q3 Q1
1449 1393.448
55552
2842.448
= 0.020
Example 2 : Based on the frequency distribution given below, compute the following
statistical measures to characterize the distribution.
Annual tax paid
No. of Managers
(Rs. thousand)
5-10
18
10-15
30
15-20
46
20-25
28
25-30
20
30-35
12
35-40
6
Calculate the inter-quartile range.
Solution:
calculation of semi inter-quartile range
Annual tax paid
No of managers
p.c.f
(Rs.thousand)
f
5-10
18
18
10-15
30
48
15-20
46
94
20-25
28
122
25-30
20
142
30-35
12
154
35-40
6
160
N=160
Semi inter-quartile range = Q3 -Q1
Q1 = size of the
N
160
th observation =
= 40th observation
4
4
N / 4 p.c. f
i
f
= 10 +
40 18
5 = 10 + 3.67 = 13.67
30
Q3 = size of the
3N
3 160
th observation =
= 120th observation
4
4
3 N / 4 p.c. f
i
f
20 +
120 94
5 = 20.4.64 = 24.64
28
161-165
Q1 = size of the
28
N=360
360
N
360
th observation =
= 90th observation
4
4
N / 4 p.c. f
i
f
= 135.5 +
Q3 = size of the
90 75
5 = 135.5+ 1.5625 = 137.06
48
3N
3 360
th observation =
= 270th observation
4
4
3 N / 4 p.c. f
i
f
270 234
5 = 150.5+3.27= 153.77
55
Q Q1
153.77 137.06
Quartile deviation = 3
=
=8.355
2
2
=
150.5 +
Number of
consumption ( f )
9
18
27
p.c.f
9
27
54
600-800
800-1000
1000-1500
1500-2000
2000 and above
Q1 = size of the
32
45
38
20
2
N=191
86
131
169
189
191
N
191
th observation =
= 47.75th observation
4
4
N / 4 p.c. f
i
f
= 400 +
47.75
200 = 400+ 153.7= 553.70
27
Q3 = size of the
3N
3 191
th observation =
= 143.25th observation
4
4
3 N / 4 p.c. f
i
f
143.25 131
500 = 1000+161.18= 1161.18
38
Q Q1
1161 .18 553.70
Quartile deviation = 3
=
= 303.74
2
2
=
1000 +
Example 5: You are given the data pertaining to kilowatt hours of electricity consumed
by 100 persons in Delhi.
Consumption(k.watt hours)
No. of users
0 but less than 10
6
10 but less than 20
25
20 but less than 30
36
30 but less than 40
20
40 but less than 50
13
Calculate the range within which middle 50% of the consumers fall.
Solution:
Calculation of range within which the middle 50% consumers fall
Consumption(k.watt
No. of users
p.c.f
hours)
0 but less than 10
6
6
10 but less than 20
25
31
20 but less than 30
36
67
30 but less than 40
20
87
40 but less than 50
13
100
N=100
Q1 = size of the
N
100
th observation =
= 25th observation
4
4
N / 4 p.c. f
i
f
= 10 +
25 6
10 = 10+ 7.6= 17.6
25
Q3 = size of the
3N
3 100
th observation =
= 75th observation
4
4
3 N / 4 p.c. f
i
f
30 +
75 67
10 = 30+4= 34
20
Range within which the middle 50% consumers fall= 34-17.6 = 16.4
Average Deviation:
The range and quartile deviation are not based on all observations. They are positional
measures of dispersion. They do not show any scatter of the observations from an
average. The average deviation is measure of dispersion based on all items in a
distribution.
Average deviation is the average amount scatter of the items in a distribution from either
the mean or the median, ignoring the signs of the deviations.
Mathematically following formula represents the concept of A.D.
Case 1: Ungrouped data series; A.D=
X Med
N
X X
N
A.D
Median
If average deviation has been computed from mean, the coefficient of average deviation
shall be obtained by dividing average deviation by the mean.
Coefficient of average deviation =
A.D
Mean
Soluation:
Calculation of average deviation
Branch 1
Income(RS)
Branch 1
Branch 2
Income(RS)
X Med
4000
4200
4400
4600
4800
med=4400
400
200
0
200
400
N=5
X Med =120
3000
4000
4200
4400
4600
4800
5800
N=7
0
Branch 1:
A.D=
X Med
N
Branch 2
X Med
med=4400
1400
400
200
0
200
400
1400
X Med =400
0
=
1200
=240
5
A.D
240
=
=0.054
Median 4400
X Med 4000
A.D=
=
=571.43
7
N
A.D
571.43
Coff. Of A.D=
=
=0.13
4400
Median
Coff. Of A.D=
Branch 2:
No. of day
3
6
11
3
2
Solution:
Sales (In
Thousand)
10-20
20-30
30-40
40-50
50-60
m.p.
X
15
25
35
45
65
(X-35)/10 fd
3
6
11
3
2
N=25
-2
-1
0
1
2
-6
-6
0
3
4
fd 5`
204
25
X A
X X
N
=5.16
fd i
N
X X
18
8
2
12
22
X X
54
48
22
36
44
f
4
X X
=20
=35- 25 10 =33
Example 3: Calculate mean deviation for the following frequency distribution.
No. of colds
No of persons
experienced in 12
month
0
15
1
46
2
91
3
162
4
110
5
95
6
82
7
26
8
13
9
2
Solution:
X X
No. of colds
No of
d = X-A
fd
f X X
experienced
persons
in 12 month
(X)
f
0
15
-5
3.78
56.70
-75
1
46
-4
2.78
127.88
-184
2
91
-3
1.78
161.98
-273
3
162
-2
0.78
126.36
-324
4
110
-1
0.22
24.20
-110
5
95
0
1.22
115.90
0
6
82
1
2.22
182.04
82
7
26
2
3.22
83.72
52
8
13
3
4.22
54.86
39
9
2
4
5.22
10.44
8
N=642
f X X =941.3 fd=-783
0
X X
N
941.30
= 642 =1.47
X A
fd i
N
=5+
( 785)
1=
642
3.78
STANDARD DEVIATION
It is a measure of spread or variability in the sample. It is defined on the square root of
the arithmetic mean of the squared deviations of individual values around the mean.
Mathematically following formula represents the concept of S.D.
Case 1: Ungrouped data series;
It has two formula; 1) Deviation taking from actual mean and
2) Deviation taken from assumed mean.
1) Actual mean:
2) Assumed mean:
d
N
d )
fX
2) Assumed mean:
fd
fd )
Coefficient of variation
The standard deviation discussed so far is an absolute measure of variation. The
corresponding relative measures is known as the coefficient of variation.
C.V. =
100
Solution:
Weekly wages
X
1320
1310
1315
1322
1326
(X-A)=d
d2
0
-10
-5
2
6
0
100
25
4
36
F
G
H
I
J
N=10
1340
1325
1321
1320
1331
20
5
1
0
11
d 30
d )
400
25
1
0
121
712
N
N
712
30
( ) 2 =7.89
10
10
Example 2: The following table gives the fluctuations in the prices of shares a company.
Price(in price)
318
322
325
312
324
315
308
318
Calculate the mean, standard deviation and coefficient of variation.
Solution: Calculation of mean, standard deviation and coefficient of variation.
X
(X-A)=d
d2
318
-6
36
322
-2
4
325
1
1
312
-12
144
324
0
0
315
-9
81
308
-16
256
318
-6
36
2
X= 2542
d 30
d 712
Solution: X
N
2542
8
= 317.75
d )
N
N
558
50 2
(
) = 5.54
=
8
8
5.54
100
C.V. = 100 =
x
317.75
= 1.74%
Example 3: Prices of shares a company.
Price(in price)
2542
2522
2534
2532
2542
2530
2556
2530
Calculate the mean, standard deviation and coefficient of variation.
Solution: Calculation of mean, standard deviation and coefficient of variation.
X
(X-A)=d
d2
2542
0
0
2522
-20
400
2534
-6
36
2532
-10
100
2542
0
0
2530
-12
144
2556
14
196
2530
-12
144
2
X= 20292
d 46 d 1020
Solution: X
N
20292
8
= 2536.5
d )
N
N
1020
46 2
(
) = 9.72
=
8
8
9.72
100
C.V. = 100 =
x
2536.5
= 38%
Example 4: blood serum cholesterol levels of 10 persons are as under:
240, 260, 290, 245, 255, 288, 272, 263, 277, 250
Calculate standard deviation with the help of assumed mean.
Solution:
Calculate standard deviation
X
(X-A)=d
d2
240
-15
225
260
5
25
290
35
1225
245
-10
100
255
288
272
263
277
250
X=2640
Solution: X
0
33
17
8
22
-5
d 90
N
2640
10
0
1089
289
64
484
25
3526
= 264
d )
N
N
4360
136 2
(
) = 15.84
=
10
10
15.84
100
C.V. = 100 =
x
264
= 6%
Example 5: The index number of price of cotton in April 2008 was as under:
Month Jan.
Fe. March April May June
July Aug. Sep.
Cotton 188
178
173
164
172
184
184
185
211
Solution:
Calculate standard deviation
X
(X-A)=d
d2
188
16
256
178
6
36
173
1
1
164
-8
64
172
0
0
184
12
144
184
12
144
185
13
169
211
39
1521
217
45
2025
X=1856
d 136
d 2 4360
d )
N
N
4360
136 2
(
)
10
10
= 15.84
Oct.
217
Solution:
Calculation of Mean and Standard deviation
No. of
Rejects
operator
20.5-25.5
25.5-30.5
30.5-35.5
35.5-40.5
40.5-45.5
45.5-50.5
50.5-55.5
m.p
X
23
28
33
38
43
48
53
Let
No. of
Operator
f
5
15
28
42
15
12
3
N=120
d=
fX
(X-38)/5
d
fd
fd2
-3
-2
-1
0
1
2
3
-15
-30
-28
0
15
24
9
45
60
28
0
15
48
27
N
X A
X A id
i
X X
fd
Mean:
X A
fd )
fd i
N
25
=38- 120 5 =36.96
Standard deviation:
fd
fd )
i
N
N
223
25 2
(
) 5 =6.375
120
120
2
fd 25 fd
223
Example 2: the breaking strength of 80 test pieces of a certain alloy is given in the
following table, the unit being given to the nearest thousand pounds per square inch.
Breaking strength
No. of pieces
44-46
3
46-48
24
48-50
27
50-52
21
52-54
5
Calculate the average breaking strength of the alloy and the standard deviation.
Solution:
Calculation of mean and standard deviation
Breaking
strength
44-46
46-48
48-50
50-52
52-54
No. of
pieces
3
24
27
21
5
N=80
X A
Mid point
X
45
47
49
51
53
fd i
N
1
49+ 81 2 =
fd
d= X-A/i
fd
fd2
-2
-1
0
1
2
-6
-24
0
21
10
12
24
0
21
20
fd 1
fd
77
49.025
fd )
N
N
77
1
( )2 2 =
80
80
676.5
Thus the average breaking strength of the alloy 49.025 and standard deviation 676.5
Example 3 : An association doing charity work decided to give old age pensions to
people over sixty years to age.
The scales of pensions were fixed as follows :
Age group 60 to 65---Rs. 2500 per month
Age group 65 to 70---Rs. 3000 per month
Age group 70 to 75---Rs. 3500 per month
Age group 75 to 80---Rs. 4000 per month
Age group 80 to 85---Rs. 4500 per month
The age of 25 persons who secured the pensions benefits are given bellow
75
62
84
72
83
72
81
64
71
63
61
60
61
67
74
64
79
73
75
76
69
78
66
67
68
Solution :
(X-3500)/500
d
-2
-1
0
1
2
X A
fd
Fd2
7
5
6
4
3
-14
-5
0
4
6
28
5
0
4
12
fd i
N
9
=3500- 25 500 =3320
fd
fd )
i2
N
N
49
9 2
(
) 500 =676.5
25
25
Thus the monthly average pensions is Rs. 3320 and standard deviation 676.5
Example 4: Suppose that samples of polythene bags from a manufacturers are tested by a
prospective buyer for bursting pressure, with the following results:
Bursting pressure(lbs)
Number of begs
5.0-9.9
2
10.0-14.9
9
15.9-19.9
29
20.0-24.9
54
25.0-29.9
11
30.0-34.9
5
Solution:
Calculation of mean and standard deviation and C.V.
Bursting
pressure(lbs)
5.0-9.9
10.0-14.9
15.9-19.9
20.0-24.9
25.0-29.9
30.0-34.9
Number of
begs
f
2
9
29
54
11
5
N=110
Mid point
d= X-A/i
fd
fd2
X
7.45
12.45
17.45
22.45
27.45
32.45
-2
-1
0
1
2
3
-4
-9
0
54
22
15
8
9
0
54
44
45
fd 78
fd
160
X A
fd i
N
78
= 17.45+ 110 5 = 21
fd
fd )
2
i2
N
N
160
78 2
(
) 5 = 4.879
=
110
110
4.879
100
C.V. = 100 =
x
21
= 23.23%
Example 5: A purchasing agent obtained samples of 60 watt bulbs from a company. He
had the samples tested in his laboratory for length of life with the following results.
Length of life(in hours)
Samples
1700 and under 1900
10
1900 and under 2100
16
2100 and under 2300
20
2300 and under 2500
8
2500 and under 2700
6
Calculate the mean and standard deviation.
Solution:
Length of
life(in
hours)
1700-1900
1900-2100
2100-2300
2300-2500
2500-2700
Samples
f
Mid point
X
d= X-A/i
fd
fd2
10
16
20
8
6
N=60
1800
2000
2200
2400
2600
-2
-1
0
1
2
-20
-16
0
8
12
40
16
0
8
24
X A
fd i
N
(16)
200 = 2146.67
60
fd 2
fd ) 2 i 2
(
N
N
88
( 16) 2
(
) 200 = 236.3
=
60
60
236.3
100
C.V. = 100 =
x
2146.67
= 2200+
fd
16
fd
88
= 11%
N1 1 N 2 2 N 3 3 N1d1 N 2 d 2 N 3 d 3
N1 N 2 N 3
123
2. Standard deviation of natural number: The standard deviation of the natural number
can be obtained by the following formula.
1
( N 2 1)
12
Branch
A
B
C
Solution:
No. of worker
50
60
90
X 123
Weekly wages
1413
1420
1415
S.D.
60
70
80
N1 X 1 N 2 X 2 N 3 X 3
N1 N 2 N 3
(50 1413) (60 1420) (90 1415)
=
50 60 90
70650 85200 127350
=
200
=1416
Combined standard deviation of three branches
2
N1 1 N 2 2 N 3 3 N1d1 N 2 d 2 N 3 d 3
123
N1 N 2 N 3
d1=
d2=
X 1 X 123
X
X 123
= 1413 1416 =3
= 1420 1416 =4
d3=
123
X 123
= 1415 1416 = 1
=72.51
Limitations
1. As compared to other measures it is difficult to compute.
2. It gives more weight to extreme values and less to those which are near the
mean. it is because of the fact the squares of the deviation which are big in
size would be proportionately greater than the squares of those deviations
which are comparatively small.
LORENZ CURVE
This measure of dispersion is graphical. It is known as the Lorenz curve named after Dr.
Max Lorenz. It is generally used to show the extent of concentration of income and
wealth. The steps involved in plotting the Lorenz curve are:
1. Convert a frequency distribution into a cumulative frequency table.
2. Calculate percentage for each item taking the total equal to 100.
3. Choose a suitable scale and plot the cumulative percentages of the persons and income.
Use the horizontal axis of X to depict percentages of persons and the vertical axis of Y to
depict percent ages of income.
4. Show the line of equal distribution, which will join 0 of X-axis with 100 of Yaxis.
5. The curve obtained in (3) above can now be compared with the straight line of equal
distribution obtained in (4) above. If the Lorenz curve is close to the line of equal
distribution, then it implies that the dispersion is much less. If, on the contrary, the
Lorenz curve is farther away from the line of equal distribution, it implies that the
dispersion is considerable.
The Lorenz curve is a simple graphical device to show the disparities of distribution
in any phenomenon. It is, used in business and economics to represent inequalities in
income, wealth, production, savings, and so on.
Figure 1: shows two Lorenz curves by way of illustration. The straight line AB is a
line of equal distribution, whereas AEB shows complete inequality. Curve ACB and
F
Figure 1: Lorenz Curve