Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 33

Module on Measures of

Variability
Measures of Variability
Once we know the average value of a set of
measurements, our next question should be: how
typical is the average value of all the measurements in
the data set? Or in other words, how spread out are
the measurements about their average value? The
importance of looking the average value is borne out
by the fact that many individuals make use of the
concept of variability in everyday decision-making,
whether or not they compute a numerical measure of
the dispersion.
The degree to which numerical data tend to
spread out about the average value is called
dispersion or variation of the data. Various
measures of the dispersion (or variation) are available,
the most common being the range, quartile deviation
or semi-inter quartile range, percentile, decile, and
standard deviation.
Range

The range refers to the difference


between the highest and the lowest
score. It is the easiest and simplest to
determine among the measures of
variability because it depends only on
the pair of extreme values.
Example: Find the range of the frequency distribution below.

Class Interval Frequency


118 – 126 3
127 – 135 5
136 – 144 9
145 – 153 12
154 – 162 5
163 – 171 4
172 – 180 2

Solution:

Range (R) = highest class interval – lowest class interval


= 180 –118
= 62
Although the range is the easiest to
compute and easiest to understand, it is also the most
unstable because its value is easily fluctuates with the
change in either of the highest or lowest scores.
Consider the test scores of two girls below:

Linda Pinky
17 18
18 10
7 17
15 11
14 18
13 10
X = 84 X = 84
X = 14 X = 14
R = 11 R=8
If we compare the test scores, we will see
that Linda’s test scores have a higher range
(18 – 7) = 11, then that of Pinky’s test scores,
(18 – 10) = 8. These ranges tell us that Linda’s
scores are apparently more scatters than Pinky’s.

However, if we look closely at Linda’s scores,


we will see except for the lowest score of seven, her
scores are quite consistent – in fact, more consistent
or clustered than Pinky’s scores. Without Linda’s
lowest score of seven, the range of her scores would
only be five (18 – 13) = 5, whereas if we exclude
the lowest score (10) of Pinky, her range would be
seven (18 – 11) = 7. Can we then really say that
Linda’s scores are more scattered or variable than
Pinky’s?
Quartile Deviation or Semi-Interquartile Range

The quartile deviation, symbol Q, is


frequently called the semi--interquartile range. It
provides the spread with half of the middle 50
percent of the scores on values in a distribution. If
the distribution is symmetrical, one quartile deviation
above and below the median will involves the middle
50% of all the scores. The quartile deviation or semi-
interquartile of a set of data is defined by the
formula,

Q3 – Q1
Q = -------------
2

where: Q3 & Q1 are the third and first quartile


of the data
Using lower boundary:

N/4 – f up
Q1 = LB + ------------- (i )
Fm

N/2 – f up
Q2 = LB + -------------- (i )
Fm

3N/4 – f up
Q3 = LB + --------------- (i )
Fm
Example: Determination of quartile range.

X f Cum. fup

50 – 52 2 30
47 – 49 2 28
44 – 46 1 26
41 – 43 1 25
38 – 40 9 24
35 – 37 3 15
32 –34 5 12
29 – 31 1 7 Q3
26 – 28 4 6 Q1
23 – 25 2 2
N = 30
N/4 = 7.5
3N/4 = 22.5
Solution:

Using lower boundary:


7.5 – 7
Q1 = 31.5 + ------------- ( 3 ) = 31.8
5

15 – 12
Q2 = 34.5 + ------------ ( 3 ) = 37.5
3

22.5 - 15
Q3 = 37.5 + ------------- (3)=
40
9
Using the upper boundary:

3N/4 – f down
Q1 = UB - ------------------ (i )
Fm

N/2 – f down
Q2 = UB - ------------------ (i )
Fm

N/4 – f down
Q3 = UB - ------------------ (i )
Fm
Example: Determination of quartile range.

X f Cum. fdown

50 – 52 2 2
47 – 49 2 4 Q3
44 – 46 1 5
6 Q1
41 – 43 1
38 – 40 9 15
35 – 37 3 18
32 – 34 5 23
29 – 31 1 24
26 – 28 4 28
23 – 25 2 30
N = 30
N/4 = 7.5
3N/4 = 22.5
Using the upper boundary:

7.5 - 6
Q3 = 40.5 - ------------- ( 3 ) = 40
9

22.5 - 18
Q1 = 34.5 - ------------- ( 3 ) = 31.8
5
Percentile

Percentile divides the distribution


into 100 parts. Just like the quartile, the
percentile is computed in a similar vein
with the median.
Decile
Decile by definition divides the
distribution into 10 equal parts. The
computations is similar to the percentile
except that the multiplier is in the multiple
of 1/10, e.g. D3 will be located at 3N/10,
using the N = 30, D3 = 3 (30)/10 = 9 in
this case. Then, compute the required
decile using similar way like the median.
Standard Deviation
The standard deviation, denoted by s, is
the positive root of the arithmetic mean of the
squared deviations from the mean of the
distribution. It is important on a measure of
heterogeneity or unevenness within a set of
observations.

The standard deviation of a sample of


ungrouped data is defined as,

 (Xi – X)2
s= ---------------
N-1

Where: Xi = value of each item


X = computed mean
N = number of cases/observations
To illustrate the calculation of standard
deviation, let us use the following data.

Xi (Xi – X) (Xi – X)2

15 -2 4
15 -2 4
17 0 0
18 1 1
20 3 9
X = 17  (Xi – X)2= 18
Solution:

15 + 15 + 17 + 18 + 20
X = --------------------------------- = 17
5

18
s= --------------- = 2.12
(5 – 1)
There are three methods in
computing the standard
deviation of grouped data.
These are the [1] long method
A, [2] long method B, and [3]
coded deviation method.
Long Method A

 f (X1 – X)2
s= ---------------------
N

Where: X1 = midpoint of each item


X = computed mean
f = frequency
N = number of cases/observations
Example: Determine the standard deviation of the frequency
distribution below.

Class f
X1 (X1 –X) (X1 – X)2 f (X1 – X)2
interval
118 – 126 3 122 -25 625 1875
127 – 135 5 131 -16 256 1280
136 – 144 9 140 -7 49 441
145 – 153 12 149 2 4 48
154 – 162 5 158 11 121 605
163 – 171 4 167 20 400 1600
172 - 180 2 176 29 841 1682
f (X1 – X)2 = 7531
N = 40

Solution:
 f midpoint
X = -------------------- = 147
N
 f (X1 – X)2
s= ---------------------
N

7531
s= ---------------------
40

s = 13.7
Long Method B

 f (X1 )2 (  f X1 )2
s= -------------- - ---------------
N-1 N(N – 1)
Using the previous data (in Long Method A):

f X1 f X1 (X1)2 f (X1)2

122 366 14884 44652


3
131 655 17161 85805
5
140 1260 19600 176400
9
149 1788 22201 266412
12
158 790 24964 124820
5
167 668 27889 111556
4
176 352 30976 61952
2
N = 40  f X1 = 5879  f (X1)2 = 871597
Solution:

 f (X1 )2 (  f X1 )2
s= --------- - ----------
N-1 N(N – 1)

871597 ( 5879 )2
s= --------- - ----------
39 40(39)

s = 13.9
Coded Deviation Method
 f (d )2 (  f d )2
s = ( i ) ----------- - --------------
N-1 N(N – 1)
Where:
i = interval
d = coded deviation = (X1 – X)/
i
Using the previous data

f d fd (d )2 f (d )2
3 -3 -9 9 27
5 -2 -10 4 20
9 -1 -9 1 9
12 0 0 0 0
5 1 5 1 5
4 2 8 4 16
2 3 6 9 18
N = 40  f d = -9  f (d )2 = 95
Solution:
 f (d )2 (  f d )2
s= (i) --------- - ----------
N-1 N(N – 1)

95 ( -9 )2
s= (i) --------- - ----------
39 40(39)

s = 13.9
Coefficient of Variation
The coefficient of variation is used to express
the standard deviation as a percentage of the mean. To
compare two distributions with different means and
standard deviations, compute for the coefficient of
variation to express them in the same unit, that is, in
terms of percentage. Use the formula:
s
V = ------- (100)
X
Where: V = coefficient of variation
s = standard deviation
X = mean
Skewness and Kurtosis
The distribution is said to be skewed when the spread of the
measurements is greater on one side than on the other side of the point of
central tendency. A distribution that is skewed to the right has a positive
value for skewness. A distribution that is skewed to the left has a negative
value for skewness, and a normal or symmetrical distribution has zero
skewness.

A simple formula for skewness is,


3(X – Md)
Skewness (sk) = ----------------
s

where: X = mean
Md = median
s = standard deviation

Kurtosis is a measure of the degree of peakedness or flatness of


a distribution. There are three types of kurtosis, these are [1] leptokurtosis,
[2] platykurtosis, and [3] mesokurtosis.
Types of Kurtosis
Leptokurtic or tall distributions involve
unusual large number of scores or values at the center
of the distribution. It is more peaked than the normal
curve since the scores are concentrated within a very
narrow interval at the center. Its tails are high and long.

Platykurtic distributions have flat distribution.


The values or scores are distributed over a wider range
about the center making the hump of a curve flat. It is
flatter than the normal distribution. Its tails are short.

Mesokurtic distributions refer to the normal or


symmetrical distributions. The values or scores are
moderately distributed about the center of the
distribution. It is neither too flat nor too peaked.
Illustrations:

Leptokurtic Distribution

Platykurtic Distribution

Mesokurtic Distribution
Given the following score distributions of a quiz in Statistics.

X f
90 – 94 5
85 – 89 7
80 – 84 10
75 – 79 9
70 – 74 12
65 – 69 10
60 – 64 11
55 – 59 8
50 - 54 3
N = 75

a.Compute the range and semi-interquartile range.


b.What is the value of the 5th decile?
c.Compute the value of the 75th percentile.
d. Determine the standard deviation of the distribution

You might also like