Measures of Dispersion

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Measures of Dispersion

Measures of Dispersion

Dispersion or Variation

The degree to which numerical data tend to spread about an average value is called
the dispersion, or variation, of the data. For example the following groups all have
the same mean, 4.25:

Group A: 2, 3, 4, 8

Group B: 1, 2, 4, 10

Group C: 0, 1, 5, 11

It is clear that Group B is more variable (shows a larger spread in the numbers)
than Group A, and Group C is more variable than Group B. But we need a
quantitative measure of this variability.

1. Range

The range is the simplest measure of variation to find. It is simply the highest value
minus the lowest value. Range = Maximum Value – Minimum Value

Since the range only uses the largest and smallest values, it is greatly affected by
extreme values.

The Range for Grouped Data

There are two ways of defining the range for grouped data.

First method

Range = class mark of highest class - class mark of lowest class

1
Second method

Range = upper class boundary of highest class - lower class boundary of lowest
class

Average Deviation or Mean Deviation

The mean deviation, or average deviation, of a set of N numbers X1, X2, . . . , XN is


abbreviated MD and is defined by:

Where: 𝑋 is the arithmetic mean of the numbers.

|𝑋𝑖−𝑋| is the absolute value of the deviation of Xi from 𝑋.

(The absolute value of a number is the number without the associated sign and is
indicated by two vertical lines placed around the number; thus|−4|=4, |+3|=3, |6|=6,
and |−0.84|=0.84).

 If X1, X2, . . . , XN occur with frequencies f1, f2, . . . ; fi, respectively, the mean
deviation can be written as:

Where: 𝑁=Σ𝑓

This form also useful for grouped data, where the Xi’s represent class marks
and the fi’s are the corresponding class frequencies.

2
Note that it would be more appropriate to use the terminology mean absolute
deviation than mean deviation.

3. The Standard Deviation ‫االنحراف المعياري‬

The standard deviation of a set of N numbers X1, X2, . . . , XN is denoted by s and is


defined by:

Where 𝑥 represents the deviations of each of the numbers Xi from the mean 𝑋.
Thus, s is the root mean square (RMS) of the deviations from the mean, or, as it is
sometimes called, the root-mean-square deviation.

Example 5. Compute the standard deviation for the set 6, 7, 9, 11, 12, 12, 13.

 If X1, X2, . . . , XN occur with frequencies f1, f2, . . . ; fi, respectively, the
standard deviation can be written as:

3
In this form, it is useful for grouped data.

• We must compute the midpoint of each class.

• We must compute the arithmetic mean (𝑋).

• We must compute the deviations of each the midpoint of classes (Xi) from the
mean (𝑋), which is (𝑋𝑖−𝑋).

Example4. For the following frequency distribution table, compute the standard
deviation.

4. The Variance ‫التباين‬

The variance of a set of data is defined as the square of the standard deviation and
is thus given by s2.

4
5. Quartiles and Interquartile Range

If a set of data is arranged in order of magnitude, the median (the middle value or
arithmetic mean of the two middle values) divides the set into two equal parts. By
extending this idea, we can think of those values, which divide the set into four
equal parts. These values denoted by Q1, Q2, and Q3, are called the first, second,
and third quartiles, respectively. The value Q2 (the 50th percentile) being equal to
the median. The 25th and 75th percentiles correspond to the first and third
quartiles, respectively.

• Q1 – First Quartile – 25% of the observations are smaller than Q1 and 75% of the
observations are larger than Q1.

• Q2 – Second Quartile – 50% of the observations are smaller than Q2 and 50% of
the observations are larger than Q2.

• Q3 – Third Quartile – 75% of the observations are smaller than Q3 and 25% of the
observations are larger than Q3 .

NOTE:

• the quartiles, like the median, either take the value of one of the observations, or
the average of two observations.

5
Example 8. Compute the quartiles and the interquartile range for the set 2, 7, 5, 0,
6, 10, 9, 10, 3, 12, 8, 4.

Solution:

1) Put the data in order

6
For grouped data:

Example: Find the first and third quartiles of the data

The sample size is n = 24. To find the first quartile, compute (0.25)(25) = 6.25. The
first quartile is therefore found by averaging the 6th and 7th data points, when the
sample is arranged in increasing order. These yields (105 + 126)/2 = 115.5. To find
the third quartile, compute (0.75)(25) = 18.75. We average the 18th and 19th data
points to obtain (242 + 245)/2 = 243.5.

Coefficient of Variation A dimensionless quantity, the coefficient of variation is


the ratio between the standard deviation and the mean for the same set of data,
expressed as a percentage.
Multiplied by 100%.

7
C. Shape
A third important property of a set of data is its shape. A simple way to
describe the property of shape is to compare the mean and the median:
1) If these two measures are equal, the data describe as (symmetry or zero-
skewness).
Mean = Median
2) If the mean exceeds the median, the data may generally describe as
(positive) or (right skewed).
Mean > Median
3) If the median exceeds the mean, the data may generally describe as
(negative) or (left skewed).
Mean < Media

8
Skewness:
Skewness means lack of symmetry. We study skewness to have an idea about the
shape of the curve which we can draw with the help of the given data.
Measures of skewness:
The important measures of skewness are:
(i) Karl – Pearson’s coefficient of skewness.
(ii) Bowley’s coefficient of skewness.

(i) Karl – Pearson’s coefficient of skewness:


Relative measure of skewness called Karl – Pearson’s coefficient of
skewness given by:

In case of mode is ill – defined, the coefficient can be determined by the formula:

(ii) Bowley’s coefficient of skewness:

In Karl – Pearson’s method of measuring skewness the whole of the series is


needed. Bowley has suggested a formula based on relative position of quartiles. In
a symmetrical distribution, the quartiles are equidistant from the value of the
median; i.e.
9
Median – Q1 = Q3 – Median. But, in a skewed distribution, the quartiles will not be
equidistant from the median. Hence, Bowley has suggested the following formula

10

You might also like