Professional Documents
Culture Documents
Measures of Dispersion
Measures of Dispersion
Measures of Dispersion
Measures of Dispersion
Dispersion or Variation
The degree to which numerical data tend to spread about an average value is called
the dispersion, or variation, of the data. For example the following groups all have
the same mean, 4.25:
Group A: 2, 3, 4, 8
Group B: 1, 2, 4, 10
Group C: 0, 1, 5, 11
It is clear that Group B is more variable (shows a larger spread in the numbers)
than Group A, and Group C is more variable than Group B. But we need a
quantitative measure of this variability.
1. Range
The range is the simplest measure of variation to find. It is simply the highest value
minus the lowest value. Range = Maximum Value – Minimum Value
Since the range only uses the largest and smallest values, it is greatly affected by
extreme values.
There are two ways of defining the range for grouped data.
First method
1
Second method
Range = upper class boundary of highest class - lower class boundary of lowest
class
(The absolute value of a number is the number without the associated sign and is
indicated by two vertical lines placed around the number; thus|−4|=4, |+3|=3, |6|=6,
and |−0.84|=0.84).
If X1, X2, . . . , XN occur with frequencies f1, f2, . . . ; fi, respectively, the mean
deviation can be written as:
Where: 𝑁=Σ𝑓
This form also useful for grouped data, where the Xi’s represent class marks
and the fi’s are the corresponding class frequencies.
2
Note that it would be more appropriate to use the terminology mean absolute
deviation than mean deviation.
Where 𝑥 represents the deviations of each of the numbers Xi from the mean 𝑋.
Thus, s is the root mean square (RMS) of the deviations from the mean, or, as it is
sometimes called, the root-mean-square deviation.
Example 5. Compute the standard deviation for the set 6, 7, 9, 11, 12, 12, 13.
If X1, X2, . . . , XN occur with frequencies f1, f2, . . . ; fi, respectively, the
standard deviation can be written as:
3
In this form, it is useful for grouped data.
• We must compute the deviations of each the midpoint of classes (Xi) from the
mean (𝑋), which is (𝑋𝑖−𝑋).
Example4. For the following frequency distribution table, compute the standard
deviation.
The variance of a set of data is defined as the square of the standard deviation and
is thus given by s2.
4
5. Quartiles and Interquartile Range
If a set of data is arranged in order of magnitude, the median (the middle value or
arithmetic mean of the two middle values) divides the set into two equal parts. By
extending this idea, we can think of those values, which divide the set into four
equal parts. These values denoted by Q1, Q2, and Q3, are called the first, second,
and third quartiles, respectively. The value Q2 (the 50th percentile) being equal to
the median. The 25th and 75th percentiles correspond to the first and third
quartiles, respectively.
• Q1 – First Quartile – 25% of the observations are smaller than Q1 and 75% of the
observations are larger than Q1.
• Q2 – Second Quartile – 50% of the observations are smaller than Q2 and 50% of
the observations are larger than Q2.
• Q3 – Third Quartile – 75% of the observations are smaller than Q3 and 25% of the
observations are larger than Q3 .
NOTE:
• the quartiles, like the median, either take the value of one of the observations, or
the average of two observations.
5
Example 8. Compute the quartiles and the interquartile range for the set 2, 7, 5, 0,
6, 10, 9, 10, 3, 12, 8, 4.
Solution:
6
For grouped data:
The sample size is n = 24. To find the first quartile, compute (0.25)(25) = 6.25. The
first quartile is therefore found by averaging the 6th and 7th data points, when the
sample is arranged in increasing order. These yields (105 + 126)/2 = 115.5. To find
the third quartile, compute (0.75)(25) = 18.75. We average the 18th and 19th data
points to obtain (242 + 245)/2 = 243.5.
7
C. Shape
A third important property of a set of data is its shape. A simple way to
describe the property of shape is to compare the mean and the median:
1) If these two measures are equal, the data describe as (symmetry or zero-
skewness).
Mean = Median
2) If the mean exceeds the median, the data may generally describe as
(positive) or (right skewed).
Mean > Median
3) If the median exceeds the mean, the data may generally describe as
(negative) or (left skewed).
Mean < Media
8
Skewness:
Skewness means lack of symmetry. We study skewness to have an idea about the
shape of the curve which we can draw with the help of the given data.
Measures of skewness:
The important measures of skewness are:
(i) Karl – Pearson’s coefficient of skewness.
(ii) Bowley’s coefficient of skewness.
In case of mode is ill – defined, the coefficient can be determined by the formula:
10