Chapter 4 Dispersion of Data

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3


Dispersion is the spread of data about its average value. The three histograms below all have the
same centre, but from left to right the dispersion of values increases.

After measures of the centre of data, measures of dispersion are the most quoted descriptive
statistic. We will look at three:
 range
 inter-quartie range
 standard deviation

The range is simply the arithmetic difference between the largest and smallest data value.
When the set of data 5.6 4.1 7.2 5.4 6.0 5.5 is ordered, the data
becomes 4.1 5.4 5.5 5.6 6.0 7.2
The range is simply 7.2  4.1 = 3.1

This statistic obviously measures the dispersion of data, but has the disadvantage that it only
considers the two outside measurements, which could be outliers.

The inter-quartile range is the arithmetic difference between the 75th percentile (or 3rd
quartile) and the 25th percentile (or 1st quartile). The data set above is rather too small to
calculate percentiles - but approximately the first quartile will be about 5.4, the third quartile 6.0,
and the inter-quartile range will be 6.0  5.4 = 0.6.

The standard deviation is calculated using the formula

1   x 2

Standard Deviation =
n 1 
 x2 
n 
 

 x  
Alternative formula for ungrouped data:   i 1

where, as before, x represents the data set, and n is the number of data values in the set.
In most instances we would use a statistical calculator, spreadsheet, or statistical computer
program to calculate the standard deviation.

There are actually two formulas for standard deviation. The one above, with an (n-1) in the
denominator, is called the sample standard deviation, and usually indicated by the letter s. A
second formula, where the (n-1) in the denominator is replaced by n, is called the population
standard deviation, and usually indicated by the Greek letter  (sigma). Statistical
calculators give both calculations. We will not explain the distinction between these two
calculations - you can use either, as long as you indicate which you are using.

We will assume that you always have a calculator, spreadsheet, or statistical computer package
available and do not have to use the formula. If you use your calculator you will find that the set
of values
5.6 4.2 7.2 5.4 6.0 5.5
have the standard deviation s = 1.002 (or  = 0.914)

 The inter-quartile range must be less than the range. Why?
 The population standard deviation must be less than the sample standard
deviation. Why?
 Unlike the measures mean, median and mode, which were often very similar, the
three measures of dispersion are quite different. Statisticians get a “feel” for each,
and what it means for any set of data. But a rule of thumb for a typical set of data
the standard deviation is usually about one quarter of the range
the inter-quartile range is usually a bit less than one half of the range
Calculating a range, inter-quartile range or standard deviation is pointless if you do not
understand and can not interpret the information that the statistic is conveying. Below is a
histogram that indicates these three statistics. You should be able to justify the range and inter-
quartile range. For the standard deviation, note out earlier rule of thumb that it is approximately
a quarter of the range. It is often indicated by a distance either side of the mean.


You might also like