Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

DESCRIPTIVE STATISTICS (Chapter 17)

489

MEASURING THE CENTRE OF DATA

We can get a better understanding of a data set if we can locate the middle or centre of the data and get an indication of its spread. Knowing one of these without the other is often of little use. There are three statistics that are used to measure the centre of a data set. These are: the mode, the mean and the median.

THE MODE
For discrete numerical data, the mode is the most frequently occurring value in the data set. For continuous numerical data, we cannot talk about a mode in this way because no two data values will be exactly equal. Instead we talk about a modal class, which is the class that occurs most frequently.

THE MEAN
The mean of a data set is the statistical name for the arithmetic average. mean = sum of all data values the number of data values

The mean gives us a single number which indicates a centre of the data set. It is usually not a member of the data set. For example, a mean test mark of 73% tells us that there are several marks below 73% and several above it. 73% is at the centre, but it does not necessarily mean that one of the students scored 73%. be a data value be the number of data values in the sample or population represent the mean of a sample and m reads mu represent the mean of a population P P x x or x = . then the mean is either: = n n If we let x n x

THE MEDIAN
The median is the middle value of an ordered data set. An ordered data set is obtained by listing the data, usually from smallest to largest. The median splits the data in halves. Half of the data are less than or equal to the median and half are greater than or equal to it. For example, if the median mark for a test is 73% then you know that half the class scored less than or equal to 73% and half scored greater than or equal to 73%. Note: For an odd number of data, the median is one of the data. For an even number of data, the median is the average of the two middle values and may not be one of the original data.

490

DESCRIPTIVE STATISTICS (Chapter 17)

Here is a rule for finding the median: If there are n data values, find For example: If n = 13, If n = 14, n+1 . The median is the 2

n+1 th data value. 2


DEMO

13 + 1 = 7, so the median = 7th ordered data value. 2

14 + 1 = 7:5, so the median = average of 7th and 8th ordered data values. 2

THE MERITS OF THE MEAN AND MEDIAN AS MEASURES OF CENTRE


The median is the only measure of centre that will locate the true centre regardless of the data sets features. It is unaffected by the presence of extreme values. It is called a resistant measure of centre. The mean is an accurate measure of centre if the distribution is symmetrical or approximately symmetrical. If it is not, then unbalanced high or low values will drag the mean toward them and cause it to be an inaccurate measure of the centre. It is called a non-resistant measure of centre because it is influenced by all data values in the set. If it is considered inaccurate, it should not be used in discussion.

THE RELATIONSHIP BETWEEN THE MEAN AND THE MEDIAN FOR DIFFERENT DISTRIBUTIONS
For distributions that are symmetric, the mean or median will be approximately equal.

mean and median

mean and median

If the data set has symmetry, both the mean and the median should accurately measure the centre of the distribution. If the data set is not symmetric, it may be positively or negatively skewed:

positively skewed distribution


mode

negatively skewed distribution


mode

median mean

mean median

Notice that the mean and median are clearly different for these skewed distributions.

You might also like