Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 28

PNG University of Technology

Mathematics & Computer Science


Department

MA 339
LECTURE 4

1
Measures of Central
Tendency and Dispersions

Part 1
Statistical Measures for
ungrouped data
2
The various measures of central
tendency and dispersion for
ungrouped data will be developed
with reference to Example 1 below.

3
Example 1.
Table 1 lists the number of employees for a sample of 8 major
Australian building societies, drawn from the 1992 KPMG Peak
Marwick financial database.

Table 1 Number of employees for a sample of 8 Australian


building
societies.
Building Society (BS) Number of Employees
Base and Equitable BS (Tas) 46
The Co-operative BS (SA) 517
Home BS (WA) 272
Hume BS Limited (NSW) 50
Ioof BS (Vic) 146
Mutual Community BS (SA) 34
The Rock BS (Qld) 46
Suncorp BS (Qld) 631 4
Let us examine the central tendency and
dispersion of these numbers of employees.

1.1 Measures of Central Tendency


Measures of Central Tendency introduced in
here includes mean, median, mode and
quartiles.

[Take not that for this first part, we are studying


measures of central tendency for UNGROUPED
data].
5
The Mean
The mean or average is one of the most useful
measures we can derive for sample data,
providing one indication of the “central”
magnitude of values.

The mean is obtained using the formula;

1 n
  xi
x1  x2  x3 ....... xn
x
n n i 1

6
In terms of the sample data of example 1, the
mean is calculated to be;

x1  x2  x3 ....... xn 46  517  272  ...  631


x   217.75 emplyees.
n 8
Or
1 n 1
x   xi  (1742)  217.75 employees
n i 1 8

7
The Median
The median (Me) is simply the central value by
virtue of its position. To obtain this “middle
value” we begin by rearranging the observations
of Table 1 in ascending order as in figure 1 below;

34 46 46 50 146 272 517 631

1st Quartile 2nd Quartile 3rd Quartile

Figure 1 Ascending array of number of building society employees

The Median (Me) for this case, Me=(50+146)/2=98 Employees.

8
 
The Quartile and allied measures
Like median, quartiles are measures of location
by virtue of position. However, instead of
dividing into halves, they divide it into quarters –
hence the term quartiles.

 1st Quartile () has one quarter of the values


below it and three quarters of the value
above it.
 2nd Quartile () (which is in fact the median)
has half of the values above it and half of the
values below it
9
 3rd Quartile () has one quarter of the values
 

above it and three quarters of the value


below it.

For sample data, the values f andare


approximated as
  46 + 46
𝑄 1= = 46 𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑠𝑠
2

  272+517
𝑄 3= =394.5 𝐸𝑚𝑝𝑙𝑜𝑦𝑒𝑠𝑠
2

10
The Mode
The mode (Mo) is the most common or typical
value of the variable. Since the only value to
occur more than once in our example is 46, the
model number of employees is 46.

The mode provides an alternative measure of


location to the mean and median. Like the
median, it is less influenced by extreme values
than is the mean.

11
1.2 Measures of dispersion
Measures of dispersion introduced in here
includes among others the range, variance,
standard deviation and coefficient of variation.

[Take not that for this first part, we are studying


measures of central tendency for UNGROUPED
data].

12
 
The Range
The range of a set of sample observations is
simply the difference between the largest and
the smallest values. Referring to figure 1, we find
that

In other words, the difference between the


largest and the smallest number of employees
for the eight selected building societies is nearly
600.

13
 
Interquartile range and quartile deviation
The interquartile range () is the difference between the
1st and 3rd quartiles (and respectively)

which, for the data in example 1, becomes

Quartile deviation ( is a related measure and simply


divides the interquartile range by 2.

which, for the data in example 1, becomes

14
 
The Mean absolute Deviation
One logical way of preventing the sum of
deviations about the mean summing to zero, is to
use the absolute value of each deviation (ie.
Simply disregard its sign). This is precisely the
approach used to compute the mean absolute
deviation (MD), which is defined as;

Where the vertical bars indicate absolute values.


For the number of employees, we calculate the
mean absolute deviation to be;

15
 

This indicates that the average difference


between the mean number of employees
and the actual number for individual
building societies is about 192.

16
 
The Variance
Another way of preventing deviations about the
mean from cancelling each other is to square
them, which is the approach adopted for the
variance (represented by the symbol ). As we
shall see, this is the most important measure of
dispersion.
By definition;

For the data of example 1, the variance is


computed as below.

17
 

Fortunately, variance can also be calculated using the


following much simpler computational formula which
circumvents the tasks of calculating the deviations
about This gives the formula;

18
 

19
 
The Standard Deviation
The standard deviation (s) is simply the square
root of the variance.
By definition;

Where the ranges of summation of 1 to n are


implicit. For the 8 selected building societies

20
 
s
s
s

Or

ss
s
s

21
Measures of Central
Tendency and Dispersions

Part 2
Statistical Measures for
Grouped data
22
Constructing a Frequency Distribution
Using Grouped Quantitative Data
• Ideally, the number of classes in a frequency
distribution should be between 4 and 20;
– Some data sets, particularly those with continuous
data, require several values to be grouped
together in a single class.
– This grouping prevents having too many classes in
the frequency distribution, which can make it
difficult to detect patterns.

2-23
Number of Classes
• One method to determine the number of classes
in a frequency distribution is the rule;
2k > n
where k = Number of classes
n = Number of data points
• Find the lowest value of k that satisfies the rule
• Suppose n = 50
25 = 32 < 50 (k = 5 is too small)
26 = 64 > 50 (k = 6 is a good choice)
2-24
Class Width
• Once k is known, the width of each class can
be found;
– The width is the range of numbers to put into each
class
Maximum data value  Minimum data value
Estimated class width 
k

– Round this estimate to a useful whole number


that makes the frequency distribution more
readable
2-25
Class Width
• There is no one correct answer for the class
width

• The goal is to create a histogram to clearly and


usefully show the pattern in the data

• Often there is more than one acceptable way


to accomplish this

2-26
Presentation of grouped data
How to present a data into a grouped frequency distribution.

We can look at this example to understand how to do so.

The amount of money earned weekly by 40 people working


part-time in a factory, correct to the nearest K10, is shown
below.
 80 90 70 110 90 160 110 80
140 30 90 50 100 110 60 100
80 90 110 80 100 90 120 70
130 170 80 120 100 110 40 110
50 100 110 90 100 70 110 80
a) Form a frequency distribution having 6 classes for these data
(b) Construct a histogram for the data
27
 
Make sure the following is satisfied;

Class Frequency
20 – 40 2
50 – 70 6
80 – 90 12
100 – 110 14
120 – 140 4
150 – 170 2
Total 40

28

You might also like