Measures of Central Tendency: Mean, Median & Mode

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 75

Measures of Central Tendency

Mean, Median & Mode


Numerical Data
Properties & Measures
Numerical Data
Properties & Measures

Numerical Data
Properties
Numerical Data
Properties & Measures

Numerical Data
Properties

Central
Tendency
Numerical Data
Properties & Measures

Numerical Data
Properties

Central
Tendency Variation
Numerical Data
Properties & Measures

Numerical Data
Properties

Central
Tendency Variation Shape
Numerical Data
Properties & Measures

Numerical Data
Properties

Central
Tendency Variation Shape

Mean
Median
Mode
Numerical Data
Properties & Measures

Numerical Data
Properties

Central
Tendency Variation Shape

Mean Range
Median
Range
Mode Variance
Standard Deviation
Coeff. of Variation
Numerical Data
Properties & Measures
Numerical Data
Properties

Central
Tendency Variation Shape

Mean Range Skew


Median Interquartile Kurtosis
Range
Mode Variance
Midrange Standard Deviation
Midhinge Coeff. of Variation
Objective of averaging
To get one single value that describes the characteristics of the
entire data.
To facilitate comparison.
Characteristics of a good average
It should be easy to understand
It should be simple to compute
It should be based on all the observations
It should be capable of further algebraic treatment.
It should have sampling stability .
It should not be unduly affected by the presence of
extreme values.
Central Tendency- Arithmetic Mean

Mean

Arithmetic Average or Mean

Geometric Mean

Harmonic Mean
Arithmetic Average or Mean
Measure of central tendency
Most common measure
Acts as ‘balance point’
Affected by extreme values (‘outliers’)
Mean
 Measure of central tendency
 Most common measure
 Acts as ‘balance point’
 Affected by extreme values (‘outliers’)
 Formula (sample mean):
n
 Xi X1  X 2    X n
i 1
X  
n n
Mean Example
Raw data: 10.3 4.9 8.9 11.7 6.3 7.7

n
 Xi X1  X 2  X 3  X 4  X 5  X 6
i 1
X  
n 6
10.3  4.9  8.9  11.7  6.3  7.7

6
 8.30
Arithmetic Mean
(continued)

 Affected by extreme values (outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Mean Mean
=3 =4
1  2  3  4  5 15 1  2  3  4  10 20
 3  4
5 5 5 5
Mean for Discrete Distribution

If a variable X takes values X1, X2, X3, .


……. Xn With corresponding
frequencies f1, f2, f3, . ……. fn then
Mean is defined as
Mean for Discrete Distribution

X f
i i
X1f1+X2f2+…….+Xnfn
X  i 1

f 1  f 2  .....  fn f1+f2+…….+fn
Example….
Calculate the arithmetic mean of the following data
Weight in Kg No. of Students fx
x f
18 4 72
19 6 114
20 5 100
21 3 63
23 4 92
25 3 75
Total 25 516
Example (cont….)

X i fi
516
X  i 1

f 1  f 2  .....  fn 25

= 20.64
Average weight is 20.64 kgs.
Mean for Continuous Distribution
Continuous distribution takes the form of a discrete
one by
Mid point representing the class interval
X represents the midpoints
Example…
The following are the figures of profit earned by 1,400
companies during 2003-04. Calculate the average profit.
Profits No. of
(Rs. Lakhs) Companies (f)
200-400 500
400-600 300
600-800 280
800-1000 120
1000-1200 100
1200-1400 80
1400-1600 20
Example…..

Profits Mid-points No. of


fX
(Rs. Lakhs) X Companies (f)

200-400 300 500 150000


400-600 500 300 150000
600-800 700 280 196000
800-1000 900 120 108000
1000-1200 1100 100 110000
1200-1400 1300 80 104000
1400-1600 1500 20 30000
1400 8,48,000
Example….

X f
i i
8,48,000
X  i 1

f 1  f 2  .....  fn
1,400

= 605.71

The average profit is Rs. 605.71 lacks.


Properties of Mean
1.The sum of the deviations of all the
values of x from their mean is zero
2. The product of mean and no. of items
gives the total of items.
Geometric Mean

If a variable X takes values X1, X2, X3, . ……. Xn the


geometric mean of x is defined as

1/ n
G.M = ( x1.x2 .x3 .....xn )
Example
If a new manager takes over a novelty firm and
with his efforts the sales are doubled in the first
year in relation to the previous year, the sales
tripled in the second and are quadrupled in the
third year, what is the average rate of increase ?
G.M of 2,3,4 gives the average rate of
increase. = (2.3.4)1/3 =2.88
Median
Measure of central tendency
Middle value in ordered sequence
 If odd n, middle value of sequence
 If even n, average of 2 middle values
Not affected by extreme values
Median
 Measure of central tendency
 Middle value in ordered sequence
 If odd n, middle value of sequence
 If even n, average of 2 middle values
 Not affected by extreme values
 Position of median in sequence
Finding the Median
 The location of the median:

n 1
Median position  position in the ordered data
2

 If the number of values is odd, the median is the middle number


 If the number of values is even, the median is the average of the
two middle numbers

n 1
 Note that
2 is not the value of the median, only the position
of the median in the ranked data
Median Example
Odd-Sized Sample
Raw data: 24.1 22.6 21.5 23.7 22.6
Ordered: 21.5 22.6 22.6 23.7 24.1
Position: 1 2 3 4 5

N 1 5 1
Positionin g Point    3 .0
2 2
Median  22.6
Median Example
Even-Sized Sample
Raw data: 10.3 4.9 8.9 11.7 6.3 7.7
Ordered: 4.9 6.3 7.7 8.9 10.3 11.7
Position: 1 2 3 4 5 6

n1 61
Positioning Point    3.5
2 2
7.7  8.9
Median   8.30
2
Median
 In an ordered list, the median is the “middle” number (50% above,
50% below)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Median Median
=3 =3
 Not affected by extreme values
Median of Discrete Distribution

In discrete distributions cumulative


frequencies gives the location of median
Example…..
Calculate the median weight of group of children
Weight in Kg No. of Students Less than
x f cum. freq
18 4 4

19 6 10

20 5 15

21 3 18

23 4 22

25 3 25

25
Example

N 1 25  1
N=25, m= 2
= =13
2
Median is the value of the 13 th item,
i.e., the weight of 13 th child. The weight of each
of 11th to 15 th child is 20 kgs (from cum.freq).
The first cum. Frequency greater than m.
Here 15 is the cum.freq. just greater than m.
The weight corresponding is 20 kgs.
Median=20 kg.s
Median for continuous distribution

Find the class interval in which median lies


using the cumulative frequencies. The exact
value of the median is to be obtained by
interpolation using the following interpolation
formula
Formula
mc
M  l1  (l2  l1 )
f
Where
l1 =lower limit of the median class.
l2 = upper limit of the median class
c = cumulative frequency of the class
interval immediately preceding the median
class.
f = frequency of the median class
m= n/2, M=median
Example
Calculate the weight for the following group of persons
Weight in Kg No. of persons Less than
f cum. freq
50-55 8 8

55-60 10 18

60-65 25 43

65-70 35 78

70-75 15 93

75-80 7 100

Total 100
Example
N=100, m =100/2 = 50
78 is the first cumulative frequency greater
than 50. The median lies in the class interval
65-70
Here l1 = 65, l2 = 70, c=43, f=35 and m=50

mc
M  l1  (l2  l1 )
f
Example..
50  43
M  65  (70  65)
35
7
 65  (5)  65  1
35
 66
Median weight is 66 kgs.
Mode
 Measure of central tendency
 Value that occurs most often
 Not affected by extreme values
 May be no mode or several modes
 May be used for numerical & categorical data
Mode Example
No Mode
Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7
One Mode
Raw Data: 6.3 4.9 8.9 6.3 4.9 4.9
More Than 1 Mode
Raw Data: 21 28 28 41 43 43
Example

Calculate the modal size of shoes.


Size of shoe : 5 6 7 8 9 10

No. of pairs : 48 52 56 50 47 48

Here the maximum frequency is 56 against


7 . Therefore modal size is 7
Mode in a continuous distribution
First determine the modal class . Then the value of
mode is calculated by interpolation.
f1  f 0
Z  l1  (l2  l1 )
( f1  f 0 )  ( f1  f 2 )
l1= lower limit of the modal class.
l2= upper limit of the modal class
f1= frequency of the modal class
f0= frequency of the class interval immediately preceding the
modal class
f2=frequency of the class interval immediately succeeding the
modal class
Example

Calculate modal life.


Life in hours No. of bulbs
f
1000-1100 40
1100-1200 80
1200-1300 100
1300-1400 60
1400-1500 60
1500-1600 50
Example….
Maximum frequency = 100 (1200-1300)
The modal class is 1200-1300
100  80
Z  1200  (l2  l1 )
(100  80)  (100  60)
20
Z  1200  (100)
20  40
100
 1200 
3
1200  33.33
Modal life is 1233.33
Review Example

 Five houses on a hill by the beach

$2,000 K
House Prices:

$2,000,000
500,000 $500 K
300,000 $300 K
100,000
100,000

$100 K

$100 K
Review Example:
Summary Statistics

House Prices:
 Mean: ($3,000,000/5)
$2,000,000 = $600,000
500,000
300,000
100,000
 Median: middle value of ranked data
100,000
= $300,000
Sum 3,000,000

 Mode: most frequent value


= $100,000
Statistics for Business and
Chap 3-49 Economics, 6e © 2007
Pearson Education, Inc.
Quartiles
The median divides the given distribution into two equal
parts.
If the distribution is divided into four equal parts, each of
these points of division is called quartiles
First quartile , second quartile and third quartile.
Second quartile is the median
The calculation of quartiles is similar to that of median.
Quartile Measures
 Quartiles split the ranked data into 4 segments with an equal
number of values per segment

25 25 25 25
% Q1 % Q2 % Q3 %
 The first quartile, Q1, is the value for which 25% of
the observations are smaller and 75% are larger
 Q2 is the same as the median (50% of the
observations are smaller and 50% are larger)
 Only 25% of the observations are greater than the
third quartile
Chap 3-51
Quartile Measures:
Locating Quartiles

Find a quartile by determining the value in the appropriate


position in the ranked data, where

First quartile position: Q1 = (n+1)/4 ranked value

Second quartile position: Q2 = (n+1)/2 ranked value

Third quartile position: Q3 = 3(n+1)/4 ranked


value

where n is the number of observed values


Chap 3-52
Quartile Measures:
Calculation Rules
 When calculating the ranked position use the following rules
 If the result is a whole number then it is the ranked position to
use

 If the result is a fractional half (e.g. 2.5, 7.5, 8.5, etc.) then
average the two corresponding data values.

 If the result is not a whole number or a fractional half then


round the result to the nearest integer to find the ranked
position.

Chap 3-53
Quartile Measures:
Locating Quartiles
Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

(n = 9)
Q1 is in the (9+1)/4 = 2.5 position of the ranked
data
so use the value half way between the 2nd and 3rd
values,

so Q1 = 12.5
Q1 and Q3 are measures of non-central location
Q2 = median, is a measure of central tendency Chap 3-54
Quartile Measures
Calculating The Quartiles: Example

Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22


(n = 9)
Q1 is in the (9+1)/4 = 2.5 position of the ranked data,
so Q1 = (12+13)/2 = 12.5

Q2 is in the (9+1)/2 = 5th position of the ranked data,


so Q2 = median = 16

Q3 is in the 3(9+1)/4 = 7.5 position of the ranked data,


so Q3 = (18+21)/2 = 19.5

Q1 and Q3 are measures of non-central location


Q2 = median, is a measure of central tendency
Chap 3-55
Deciles & Percentiles
If we divide the series into ten equal parts, the points of
division are called deciles.
If we divide the series into 100 equal parts, the points of
division are called percentiles.
Computation of Quartiles, Deciles
etc.
The procedure for computing quartiles, deciles etc. is the same
as for median.
Understand the terms mean,
median, mode
Measures of Central Tendency
 Mean … the average score

 Median … the value that lies in the middle after ranking all
the scores

 Mode … the most frequently occurring score


Which average ?
It must be clearly understood that no single average can be regarded as
best for all purposes.
Are they badly skewed (avoid mean) , space around the middle ( avoid
mean), or unequal in class-interval ( avoid mode)
Which average ?
Is composite average of all absolute or relative vales
needed ( arithmetic mean or geometric mean ) ? Or is a
middle vale needed ( median) or the most common
value (mode)?
Note
To use one measure alone is like looking through a keyhole:
the part of the room you can see cannot give a full idea
of the whole room.
Moral of the story …..
A person had to cross a river from one bank to
another. He was not aware of the depth of the river, so
he enquired from another who told him that the
average depth of water is 160 cms. The man was 175
cms and he thought that he can very easily cross the
river because all the time he would be above the water
level. So he started. In the beginning the level of water
was very shallow but as he reached the middle, the
water was 500 cms deep and he lost his life. The man
was drowned because he had a misconception that
average depth means uniform depth throughout.

Moral of the story …..


Measures of Central Tendency
A society consists of six members – a king whose annual
income is $1.2 billion and five serfs, each of whose annual
incomes is $1. The mean income of the members of the
society is then about $200 million, which is typical of none of
its members.
MEAN not always a “real” data
point
3 mathematicians duck hunting:
#1 fired shot 6 inches over DUCK
#2 fired shot 6 inches below DUCK
#3 At this point excited exclaimed,
“WE GOT IT!”
Measures of Central Tendency
Mean … the most frequently used but is sensitive to
extreme scores
e.g. 1 2 3 4 5 6 7 8 9 10
Mean = 5.5 (median = 5.5)
e.g. 1 2 3 4 5 6 7 8 9 20
Mean = 6.5 (median = 5.5)
e.g. 1 2 3 4 5 6 7 8 9 100
Mean = 14.5 (median = 5.5)
Comparison of mean, median &
mode
 Mean
 Used for inference as well as description; best estimator of the
parameter
 Based on all data in the distribution
 Generally preferred except for “bad” distribution. Most
commonly used statistic for central tendency.
Not to use Arithmetic mean------
1. In highly –skewed distributions
2. In distributions with open-end intervals.
3. When the distribution is unevenly spread,
concentration being small or large at irregular points.
4. When an average rate of growth or change over a
period of time is required.
Not to use Arithmetic mean------
1. When the observations form a geometric progression
2. When averaging rates (i.e. speed, fluctuations in the prices
etc)
3. When there are very large & very small values.
Geometric mean---
The G.M is typically used in averaging index numbers, rates of
change, ratios, and other sets of data expressed in percentage
form. It is particularly important in economics and business
statistics in index number construction.
MEDIAN is a POSITIONAL
measure
 1/2 data points above


 AND


 1/2 data points below
REMEMBER

MEDIAN
 requires > =

ORDINAL
level data
Measures of Central Tendency
Median

… is not sensitive to extreme scores

… use it when you are unable to use the mean because of


extreme scores
Measures of Central Tendency
Mode

… does not involve any calculation or ordering of data

… use it when you have categories (e.g. occupation)


Best Guess interpretations
 Mean – average of signed error will be zero.
 Mode – will be absolutely right with greatest frequency
 Median – smallest absolute error
Thinking Challenge

$400,000

$70,000

$50,000
... employees cite low pay
$30,000 -- most workers earn only
$20,000.
$20,000
... President claims
average pay is $70,000!

Alone Group Class

You might also like