Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 49

Unit 2

Measures of Central Tendency

Dr. Pooja Kansra


Associate Professor
Mittal School of Business
https://www.mckinsey.com/business-functions/strategy-and-corporate-
finance/our-insights/the-coronavirus-effect-on-global-economic-sentiment
Topic Covered
Central Tendency including partition values
Use of MS-Excel based examples to discuss the
central tendency
Data introduction : Max and Min value
Measure of Central Tendency
Measure of central tendency provides a very
convenient way of describing a set of scores with a
single number that describes the PERFORMANCE of
the group.
It is also defined as a single value that is used to
describe the “center” of the data.
There are three commonly used measures of central
tendency. These are the following:
MEAN | MEDIAN | MODE
Mean = Arithmetic Mean
 It is the most commonly used measure of the center of data
 It is also referred as the “arithmetic average”
 Computation of Sample Mean

 Computation of the Mean for Ungrouped Data


• The average performance of 15 students
who participated in mathematics quiz
consisting of 25 items is 15.20.
• The implication of this is that student who
got scores below 15.2 did not perform
well in the said examination.
• Students who got scores higher than 15.2
performed well in the examination
compared to the performance of the whole
class.
Advantage
Simple, Unique and every data has only one mean.
Calculation based on all data.
Algebraic treatment.
Weighted AM
The arithmetic mean gives equal important (or
weight) to each observation in the data set.
There are situations in which value of individual
observations in the data set is not of equal
importance.
If values occur with different frequencies, then
computing A.M. of values (as opposed to the A.M.
of observations) may not be truly representative of
the data set characteristic and thus may be
misleading.
When Mean Won’t Work
When a distribution contains a few extreme
values (or is very skewed) i.e. outliers, the mean
will be pulled toward the extremes (displaced
toward the tail). In this case, the mean will not
provide a "central" value.
The mean cannot be calculated for qualitative
characteristics such as intelligence, honesty,
beauty, or loyalty.
The mean cannot be calculated for a data set that
has open-ended classes at either the high or low
end of the scale.
11
Median
The median is the middle observation when the
data are sorted from smallest to largest.
If the number of observations is odd, the median is
literally the middle observation.
If the number of observations is even, the median is
usually defined as the average of the two middle
observations.
In Excel, the median can be calculated with the
MEDIAN function.
Median
Middle value
Value that splits the dataset in
half

To find the median, order your


data from smallest to largest,
and then find the data point
that has an equal amount of
values above it and below it.
Middle value
Value that splits the dataset in
half

To find the median, order your


data from smallest to largest,
and then find the data point
that has an equal amount of
values above it and below it.
When there is an even
number of values, you
count in to the two
innermost values and then
take the average. The
average of 27 and 29 is 28.
Consequently, 28 is the
median of this dataset.
When there is an even
number of values, you
count in to the two
innermost values and then
take the average. The
average of 27 and 29 is 28.
Consequently, 28 is the
median of this dataset.
Outliers and skewed data have a smaller effect
on the median.
Let’s Poll
Identify the extreme values in the given data.
10, 1000, 20, 30, 40 ,50, 60
A. 10
B. 60
C. 1000
D. Can’t determined
Continuous Series
Mode
• Mode value: A measure of location recognized
by the location of the most frequently occurring
value of a set of data.
• The concept of mode is of great use to large
scale manufacturers of consumable items such
as ready-made garments, shoe-makers, and so
on. In all such cases it is important to know the
size that fits most persons rather than ‘mean’
size.
Let’s Poll
2. Which is best measure of central tendency to deal
with the qualitative data?
a. Mean
b. Median
c. Mode
d. All of these
average man prefers . . . brand of trousers
average production of an item in a month
average service time at the service counter

The mode is a poor measure of


central tendency when most
frequently occurring
values of an observation do not
appear close to the center of the data.
Let’s Poll
The following data shows the size of the T-shirts
demanded by residents of a locality:
L, XL, XXL, L, M, S, L, XL, XXL, L, M, S,
S, L
Which of the Following shows the modal size of the T-
shirt?
A. XXL
B. XL
C. L
D. S
Let’s find mode
Let’s Poll
1. Which one is highly effected by extreme value?
a. Mean
b. Median
c. Mode
d. All of these
Example
Baseball Salaries 2011.xlsx
 Objective: To learn how salaries are distributed across all 2011 MLB players.
 Solution: Data set contains data on 843 Major League Baseball players in the 2011
season.
 Variables are player’s name, team, position, and salary.
 Create summary measures of central tendency of baseball salaries using Excel.
Relationship

Mean – Mode = 3 (Mean – Median)


Mode = 3 Median – 2 Mean
Mean > Median > Mode ::: Positively Skewed
Mean < Median < Mode ::: Negatively Skewed
True or False
1. The mean of a data set remains unaffected if an
observation equal to mean is included in it.
2. It is possible to have data with three different values
for measures of central tendency.
3. The mode is always found at the highest point of a
graph of a frequency distribution.
4. The median is less affected than the mean by
extreme values of observations in a distribution.
Partition Values and role of Outliers
Quartiles -- dividing data into four equal parts. It don’t
shows the middle part of any quarter but show where it
ends.
Data must be in ascending
order for diagram clarity.

100
%

75
%

50
%

25
%
Box & Whisker Plot Graphs
Histogram
Use Pivot Table for Frequency Distribution and
Histogram and explaining vis-à-vis Descriptive
statisitcs

https://www.excel-easy.com/examples/frequency-
distribution.html
Skewness and Kurtosis
 Skewness is the measure of the asymmetry of an ideally
symmetric probability distribution.
 Skewness is the measure of how much the distribution
deviates from the normal distribution (Normal distribution
means MEAN=MEDIAN=MODE [for the time being])
Why is skewness important?
 It helps us to
identify the
direction of
outliers.
 Although, it will
not tell us about
number of
outliers.
 Skewness values
should be zero
but even nearly
zero value is
manageable.

Histogram drawn from Baseball file.


=skew (number1, number2,….)
Formula

-ve skewed

+ve skewed
=kurt(…) K>3

K=3

K<3
Skewness Kurtosis

You might also like