Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

MODULE 3:

MEASURES OF CENTRALITY
AND VARIABILITY
OBJECTIVES
▷ The students must be able to:
1. Analyze word problems to determine which measure
of central tendency is best used in each
circumstance.
2. Apply the concepts of measures of centrality in raw
data.
3. Compare the uses of the measures of central
tendency.
4. Solve for the measures of variability. 2
Measures of Centrality
This is a value at the center or middle of a data set.

3
Mean
The mean (or arithmetic mean) of a set of data is the measure of center
found by adding all the data values and dividing the total by the number
of data values.
Important Properties of Mean:
▷ Sample means drawn from the same population tend to vary less than
other measures of center.
▷ The mean of a data set uses every data value.
▷ A disadvantage of the mean is that just one extreme value (outlier)
can change the value of the mean substantially.
σ𝑥
FORMULA: 𝑥ҧ 𝑜𝑟 𝜇 =
𝑛 𝑜𝑟 𝑁

4
Example for Mean
Find the mean of the first five pulse rates for males:
84, 74, 50, 60, 52 (all in beats per minute, or BPM).
σ 𝑥 84 + 74 + 50 + 60 + 52
𝑥ҧ = =
𝑛 5
320
= = 64.0 𝐵𝑃𝑀
5
Interpretation: The mean of the first five male pulse
rates is 64.0 BPM.
5
Median
The median of a data set is a measure of center that is the middle value when the
original data values are arranged in order of increasing (or decreasing)
magnitude. (Notation: 𝑥)෤
Important Properties of the Median:
▷ The median does not change by large amounts when we include just a few
extreme values, so the median is resistant measure of center.
▷ The median does not directly use every data value. (For example, if the largest
value is changed to a much larger value, the median does not change.
▷ If the number of data values is odd, the median is the number located in the
exact middle of the sorted list.
▷ If the number of data values is even, the median is found by computing the
mean of the two middle numbers in the sorted list.
6
Example for Median
Find the median of the first five pulse rates for
males: 84, 74, 50, 60, 52 (all in beats per minute, or
BPM).
Sort the data values in ascending order:
50, 52, 60, 74, 84
Since the number of data is odd (5), then the exact
middle is the median. Therefore, the median is 60.0
BPM.
7
Example for Median
Find the median of the first six pulse rates for males: 84,
74, 50, 60, 52, 62 (all in beats per minute, or BPM).
Sort the data values in ascending order:
50, 52, 60, 62, 74, 84
Since the number of data is even (6), then we compute
for the mean of the two middle values to find the
60+62
median. Therefore, the median is = 61.0 𝐵𝑃𝑀.
2

8
Mode
The mode is the value/s that occurs with the greatest frequency.
(Notation: 𝑥)

Important Properties of the Mode:
▷ The mode can be found with qualitative data.
▷ A data set can have no mode, one mode or multiple modes.
▷ Bimodal – exactly two data values have the greatest frequency
▷ Unimodal – exactly one data value has the greatest frequency
▷ Multimodal – more than two data values have the greatest
frequency.

9
Example for Mode
Find the mode of these pulse rates (in BPM):
▷ 58, 58, 58, 58, 60, 60, 62, 64
Mode: 58 BPM
▷ 58, 58, 58, 60, 60, 60, 62, 64
Mode: 58 BPM and 60 BPM
▷ 58, 58, 62, 62, 64, 64
Mode: NONE
▷ 58, 58, 62, 62, 64, 64, 60
Mode: 58 BPM, 62 BPM and 64 BPM
10
Midrange
The midrange of a data set is the measure of center that is the
value midway between the maximum and minimum values in the
original data set.
𝑀𝑎𝑥𝑖𝑚𝑢𝑚 + 𝑀𝑖𝑛𝑖𝑚𝑢𝑚
𝑀𝑖𝑑𝑟𝑎𝑛𝑔𝑒 =
2
Important Properties of Midrange
▷ Because the midrange uses only the maximum and minimum
values, it is very sensitive to those extremes so the midrange is
not resistant.
▷ It is very easy to compute and sometimes used incorrectly for
the median,
11
Example for Midrange
Find the midrange of the first five pulse rates for
males: 84, 74, 50, 60, 52 (all in beats per minute, or
BPM).
𝑀𝑎𝑥𝑖𝑚𝑢𝑚 + 𝑀𝑖𝑛𝑖𝑚𝑢𝑚 84 + 50
𝑀𝑖𝑑𝑟𝑎𝑛𝑔𝑒 = =
2 2
= 67.0 𝐵𝑃𝑀
The midrange is 67.0 BPM.

12
Weighted Mean
σ(𝑤 ∙ 𝑥)
Weighted Mean: 𝑥ҧ = σ𝑤
This formula tells us to first multiply each weight, w,
by the corresponding value, x, then add the
products, and then finally to divide that total by the
sum of the weights, σ 𝑤.

13
Example for Weighted Mean
In her first semester of college, a student has her final grades
for each course were 1.50 (3 units), 1.75 (4 units), 2.25 (3
units), 1.25 (3 units) and 3.00 (1 unit). Compute her weighted
average.
σ(𝑤 ∙ 𝑥)
𝑥ҧ =
σ𝑤
1.50 × 3 + 1.75 × 4 + 2.25 × 3 + 1.25 × 3 + (3.00 × 1)
=
3+4+3+3+1
25
= 1.79
14
14
Measures of Centrality
Grouped Data
This is a value at the center or middle of a data set.

15
Mean of Grouped Data
Class Interval (Pulse Class Midpoint/Class
Frequency (f) f X CM
Rate) Mark (CM)
40 – 54 15 47 705

55 – 69 63 62 3906

70 – 84 62 77 4774

85 – 99 11 92 1012

100 – 114 2 107 214

Total 153 10611

σ(𝑓 × 𝐶𝑀) 10611


Formula: 𝑥ҧ = σ𝑓
= = 69.35 𝐵𝑃𝑀
153

16
Median of Grouped Data
Class Interval (Pulse
Frequency (f) LCB < CF
Rate)
40 – 54 15 39.5 15

55 – 69 63 54.5 78

70 – 84 62 69.5 140

85 – 99 11 84.5 151

100 – 114 2 99.5 153

Total 153

𝑛
− <𝐶𝐹𝑚𝑒𝑑−1
Formula: 𝑥෤ = 𝐿𝐶𝐵𝑚𝑒𝑑 + 2
𝑖
𝑓𝑚𝑒𝑑
76.5 − 15
= 54.5 + 15 = 69.14 𝐵𝑃𝑀
63 17
Mode of Grouped Data
Class Interval (Pulse
Frequency (f) LCB < CF
Rate)
40 – 54 15 39.5 15

55 – 69 63 54.5 78

70 – 84 62 69.5 140

85 – 99 11 84.5 151

100 – 114 2 99.5 153

Total 153

𝐷1
Formula: 𝑥ො = 𝐿𝐶𝐵𝑚𝑜𝑑 + 𝑖
𝐷1 +𝐷2
48
= 54.5 + 15 = 69.19 𝐵𝑃𝑀
49 18
Measures of Variability
This part measures how far or close the data to each other and also to the
concept of centrality.

19
Range
The range of a set of data values is the difference between the
maximum data value and the minimum data value.
𝑅𝑎𝑛𝑔𝑒 = 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑑𝑎𝑡𝑎 − (𝑚𝑖𝑛𝑖𝑚𝑢𝑚 𝑑𝑎𝑡𝑎)
Important Properties of the Range:
▷ The range uses only the maximum and the minimum data value,
so it is very sensitive to extreme values. The range is not
resistant.
▷ Because the range uses only the maximum and minimum
values, it does not take every value into account and therefore
does not truly reflect the variation among all of the data values.
20
Example of Range
Find the range of the first five pulse
rates: 84, 74, 50, 60, 52 (BPM).
𝑅𝑎𝑛𝑔𝑒 = 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 − 𝑚𝑖𝑛𝑖𝑚𝑢𝑚
= 84 − 50 = 34.0 𝐵𝑃𝑀
The range is 34.0 BPM.

21
Standard Deviation
σ 𝑥−𝜇 2
Population: 𝜎 =
𝑁

σ 𝑥−𝑥ҧ 2 𝑛 σ 𝑥2 − σ 𝑥 2
Sample: 𝑠 = =
𝑛−1 𝑛(𝑛−1)

The standard deviation of set of sample values, denoted by s, is a


measure of how much data values deviate away from the mean.
Notations: s – sample standard deviation; 𝜎 – population standard
deviation; 𝜇 – population mean; 𝑥ҧ - sample mean, N – population
size, n – sample size, x – set of data

22
Example of Standard Deviation
Find the standard deviation of the first five pulse rates: 84,
74, 50, 60, 52 (BPM).
x 𝑥ҧ 𝑥 − 𝑥ҧ 𝑥 − 𝑥ҧ 2 856
= 214
84 64 20 400 4
𝜎 2 = 214
74 64 10 100 𝜎 = 214
50 64 -14 196 𝜎 = 14.6287
60 64 -4 16
The standard
52 64 -12 144
deviation is 14.63
Total 856 BPM.
23
Range Rule of Thumb for Identifying Significant
Values
Significantly Low Values are 𝜇 − 2𝜎 or lower.
Significantly High Values are 𝜇 + 2𝜎 or higher.
Values not significant are between 𝜇 − 2𝜎 and 𝜇 + 2𝜎.
Example: With a mean of 69.6 and a standard deviation of 11.3, we
use the range rule of thumb to find the values that are significantly
low or significantly high as follows:
Significantly low values are 69.6 − 2 × 11.3 = 47.0 or lower.
Significantly high values are 69.6 + 2 × 11.3 = 92.2 or higher.
Values not significant: Between 47.0 and 92.2.

24
Coefficient of Variation (CV)
The coefficient of variation for a set of nonnegative sample
or population data, expressed as a percent, describes the
standard deviation relative to the mean, and is given by the
following:
𝑠
Sample: 𝐶𝑉 = ҧ × 100%
𝑥
𝜎
Population: 𝐶𝑉 = × 100%
𝜇

25
Example for Coefficient of Variation (CV)
For the male pulse rates, 𝑥ҧ = 69.6 𝐵𝑃𝑀 and 𝑠 = 11.3 𝐵𝑃𝑀; for their
heights, 𝑥ҧ = 174.12 𝑐𝑚 and 𝑠 = 7.10 𝑐𝑚. We want to compare variation
among pulse rates to variation among heights.
Here, we have different scales and units of measurement, so we use the
coefficients of variation:
𝑠 11.3 𝐵𝑃𝑀
Male Pulse Rate: 𝐶𝑉 = ҧ × 100% = × 100% = 16.2%
𝑥 69.6 𝐵𝑃𝑀
𝑠 7.10 𝑐𝑚
Male Heights: 𝐶𝑉 = ҧ × 100% = × 100% = 4.1%
𝑥 174.12 𝑐𝑚
We can see that male pulse rates (CV: 16.2%) vary more than male
heights (CV: 4.1%).

26
Z-score
A z-score (or standard score or standardized value) is the
number of standard deviations that a given value x is above
or below the mean. The z-score is calculated by using one of
the following:
𝑥−𝑥ҧ
Sample: 𝑧 =
𝑠
𝑥−𝜇
Population: 𝑧 =
𝜎

27
Example of Z-score
Which of the following two data values is more extreme
relative to the data set from which it came?
▷ The 4000 g weight of a newborn baby (among 400
weights with sample mean 𝑥ҧ = 3152.0 𝑔 and a standard
deviation 𝑠 = 693.4 𝑔)
▷ The 99𝑜 𝐹 temperature of an adult (among 106 adults with
sample mean 𝑥ҧ = 98.20𝑜 𝐹 and a sample standard
deviation 𝑠 = 0.62𝑜 𝐹)

28
Example of Z-score
Solution:
▷ 4000 g birth weight:
𝑥 − 𝑥ҧ 4000 𝑔 − 3152.0 𝑔
𝑧= = = 1.22
𝑠 693.4 𝑔
▷ 99𝑜 𝐹 body temperature:
𝑥 − 𝑥ҧ 99𝑜 𝐹 − 98.20𝑜 𝐹
𝑧= = 𝑜
= 1.29
𝑠 0.62 𝐹
The body temperature is farther above the mean, it is the
more extreme value, but not much.
29
Percentile
Percentiles are measures of location, denoted
𝑃1 𝑃2 , … , 𝑃99 , which divide a set of data into 100
groups with about 1% of the values in each group.
Finding the Percentile of a Data Value
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒 𝑥
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑥
= × 100
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠

30
Example of Percentile
The following are the 40 cotinine measures (ng/mL)
of smokers. Find the percentile for the cotinine level
of 198 ng/mL.
0 1 1 3 17 32 35 44 48 86
87 103 112 121 123 130 131 149 164 167
173 173 198 208 210 222 227 234 245 250
253 265 266 277 284 289 290 313 477 491

31
Example of Percentile
Based from the sorted list, we see that there are 22
values less than 198 ng/mL, so:
22
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 198 = 100 = 55
40
A cotinine level of 198 ng/mL is in the 55th
percentile. This can be interpreted as 198 ng/mL
separates the lowest 55% of the values and the
highest 45% of the values. We have 𝑃55 = 198.
32
Converting a Percentile to a Data Value
1. Sort the data. (Arrange the data in lowest to highest.)
𝑘
2. Compute 𝐿 = 𝑛 where n is the number of values
100
and k is the percentile in question.
3. If L is a whole number, solve for 𝑃𝑘 by adding Lth value
and the next value then divide it by 2.
4. If L is not a whole number, round it up to the next larger
whole number and the 𝑃𝑘 is the Lth value counting from
the lowest.

33
Example of Percentile
The following are the 40 cotinine measures (ng/mL)
of smokers. Find the 33rd percentile.

0 1 1 3 17 32 35 44 48 86
87 103 112 121 123 130 131 149 164 167
173 173 198 208 210 222 227 234 245 250
253 265 266 277 284 289 290 313 477 491

34
Example of Percentile
𝑘 33
𝐿= 𝑛= 40 = 13.2
100 100
Since L is 13.2, not a whole number, we need to
round it up to next whole number, therefore, it is 14.
With this, we need to find the 14th data from the
sorted list starting from the lowest.
𝑛𝑔
Therefore, 𝑃33 = 121 .
𝑚𝐿

35
Example of Percentile
The following are the 40 cotinine measures (ng/mL)
of smokers. Find the 25th percentile.

0 1 1 3 17 32 35 44 48 86
87 103 112 121 123 130 131 149 164 167
173 173 198 208 210 222 227 234 245 250
253 265 266 277 284 289 290 313 477 491

36
Example of Percentile
𝑘 25
𝐿= 𝑛= 40 = 10
100 100
Since L is 10, a whole number, we need to get the
mean of the 10th data and the 11th data from the list
starting from the lowest.
𝐿10 +𝐿11 86+87 𝑛𝑔
Therefore, 𝑃25 = = = 86.5 .
2 2 𝑚𝐿

37
Quartiles
Quartiles are measures of location, denoted by
𝑄1 , 𝑄2 , and 𝑄3 , which divides a set of data into four
groups with about 25% of the values in each group.
𝑄1 = 𝑃25
𝑄2 = 𝑃50
𝑄3 = 𝑃75
IQR (Interquartile range) = 𝑄3 − 𝑄1

38
Boxplot
Boxplot (or box-and-whisker diagram) is a graph of
data set that consists of a line extending the
minimum value to the maximum value, and a box
with lines drawn at the first quartile 𝑄1 , the median,
and the third quartile, 𝑄3 .

39
Example of Boxplot
Make a boxplot of these data.
0 1 1 3 17 32 35 44 48 86
87 103 112 121 123 130 131 149 164 167
173 173 198 208 210 222 227 234 245 250
253 265 266 277 284 289 290 313 477 491

𝑀𝑖𝑛𝑖𝑚𝑢𝑚 = 0; 𝑄1 = 86.5; 𝑄2 = 170.0; 𝑄3


= 251.5; 𝑀𝑎𝑥𝑖𝑚𝑢𝑚 = 491
40
Example of Boxplot
Make a boxplot of these data.

𝑀𝑖𝑛𝑖𝑚𝑢𝑚 = 0; 𝑄1 = 86.5; 𝑄2
= 170.0; 𝑄3
= 251.5; 𝑀𝑎𝑥𝑖𝑚𝑢𝑚𝑖 = 491

41

You might also like