Topic III

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

TOPIC 3: Grouped Data

B.S. Global Studies


Universitat Pompeu Fabra
Lecturer: Jaume Borràs
Idea so far
We studied centrality measures and dispersion when the data was not
grouped

We will proceed to replicate the second part of Topic II (besides


measures of shape, which follow the same logic) but with grouped data
(intervals)
Centrality Measures: Mean
! "
• 𝑥̅ = ∑#$! 𝑛# 𝑐# = ∑"#$! 𝑓# 𝑐#
"

Where:
• ni is the absolute frequency of the ith interval;
• fi is the relative frequency of the ith interval;
• ci is the class mark of the ith interval;
• n is the size of the sample
Example: mean
Example: time (in minutes) that 28 employees spend to get to work

Driving Number of
ci
time employees

[0 – 10) 3 5
[10 – 20) 10 15
[20 – 30) 7 25
[30 – 40) 4 35
[40 – 50) 2 45
[50 - 60] 2 55
Centrality Measures: Median
Median

𝑛
− 𝑁#$%
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝑄! = 𝐿" + 2 𝑤"
𝑛"

where:
• LM is the lower bound of the median class (interval containing the median);
• n is the size of the sample;
• Ni-1 is the cumulative frequency of the interval before the median class;
• nM is the absolute frequency of the median class;
• wM is the width of the median class (HM - LM)
• Median Class: ABS CUMUL FREQ HIGHER THAN N/2
Example: Median
Example: time (in minutes) that 28 employees spend to get to work

Driving
ni Ni fi Fi ci
time
[0 – 10) 3 3 0,11 0,11 5
[10 – 20) 10 13 0,36 0,46 15
[20 – 30) 7 20 0,25 0,71 25
[30 – 40) 4 24 0,14 0,86 35
[40 – 50) 2 26 0,07 0,93 45
[50 - 60] 2 28 0,07 1 55
Dispersion Measures: Quartiles
Quartiles

𝑛 3𝑛
− 𝑁#$% − 𝑁#$%
𝑄% = 𝐿&! + 4 𝑤&! 𝑄' = 𝐿&" + 4 𝑤&"
𝑛&! 𝑛&"

where:
• L is the lower bound of the interval
• n is the size of the sample
• Ni-1 is the cumulative frequency of the interval before/above
• 𝑛! is the absolute frequency of the interval of Q
• w is the width of the interval
• The interval that contains Q1 has relative frequency larger than N/4
• The interval that contains Q3 has relative frequency larger than 3N/4
Example: Quartiles
Example: time (in minutes) that 28 employees spends to get to work

Driving
ni Ni fi Fi ci
time
[0 – 10) 3 3 0,11 0,11 5
[10 – 20) 10 13 0,36 0,46 15
[20 – 30) 7 20 0,25 0,71 25
[30 – 40) 4 24 0,14 0,86 35
[40 – 50) 2 26 0,07 0,93 45
[50 - 60] 2 28 0,07 1 55
Dispersion Measures: Variance and Stdev
Variance

&
1 𝑛
𝑆" = * 𝑛# 𝑐#" − 𝑥̅ "
𝑛−1 𝑛 −1
#$%

Standard deviation
𝑆= 𝑆!

where:
• ni is the absolute frequency of the ith class;
• ci is the class mark of the ith class
Example: Variance and Stdev
Example: time (in minutes) that 28 employees spends to get to work

Driving Number of
ci
time employees

[0 – 10) 3 5
[10 – 20) 10 15
[20 – 30) 7 25
[30 – 40) 4 35
[40 – 50) 2 45
[50 - 60] 2 55
Data Transformation
Generic change of unit of measure

𝑋−𝑎
𝑌= 𝑏>0
𝑏

We can do two things with variable X:


1) Sum or subtract a constant (Origin change)
2) Multiply or divide by a constant (Scale change)
Mean
Mean
If we apply a linear transformation to a position measure, the position
measure will change exactly according to the transformation

If 𝑥̅ the mean of X, then the mean of Y is:

𝑋−𝑎 𝑥̅ − 𝑎
𝑌= 𝑦- =
𝑏 𝑏
Standard Deviation
Standard deviation
If we apply a linear transformation to a spread measure, the spread
measure will be only affected by the change of scale, not by the change
of origin

If 𝑠( is the standard deviation of X, the standard deviation of Y is:

𝑋−𝑎 𝑠/
𝑌= 𝑠. =
𝑏 𝑏

You might also like