Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Frequency Measures &

Graphical Representation of Data


Discussion Topics
• Absolute and Relative Frequencies
• Empirical Cumulative Distribution Function (ECDF)
(ECDF for Ordinal Variables, ECDF for Continuous Variables)
• Graphical Representation of a Variable
(Bar Chart, Pie Chart, Histogram)

2
Absolute and
Relative
Frequencies

3
Frequency of Data on Discrete Variables (1)
Suppose there are 10 people in a queue. Each of them is either coded as “F”
(for female) or “M” (for male). The collected data may look like:
M, F, M, F, M, M, M, F, M, M.
We use 𝑎1 to refer to the M category and 𝑎2 to refer to the F category. We also
have 7 values in category 𝑎1 , denoted as 𝑛1 = 7, and 3 values in category 𝑎2 ,
denoted as 𝑛2 = 3.
The absolute frequency is the number of observations in a particular category.
Note that 𝑛1 + 𝑛2 = 𝑛 = 10 we can also calculate the relative frequencies of
𝑛1 7 𝑛2 3
𝑎1 and 𝑎2 as 𝑓1 = 𝑓 𝑎1 = = = 0,7 = 70% and 𝑓2 = 𝑓 𝑎2 = = =
𝑛 10 𝑛 10
0,3 = 30% ,respectively.
4
Frequency of Data on Discrete Variables (2)
Suppose there are 𝑘 categories denoted as 𝑎1 , 𝑎2 ,..., 𝑎𝑘
with 𝑛𝑗 (𝑗 = 1, 2, … , 𝑘) observations in category 𝑎𝑗 .
The absolute frequency 𝑛𝑗 is defined as the number of
units in the 𝑗th category 𝑎𝑗 .
The sum of absolute frequencies equals the total number
of units in the data: σ𝑘𝑗=1 𝑛𝑗 = 𝑛. The relative frequencies
of the 𝑗th class are defined as
𝑛𝑗
𝑓𝑗 = 𝑓 𝑎𝑗 = , 𝑗 = 1,2, … , 𝑘
𝑛
5
Empirical
Cumulative
Distribution
Function (ECDF)

6
Empirical Cumulative Distribution Function (ECDF)
Consider 𝑛 observations 𝑥1 , 𝑥2 , … , 𝑥𝑛 of a variable 𝑋, which are arranged in ascending
order as 𝑥(1) ≤ 𝑥 2 ≤ ⋯ ≤ 𝑥(𝑛) . The empirical cumulative distribution function 𝐹(𝑥)
is defined as the cumulative relative frequencies of all values 𝑎𝑗 , which are smaller
than, or equal to, 𝑥:

𝐹 𝑥 = ෍ 𝑓 𝑎𝑗
𝑎𝑗 ≤𝑥
• 𝐹(𝑥) is a monotonically non-decreasing function,
• 0 ≤ 𝐹(𝑥) ≤ 1,
• lim 𝐹 𝑥 = 0,
𝑥→−∞
• lim 𝐹 𝑥 = 1
𝑥→+∞
• 𝐹(𝑥) is right continuous.
7
ECDF for Ordinal Variables (1)

Consider a customer satisfaction survey from a car service


company. The 200 customers who had a car service done
within the last 30 days were asked to respond regarding their
overall level of satisfaction with the quality of the car service
on a scale from 1 to 5 based on the following options:
1 = not satisfied at all, 2 = unsatisfied, 3 = satisfied, 4 = very
satisfied, and 5 = perfectly satisfied.
8
ECDF for Ordinal Variables (2)

9
Obtain the relative frequencies from ECDF
Suppose 𝐻(𝑐 ≤ 𝑥 ≤ 𝑑) = relative frequency of
values 𝑥 with 𝑐 ≤ 𝑥 ≤ 𝑑, then:

10
ECDF for Continuous Variables (1)
Assume:
• 𝑘 : number of groups (or intervals)
• 𝑒𝑗−1 : lower limit of 𝑗th interval
• 𝑒𝑗 : upper limit of 𝑗th interval
• 𝑑𝑗 = 𝑒𝑗 − 𝑒𝑗−1 : width of the 𝑗th interval
• 𝑛𝑗 : number of observations in the 𝑗th interval
The ECDF can then be defined as
0, 𝑥 < 𝑒0
𝑓𝑗
𝐹 𝑥 = 𝐹 𝑒𝑗−1 + 𝑥 − 𝑒𝑗−1 , 𝑥 ∈ ൣ𝑒𝑗−1 , 𝑒𝑗 ൯
𝑑𝑗
1, 𝑥 ≥ 𝑒𝑘
with 𝐹 𝑒0 = 0.
11
ECDF for Pizza Delivery Time (1)
Consider of the pizza delivery service. If the number of observations is not large, i.e.
whether the delivery took < 15 min, between 15 and 20 min, between 20 and 25 min, and
so on, then we can construct the ECDF by creating a table summarizing the data features as

12
ECDF for Pizza Delivery Time (2)

13
Graphical
Representation
of a Variable

14
Bar Charts for Pizza Deliveries per Branch

15
Pie Charts

16
Histogram for Pizza Delivery Time

17
Thank You!
Agus Sukmana dan Erwinna Chendra
AMS 181501.04 STATISTIKA ELEMENTER

18

You might also like