Professional Documents
Culture Documents
Frequency Measures and Graphical Representation of Data
Frequency Measures and Graphical Representation of Data
2
Absolute and
Relative
Frequencies
3
Frequency of Data on Discrete Variables (1)
Suppose there are 10 people in a queue. Each of them is either coded as “F”
(for female) or “M” (for male). The collected data may look like:
M, F, M, F, M, M, M, F, M, M.
We use 𝑎1 to refer to the M category and 𝑎2 to refer to the F category. We also
have 7 values in category 𝑎1 , denoted as 𝑛1 = 7, and 3 values in category 𝑎2 ,
denoted as 𝑛2 = 3.
The absolute frequency is the number of observations in a particular category.
Note that 𝑛1 + 𝑛2 = 𝑛 = 10 we can also calculate the relative frequencies of
𝑛1 7 𝑛2 3
𝑎1 and 𝑎2 as 𝑓1 = 𝑓 𝑎1 = = = 0,7 = 70% and 𝑓2 = 𝑓 𝑎2 = = =
𝑛 10 𝑛 10
0,3 = 30% ,respectively.
4
Frequency of Data on Discrete Variables (2)
Suppose there are 𝑘 categories denoted as 𝑎1 , 𝑎2 ,..., 𝑎𝑘
with 𝑛𝑗 (𝑗 = 1, 2, … , 𝑘) observations in category 𝑎𝑗 .
The absolute frequency 𝑛𝑗 is defined as the number of
units in the 𝑗th category 𝑎𝑗 .
The sum of absolute frequencies equals the total number
of units in the data: σ𝑘𝑗=1 𝑛𝑗 = 𝑛. The relative frequencies
of the 𝑗th class are defined as
𝑛𝑗
𝑓𝑗 = 𝑓 𝑎𝑗 = , 𝑗 = 1,2, … , 𝑘
𝑛
5
Empirical
Cumulative
Distribution
Function (ECDF)
6
Empirical Cumulative Distribution Function (ECDF)
Consider 𝑛 observations 𝑥1 , 𝑥2 , … , 𝑥𝑛 of a variable 𝑋, which are arranged in ascending
order as 𝑥(1) ≤ 𝑥 2 ≤ ⋯ ≤ 𝑥(𝑛) . The empirical cumulative distribution function 𝐹(𝑥)
is defined as the cumulative relative frequencies of all values 𝑎𝑗 , which are smaller
than, or equal to, 𝑥:
𝐹 𝑥 = 𝑓 𝑎𝑗
𝑎𝑗 ≤𝑥
• 𝐹(𝑥) is a monotonically non-decreasing function,
• 0 ≤ 𝐹(𝑥) ≤ 1,
• lim 𝐹 𝑥 = 0,
𝑥→−∞
• lim 𝐹 𝑥 = 1
𝑥→+∞
• 𝐹(𝑥) is right continuous.
7
ECDF for Ordinal Variables (1)
9
Obtain the relative frequencies from ECDF
Suppose 𝐻(𝑐 ≤ 𝑥 ≤ 𝑑) = relative frequency of
values 𝑥 with 𝑐 ≤ 𝑥 ≤ 𝑑, then:
10
ECDF for Continuous Variables (1)
Assume:
• 𝑘 : number of groups (or intervals)
• 𝑒𝑗−1 : lower limit of 𝑗th interval
• 𝑒𝑗 : upper limit of 𝑗th interval
• 𝑑𝑗 = 𝑒𝑗 − 𝑒𝑗−1 : width of the 𝑗th interval
• 𝑛𝑗 : number of observations in the 𝑗th interval
The ECDF can then be defined as
0, 𝑥 < 𝑒0
𝑓𝑗
𝐹 𝑥 = 𝐹 𝑒𝑗−1 + 𝑥 − 𝑒𝑗−1 , 𝑥 ∈ ൣ𝑒𝑗−1 , 𝑒𝑗 ൯
𝑑𝑗
1, 𝑥 ≥ 𝑒𝑘
with 𝐹 𝑒0 = 0.
11
ECDF for Pizza Delivery Time (1)
Consider of the pizza delivery service. If the number of observations is not large, i.e.
whether the delivery took < 15 min, between 15 and 20 min, between 20 and 25 min, and
so on, then we can construct the ECDF by creating a table summarizing the data features as
12
ECDF for Pizza Delivery Time (2)
13
Graphical
Representation
of a Variable
14
Bar Charts for Pizza Deliveries per Branch
15
Pie Charts
16
Histogram for Pizza Delivery Time
17
Thank You!
Agus Sukmana dan Erwinna Chendra
AMS 181501.04 STATISTIKA ELEMENTER
18