Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Topic 1 - Exercises

Exercise 1 (2.14)
A café recently conducted a survey on its coffee selling by
customers’ age. The owner recorded the first 30 customers who
visited his café and had chosen coffee as their drink. The customers’
respective ages were recorded in a sample data set as follows:

a. Construct a frequency distribution


b. Develop a histogram based on the frequency distribution in (a)
c. Add two more columns which show the relative frequency and
cumulative relative frequency for the frequency distribution in (a)
d. Determine how many percent of the café’s customers chosen to
drink coffee are less than 29 years old.
Exercise 1 - Solution
a. Sort the customer’s age:

Find the number of classes k:


• Use the “2 to the k” rule: 𝟐𝒌 > 𝒏
log(30) 1.4771
• Solution: k > log 2 (𝑛) → 𝑘 > = = 4.907 → 𝒌 = 𝟓
log 2 0.3010

• Verify: 25 = 32 > 30
Find the class width W:
max −𝑚𝑖𝑛 52−13
• 𝑊= = = 7.8 ≈ 𝟖 (round up)
𝑘 5
Exercise 1 - Solution
a. Sort the customer’s age:

We have k=5 classes, each has a width of 8


The frequency distribution:

Age Frequency
13 – < 21 5
21 – < 29 8
29 – < 37 5
37 – < 45 6
45 – < 53 6
Exercise 1 - Solution Age Frequency
13 – < 21 5
b. Histogram: 21 – < 29 8
29 – < 37 5
37 – < 45 6
45 – < 53 6
Exercise 1 - Solution
c. Relative and cumulative frequencies

Age Frequency Relative Cumulative


frequency Relative
Frequency
13 – < 21 5 0.1667 0.1667
21 – < 29 8 0.2667 0.4333
29 – < 37 5 0.1667 0.6
37 – < 45 6 0.2 0.8
45 – < 53 6 0.2 1
Total 30 1.0000
d. Determine how many percent of the café’s customers chosen to
drink coffee are less than 29 years old.
• From the cumulative freq. distribution: 43.33% of the café’s
customers who chose to drink coffee are less than 29 years old.
Exercise 2 (3.74)
Recently, a store manager tracked the time customers
spent in the store from the time they took a number until
they left. A sample of 16 customers was selected and the
following data (measured in minutes) were recorded:

15 14 16 14 14 14 13 8 12 9 7 17 10 15 16 16

a. Find and interpret the: mean, standard deviation,


mode, range, median, and interquartile range
b. Are there any Outliers?
c. Develop a box and whisker plot for these data
Exercise 2 - Solution 𝟐
𝒙 ഥ
𝒙−𝒙 ഥ
𝒙−𝒙
a. 𝑥1 15 1.875 3.515625
σ𝑛 14 0.875 0.765625
𝑖=1 𝑥𝑖 15+14+⋯+16
Mean: 𝑥ҧ = = 16 2.875 8.265625
𝑛 16
14 0.875 0.765625
210
= = 𝟏𝟑. 𝟏𝟐𝟓 14 0.875 0.765625
16 14 0.875 0.765625
13 –0.125 0.015625
⋮ 8 –5.125 26.26563
Sample S𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧: 12 –1.125 1.265625
9 –4.125 17.01563
σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 2
7 –6.125 37.51563
𝑠=
𝑛−1 17 3.875 15.01563
10 –3.125 9.765625
15 1.875 3.515625
141.75 16 2.875 8.265625
= = 9.45 𝑥
16 − 1 16 16 2.875 8.265625
sum 210 141.75
= 𝟑. 𝟎𝟕𝟒𝟏
Exercise 2 - Solution
𝒙 𝒙𝟐
a. 𝑥1 15 225
Sample Standard deviation 14 196
𝐚𝐥𝐭𝐞𝐫𝐧𝐚𝐭𝐢𝐯𝐞 formula 16 256
14 196
(𝟑. 𝟏𝟒 𝑖𝑛 𝑡ℎ𝑒 𝑏𝑜𝑜𝑘): 14 196
2 14 196
σ𝑖 𝑥𝑖 13 169
σ𝑖 𝑥𝑖2 − ⋮
𝑠= 𝑛 8 64
𝑛−1 12 144
9 81
7 49
17 289
(210)2 10 100
2898 − 16 15 225
𝑠= 16 256
16 − 1 𝑥16 16 256
sum 210 2898
141.75
= = 9.45 = 𝟑. 𝟎𝟕𝟒𝟏
15
Exercise 2 - Solution
𝑸𝟐
a.
Sort data in ascending order:
7 8 9 10 12 13 14 14 14 14 15 15 16 16 16 17
position 𝑖:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Mode = 14 (most frequent value)


Range = max – min = 17-7 = 10

Median (𝑸𝟐 = 𝟓𝟎𝒕𝒉 𝒑𝒆𝒓𝒄𝒆𝒏𝒕𝒊𝒍𝒆):


• 𝑛 = 16, even number => median is the average of the two
(14+14)
middle values: = 𝟏𝟒
2
𝑝 50
• Using 𝑖 = 𝑛 = 16 = 8, integer => median is the
100 100
(14+14)
average of the values at the 8th and 9th positions: = 𝟏𝟒
2
Exercise 2 - Solution
𝑸𝟏 𝑸𝟑
a.

7 8 9 10 12 13 14 14 14 14 15 15 16 16 16 17
position 𝑖:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Frist quartile (𝑸𝟏 = 𝟐𝟓𝒕𝒉 𝒑𝒆𝒓𝒄𝒆𝒏𝒕𝒊𝒍𝒆)


𝑝 25
• 𝑖= 𝑛 = 16 = 4, integer => 𝑄1 is the average of the
100 100
(10+12)
values at the 4th and 5th positions: = 𝟏𝟏
2

Thrid quartile (𝑸𝟑 = 𝟕𝟓𝒕𝒉 𝒑𝒆𝒓𝒄𝒆𝒏𝒕𝒊𝒍𝒆)


𝑝 75
• 𝑖= 𝑛 = 16 = 12, integer => 𝑄3 is the average of the
100 100
(15+16)
values at the 12th and 13th positions: = 𝟏5.5
2
Exercise 2 - Solution
𝑸𝟏 𝑸𝟑
a.
𝑰𝑸𝑹

7 8 9 10 12 13 14 14 14 14 15 15 16 16 16 17
position 𝑖:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

The Interquartile Range (IQR) = 𝑄3 − 𝑄1 = 15.5 – 11 = 4.5

b. Outliers?
• Lower limit: 𝑄1 − 1.5 𝐼𝑄𝑅 = 11 − 1.5 × 4.5 = 𝟒. 𝟐𝟓
• Upper limit: 𝑄3 + 1.5 𝐼𝑄𝑅 = 15.5 + 1.5 × 4.5 = 𝟐𝟐. 𝟐𝟓
• ⇒ No outliers, because no values in the data is less than the
lower limit (4.25) or greater than the upper limit (22.25)
• ⇒ In the context of the problem: there are no outliers since
the time spent by each customer in store is greater than 4.25
minutes and smaller than 22.25 minutes.
Exercise 2 - Solution
c. Boxplot
7 8 9 10 12 13 14 14 14 14 15 15 16 16 16 17

𝒎𝒂𝒙 = 𝟏𝟕
𝑸𝟑 = 𝟏𝟓. 𝟓
𝑸𝟐 = 𝟏𝟒

𝑸𝟏 = 𝟏𝟏

𝒎𝒊𝒏 = 𝟕

Is the data distribution skewed?

You might also like