Summary of Lecture 3: Numerical Descriptive Measures

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Summary of Lecture 3: Numerical descriptive measures

I. Measures of central location


1. Mean (average value)
𝑥$ + 𝑥& + ⋯ + 𝑥(
𝑥 =
𝑛
- For frequency table:
𝑓$ 𝑥$ + 𝑓& 𝑥& + ⋯ + 𝑓+ 𝑥+
𝑥 =
𝑛
- For grouped data
𝑓$ 𝑚$ + 𝑓& 𝑚& + ⋯ + 𝑓+ 𝑚+
𝑥 =
𝑛
2. Median
- Placing all the observations in order (ascending or descending).
- The observation that falls in the middle is the median.
3. Mode: is the observation (or observations) that occurs with the greatest
frequency.

Example: A sample of 12 people was asked how much change they had in
their pockets and wallets. The responses (in cents) are:
52; 25; 15; 30; 104; 44; 60; 30; 33; 81; 40; 5
- The mean is
52 + 25 + 15 + ⋯ + 5
𝑥 = = 43.25
12
- The median
+ Arrange the data in increasing order:
5; 15; 25; 30; 30; 33; 40; 44; 52; 60; 81; 104
+ There are 2 middle values: 33 and 40 so the median is
33 + 40
𝑚 = = 36.5
2
- The mode is 30 since it occurs the most frequently (2 times)

II. Measures of variability


1. Range
range = largest observation – smallest observation

2. Variance
(
(𝑥$ − 𝑥)& + ⋯ + (𝑥( − 𝑥)& 1
&
𝑠 = = 𝑥9& − 𝑛 𝑥 &
𝑛−1 𝑛−1
9:$

- For frequency table:


&
𝑓$ (𝑥$ − 𝑥)& + ⋯ + 𝑓+ (𝑥+ − 𝑥)&
𝑠 =
𝑛−1
+
1
= 𝑓9 𝑥9& − 𝑛 𝑥 &
𝑛−1
9:$

- For grouped data


+
1
𝑠& = 𝑓9 𝑚9& − 𝑛 𝑥 &
𝑛−1
9:$
3. Standard deviation:
𝑠 = 𝑠&
4. Coefficient of variation:
𝑠
𝐶𝑉 =
𝑥
Example: A sample of 12 people was asked how much change they had in
their pockets and wallets. The responses (in cents) are:
52; 25; 15; 30; 104; 44; 60; 30; 33; 81; 40; 5
• The range = largest observation – smallest observation
= 104 – 5 = 99
• The variance is
(
1
𝑠& = 𝑥9& − 𝑛 𝑥 &
𝑛−1
9:$
1
= 52& + 25& + ⋯ + 5& − 12 ∗ 43.25&
11
= 775.8409

2
• The standard deviation is
𝑠 = 𝑠 & = 775.8409 = 27.85392
• The coefficient of variation is
𝑠 27.85392
𝐶𝑉 = = = 0.64 = 64%
𝑥 43.25
III. Measures of relative standing and box-plot
• Pth-percentile: is the value for which P% are less than that value and
(100–P)% are greater than that value.
To calculate the Pth-percentile:
+ Determine the location of Pth-percentile:
𝑃
𝐿C = (𝑛 + 1)
100
+ Arrange the data in increasing order and find the value at the 𝐿C
position.
• The first or lower quartile, 𝑄$ , is equal to the 25th percentile.
The second quartile, 𝑄& , is equal to the 50th percentile.
The third or upper quartile, 𝑄G , is equal to the 75th percentile.
The interquartile range 𝐼𝑄𝑅 = 𝑄G − 𝑄$
Example: A sample of 12 people was asked how much change they had in
their pockets and wallets. The responses (in cents) are:
52; 25; 15; 30; 104; 44; 60; 30; 33; 81; 40; 5
- Find the 60th-percentile
+ The location of 60th-percentile:
𝑃 60
𝐿JK = 𝑛 + 1 = 13 ∗ = 7.8
100 100
+ Arrange the data in increasing order:
5; 15; 25; 30; 30; 33; 40; 44; 52; 60; 81; 104
+ The 60th-percentile is 40 + (44 - 40)*0.8 = 43.2
- Calculate the IQR
+ The location of 25th-percentile:
𝑃 25
𝐿&L = 𝑛 + 1 = 13 ∗ = 3.25
100 100

3
+ Arrange the data in increasing order:
5; 15; 25; 30; 30; 33; 40; 44; 52; 60; 81; 104
+ The first quartile is 𝑄$ = 25 + (30 - 25)*0.25 = 26.25
+ The location of 75th-percentile:
𝑃 75
𝐿ML = 𝑛 + 1 = 13 ∗ = 9.75
100 100
+ Arrange the data in increasing order:
5; 15; 25; 30; 30; 33; 40; 44; 52; 60; 81; 104
+ The third quartile is 𝑄G = 52 + (60 - 52)*0.75 = 58
+ The interquartile range 𝐼𝑄𝑅 = 𝑄G − 𝑄$ = 58 − 26.25 = 31.75
• Box-plot
+) 1.5IQR rule: An observation 𝑥K is an outlier if 𝑥K ∉ 𝑄$ −
1.5𝐼𝑄𝑅; 𝑄G + 1.5𝐼𝑄𝑅
- If the data have not any outlier, the box-plot graphs five statistics: the
minimum and maximum observations, the first, second, and third
quartiles.
- If the data have at least one outlier, the box-plot graphs five statistics:
the minimum and maximum observations, the first, second, and third
quartiles and the outliers, the two whiskers 𝑄$ − 1.5𝐼𝑄𝑅; 𝑄G +
1.5𝐼𝑄𝑅.
Example: A sample of 12 people was asked how much change they had
in their pockets and wallets. The responses (in cents) are:
52; 25; 15; 30; 104; 44; 60; 30; 33; 81; 40; 5
Draw the box-plot.
- The whiskers are 𝑄$ − 1.5𝐼𝑄𝑅; 𝑄G + 1.5𝐼𝑄𝑅 =
[−21.125; 105.625]. All observations lie inside the whiskers so the
data have not any outliers. The box-plot graphs five statistics: the
minimum and maximum observations, the first, second, and third
quartiles: 5; 104; 26.25; 36.5; 58

4
5

You might also like