Professional Documents
Culture Documents
Lec02 - Central Tendency (Student)
Lec02 - Central Tendency (Student)
Lecture 2
Measures of Central Tendency and
Variation
Kirk Chan & Charmaine Lau
Mode Variance
Skewness
Geometric Mean Standard Deviation
Coefficient of Variation
n
X G = (X 1 X 2 X n )1 / n
Xi
X = i =1
n Midpoint of Most
ranked frequently
values observed
value
Xi Xi
X = i =1 = i=1
n N
=
X1 + X 2 ++ Xn X1 + X 2 ++ XN
=
n N
where n is the sample size where N is the population size
Xi is the ith observation Xi is the ith observation
Mean = 3
1 + 2 + 3 + 4 + 5 15
= =3
5 5
0 1 2 3 4 5 6 7 8 9… 20
Mean = 6
1 + 2 + 3 + 4 + 20 30
= =6
5 5
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9…20
Median = 3 Median = 3
n +1
Note that 2
is not the value of the median, only the
position of the median in the ranked data
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
Mode
Mode = 9
Answer:
Q1 Q2 Q3
◼ The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
◼ The second quartile, Q2, is the same as the median (50%
are smaller, 50% are larger)
◼ Only 25% of the observations are greater than the third
quartile, Q3
As n=9,
Q1 is in the (9+1)/4 = 2.5 position of the ranked data, so
we use the value halfway between the 2nd and the 3rd
values, which yields Q1 = 12.5
Q1 and Q3 are measures of non-central location while Q2,
i.e. median, is a measure of central tendency
Position Value
Find Quartiles:
Q1 is in the (10+1)/4 = 2.75 or rounded as the 3rd
ranked data, so Q1 = 13
Q2 is in the (10+1)/2 = 5.5th ranked data, so
Q2 = median = 16.5
Q3 is in the 3(10+1)/4 = 8.25 or rounded as the 8th
ranked data, so Q3 = 21
Variation
Small variation
◼ Measures of variation give
information on the spread or
Large variation
variability of the data values
Same center,
different variation
Foundation Statistics – Measures of Central Tendency and Variation 17
Range
Simplest measure of variation
Difference between the largest and the smallest values in
a set of data:
Range = Xlargest – Xsmallest
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range: 14 – 1 = 13
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Median X
X Q1 Q3
(Q2) maximum
minimum
25% 25% 25% 25%
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
Q1 Q 2 Q3 Q1 Q2 Q 3 Q1 Q2 Q3
– Negative skew: The left tail is longer; the mass of the distribution is
concentrated on the right of the figure. It has relatively few low
values. The distribution is said to be left-skewed.
– Positive skew: The right tail is longer; the mass of the distribution is
concentrated on the left of the figure. It has relatively few high
values. The distribution is said to be right-skewed.
00 22 33 55 27
27
The data are right-skewed, as the plot depicts
Answer:
Mean
(X )
n N
(X i − )
2 2
i −X
s =
2 i =1
2 = i =1
n −1 N
where n is the sample size where N is the population size
Xi is the ith observation Xi is the ith observation
X is the sample mean μ is the population mean
Mean
2
lower result
(X i − )
2
Answer:
N=5 2 = i =1
N
1+2+3+4+5
Mean 𝑋ሜ = = 3.00
5
= 2.00
Foundation Statistics – Measures of Central Tendency and Variation 31
Example - Variance (cont’d)
(X i )
n
2
−X
What if they are sample (1,2,3,4,5), n = 5
s2 = i
=1
Sample variance S2 = 2.50 n −1
Imaginary population: 1, 1, 1, 2, 3, 4, 5, 5, 5, 5
𝜎 2 = 2.96
2 2
s= i =1
= i=1
n −1 N
where n is the sample size where N is the population size
Xi is the ith observation Xi is the ith observation
X is the sample mean μ is the population mean
S =
(10 − X ) + (12 − X ) + (14 − X )
2 2 2
(
+ + 24 − X )
2
n −1
=
(10 − 16 )2 + (12 − 16 )2 + (14 − 16 )2 + + (24 − 16 )2
8 −1
S
CV = 100%
X
Implications:
– Helps to know how a set of data clusters around its mean
– In any data set, the observed values lie within a certain standard
deviations above or below the mean. (Chebyshev's Rule)
Example:
Consider lifetime of certain
brand of battery
µ = 100hr
68% σ = 2hr
95% 99.7%
2 3