Professional Documents
Culture Documents
Statistics - Measures of Central Tendency and Dispersion Class 2 Session 2
Statistics - Measures of Central Tendency and Dispersion Class 2 Session 2
Oscar BARRERA
oscardavid.barrerarodriguez@sciencespo.fr
The Mode
10 10 10 9 9 9 9 8 8 8 8 8 7 7 7 7 7 7 6 6 6 6 6 6 6 5 5 5 5 5 5 5 5
54444444433333221
The Mode
10 10 10 9 9 9 9 8 8 8 8 8 7 7 7 7 7 7 6 6 6 6 6 6 6 5 5 5 5 5 5 5 5
54444444433333221
The Mode
Limitations
The Mode
Limitations
10 10 9 9 9 9 9 9 8 8 7 7 6 6 5 5 4 4 4 4 4 4 3 3 2 2 1 1
10 9 8 7 6 5 4 3 2 1 0
The Mode
Limitations
10 10 10 10 10 10 10 9 8 7 6 5 4 3 2 1 0
The Mode
Limitations
The Median
The Median
How to find it?
10 10 9 7 7 6 5 4 3 2 2
N +1
Md =
2
Solving with N=11?
The Median
How to find it?
10 10 9 7 7 6 5 4 3 2 2
N +1
Md =
2
Solving with N=11?
11 + 1
Md = = 6th
2
This indicates the median is located at the sixth position in the
distribution.
The Median
How to find it?
11 + 1
Md = = 6th
2
This indicates the median is located at the sixth position in the
distribution.
I Order the data: rank-ordered from smallest to largest.
I Count off six positions starting with the smallest value.
The Median
How to find it?
20 19 18 16 15 14 12 11 11 11 10 9
N N +2
Md = ,
2 2
Solving with N=11?
Oscar BARRERA Statistics Intermediate
Central Tendency Variability
The Median
How to find it?
20 19 18 16 15 14 12 11 11 11 10 9 Use the
N N +2
Md = ,
2 2
Solving with N=12?
12 12 + 2
Md = , = Md = 6th, 7th
2 2
The Median
How to find it?
20 19 18 16 15 14 12 11 11 11 10 9 Use the
The Median
How to find it?
The Median
Limitations
The Median
Limitations
Quartiles
Quartiles
How to find it
Quartiles
How to find it
7 7 8 9 10 11 11 14 15
Quartiles
How to find it
Quartiles
How to find it
Q1 = 2.5 means the first quartile is the average of the values at the
second and third positions. (7 + 8)/2 = 7.5
Quartiles
How to find it
Quartiles
How to find it
6 7 7 8 9 10 10 11 11 14
Quartiles
How to find it
The Mean
Understanding Sigma
ΣX
Means to sum up all values that belong to variable X.
Example!!
The Mean
Understanding sigma
ΣX = 39, 000 + 43, 500 + 48, 000 + 52, 500 + 57, 000 = 240000
ΣY = 41, 000 + 46, 000 + 51, 000 + 56, 000 + 62, 000 = 256000
Oscar BARRERA Statistics Intermediate
Central Tendency Variability
Understanding sigma
IMPORTANT!!
Σ is a grouping symbol like a set of parentheses and everything to
the right of Σ must be completed before summing the resulting
values.
Example
Understanding Sigma
ΣX 2 = 100 + 81 + 64 + 36 + 16 + 9 = 306
(ΣX )2 = (10 + 9 + 8 + 6 + 4 + 3)2 = (40)2 = 1600
Σ(X −Y ) = (10−5)+(9−5)+(8−4)+(6−4)+(4−3)+(3−2) = 17
Σ(X −Y )2 = (10−5)2 +(9−5)2 +(8−4)2 +(6−4)2 +(4−3)2 +(3−2)2 = 63
ΣXY = 50 + 45 + 32 + 24 + 12 + 6 = 169
The mean
Is the arithmetic average of all the scores in a distribution.
I The mean is the most-often used measure of central tendency
1 It evenly balances a distribution so both the large and small
values are equally represented
2 Takes into account all individual values.
To estimate it.. in two steps
1. add together all the scores in a distribution ΣX .
2. divide that sum by the total number of scores in the
distribution.
Sample
ΣX
X =M=
n
Population
ΣX
µ=
N
Oscar BARRERA Statistics Intermediate
Central Tendency Variability
The Mean
Example
2 2 3 4 5 6 7 7 9 10 10
The Mean
Example
2 2 3 4 5 6 7 7 9 10 10
Solution
ΣX 65
X =M= = = 5, 909
n 11
The median = 6. The difference between the mean and the median
reflects the mean’s taking into account the individual values.
Mean vs Median
Just to emphasize the fact that the mean takes into account all
values in a set of data.. Recall this example
Mean
IMPORTANT:
The mean is defined as the mathematical center of a
distribution.(i.e., differences between scores and the mean). The
sum of such different IS zero for any distribution. Perfect
balancing point
The Mean
Limitations
Box Plots
Box Plots
Box Plots
Box Plots
Examples: two classes who took the same final exam
Box Plots
7 + 7 + 7 + 7 + 7 + 7 + 7 + 7 + 7 + 7 = 70/10 = 7
or
6 + 6 + 6 + 6 + 6 + 8 + 8 + 8 + 8 + 8 = 70/10 = 7
or
1 + 1 + 5 + 5 + 9 + 9 + 10 + 10 + 10 + 10 = 70/10 = 7
Oscar BARRERA Statistics Intermediate
Central Tendency Variability
measures of variability
The range
The range
Sum of squares
Is the sum of the squared deviation scores from the mean and
measures the summed or total variation in a set of data.
SS = Σ(X − X )2
Sum of squares
Estimate the SS for the sample:
Variance
Issues with SS
I Sum of squares measures the total variation among scores in a
distribution;
I sum of squares does not measure average variability
I We want a measure of variability that takes into account both
the variation of the scores and number of scores in a
distribution
Sample Variance S 2
I is the average sum of the squared deviation scores from a
mean.
I Measures the average variability among scores in a distribution.
Σ(X − X )2 SS
S2 = =
n n
Oscar BARRERA Statistics Intermediate
Central Tendency Variability
Variance
Σ(X − X )2 SS
S2 = =
n n
In our examples?
Set I
64
S2 = = 6, 4
10
Set II
4
S2 =
= 0, 4
10
Larger measurements of variance indicate greater variability.
Variance
Issues
The one issue with sample variance is it is in squared units, not the
original unit of measurement.
I If the original measurements in Sets 1 and 2 were of length in
centimeters, when the values were squared the measurement
units are now cm2
What is the solution to this issue?
Standard Deviation
Once sample variance has been calculated, calculating the sample
standard deviation (s) is as simple as taking the square root of the
variance.
s
√
r
Σ(X − X )2 SS
S= = = S2
n n
So, in our exercise? Set I
p
S= 6, 4 = 2, 53
Set II p
S= 0, 4 = 0, 63
The standard deviation measures the average deviation between a
score and the mean of a distribution.
For example, in Set I (Set II), a score is expected to deviate from
the mean by 2.530 (0,63).
Oscar BARRERA Statistics Intermediate
Central Tendency Variability