Professional Documents
Culture Documents
3) S1 Representation and Summary of Data - Dispersion
3) S1 Representation and Summary of Data - Dispersion
3) S1 Representation and Summary of Data - Dispersion
Data - Dispersion
• The last chapter was based on
calculating averages from sets of data
Lowest Highest
Value Q1 Q2 Q3 Value
For continuous data Use interpolation (like with the median from
chapter 2)
Range 15 – 1 = 14
Q3 =
3n 285 71.25 (72nd term) 37 29 49
4 4
38
IQR = Q3 – Q1 38 34 83
= 38 – 37
=1
39 12 95
Range and Quartiles
Time No. Cumulative
The length of time spent on the internet
each evening by a group of students is (mins) Students Frequency
shown in the table below. Calculate the
Inter-quartile range.
We use the interpolation method: 30-31 2 2
32-33 25 27
34-36 30 57
37-39 13 70
Range and Quartiles
Time No. Cumulative
The length of time spent on the internet
each evening by a group of students is (mins) Students Frequency
shown in the table below. Calculate the
Inter-quartile range.
30-31 2 2
Q1 = n 70 17.5 term
th
4 4 31.5
Q1 = LB + ( 1/4n - Fb4G x CW
GF
) 32-33 25 27
33.5
( )
17.5 - 2
31.5 + x 2 34-36 30 57
25
Q1 = 32.74
37-39 13 70
Range and Quartiles
Time No. Cumulative
The length of time spent on the internet
each evening by a group of students is (mins) Students Frequency
shown in the table below. Calculate the
Inter-quartile range.
30-31 2 2
Q3 = 3n 210 52.5 term
th
4 4
Q3 = LB + ( 3n/4 – Fb4G x CW
GF ) 32-33 25 27
33.5
( 52.5 -27
33.5 +
30
x 3 ) 34-36 30 57
36.5
Q3 = 36.05
37-39 13 70
Range and Quartiles
Time No. Cumulative
The length of time spent on the internet
each evening by a group of students is (mins) Students Frequency
shown in the table below. Calculate the
Inter-quartile range.
30-31 2 2
Q1 = 32.74
Q3 = 36.05
32-33 25 27
IQR Q3 – Q1
36.05 – 32.74
3.31 34-36 30 57
37-39 13 70
Percentiles
A Percentile is similar to a quartile. The 70 th percentile of a set of data will be
the value that has 70% of the data before it. It would normally be written P 70.
The 62nd percentile will be the value that has 62% of the data before it, P 62.
xn
To calculate Px, you find the value of the th term
100
31n
For the 31st percentile
100
90n
For the 90 percentile
th
100
P90 =
90n 6300 63rd term 160-170 21 25
100 100
P90 = 170-180 32 57
(
90n/100 –Fb4G
LB +
GF x CW )
180-190 9 66
( )
63-57 x
180 + 10
9
P90 = 186.67 (2dp) 190-200 4 70
Percentiles Cumulative
The height, in cm of 70 eighteen year old Height Students
Frequency
boys was measured and the data put into
the table opposite. Calculate the 90th
percentile, the 10th percentile and the 150-160 4 4
10% to 90% Inter-percentile range.
170-180 32 57
P10 = LB + ( 10n/100-Fb4G x CW
GF )
180-190 9 66
160 + ( 7-4
21
x 10 )
P10 = 161.43 (2dp) 190-200 4 70
Percentiles Cumulative
The height, in cm of 70 eighteen year old Height Students
Frequency
boys was measured and the data put into
the table opposite. Calculate the 90th
percentile, the 10th percentile and the 150-160 4 4
10% to 90% Inter-percentile range.
170-180 32 57
The 10% to 90% Inter-percentile
range P90 – P10
180-190 9 66
186.67 – 161.43
25.24cm
190-200 4 70
Variance and Standard Deviation
Variance and Standard Deviation are measures of how far away the data is
spread from the mean. If the mean is x and an observation is x, then the
observation’s dispersion from the mean is x x.
Sum of the squared
dispersions from the mean
(squaring removes any
The variance will therefore be given by; ( x x) 2 negative values)
n Number of
observations
However, a formula which is more commonly used, especially with larger sets of
data, is; The mean of the
squares
2
2
x x
2
Variance
n n
The square of
the mean
Important point:
The Standard Deviation tells you the range from the mean which
contains around 68% of the data (if data is normally disributed)
150
68 of the students are within
140 160 one Standard Deviation
x 36 x n 218 36
2
218 7 2
7 7
x x2
3 9 2 31.14 26.45
4 16
6 36 Variance 2 4.69
2 4
8 64
Standard Deviation 2.17 (2dp)
8 64
5 25
Total 36 218
Variance and Standard Deviation from a Table
As with the averages from Chapter 2, you need to be able to calculate the
Variance and Standard Deviation from a frequency table, grouped or ungrouped.
x2 x
2
fx
2
2
fx
Variance 2
f f
Sum of frequency
The difference reflects the fact that each value of x will appear many
times, rather than just once or a few times
Variance and Standard Deviation from a Table
Calculate the Variance and Standard Deviation of a set of data with the
following values already calculated.
fx 224 f
2
fx 8731 25
2
fx
2
fx 2
f f
2
8731 224
2
25 25
Variance 2 268.9584
0-5 4 2.5 10 25
2
fx 247.5 fx 3018.75 (4 x 2.5) (4 x 2.52)
As with averages, coding can be used to make data easier to work with.
However, there is something extra to remember…
If you have a set of data with a range of 15, and reduce every number
by 2, what will happen to the range?
Nothing!
Range measures the spread of data, and if all the numbers are 2 less,
the spread will not have changed
x x2
Code y x 100
50 2500
50, 60, 70, 80, 90
60 3600
x 350 x 25500 n 5
2
70 4900
80 6400
x2 x
2
90 8100
2
n n Total 350 25500
2
25500 350
2
5 5
2 200
We do not need to undo
14.14 (2dp)
as we only subtracted!
Coding
Use the following code to calculate the Standard Deviation of this set of data:
150, 160, 170, 180, 190
Code
x x x2
y
10 15 225
15, 16, 17, 18, 19
16 256
x 85 x 1455 n 5
2
17 289
18 324
x2 x
2
19 361
2
n n Total 85 1455
x 100 x x2
Code y
10 5 25
5, 6, 7, 8, 9
6 36
x 35 x 255 n 5
2
7 49
8 64
x2 x
2
9 81
2
n n Total 35 255
2
255 35 We only need to undo
2
5 5 the divide by 10…
2 2
x 10
1.41 (2dp) 14.14 (2dp)
Coding
Use the code below to calculate the Standard Deviation of this table of data.
2
fy
2
fy 2 0-5 4 2.5 -1 -4 4
f f
2 5-10 12 7.5 0 0 0
34.25 11.5
2
26 26
10-15 6 12.5 1 6 6
1.12
2
5.29 (2dp)
Total 26 11.5 34.25
(Σf) (Σfy) (Σfy2)
Summary
• We have now finished chapter 3