Professional Documents
Culture Documents
Standard Deviation, Standardization and Outliers
Standard Deviation, Standardization and Outliers
Standard Deviation, Standardization and Outliers
Standard deviation,
2 standardization and outliers
Try this worksheet after you have completed section 2.6. You may find these
techniques useful in your project.
EXAMPLE 1
Follow the steps below to calculate the standard deviation of this set of data:
12 10 8 12 11 10 12 8 12 10
Step 1 The standard deviation is also called the ‘root, mean squared deviation’.
To calculate the standard deviation, first calculate the mean of the set of data.
Step 2 To measure the deviation from the mean, subtract the mean from each
number in the list.
Step 3 Square each of the numbers you are left with after step 2.
Step 4 Calculate the mean of this new set of squared numbers.
Step 5 Find the square root of your answer to step 4.
Answer
12 + 10 + 8 + 12 + 11 + 10 + 12 + 8 + 12 + 10
Step 1 The mean, x , is
10
= 105 ÷ 10 = 10.5
You can set out the data
Step 2 Step 3
values and calculations in a
Number, x (x − x) (x − x)
2
table like this.
12 1.5 2.25
10 −0.5 0.25
8 −2.5 6.25
12 1.5 2.25
11 0.5 0.25
10 −0.5 0.25
12 1.5 2.25
8 −2.5 6.25
12 1.5 2.25
10 −0.5 0.25
© Oxford University Press 2012: this may be reproduced for class use solely for the purchaser’s institute Extension worksheet 1
EXTENSION MATERIAL
Exercise 1
1 Calculate the standard deviation of this data set:
21 15 12 15 12 18 12 15 18 12
2 Which set of numbers is more spread out: the set in the in the example or the set in
question 1?
Step 3 Some of these numbers will be positive and others negative, so square each
number in the list so that you have only positive numbers.
(x −x)
2
( x1 − x ) ( x2 − x ) ( x3 − x ) x
2 2 2 2
… x n
n 1
∑1 ( x k − x )
n 2
Step 4 Find the sum of this new list. This is written as
Then find the mean, by dividing the sum by the number of data points in the list (n).
∑(x )
n 2
k −x
This is written as 1
n
Step 5 Take the square root.
∑(x )
n 2
k −x
Standard deviation = 1
n
n
Use this method to calculate the standard deviation of the data sets above.
Exercise 2
1 Calculate the mean and the standard deviation of each set of data
a 5 6 7 8 9 10 11
b 65 66 67 68 69 70 71
c 50 60 70 80 90 100 110
d 7.5 7.6 7.7 7.8 7.9 8.0 8.1
2 Do you notice anything interesting about the way in which the means and the
standard deviations change?
© Oxford University Press 2012: this may be reproduced for class use solely for the purchaser’s institute Extension worksheet 2
EXTENSION MATERIAL
Answer
Using the shorter method:
x f x×f (x 2) f × (x 2)
1 6 6 1 6
2 2 4 4 8
3 0 0 9 0
4 5 20 16 80
5 6 30 25 150
6 4 24 36 144
7 9 63 49 441
x = 147 ÷ 32 = 4.59375
¦ f x = 829 ÷ 32 = 25.90625
2
k
Exercise 3
1 Calculate the mean and the standard deviation of each set of data.
a
x 11 12 13 14 15 16 17
f 6 2 0 5 6 4 9
b x 1 2 3 4 5 6 7
f 2 5 10 14 20 34 29
c x 1 2 3 4 5 6 7
f 16 12 52 41 34 25 44
∑ ( x − x ) = 150
2
∑ x = 567 n = 15
∑ ( x − x ) = 80
2
∑ x = 167 n = 24
∑ ( x − x ) = 85
2
∑ x = 125 n = 12
∑ ( x − x ) = 32
2
∑ x = 652 n = 52
© Oxford University Press 2012: this may be reproduced for class use solely for the purchaser’s institute Extension worksheet 3
EXTENSION MATERIAL
∑ x = 567 ∑ x 2 = 22570 n = 24
∑ x = 125 ∑ x 2 = 2003 n = 15
∑ x = 85 ∑ x 2 = 998 n = 30
∑ x = 445 ∑ x 2 = 6250 n = 54
∑ x = 250 ∑ x = 2257
2
n = 29
Standardizing results
Statistics analyzes data in order to reach a conclusion. For example you can analyze
examination results, to compare a student’s performance in different exams.
Here are Bruce’s exam results.
Subject Mark
Mathematics 50
English 75
Physics 70
Biology 65
History 45
Drama 90
Music 85
Standardized score
Standard ⎛ (x − x) ⎞
Subject Mark (x) Mean mark (x)
deviation (sd) ⎜z = ⎟
⎝ sd ⎠
Mathematics 50 60 20 −0.5
English 75 50 15 1.67
Physics 70 65 10 0.5
Biology 65 70 5 −1
History 45 50 15 −0.33
Drama 90 95 2 −2.5
Music 85 72.5 12.5 1
This shows that Bruce did best in English, and then music. His worst result was for drama.
© Oxford University Press 2012: this may be reproduced for class use solely for the purchaser’s institute Extension worksheet 4
EXTENSION MATERIAL
Exercise 4
1 The table shows Fred’s exam marks, the mean mark for the year group and the
standard deviation.
Which subjects did Fred do best and worst in, compared to the rest of his year?
Mean Standard
Subject Mark
mark deviation
Mathematics 75 60 20
English 75 50 15
Physics 75 65 18
Biology 85 70 12
Art 40 45 20
History 45 55 15
Drama 90 95 2
Music 85 72.5 12.5
2 Jenny wrote eight essays for her GCSE English. Her marks out of 20 were
12, 15, 13, 17, 10, 9, 15, 13
a Calculate Jenny’s mean mark.
b Calculate the standard deviation of Jenny’s marks.
Paula’s marks in the same eight essays were 19, 8, 4, 16, 12, 18, 5, 6.
c Calculate Paula’s mean mark.
d Calculate Paula’s standard deviation.
e Briefly compare the two sets of marks.
3 The mass of coffee, in grams, in ten jars of ‘Fine Blend’ labeled 200 g are
218, 222, 206, 212, 220, 200, 196, 222, 194, 212
a Work out the mean mass of coffee per jar.
b Work out the standard deviation for the mass of the coffee.
4 The class results for paper 1 and paper 2 in a mock exam were
(74, 59) (66, 54) (54, 56) (34, 22) (45, 63) (78, 71)
(90, 85) (49, 42) (72, 59) (45, 39) (54, 48) (34, 42)
(77, 63) (78, 45) (81, 85) (49, 37)
a Find the mean mark and standard deviation for each paper.
b A candidate scored 65 on paper 1 and 60 on paper 2. Which was his
better performance?
© Oxford University Press 2012: this may be reproduced for class use solely for the purchaser’s institute Extension worksheet 5
EXTENSION MATERIAL
Outliers
An outlier is a value that is much smaller or much larger than the other values.
Normally an outlier is
smaller than ‘the lower quartile – 1.5 × the interquartile range’
or larger than ‘the upper quartile + 1.5 × the interquartile range’
EXAMPLE 3
Fifty children do a jigsaw puzzle. The times, correct to the nearest minute, for
completing the puzzle are
Time (t min) Frequency
4 4
5 12
6 18
7 9
8 3
9 2
11 1
Answer
Using the GDC, the lower quartile = 5, the median = 6, the upper quartile = 7.
The lowest value is 4 and the highest is 11.
The box and whisker graph is:
y
0 1 2 3 4 5 6 7 8 9 10 11 12 x
IQR = 7 – 5 = 2
5 – 1.5 × 2 = 5 – 3 = 2, there are no outliers at the low end.
7 + 1.5 × 2 = 7 + 3 = 10, 11 is larger than 10, so 11 is an outlier.
Exercise 5
1 The temperatures in °C recorded each day at noon in April in Rotterdam were
8 9 7 6 8 10 11 12 9 12
13 8 10 11 13 9 13 14 11 9
12 10 9 12 11 15 12 15 23 26
a Draw a box and whisker graph to represent this information.
b Test for outliers.
© Oxford University Press 2012: this may be reproduced for class use solely for the purchaser’s institute Extension worksheet 6
EXTENSION MATERIAL
3 The number of words in each sentence in the first chapter of a book are
© Oxford University Press 2012: this may be reproduced for class use solely for the purchaser’s institute Extension worksheet 7
EXTENSION MATERIAL
Exercise 2
1 a Mean = (5 + 6 + 7 + 8 + 9 + 10 + 11) ÷ 7 = 56 ÷ 7 = 8
∑ ( x − x ) = (9 + 4 +1 + 0 + 1 + 4 + 9) = 28
2
∑(x − x ) ÷ 7 = 4
2
sd = 4 = 2
b 68, 2 c 80, 20 d 7.8, 0.2
2
Exercise 3
1 a 14.594, 2.192 b 5.307, 1.540 c 4.411, 1.813
Exercise 4
1 z-scores = 0.75, 1.67, 0.56, 1.25, –0.25, –0.67, –2.5, 1
2 a 13 b 2.6
c 11 d 5.68
3 a 210.2 g b 10.7 g
© Oxford University Press 2012: this may be reproduced for class use solely for the purchaser’s institute Extension worksheet 8
EXTENSION MATERIAL
Exercise 5
1 a f
10
8
6
4
2
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 x
2 a f
10
8
6
4
2
0 10 20 x
3 a f
10
8
6
4
2
0 10 20 30 x
© Oxford University Press 2012: this may be reproduced for class use solely for the purchaser’s institute Extension worksheet 9