Further Mathematics 2019: Unit 3 & 4: Examples Answered

Further Mathematics: Univariate Data Ringwood SC 2019
Further Mathematics 2019: Unit 3 & 4 Examples answered

DATA ANALYSIS (CORE) – UNIVARIATE DATA
CHAPTER 1: Displaying and Describing Data Distributions
Exercise 1A: Classifying Data
Variables
In statistics, we call quantities about which we record information variables.
Types of data: numerical and categorical
DATA
NUMERICAL DATA CATEGORICAL DATA

Variable is a Variable is a word, number or
number that comes symbol arising from a person
from measuring or or object belonging to a
counting. category.
CONTINUOUS DISCRETE NOMINAL ORDINAL

We ask how We ask how Data can not be Data can be
much? many? logically ordered. logically ordered.
Numerical data arises when the information recorded about some variable is a number that
comes from measuring or counting some quantity. Numerical data also comes in two types,
discrete and continuous. In discrete data we count and ask the question “How many?”. In
continuous data we measure and ask the question “How much?”
eg. Numerical discrete – number of people living in your house.

number of hours you watch TV
eg. Numerical Continuous – height, weight, length
Categorical data arises when the information recorded about a variable is a word, number or
symbol arising from classifying a person or object as belonging to a particular category.
(NB: numbers which are categorical cannot have arithmetic procedures eg averages, addition,
subtraction, multiplication, division etc applied to them)
eg. Place you came in a race: first, second, third

Favourite ice cream
Type of pet
1
Example A1
Classify each of the following as categorical or numerical data. If the data is numerical, further
classify the data as discrete or continuous. If it’s categorical, classify the data as ordinal or
nominal.
DATA TYPE
TIME SPENT IN SHOWER Numerical, Continuous
SALARY Numerical, discrete

WEIGHT (1 = underweight, 2 = normal weight, Categorical, ordinal
3 = overweight)
AGE GROUP (teens, twenties or thirties) Categorical, ordinal
EYE COLOUR Categorical, nominal
WEIGHT (kilograms) Numerical, continuous
POST CODE Categorical, nominal
Exercise 1A: 1,2,3,4,5,6
Exercise 1B: Displaying and Describing Categorical and Numerical Data
Frequency Tables
A frequency table is a listing of the values a variable takes in a data set, along with how often
(frequently) each value occurs. It can be used for both numerical and categorical data.
Frequency can be recorded as a:
 Count: number of times a value occurs, or
 Percent: percentage of times a value occurs.
The tables are set as follows:
Category Frequency Count Frequency percentage
Percentage frequency is calculated as follows:
Frequency
Percentage Frequency   100%
Total Number
2
Example B1 (Numerical Data)
The family size of 11 preschool children are as follows:
3 3 4 4 5 3 2 4 3 5 3
Display the data in the form of a frequency table. Round percentages to one decimal place.
FAMILY SIZE FREQUENCY FREQUENCY PERCANTAGE

1
2 1 × 100 = 9.1%
11
5
3 5 × 100 = 45.5%
11
3
4 3 × 100 = 27.3%
11
2
5 2 × 100 = 18.2%
11
TOTAL 11
The Mode
In a frequency table, the mode is defined to be the data value (or range of data values) that
occurs most often; that is has the greatest frequency.
It is also called the modal class or category.
Barcharts
The barchart is a display for representing the information

contained in a frequency table containing categorical data in
graphical form.
Constructing a barchart from a frequency table:

 Frequency (or % frequency) on vertical axis
 Variable on horizontal axis
 Height of bar gives frequency (or % frequency)
 Bars are drawn with gaps between them to show each
value is a separate category.
Note: Bar charts have a gap between the vertical axis and the first column and has a gap between
the columns. Frequency is always on the vertical axis.
3
Example B2 (Categorical Data)

10 voters are asked which party they chose on election day:
Greens, Labor, Labor, Labor, Liberal, Labor, Liberal, Labor, Labor, Liberal
a) Display the data in the form of a frequency table.
GENDER FREQUENCY
Liberal 3
Greens 1
Labor 6
TOTAL 10
b) Display the data from your frequency

table in a bar chart.
Stacked or Segmented Barcharts

In a stacked or segmented barchart, the bars are stacked one on another to give a single bar
with several components.
The vertical axis measures either the frequency or percentage frequency.
Example B3
According to the segmented bar chart below, what percentage of days was Melbourne’s climate
recorded as moderate?
About 65%
4
Exercise 1B: 2, 4, 6
Exercise 1C: Displaying and describing the distributions of numerical
variables
Grouped Frequency Distribution
Sometimes the only way we can display data is by grouping into categories, for instance ages.
Listing all possible ages would be tedious therefore we might use the intervals: 15-19, 20-24, 25-
29, etc.
Grouping is also used for continuous data such as heights and weights. We generally try to group
our data into between 5 and 10 class intervals.
Discrete data Continuous data

eg 15–19 eg 165–
20–24 170–
25–29 175–
Example C1
A local gym recorded the ages of thirty people who were in a cycle class. The results are as
follows:
32 18 20 22 24 52 45 36 28 27
39 25 24 19 51 20 22 26 25 30
19 18 30 32 17 28 25 19 31 28
Form a grouped frequency table with class intervals of 5

Lowest number = 17
Highest number = 52
Age Tally Frequency Midpoint

15–19 6 17
20–24 6 22
25–29 8 27
30–34 5 32
35–39 2 37
40–44 0 42
45–49 1 47
50–54 2 52
5
Histograms
A histogram is a way of representing the information contained in a frequency table containing
numerical data in graphical form. Note: In a bar chart, the ‘bars’ are separated by a space and
they are used for categorical data. Histograms have ‘bars’ which touch and they are used for
numerical data. Frequency is shown on the vertical axis of both.
Note: When you are working with grouped data the first number in the class interval is written at the the
start of the bar. Alternatively, if you’re working with discrete (non-grouped) data, the number is written in
the middle of the bar.
Use the information in the frequency table to construct a histogram to display the distribution.
Describing histograms
“SOCS” = Shape, Outliers, Centre, Spread
In describing histograms, we discuss
1. Shape – see below – ignore the outlier when you are determining the shape!
2. ‘Outliers’ (values that are considerably higher or lower than the bulk of the data)
𝑛+1
3. Centre (median is the 2 th value) – need to know how to find these in histograms
4. Spread (range : highest score – lowest score)
**ALL of these features must be discussed when describing a histogram distribution
Shape
Symmetric Distribution Positively Skewed
10
10
Frequency
8
Frequency
5 6
4
0 2
FEATURES: 0
● Single peaked FEATURES:
● Tails off relatively evenly either side ● Tails off to the right
● Mean > Median
Negatively Skewed Bimodal Distribution

10
10
8 8
Frequency
Frequency
6 6
4 4
2
2
0
0
FEATURES: FEATURES:
● Tails off to the left ● Double peaked
● Mean < Median
6
Outliers
Outliers are values that stand out from the main body of data.
10
8
Frequency
6
4
2
0
Centre
The Centre (using the median) divides histogram into two equal areas.
10
10
8
Frequency
Frequency
6
5 4
2
0 0
n +1
Median location: if “n” is the number of data values, the median is the th value
2
Spread
Spread (using the range) indicates how tight or loose data values in a distribution are clustered.
10
15
Frequency
Frequency
10
5
5
0 0
NOTE : NOTE:
● Data loosely clustered ● Data closely clustered
7
Plotting Histograms on the calculator
Step 1: In the screen, enter all data values into list1

Step 2: Tap (Alternatively, tap “SetGraph  Setting”)
Step 3: Make sure “Draw” is set to “On”
Select Type: Histogram
XList: list1
Freq: 1
Tap: “Set”
Step 4: Tap
Step 5: “HStart” is the value you want your first interval to begin from. Typically a “round” number that is
equal to or below the lowest value in your data. “HStep” is the size of each interval (usually 5 or 10,
depending on the spread of the data). Select these values then tap “OK”.
Step 6: Tap (Alternatively, tap “Analysis  Trace”) then scroll left and right using to show
values on the histogram.
Writing a Report: Describing Histograms

For the distribution of (variable name) the data is (shape description) with (no outliers OR
outlier(s) at …).
The centre of the distribution, as measured by the median, lies in the interval _________ , and the
spread of the distribution, is ______________ as measured by the range.
8
Example C2
This histogram shows the
distribution of life expectancy
for 183 different countries.
Complete the report below for

this histogram.
For the distribution of life expectancy for these 183 countries, the data is negatively skewed with
no apparent outliers.
The centre of distribution, as measured by the median, is between 65 and 70 years and the spread
of the distribution, is approximately 35 as measured by the range.
Exercise 1C: 1, 2, 3, 5, 7, 8, 9
9
Exercise 1D: Using a log scale to display data

Significant figures
When rounding to a given number of significant figures, we are rounding to the digits in the
number that are regarded as “significant”.
The rules for significant figures are:

 All digits greater than zero are significant.
 Leading zeros can be ignored (they are placeholders and are not significant) – for
integers & decimal numbers.
 Zeros included between other digits are significant.
 Trailing zeros after decimal digits are significant.
 Trailing zeros for integers are not significant (unless specified otherwise).
Example D1
How many significant figures are there in each of these numbers?
a) 0.003 561 4
b) 70.036 5
c) 5.320 4
d) 5320 3
e) 450 000 2
f) 78 000.0 6
g) 78 000 2
Rounding with Significant figures

As when rounding to a given number of decimal places, when rounding to a given number of
significant figures consider the digit after the specified number of figures.
If it is 5 or above, round the final digit up; if it is 4 or below, keep the final digit as is.
For example:
5067.37 — rounded to 2 significant figures is 5100

3199.01 — rounded to 4 significant figures is 3199
0.004931 — rounded to 3 significant figures is 0.004 93
1020004 – rounded to 2 significant figures is 1000 000
32 – rounded to 4 significant figures is 32.00
Worksheet: Rounding
For extra practice: http://studymaths.co.uk/workout.php?workoutID=62
(Redo until you get 10/10)
10
Using a log10 scale

Sometimes a data set will contain data points that vary so much in size that plotting them using a
traditional scale becomes very difficult.
A way to overcome this is to write the numbers in logarithmic (log) form. The log of a number is
the power of 10 which creates this number.
log10(10) = log10(101) = 1
log10(100) = log10(102) = 2
log10(1000) = log10(103) = 3
The log of a number is the power of 10 which creates this number.
Worked example
The histogram below displays the body weights (in kg) of a number of animal species. Because
the animals represented in this dataset have weights ranging from around 1kg to 90 tonnes (a
dinosaur), most of the data are bunched up at one end of the scale and much detail is missing.
The distribution of weights is highly positively skewed, with an outlier.
However, when a log scale is used, their weights are much more evenly spread along the scale.
The distribution is now approximately symmetric, with no outliers, and the histogram is
considerably more informative. We can now see that the percentage of animals with weights
between 10 and 100kg is similar to the percentage of animals with weights between 100 and
1000 kg.
11
Converting values using log base 10
To convert a “real” value into a log scale value:

log10(actual value) = log scale value
To convert a log scale value back into a “real” value value:

10 ^ (log scale value) = actual value
Using CAS for logs

To convert a single data value:
Actual value to log scale value: (use ) Log scale value to actual value
To convert an entire set of data

Use the Statistics screen as shown:
12
Example D2
The Richter scale is a log base 10 scale used to measure the size of earthquakes.
a) An earthquake is recorded with a raw value of 75,000. What is this value on the Richter scale,
correct to 3 significant figures?
Log10(75000) = 4.875061… = 4.87 (to 3 s.f.)
b) What is the raw value, correct to 4 significant figures, of an earthquake which is recorded at
6.3 on the Richter scale?
106.3 = 1995262.315 = 1 995 000 (to 4 s.f.)
c) Show that an earthquake that measures 3.0 on the Richter scale is ten times stronger than an
earthquake that measures 2.0 on the Richter scale.
102 = 100
1000
103 = 1 000 = 10 therefore 10 times stronger.
100
Example D3
a) positive skew
b) approximately symmetric
Exercise 1D: 1aceg, 2, 3, 4
13
EXAM QUESTIONS
Log10 (10) = 1
There is 1 country above 1.0 on

the log scale histogram.
1÷58 x 100 = 1.72..%  2%
B is correct
Log10 (1) = 0
There are 9+1 = 10 values above

0 on the log scale histogram.
E is correct
Chapter 1 Review: MCQ 1-17, EXT 1-4
14
CHAPTER 2: Summarising Numerical Data
Exercise 2A: Dot Plots and Stem plots

Dot Plots
A dot plot is a type of graph which consists of a number line with each data point marked by a
dot. It is suitable for small sets of data only. It can be interpreted in the same way as a histogram.
Example A1
The dot plot below shows the number of hours students in a class spend on homework each
week. What is the median number of homework hours per week?
Stem and Leaf Plots (Stem plots)

A stem plot is an alternative to a histogram. It is useful for displaying relatively small sets of data
(less than 50 values) and has the advantage of retaining all the original data values. Like a
histogram the stem plot gives information about the shape, outliers, centre and spread of the
distribution. Remember to include a key.
Note: always check the key!
Example A2
Prepare an ordered stem and leaf plot for the following set of scores.
12 45 67 45 34 54 87 86 80 40 23 48 69 71
**Hint – you can use your CAS to order the data (Stats – List 1 – Edit – Sort ascending).
Key: 1|2 = 12
Stem Leaf
1 2
2 3
3 4
4 0 5 5 8
5 4
6 7 9
7 1
8 0 6 7
Note: The histogram report template (SOCS) can also be used for stem and leaf plots
15
Split Stems
Some stem plots are too bunched (there are too many numbers in one leaf) and it is therefore
necessary to perform a split stem. The stem is usually split into halves or fifths.
Eg. The stem 2 (representing 20) can be split into
2 ( 20-24) or 2 (20-21)
2 (25-29) 2 (22-23)
2 (24-25)
2 (26-27)
2 (28-29)
Example A3
Construct a single stem, and a stem split into fifths for the following data.
1.5 0.2 1.2 1.3 0.9 1.8 1.9 1.7 0.7 1.6
1.2 1.0 1.6 1.4 1.1 1.5 1.6 1.5 1.7
Stem Leaf
0 2 7 9 Stem Leaf
1 0 1 2 2 3 4 5 5 5 6 6 6 7 7 8 9 0
0 2
0
0 7
0 9
1 0 1
1 2 2 3
1 4 5 5 5
1 6 6 6 7 7
1 8 9
WHEN TO USE WHICH GRAPH:
Graph Type of Data Limit on data size

Bar chart Categorical data
Histogram Numerical data Best for medium to large

data sets (> 40 values)
Stem and leaf plot Numerical data Best for small to medium
data sets (< 50 values)
Dot plot Numerical data Suitable only for small data
sets (< 20 values)
Exercise 2A: 2, 3, 4, 5
16
Exercise 2B: The median, range and interquartile range (IQR)

Summary Statistics
The 2 most common measures used to summarise a distribution are measures of centre &
spread.
Calculating the Median

The median is a measure of centre. The median is located by listing all the data values in
numerical order and then finding the value that divides the distribution into two equal parts. For
small data sets, once the data is ordered the median can be easily located by the eye.
For example, to calculate the median of the following, write out the data set:
3 5 1 4 8
we firstly write out the data values in numerical order:
1 3 4 5 8
and then locate the midpoint of the data set
1 3 4 5 8
^
median = 4
For an odd number of data values, the median will be one of the data values as above. For an even
number of data values, the median does not coincide with an actual data value. For example, to
locate the median of the data set:
5 3 4 8
we firstly write out the data values in numerical order and then locate the midpoint of the data
set. In this case the midpoint lies halfway between 4 and 5, that is, at 4.5:
3 4 5 8
^
median = 4.5
For n data values, the MEDIAN is located at the:
n +1
position
2
By definition, half the data (50%) lies above the median and half below.
17
Example B1
Find the median of the following data set:
1 8 7 6 5 4 2 2 3 6
10 scores, ordered: 1, 2, 2, 3, 4, 5, 6, 6, 7, 8  median = 4.5
The Range
The range is a measure of spread.
RANGE = Highest value – Lowest Value

= xmax – xmin
Example B2
What is the range for the data in Example B1?
Range = 8 – 1 = 7
Example B3
Calculate the median and range from the stem and leaf plot below:
27 scores, median score is the 14th  median = 3.1km2

range = 8.4 – 1.5 = 6.9km2
18
The Interquartile Range (IQR)

The Interquartile range is a measure of spread. Just as the median is the point that divides a
distribution in half, quartiles are the points that divide a distribution into quarters.
We use the symbols, Q1, Q2 and Q3 to represent the quartiles.
Q1 – (the lower quartile)– The median of the lower half of you data
Q2 – the median
Q3 – (the upper quartile) – The median of the upper half of your data
Note –the median is not included in the lower or upper halves when performing
calculations for lower and upper quartiles.
Interquartile Range (IQR) = Q3 – Q1
The interquartile range is a measure of spread of the distribution that describes the range of the
middle 50% of observations.
The IQR is not affected by the presence of outliers. For this reason it is often a more useful
measure of spread than the range.
Example B4
For each of the following sets of data, find:
i. the lower quartile
ii. the upper quartile
iii. the interquartile range
a) 1, 2, 3, 3, 4, 4, 5, 6, 6, 7, 8
Q1 = 3
Q3 = 6 IQR = 6 – 3 = 3
b) 2, 3, 3, 3, 4, | 5, 6, 6, 6, 7
Q1 = 3
Q3 = 6 IQR = 6 – 3 = 3
Summary statistics on the calculator

This can be used to determine the mean, standard deviation, mode, median, minimum, maximum, upper
and lower quartiles.
Step 1: From the menu, go to the Statistics screen
Step 2: Enter all data values into list 1
Step 3: Tap “Calc  One-Variable”
Step 4: Make sure that Xlist is set to “list1” and tap OK.
19
Example B5
Calculate the IQR of the data displayed in the stem and leaf plot below.
STEM LEAF Key: 15 | 4 = 154

15 4 8 8
16 1 3 3 6 8
17 0 0 1 4 7 9 9|9
18 1 2 3 3 5 7 8 8 9
19 2 7 8
20 0 2
IQR = 188 – 168 = 20
Exercise 2B : 1, 3, 4, 5, 6, 8
Holiday Homework Worksheet: Univariate Data Exam Questions
Exercise 2C: The Five Number Summary and the Box plot
The “five number summary” is:

[Minimum value, Q1, Median, Q3, Maximum value]
The Box Plot

The box plot is a graphical representation of a five number summary (numerical data).
A box plot is a very compact way of clearly displaying the location, spread (median, IQR, range)
and general shape of a distribution.
When constructing a box plot:

 the box represents the middle 50% of scores
 the median is shown by a vertical line drawn within the box
 lines (called whiskers) are extended out from the lower and upper ends of the box to the
smallest and largest data values.
 Each section of the boxplot contains one quarter, 25%, of the data.
Outliers can be shown as a dot or a cross. If there is an outlier, the “whisker” only extends to the
lowest/highest value that is not an outlier.
20
Relating a box plot to the shape of a distribution

We can describe the shape of box plots using the same terms that we used for histograms.
Box plots with outliers

To determine whether a value is an outlier, we need to calculate a “lower fence” and an “upper
fence”. Any data value that lies outside these fences is considered an outlier.
Lower fence = Q1 – 1.5 x IQR

Upper fence = Q3 + 1.5 x IQR
On a boxplot, outliers are marked with a dot or a cross.
Example C1
a) Calculate the 5 number summary and then construct a box plot for the following data. Use and
label an appropriate scale.
36 35 34 32 37 35 38 32 35 37
ordered : 32, 32, 34, 35, 35, 35, 36, 37, 37, 38
min = 32, Q1 = 34, med = 35, Q3 = 37, max = 38
b) describe the box plot in terms of shape.

Approximately symmetric (although this one is not clear!)
c) calculate the upper and lower fences, and hence state whether there are any outliers.
IQR = 37 – 34 = 3 lower fence = 34 – 1.5 x 3 = 29.5
upper fence = 37 + 1.5 x 3 = 41.5
Since all values are between 29.5 and 41.5, there are no outliers.
21
USING CAS to draw boxplots
Step 1: In the screen, enter all data values into list1

Step 2: Tap (Alternatively, tap “SetGraph  Setting”)
Step 3: Make sure “Draw” is set to “On”
Select Type: “MedBox”
XList: list1
Freq: 1
Show Outliers (always make sure this box is ticked!)
Tap “Set”
Step 4: Tap
Step 5: Tap (Alternatively, tap “Analysis  Trace”) then scroll left and right using to show
values on the box plot
DESCRIBING THE DISTRIBUTION OF A BOXPLOT

When describing box plots we are typically asked to refer to shape, center, and spread, as well as
the presence of any outliers.
“SOCS” = Shape, Outliers, Centre (median), Spread (range & IQR)
Report template
The distribution is approximately symmetric/positively skewed/negatively skewed

with outlier(s) at _____ (or “with no outliers”).
The centre of the distribution, as measured by the median, is ________.
The spread of the distribution, as measured by the IQR, is ______ .
22
Comparing box plots

We can also be asked to make comparisons between multiple box plots. These are called
“parallel” box plots when drawn on the same scale (these are explored further in Bivariate Data)
Example C2
Using the box plots shown, answer
the following questions:
a) In which month was the
temperature generally higher?
May
b) Compare the distributions for July

and May in terms of center and
spread.
The center, as measured by the
median, is higher for May
(approximately 14.50C) than July
(approximately 90C).
The spread, as measured by the interquartile range, was greater in May (approximately 6.50C)
than July (approximately 30C)
Exercise 2C: 1, 2, 3a, 4, 6, 7, 8, 9

Exercise 2D: 1
Exercise 2E: 1, 2, 3
Exercise 2F: Describing the centre and spread of symmetric distributions

The Mean (average)
The mean is also known as the average. It is found by adding together each value, then dividing
by the number of values.
𝑠𝑢𝑚 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠
𝑚𝑒𝑎𝑛 (𝑥̅ ) =
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠
It is important to note that the mean is affected by outliers.
Mean on the CAS

1. Statistics  Enter data into list1
2. Tap “Calc  One variable  OK”
Note that is the mean
23
Example F1
Calculate the mean and median of following sets of data.
a) 2 6 8 10 14 15 15
mean = (2+6+8+10+14+15+15) ÷ 7 median = 10
= 10
b) 2 6 8 10 14 15 50
mean = (2+6+8+10+14+15+50) ÷ 7 median = 10
= 15
Mean of Ungrouped and Grouped Data

Sometimes we may be required to calculate the mean from data that is grouped (e.g. in a
frequency table, histogram or barchart)
Example F2
Use the “mid point” method to find the mean from each of the following frequency tables:
a) Complete the table below assuming that the data variable is discrete, then calculate the mean
from the table.
class mid – freq. xf

interval point Mean = (479 + 838.5 + 1592.5 + 2157) ÷ (2 + 3 + 5 + 6)
= 5067 ÷ 16
220 – 259 239.5 2 479 = 316.6875
260 – 299 279.5 3 838.5
300 – 339 318.5 5 1592.5
340 – 379 359.5 6 2157
b) Complete the table below assuming that the data variable is continuous, then calculate the
mean from the table.
class mid – freq. xf

interval point Mean = (240 + 1960 + 1600 + 1440) ÷ (1 + 7 + 5 + 4)
220 – 240 1 240 = 5240 ÷ 17
260 – 280 7 1960 = 308.235 (if rounded to 3 d.p.)
300 – 320 5 1600
340 – 360 4 1440
24
Choosing between the mean and the median

The mean and median are both measures of the centre of the distribution.
If the distribution is:
 Symmetric and there are no outliers, either mean or median can be used to indicate
the centre of the distribution.
 Skewed and/or there are outliers, it is more appropriate to use the median to indicate
the centre of the distribution.
Exercise 2F-1: 1, 3, 4, 5, 7, 8
Worksheet: Mean from grouped data
The Standard Deviation, Sx

 The standard deviation, s, measures the spread of data values around the mean.
 ( x – x )2
s =
n – 1
 The variance, is the square of the standard deviation, s2, and is also a measure of spread
e.g. Variance = 9, SD = √9 = 3
 If the standard deviation is small compared to the mean eg mean = 100 and s.d.= 3 then
there is a small spread of data.
 If the distribution is highly skewed, or if there are outliers, then the IQR is a better
measure of spread than the standard deviation.
Finding the standard deviation on the CAS (this is the only way you need to know!)
1. Statistics  Enter data into list1
2. Tap “Calc  One variable  OK”
Note that is the standard deviation
Example F3
Use your calculator to find the standard deviation of the following data set, correct to three
significant figures:
76, 75, 79, 69, 80, 74, 83, 66
Sx = 5.6505373…  5.65
Exercise 2F-2: 1, 2, 3, 4, 6
25
2G: The normal distribution and the 68-95-99.7% Rule
For any distribution approx.. 95% of the data lies within 2 Standard deviations of the mean
For any NORMAL (bell-shaped) distribution, approximately:
 68% of the observations lie within one standard deviation of the mean
 95% of the observations lie within two standard deviations of the mean
 99.7% of the observations lie within three standard deviations of the mean
mean -1SD mean+1SD
mean – 2SD mean + 2SD
mean – 3SD mean + 3SD
26
Example G1
The number of matches in a box is not always the same. When a sample of boxes was
studied it was found that the number of matches in a box approximated a normal (bellshaped)
distribution with a mean number of matches of 50 and a standard deviation of 2.
What percentage of boxes would be expected to have more than 48 matches?
48 is 1 s.d. below the mean
Therefore 16% of distribution is below 48, which means 84% is above 48.
Example G2
VCE study scores are normally distributed with a mean of 30 and standard deviation of 7.
a) What score would be needed to be in the top 16% of the state?

16% of distribution is more than 1 s.d. above mean  therefore 37 or more is needed to be in top 16%.
b) What percentage of students are expected to score between 23 and 44?

23 is 1 s.d. below, 44 is 2 s.d.s above  34% between 23 to 30, 47.5% between 30 to 44
34% + 47.5% = 81.5%, therefore 81.5% expected to score between 23 and 44.
c) In a class of 25 students, how many would be expected to score between 30 and 37? Answer to
the nearest whole number
34% of scores between 30 to 37  34% of 25 = 8.5  therefore 9 students.
27
Example G3
IQ scores are normally distributed. Given that 95% of IQ scores lie between 70 and 130, find the
mean and standard deviation of IQ scores.
95% of normal distribution lies within 2 s.d. either side of the mean.
So 70 is 2 s.d. below mean, 130 is 2 s.d. above mean.
Therefore, mean = 100 (midpoint of 70 and 130) and s.d. = 15 (30 ÷ 2)
Mean = 100
Standard deviation = 15
Exercise 2G : 1, 2, 3, 5
Exercise 2H: Standard Scores

These are also known as ‘z-scores’, and allow comparisons of distributions with different means
and standard deviations.
The standard score is standard score = data value – mean

Standard deviation
or xx
z
s
Standard scores can be positive or negative:

 Positive z-scores indicate the data is above the mean
 Negative z-scores indicate the data is below the mean
 A zero z-score indicates that the data is equal to the mean
*z – score = 1. Score is 1 SD above the mean.
Example H1
In an IQ test, the mean IQ is 100 and the standard deviation is 15. Dale’s test results give
an IQ of 130. Calculate this as a z-score. Interpret this information.
130−100
𝑧= =2
15
Dale’s score is 2 standard deviations above the mean (which puts him in the top 2.5% of the population)
28
You may also be asked to use the standard score formula to determine an “actual” value.
In these cases, “Action  Advanced  Solve” can be used on the calculator.
Example H2
The length of ants in a colony are normally distributed with a mean of 4.8mm and standard
deviation of 1.2mm.
An ant with a standardized length z = –0.5 corresponds to what actual length?
𝑥−4.8
−0.5 =  solve  x = 4.2mm
1.2
Using standard scores to compare performance

Standard scores can be used to compare performance in two different distributions with
different means and standard deviations.
Example H3
A student obtained the following marks in two exams:
Subject Mark Mean Std Dev
Psychology 75 65 10
Statistics 70 60 5
In which subject did she do better? Show your calculations.

75−65
Psychology standard score: 𝑧 = =1
10
70−60
Statistics standard score: 𝑧 = =2
5
Using standardized scores, this student performed better on the Statistics test.
Exercise 2H : 1ace, 2ace, 3, 4
29
2I : Populations and Samples
 A population, in statistics, is a group of people (or objects) to whom you can apply any
conclusions or generalisations that you reach in your investigation.
 A sample, in statistics, is a smaller group of people (or objects) who have been chosen
from the population and are involved in the investigation.
 A simple random sample SRS is a random selection from the population such that every
member of that population has an equal chance of being chosen in the sample and the
choice of one member does not affect the choice of another member (using your CAS).
Simple Random Sample (SRS) on your CAS:
To generate a list of random values on the calculator:
Step 1: In the Main screen, tap “Keyboard   Catalog”
Step 2: Tap on the letter “R” and then tap on “ randList( ”twice
Step 3: Input values in the following format:

randList (number of selections, minimum value, maximum value)
For example, to select 3 random numbers between 1 and 20, type: randList(3,1,20)
NOTE: If you are given a list of data and have to find a random sample, you need to number your
data 1, 2, 3, etc. The numbers given on your calculator indicates the POSITION of the real
values of the random sample in the data, they are NOT THE REAL VALUES OF DATA.
Example I2
The following data represents the ages of 20 people in an aerobics class. Find a random sample of
8 people.
42 17 18 36 19 22 25 21 20 38 33 30 16 19 25 25 26 25 22 17
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
a) Assign a number from 1 to 20 to each data value.
b) Use your calculator to select a simple random sample of 8 people from this class. Write
down the ages of the 8 people in the sample.
Randlist(8, 1, 20) 
for example: { 7, 3, 18, 15, 12, 6, 9, 10 }  Ages: 25, 18, 25, 25, 30, 22, 20, 38
Exercise 2I: 1, 2, 3 (from online book, or on Compass)
Chapter 2 Review: MCQ 1-29, EXT 1-5
30

Further Mathematics 2019: Unit 3 & 4: Examples Answered

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Further Mathematics 2019: Unit 3 & 4: Examples Answered

Uploaded by

Copyright:

Available Formats

Further Mathematics: Univariate Data Ringwood SC 2019

Further Mathematics 2019: Unit 3 & 4 Examples answered

CHAPTER 1: Displaying and Describing Data Distributions

Exercise 1A: Classifying Data

Types of data: numerical and categorical

NUMERICAL DATA CATEGORICAL DATA

CONTINUOUS DISCRETE NOMINAL ORDINAL

eg. Numerical discrete – number of people living in your house.

eg. Numerical Continuous – height, weight, length

eg. Place you came in a race: first, second, third

SALARY Numerical, discrete

EYE COLOUR Categorical, nominal

WEIGHT (kilograms) Numerical, continuous

POST CODE Categorical, nominal

Exercise 1A: 1,2,3,4,5,6

Exercise 1B: Displaying and Describing Categorical and Numerical Data

Category Frequency Count Frequency percentage

Percentage frequency is calculated as follows:

Example B1 (Numerical Data)

The family size of 11 preschool children are as follows:

FAMILY SIZE FREQUENCY FREQUENCY PERCANTAGE

It is also called the modal class or category.

The barchart is a display for representing the information

Constructing a barchart from a frequency table:

Example B2 (Categorical Data)

b) Display the data from your frequency

Stacked or Segmented Barcharts

The vertical axis measures either the frequency or percentage frequency.

Discrete data Continuous data

Form a grouped frequency table with class intervals of 5

Age Tally Frequency Midpoint

Negatively Skewed Bimodal Distribution

Plotting Histograms on the calculator

Step 1: In the screen, enter all data values into list1

Writing a Report: Describing Histograms

Complete the report below for

Exercise 1D: Using a log scale to display data

The rules for significant figures are:

Rounding with Significant figures

5067.37 — rounded to 2 significant figures is 5100

Using a log10 scale

The log of a number is the power of 10 which creates this number.

Converting values using log base 10

To convert a “real” value into a log scale value:

To convert a log scale value back into a “real” value value:

Using CAS for logs

To convert an entire set of data

Exercise 1D: 1aceg, 2, 3, 4

There is 1 country above 1.0 on

1÷58 x 100 = 1.72..%  2%

There are 9+1 = 10 values above

Chapter 1 Review: MCQ 1-17, EXT 1-4

CHAPTER 2: Summarising Numerical Data

Exercise 2A: Dot Plots and Stem plots

Stem and Leaf Plots (Stem plots)

Note: always check the key!

Eg. The stem 2 (representing 20) can be split into

WHEN TO USE WHICH GRAPH:

Graph Type of Data Limit on data size

Histogram Numerical data Best for medium to large

Exercise 2B: The median, range and interquartile range (IQR)

Calculating the Median