Quat 6221 WB

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 148

IIE Module Guide QUAT6221/d/p/w; BSTA6212

QUANTITATIVE TECHNIQUES
WORKBOOK 2023

This manual enjoys copyright under the Berne Convention. In terms of the Copyright
Act, no 98 of 1978, no part of this manual may be reproduced or transmitted in any
form or by any means, electronic or mechanical, including photocopying, recording or
by any other information storage and retrieval system without permission in writing
from the proprietor.

The Independent Institute of Education (Pty) Ltd is registered


with the Department of Higher Education and Training as a
private higher education institution under the Higher Education
Act, 1997 (reg. no. 2007/HE07/002). Company registration number: 1987/004754/07.

© The Independent Institute of Education (Pty) Ltd 2023 Page 1 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

DID YOU KNOW?

Student Portal

The full-service Student Portal provides you with access to your academic
administrative information, including:
• an online calendar,
• timetable,
• academic results,
• module content,
• financial account, and so much more!

Module Guides or Module Manuals

When you log into the Student Portal, the ‘Module Information’ page displays the
‘Module Purpose’ and ‘Textbook Information’ including the online ‘Module Guides or
‘Module Manuals’ and assignments for each module for which you are registered.

Supplementary Materials

For certain modules, electronic supplementary material is available to you via the
‘Supplementary Module Material’ link.

Module Discussion Forum

The ‘Module Discussion Forum’ may be used by your lecturer to discuss any topics
with you related to any supplementary materials and activities such as ICE, etc.

To view, print and annotate these related PDF documents, download Adobe
Reader at following link below:
www.adobe.com/products/reader.html

© The Independent Institute of Education (Pty) Ltd 2023 Page 2 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

IIE Library Online Databases

The following Library Online Databases are available to you. Please contact your
librarian if you are unable to access any of these. Use the same username and
password as for student portal

Library Website This library website gives access to various online


resources and study support guides
[Link]

LibraryConnect The Online Public Access Catalogue. Here you will be


(OPAC) able to search for books that are available in all the IIE
campus libraries.
[Link]

EBSCOhost This database contains full text online articles.


[Link]

EBSCO eBook This database contains full text online eBooks.


Collection [Link]

SABINET This database will provide you with books available in


other libraries across South Africa.
[Link]

DOAJ DOAJ is an online directory that indexes and provides


access to high quality, open access, peer-reviewed
journals.
[Link]

DOAB Directory of open access books.


[Link]

IIESPACE The IIE open access research repository


[Link]

Emerald Emerald Insight


[Link]

HeinOnline Law database


[Link]

JutaStat Law database


[Link]

© The Independent Institute of Education (Pty) Ltd 2023 Page 3 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Table of Contents
Using this Workbook ....................................................................................................5
Learning Unit 1: Introduction to Statistics .....................................................................6
1. Activities ...............................................................................................................6
2. Revision Exercises .............................................................................................11
3. Solutions to Activities and Revision Exercises ...................................................17
Learning Unit 2: Index Numbers ................................................................................27
1 Activities .............................................................................................................27
2 Revision Exercises .............................................................................................30
3. Solutions to Activities and Revision Exercises ...................................................32
Learning Unit 3: Descriptive Statistics ........................................................................39
1. Activities .............................................................................................................39
2. Revision Exercises .............................................................................................45
3 Solutions to Activities and Revision Exercises ...................................................52
Learning Unit 4: Linear Regression And Correlation Analysis ...................................67
1 Activities .............................................................................................................73
2 Revision Exercises .............................................................................................75
3 Solutions to Activities and Revision Exercises ...................................................78
Learning Unit 5: Basic Probability ................................Error! Bookmark not defined.
1 Activities .............................................................................................................83
2 Revision Exercises .............................................................................................90
3 Solutions to Activities and Revision Exercises ...................................................94
Learning Unit 6: Probability Distributions .................................................................100
1 Activities ...........................................................................................................100
2. Revision Exercises ...........................................................................................103
3. Solutions to Activities and Revision Exercises .................................................107
Learning Unit 7: Introduction to Sampling Distributions ...........................................113
1 Activities ...........................................................................................................113
2 Revision Exercises ...........................................................................................115
3 Solutions to Activities and Revision Exercises .................................................117
Learning Unit 8: Hypothesis Testing ........................................................................120
1 Activities ...........................................................................................................120
2 Revision Exercises ...........................................................................................122
3. Solutions to Exercises ......................................................................................123
Learning Unit 9: Chi-Square Tests ...........................................................................128
1 Activities ...........................................................................................................128
2 Revision Exercises ...........................................................................................132
3 Solutions to Activities and Revision Exercises .................................................134
FORMULAE SHEET: Quantitative Techniques (QUAT6221 and BSTA6212) .........143
Bibliography .............................................................................................................148

© The Independent Institute of Education (Pty) Ltd 2023 Page 4 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Using this Workbook


This workbook has been developed to support your use of the prescribed material for
this module. Various activities and revision questions are designed to help you to
engage with the subject matter as well as to help you prepare for your assessments.

© The Independent Institute of Education (Pty) Ltd 2023 Page 5 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Learning Unit 1: Introduction to Statistics


Material used for this Learning Unit:

• Prescribed Textbook, Chapter One.

1. Activities
1.1 Activity 1
Purpose:

The purpose of this activity is to distinguish between statistical


concept antonyms.

Task:

1. Identify the population and the sample in each of the


following situations:

1.1 50 smokers were selected at random to determine


what the effectiveness of a televised anti-smoking
campaign is on smokers.

1.2 A very popular radio station selected 200 people at


random to determine listeners’ attitudes towards
certain programmes broadcast during the day.

1.3 In a recent survey, 3 000 South Africans were


asked if they read the newspaper daily; 600 said
‘Yes’.

2. The purpose of the following is to assist students in


identifying primary and/or secondary data:

2.1 Will information obtained to determine the water


levels of the major storage dams in South Africa be
primary or secondary data? Where can this data be
obtained from?

2.2 Describe how you would go about determining the


number of filling stations in your area, using both a
secondary source and a primary source.

2.3 Identify a problem where you can make use of


secondary data.

© The Independent Institute of Education (Pty) Ltd 2023 Page 6 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2.4 Identify a problem where you can make use of


primary data.

3. For each of the following situations, would you


recommend taking a sample or performing a census?
Explain your reasoning in each instance:

3.1 A jeweller just received a delivery of shock-


resistant watches, and wishes to determine
(approximately) the greatest height from which the
watches can be dropped onto a concrete surface
without breaking the crystal.

3.2 Tiger Mills wishes to determine the age, gender


and income characteristics of people who consume
Cheerio breakfast cereal.

3.3 The producers of the Early Bird show want to find


out what percentage of television (TV) viewers
recognise a photo of host, Joseph Khumalo.

3.4 A researcher wishes to determine whether


companies that manufacture nuclear submarines
would be interested in a new technique for
purifying air when such craft are submerged.

4. In the following examples, determine whether we are


dealing with a parameter or a statistic:

4.1 When surveying the political choices of voters, a


sample of voters is selected from the population of
all eligible voters. Based on the results observed
for the sample statistics, the analyst then makes
inferences regarding the political choices that are
likely to exist in the population of voters.

4.2 A recent survey of a sample of graduates reported


that the average starting salary for a graduate is
less than R30 000 per year.

4.3 Starting salaries for 270 graduates increased by


5% from the previous year.

4.4 In a random check of a sample of retail stores, the


local health inspector found that 24% of the stores
were not storing fish at the proper temperature.

© The Independent Institute of Education (Pty) Ltd 2023 Page 7 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

4.5 In 2004, all major league soccer teams spent a


total of R1 968 088 on players’ salaries. Decide
whether the numerical value is from a population or
a sample, and then specify whether it is a
parameter or a statistic.

5. Indicate whether the corresponding sets of observations


would be quantitative or qualitative. If quantitative,
distinguish between discrete and continuous.

Quantitative Qualitative Discrete Continuous


Ethnic group
Age
ID number
Net worth (rand)
Favourite sport
Temperature
Home language
Cooking time for pasta
Speed of an aeroplane
Gender

Activity 2
Purpose:

The purpose of this activity is to identify the measurement


scale associated with different variables.

Task:

Specify the measurement scale for each of the following:

1. Whether you are a South African (SA) citizen.

2. The amount you paid to fill up your petrol tank.

3. The time it took you to get to the university this morning.

4. The size of your take-away coffee.

5. Your belt size.

6. Your student number.

7. The occupation of 200 shoppers at a supermarket.

© The Independent Institute of Education (Pty) Ltd 2023 Page 8 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

8. The daily temperature measured inside the supermarket.

9. The amount spent by every shopper at a supermarket.

10. Rating a new product as good, average or poor.

11. The recording of the first three digits of the shoppers’ cell
phone numbers.

Commentary related to this activity:

When dealing with measures of description, only certain


descriptive measures can be used with certain measurement
scales of data.

Activity 3
Purpose:

The purpose of this activity is to analyse a case study in terms


of the basic statistical concepts.

Task:

The South African government is concerned about the high


illiteracy rates amongst adults in South Africa. They wish to
estimate the true proportion of adults (over 18 years of age) in
South Africa who are illiterate (that is, that cannot read or write
in at least one language). A random sample of 10 000 adults
were interviewed, and 1 107 of them were found to be illiterate.

1. The sample space is __________________?

2. The parameter of interest is __________________?

3. The statistic is __________________?

4. The sample size is __________________?

5. The population size is __________________?

6. The variable is __________________?

7. The measurement scale of the variable is ___________?

8. The type of data is __________________?

© The Independent Institute of Education (Pty) Ltd 2023 Page 9 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Commentary related to this activity:

Competency in this activity means that you have developed a


good understanding of some important statistical concepts.

© The Independent Institute of Education (Pty) Ltd 2023 Page 10 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2. Revision Exercises
2.1 Revision Exercise 1
1. Classify the following sets of data as qualitative or
quantitative. If classified as quantitative, is it discrete or
continuous?

Quantitative Qualitative Discrete Continuous


The weight of each
member of a soccer
team
Religious affiliation
Marks obtained in 1st
test
The colours you can
identify in a rainbow
Telephone numbers
in a telephone
directory
The number of sit-ups
you can do
The daily temperature
at 12h00
Number of traffic
fatalities
Time required to
complete a crossword
puzzle
The ages of the
students in your study
group

© The Independent Institute of Education (Pty) Ltd 2023 Page 11 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2. What is the measurement scale for each of the given


variables?

Religious affiliation
The number of sit-ups you can do
The daily temperature at 12h00
Number of traffic fatalities
Time required to complete a crossword puzzle
The ages of the students in your study group
The colours you can identify in a rainbow
Telephone numbers in a telephone directory
Shoe sizes
The three major professional tennis tournaments
listed: Australian Open; Wimbledon; US Open
The amount of weight lost in the past month by a
person following a strict diet
The classification of Boeing aircraft as 727, 737 or
747, according to size

3. A recent study on the average amount of time spent


watching TV per day by a group of students yielded the
following results:

Average Time
Number of
Academic Status (Hours) Spent
Students
Watching TV
1st years 30 6.4
2nd years 20 4.5
3rd years 10 2.8

Choose the correct word in each of the following statements:

The 60 students involved in the study constitute a


sample/population of students.
The figure 6.4 is a parameter/statistic.
The estimate of the average amount of time spent
watching TV per week by all 1st year students involves
a descriptive/inferential technique.
The variable of interest is the number of
students/average time watching TV.
The average times calculated for the three groups of
students are of a discrete/continuous nature.
The academic status of the students can be classified
as quantitative/qualitative data.

© The Independent Institute of Education (Pty) Ltd 2023 Page 12 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2.2 Revision Exercise 2


1. The collection of all possible individuals, objects or
measurements is known as:

1.1 A sample;
1.2 A population;
1.3 An inference;
1.4 A statistic.

2. Techniques used to organise, summarise, and present


the data that has been collected are known as:

2.1 Populations;
2.2 Samples;
2.3 Inferential statistics;
2.4 Descriptive statistics.

3. Techniques used to estimate a population parameter,


based on a sample, are known as:

3.1 Populations;
3.2 Samples;
3.3 Inferential statistics;
3.4 Descriptive statistics.

4. In a random sample, each item in the population has:

4.1 A chance of being selected;


4.2 The same chance of being selected;
4.3 A 50% chance of being selected;
4.4 No chance of being selected.

5. The sample mean is an example of a _____________.

5.1 Sample statistic;


5.2 Population parameter;
5.3 Measurement scale;
5.4 Discrete variable.

6. Primary data is collected by:

6.1 Primary school children;


6.2 People collecting it for the first time;
6.3 The actual people who will be using it;
6.4 Mainly experienced people.

© The Independent Institute of Education (Pty) Ltd 2023 Page 13 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

7. Secondary data are obtained from:

7.1 Secondary school children;


7.2 Existing sources;
7.3 The actual people who will be using it;
7.4 Mainly experienced people.

8. When conducting a survey, you collect data by:

8.1 Sampling;
8.2 Using a secondary source;
8.3 Asking questions;
8.4 Using the random number table.

9. To estimate the percentage of defects in a recent


manufacturing batch, the quality control manager of Intel
Computers selects every 8th chip that comes off the
assembly line starting with the 3rd until he obtains a
sample of 100. The method he follows to obtain the
sample is known as the______________.

9.1 Simple random sampling method;


9.2 Systematic random sampling method;
9.3 Stratified random sampling method;
9.4 Snowball sampling method.

10. Once every hour, a random sample of 12 light bulbs is


selected from an assembly line delivering this type of
light bulb. The number of bulbs in each sample that will
not light is divided by 12 to obtain the defective
proportion. What is the variable?

10.1 Type of light bulbs;


10.2 Sample of 12 light bulbs;
10.3 Proportion of defective light bulbs;
10.4 Assembly line.

11. 49, 34, and 46 students are selected from the 1st year,
2nd year and 3rd year classes, consisting respectively of
490, 340 and 460 students. The type of sampling is.

11.1 Stratified random sampling;


11.2 Cluster random sampling;
11.3 Systematic random sampling;
11.4 Simple random sampling.

© The Independent Institute of Education (Pty) Ltd 2023 Page 14 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

12. To avoid working late, a quality control analyst simply


inspects the first 100 items produced that day. This type
of sampling is known as:

12.1 Cluster sampling;


12.2 Systematic sampling;
12.3 Judgmental sampling;
12.4 Convenience sampling.

© The Independent Institute of Education (Pty) Ltd 2023 Page 15 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

4.1 Revision Exercise 3


Complete the following crossword.

© The Independent Institute of Education (Pty) Ltd 2023 Page 16 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3. Solutions to Activities and Revision


Exercises
3.1 Activity 1

Questions: Model Solutions:


1. Identify the population and 1.1 The sample is the
the sample in each of the 50 smokers
following situations: selected at random.
The population
1.1 50 smokers were consists of all
selected at random to smokers.
determine what the 1.2 The sample is the
effectiveness of a 200 listeners
televised anti-smoking selected at random
campaign is on from the population
smokers. of all listeners of
1.2 A very popular radio that radio station.
station selected 200 1.3 The sample is the
people at random to 3 000 South
determine listeners’ Africans that were
attitudes towards questioned. The
certain programmes population will all be
broadcast during the South Africans.
day.
1.3 In a recent survey,
3 000 South Africans
were asked if they
read the newspaper
daily; 600 said yes.
2. The purpose of the following 1.1 Secondary data and
is to assist students in can be obtained
identifying primary and/or from the
secondary data: Department of
Water Affairs. They
2.1 Information obtained have collected the
to determine the water information for
levels of the major another purpose.
storage dams in South 2.2 Secondary source:
Africa will be The local
primary/secondary municipality will
data? Where can this have a list of filling
data be obtained stations in your
from? area.
2.2 Describe how you Primary data: You
would go about will walk or drive
determining the every street in your

© The Independent Institute of Education (Pty) Ltd 2023 Page 17 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

number of filling area and record


stations in your area where the filling
using both a stations are located.
secondary source and 2.3 Discuss options in
a primary source. class.
2.3 Identify a problem 2.4 Discuss options in
where you can make class.
use of secondary
data.
2.4 Identify a problem
where you can make
use of primary data.
Questions: Model Solutions:
3. For each of the following 3.1 A sample, because
situations, would you of the destructive
recommend taking a sample nature of the
or performing a census? experiment.
Explain your reasoning in 3.2 A sample, because
each instance: of the widely spread
3.1 A jeweller just population.
received a delivery of 3.3 A sample, because
shock-resistant of the widely spread
watches, and wishes population.
to determine 3.4 Population, because
(approximately) the the number of
greatest height that companies who
the watches can be manufacture nuclear
dropped from onto a submarines will be
concrete surface very small.
without breaking the
crystal.
3.2 Tiger Mills wishes to
determine the age,
gender and income
characteristics of
people who consume
Cheerio breakfast
cereal.
3.3 The producers of the
Early Bird show want
to find out what
percentage of
television (TV) viewers
recognise a photo of
host, Joseph
Khumalo.

© The Independent Institute of Education (Pty) Ltd 2023 Page 18 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.4 A researcher wishes


to determine whether
companies that
manufacture nuclear
submarines would be
interested in a new
technique for purifying
air when such craft
are submerged.
4. In the following examples, 4.1 Sample statistic is
determine whether we are the proportion of the
dealing with a parameter or political choices of
a statistic: the sample. The
inference that is
4.1 When surveying the made on the
political choices of political choices for
voters, a sample of the population is the
voters is selected from parameter.
the population of all 4.2 The estimated
eligible voters. Based starting salary for all
on the results graduates is a
observed for the parameter.
sample statistics, the 4.3 Sample statistic.
analyst then makes 4.4 Sample statistic.
inferences regarding 4.2 This is a population
the political choices parameter, as it
that are likely to exist covers all teams in
in the population of the league.
voters.
4.2 A recent survey of a
sample of graduates
reported that the
average starting
salary for a graduate
is less than R30 000
per year.
4.3 Starting salaries for
270 graduates
increased by 5% from
the previous year.
4.4 In a random check of
a sample of retail
stores, the local health
inspector found that
24% of the stores
were not storing fish at
the proper

© The Independent Institute of Education (Pty) Ltd 2023 Page 19 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

temperature.
4.5 In 2004, all major
league soccer teams
spent a total of
R1 968 088 on
players’ salaries.
Decide whether the
numerical value is
from a population or a
sample, and then
specify whether it is a
parameter or a
statistic.

3 Indicate whether the corresponding sets of observations


would be quantitative or qualitative. If quantitative,
distinguish between discrete and continuous.

Quantitativ Qualitativ Discret Continuou


e e e s
Ethnic
X
group
Age X X
ID number X
Net worth
X X
(rand)
Favourite
X
sport
Temperatur
X X
e
Home
X
language
Cooking
time for X X
pasta
Speed of
an X X
aeroplane
Gender X

© The Independent Institute of Education (Pty) Ltd 2023 Page 20 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.2 Activity 2

Questions: Model
Solutions:
Specify the measurement scale for each of the
following:
1. Whether you are a South African (SA) Nominal
citizen.
2. The amount you paid to fill up your Ratio
petrol tank.
3. The time it took you to get to the Ratio
university this morning.
4. The size of your take-away coffee. Ordinal
5. Your belt size. Ratio
6. Your student number. Nominal
7. The occupation of 200 shoppers at a Nominal
supermarket.
8. The daily temperature measured inside Interval
the supermarket.
9. The amount spent by every shopper at a Ratio
supermarket.
10. Rating a new product as good, average Ordinal
or poor.
11. The recording of the first three digits of Nominal
the shoppers’ cell phone numbers.

© The Independent Institute of Education (Pty) Ltd 2023 Page 21 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.3 Activity 3

Questions: Model Solutions:


The South African government is
concerned about the high illiteracy rates
amongst adults in South Africa. They
wish to estimate the true proportion of
adults (over 18 years of age) in South
Africa who are illiterate (that is, that
cannot read or write in at least one
language). A random sample of 10 000
adults were interviewed, and 1 107 of
them were found to be illiterate.
1. The sample space is The 10 000 adults
__________________? in the sample.
2. The parameter of interest is The proportion of
__________________? all adults in South
Africa that are
illiterate.
3. The statistic is The proportion
__________________? 1107
10 000
4. The sample size is 10 000 adults
__________________?
5. The population size is Unknown
__________________?
6. The variable is Illiteracy rate
__________________?
7. The measurement scale of the Ratio scaled
variable is _______?
8. The type of data is Quantitative and
__________________? discrete

© The Independent Institute of Education (Pty) Ltd 2023 Page 22 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.4 Revision Exercise 1


1.

Quantitati Qualitati Discre Continu


ve ve te ous
The weight of each
member of the soccer X X
team
Religious affiliation X
Marks obtained in 1st
X X
test
The colours you can
X
identify in a rainbow
Telephone numbers
in a telephone X
directory
The number of sit-ups
X X
you do
The daily temperature
X X
at 12h00
Number of traffic
X X
fatalities
Time required to
complete a crossword X X
puzzle.
The ages of the
students in your study X X
group

© The Independent Institute of Education (Pty) Ltd 2023 Page 23 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2.

Religious affiliation Nominal


The number of sit-ups you can do Ratio
The daily temperature at 12h00 Interval
Number of traffic fatalities Ratio
Time required to complete a crossword puzzle Ratio
The ages of the students in your study group Ratio
The colours you can identify in a rainbow Nominal
Telephone numbers in a telephone directory Nominal
Shoe sizes Ordinal
The three major professional tennis tournaments Nominal
listed: Australian Open; Wimbledon; US Open
The amount of weight lost in the past month by a Ratio
person following a strict diet
The classification of Boeing aircraft as 727, 737 or Ordinal
747, according to size.

3.

The 60 students involved in the study constitute a


Sample
sample/population of students.
The figure 6.4 is known as a parameter/statistic. Statistic
The estimate of the average amount of time spent
Inferenti
watching TV per week by all 1st year students
al
involves a descriptive/inferential technique.
The variable of interest is the number of Average
students/average time watching TV. Time
The average times calculated for the three groups Continuo
of students are of a discrete/continuous nature. us
The academic status of the students can be Qualitati
classified as quantitative/qualitative data ve

3.5 Revision Exercise 2


1. 1.2 A population.

2. 2.4 Descriptive statistics.

3. 3.3 Inferential statistics.

4. 4.2 The same chance of being selected.

5. 5.1 Sample statistic.

© The Independent Institute of Education (Pty) Ltd 2023 Page 24 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

6. 6.2 People collecting it for the first time.

7. 7.2 Existing sources.

8. 8.3 Asking questions.

9. 9.2 Systematic random sampling method.

10. 10.3 Proportion of defective light bulbs.

11. 11.1 Stratified random sampling.

12. 12.4 Convenience sampling.

© The Independent Institute of Education (Pty) Ltd 2023 Page 25 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.6 Revision Exercise 3

© The Independent Institute of Education (Pty) Ltd 2023 Page 26 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Learning Unit 2: Index Numbers


Material used for this Learning Unit:

• Prescribed Textbook, Chapter Fourteen. (Please note, all references and


exercises related to Excel can be done for enrichment purposes but will not
be assessed.).

1 Activities
Consult the web page: http://www.statssa.gov.za/ for detailed
information on economic indicators.

Activity 1
Purpose:

The purpose is to use summary values to measure how prices


and quantities change over time periods. The value of an item
in the current period is expressed as a ratio of the value in a
base period. A step-by-step approach will be used in this
activity to explain all concepts.

Task:

The table below shows the prices (R) and quantities (kg) of
food items bought during 2013 and 2014.

2013 2014
P0 Q0 P1 Q1 P1Q0 P0Q0 P0Q1 P1Q1
Rice 7 80 6 70 480 560 490 420
Meat 30 50 35 60 1 750 1 500 1 800 2 100
Potatoes 3 100 3 100 300 300 300 300
40 230 44 230 2 530 2 360 2 590 2 820

Step 1:

The base period is given as the year 2013.

Step 2:

Label 2013 price and quantity columns as P0 and Q0, and for
2014 label them as P1 and Q1.

© The Independent Institute of Education (Pty) Ltd 2023 Page 27 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

1. With 2013 as the base period, calculate the 2014 simple


price and quantity indices for rice.

2. Which food item shows the largest change in quantity


over the two-year period?

3. Using 2013 as the base period, calculate the unweighted


composite price and quantity indexes for 2014.

4. Using 2013 as the base period, calculate the change in


price and quantity for 2014 using the Laspeyres’ indices.

5. Using 2013 as the base period, calculate the change in


quantity and price for 2014 using the Paasche indices.

Commentary Related to Activity Design:

Although an index is calculated as a percentage, the


percentage symbol (%) is dropped, only being used in
interpretations. Laspeyres price index measures price changes
over time, keeping quantities constant. This permits price
changes to be monitored without the confounding effect of
simultaneous quantity changes. When calculating the Paasche
price index, current period quantities are used to reflect more
recent consumption patterns.

Activity 2
The following table shows the 2011 and 2012 prices and
registrations of four makes of cars by a dealer in
Johannesburg. The number of motor vehicles per make of car
registered in 2011 and 2012 were used as respective weights.

Number of
Prices (R’000) registrations
Make 2011 2012 2011 2012
P0 P1 Q0 Q1 P0Q1 P1Q0
Audi 170 228 57 68 11 560 12 996
BMW 226 286 526 492 111 192 150 436
Toyota 102 136 1703 1803 183 906 231 608
Volkswagen 77 99 1343 1229 94 633 132 957
Total 575 749 3 629 3 592 401 291 527 997

© The Independent Institute of Education (Pty) Ltd 2023 Page 28 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Answer the next three questions.

1. With 2011 as the base period, calculate the simple


quantity index for BMW for 2012.

2. With 2011 as the base period, calculate the unweighted


composite price index for 2012.

3. With 2011 as the base period, calculate the change in


price for 2012 using the Laspeyres and Paasche
methods.

© The Independent Institute of Education (Pty) Ltd 2023 Page 29 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2 Revision Exercises
Revision Exercise 1
The prices and quantities of bread were compared over a two-
year period. Use the data in the table below to answer the next
three questions.

Prices Quantities
Commodity 2010 2011 2010 2011 P1Q0 P0Q0
White bread 12.50 15.00 8 2 120.00 100
Brown bread 16.00 22.50 5 7 112.50 80
Rye bread 27.50 20.00 2 5 40.00 55
Total 56.00 57.50 272.50 235

1. Which type of bread shows the smallest relative change


in price over the two years?

2. Calculate the unweighted composite quantity index for


2011 for all the commodities.

3. Calculate the Laspeyres price index for 2011.

4. Calculate the Paasche price index for 2011. Compare


the two weighted indexes you have computed.

Revision Exercise 3
The table below shows the prices and annual consumption of
the raw materials used in Gauteng Breweries in 2010 and
2011.

Prices Unit Quantities


Raw materials 2010 2011 2010 2011
Malt 49 46 10 874 15 116
Hops 512 724 732 696
Sugar 46 51 1 865 2 486
Wheat flour 31 27 873 1 093

1. Which raw material shows the largest relative change in


price over the two years?

2. Which raw material shows the largest relative change in


quantity over the two years?

3. Calculate the unweighted composite price and quantity


indexes for 2011 and interpret your answers.

© The Independent Institute of Education (Pty) Ltd 2023 Page 30 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

4. Calculate the Laspeyres price index for 2011.

5. Calculate the Paasche price index for 2011.

6. Compare the two weighted indexes you have computed.

© The Independent Institute of Education (Pty) Ltd 2023 Page 31 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3. Solutions to Activities and Revision


Exercises
3.1 Activity 1
Purpose:

The purpose is to use summary values to measure how prices


and quantities change over time periods. The value of an item
in the current period is expressed as a ratio of the value in a
base period. A step-by-step approach will be used in this
activity to explain all concepts.

Task:

The table below shows the prices (R) and quantities (kg) of
food items bought during 2013 and 2014.

2013 2014
P0 Q0 P1 Q1 P1Q0 P0Q0 P0Q1 P1Q1
Rice 7 80 6 70 480 560 490 420
Meat 30 50 35 60 1 75 1 50 1 80 2 10
0 0 0 0
Potatoe 3 100 3 100 300 300 300 300
s
40 230 44 230 2 53 2 36 2 59 2 82
0 0 0 0

Step 1:

The base period is given as the year 2013.

Step 2:

Label 2013 price and quantity columns as P0 and Q0, and for
2014 label them as P1 and Q1.

© The Independent Institute of Education (Pty) Ltd 2023 Page 32 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Questions: Model Solutions:


1. With 2013 as the base period, The simple price index formula is 𝐼𝑝 =
calculate the 2014 simple price 𝑃1
× 100. Insert rice prices in the formula:
𝑃𝑜
and quantity index for rice.
6
𝐼𝑝(𝑟𝑖𝑐𝑒) = × 100 = 85.71
7
There was a decrease of 100% - 85.71%
= 14.29% in the price of rice from 2013
to 2014.
The simple quantity index formula is 𝐼𝑞 =
𝑞1
× 100. Insert rice quantities in the
𝑞𝑜
formula:
70
𝐼𝑞(𝑟𝑖𝑐𝑒) = × 100 = 87.50
80
There was a decrease of 100% - 87.50%
= 12.50% in the quantity of rice from
2013 to 2014.
2. Which food item shows the Do a simple quantity index for all three
largest change in quantity over products and compare the answers.
the two-year period? 𝑞1
𝐼𝑞 = × 100
𝑞0
70
𝐼𝑞(𝑟𝑖𝑐𝑒) = × 100 = 87.50 There was a
80
decrease of 12.5% in quantity over the
period.
60
𝐼𝑞(𝑚𝑒𝑎𝑡) = × 100 = 120 There was an
50
increase of 20% in quantity over the
period.
100
𝐼𝑞(𝑝𝑜𝑡𝑎𝑡𝑜𝑒𝑠) = × 100 = 100 There is no
100
change in quantity consumed over the
period.
Rice shows a decrease of 12.5% and
meat an increase of 20%. The largest
change is then in the quantity of meat
consumed.

© The Independent Institute of Education (Pty) Ltd 2023 Page 33 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Questions: Model Solutions:


3. Using 2013 as the base period, Step 1: Sum the prices of all the items in
calculate the unweighted the given period (p1).
composite price and quantity Step 2: Sum the prices of all the items in
indexes for 2014. the base period (po).
Step 3: Substitute the totals into the
formula, and interpret the results.
∑ 𝑃1 44
𝐼𝑝 = ∑ × 100 = × 100 = 110.
𝑃𝑜 40

There was an average increase of 10%


in price for the three commodities over
the time period.

∑ 𝑄1 230
𝐼𝑞 = ∑ × 100 = × 100 = 100
𝑄𝑜 230

There was no change in the average


quantity consumed of the three
commodities over the time period.
4. Using 2013 as the base period, ∑ 𝑝1 𝑞0
𝐼𝑝 = × 100
calculate the change in price and ∑ 𝑝𝑜 𝑞0
the change in quantity for 2014 2 530
= × 100 = 107.20
2 360
using the Laspeyres’ methods.
There was an average increase in price
of 7.2% between 2013 and 2014, holding
quantities constant at 2013 values.

∑ 𝑝0 𝑞1
𝐼𝑞 = × 100
∑ 𝑝0 𝑞0
2 590
= × 100 = 109.75
2 360

There was an average increase in


quantity of 11.46% between 2013 and
2014, holding prices constant at 2013
values.

© The Independent Institute of Education (Pty) Ltd 2023 Page 34 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Questions: Model Solutions:


5. Using 2013 as the base period, ∑ 𝑝1 𝑞1
𝐼𝑝 = × 100
calculate the change in quantity ∑ 𝑝𝑜 𝑞1
and the change in price using the 2 820
= × 100 = 108.88
2590
Paasche methods.
There was an average increase in price
of 8.88% between 2013 and 2014,
holding quantities constant at 2014
values.

∑ 𝑝1 𝑞1
𝐼𝑝 = × 100
∑ 𝑝1 𝑞0
2820
= × 100 = 111.46
2 530

There was an average increase in


quantity of 11.46% between 2013 and
2014, holding prices constant at 2014
values.

3.2 Activity 1
The following table shows the 2011 and 2012 prices and
registrations of four makes of cars by a dealer in
Johannesburg. The number of motor vehicles per make of car
registered in 2011 and 2012 were used as respective weights.

Number of
Prices registration
(R’000) s
Make 20 20 201 2012
11 12 1 Q1 P0Q1 P1Q0 P0Q0 P1Q1
P0 P1 Q0
Audi 17 22 57 68 11 56 12 99 9 690 15 50
0 8 0 6 4
BMW 22 28 526 492 111 1 150 4 118 8 140 7
6 6 92 36 76 12
Toyota 10 13 170 1803 183 9 231 6 173 7 245 2
2 6 3 06 08 06 08
Volkswag 77 99 134 1229 94 63 132 9 103 4 121 6
en 3 3 57 11 71
Total 57 74 3 62 3 59 401 2 527 9 405 6 523 0
5 9 9 2 91 97 83 95

© The Independent Institute of Education (Pty) Ltd 2023 Page 35 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Questions: Model Solutions:


1. With 2011 as the base period, 𝑞1 492
𝐼𝑞 = × 100 = × 100 = 93.54
𝑞0 526
calculate the simple quantity
There was a decrease of 6.46% in the
index for BMW for 2012.
quantity of BMW’s registered.
∑ 𝑃1 749
2. With 2011 as the base period, 𝐼𝑝 = ∑ × 100 = × 100 = 130.26
𝑃𝑜 575
calculate the unweighted
There was a 30.26% increase in the
composite price index for 2012.
average price of the vehicles registered
between 2011 and 2012.
∑ 𝑝1 𝑞0 527997
3. With 2011 as the base period, 𝐼𝑝(𝐿) = ∑ × 100 = × 100 =
𝑝0 𝑞0 405683
calculate the change in price for
130.15.
2012 using the Laspeyres’ and
There was an average increase in price of
Paasche methods.
30.15% between 2011 and 2012, holding
quantities constant at 2011 values.

∑ 𝑝1 𝑞1 523095
𝐼𝑝(𝑃) = ∑ × 100 = × 100 =
𝑝0 𝑞1 401291
130.35.
There was an average increase in price of
30.35% between 2011 and 2012, holding
quantities constant at 2012 values.

3.3 Revision Exercise 2


1. White: 20% increase
Brown: 40.6% increase
Rye: 27.27% decrease

The smallest relative change is in the price of white


bread.

∑ 𝑞1 14
2. 𝐼𝑞 = ∑ × 100 = × 100 = 93.33
𝑞𝑜 15

The index indicates an average decrease of 6.67% in


quantities.

∑ 𝑝1 𝑞0 272.50
3. 𝐼𝑝(𝐿) = ∑ × 100 = × 100
𝑝𝑜 𝑞𝑜 235
= 115.96

An average increase of 15.96% in price, holding


quantities constant at 2010 values.

© The Independent Institute of Education (Pty) Ltd 2023 Page 36 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

∑ 𝑝1 𝑞1 287.5
4. 𝐼𝑝(𝑃) = ∑ × 100 = × 100
𝑝𝑜 𝑞1 274.5
= 104.74

An average increase of 4.74% in price, holding quantities


constant at 2011 values.

3.4 Revision Exercise 3


1. Malt: Decrease of 6.12%
Hops: Increase of 41.41%
Sugar: Increase of 10.87%
Wheat: Decrease of 12.9%

Therefore, the raw material that had the largest relative


change in price over the two years was Hops.

2. Malt: Increase of 39.01%


Hops: Decrease of 4.92%
Sugar: Increase of 33.3%
Wheat: Increase of 25.2%

Therefore, the raw material that had the largest relative


change in quantity over the two years was Malt.

∑ 𝑃1 848
3. 𝐼𝑝 = ∑ × 100 = × 100 = 132.92
𝑃𝑜 638

Between 2010 and 2011, the price of the raw materials


increased by an average of 32.82%.

∑ 𝑞1 19391
𝐼𝑞 = ∑ × 100 = × 100 = 135.19
𝑞0 14344

Between 2010 and 2011, the quantity of the raw


materials increased by an average of 35.19%.

∑ 𝑝1 𝑞0 1148858
4. 𝐼𝑝(𝐿) = ∑ × 100 = × 100
𝑝0 𝑞0 1020463
= 112.58

Between 2010 and 2011, the price of the raw materials


increased by an average of 12.58%, holding quantities
constant at 2010 values.

© The Independent Institute of Education (Pty) Ltd 2023 Page 37 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

∑ 𝑝1 𝑞1 1355537
5. 𝐼𝑝(𝑃) = ∑ × 100 = × 100
𝑝0 𝑞1 1245275
= 108.85

Between 2010 and 2011, the price of the raw materials


increased by an average of 8.85%, holding quantities
constant at 2011 values.

6. Holding quantities constant at current values results in


the average change in prices being lower for the
Paasche index when compared to the Laspeyres index.

© The Independent Institute of Education (Pty) Ltd 2023 Page 38 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Learning Unit 3: Descriptive Statistics


Material used for this Learning Unit:

• Prescribed Textbook, Chapter Two and Chapter Three. (Please note, all
references and exercises related to Excel can be done for enrichment
purposes but will not be assessed.)

1. Activities
1.1 Activity 1
Purpose:

The purpose of this activity is to do a complete analysis of a


case study involving ungrouped data.

Task:

The following data represents the number of daily and Sunday


newspapers currently published in the nine provinces of SA.

6 18 7 9 24 12 6 19 11

1. Calculate the mean, median and modal number of


newspapers per province. Interpret each answer.

2. By comparing the mean, median and modal statistics,


what can you conclude about the shape of the
distribution?

3. Determine the range, standard deviation, variance and


coefficient of variation for the number of newspapers per
province.

4. Construct a five-number summary table and a box-and-


whisker plot. Comment on the skewness of the
distribution, and the presence/non-presence of outliers.

5. Determine the inter-quartile range and quartile deviation.


Interpret your answer.

6. Determine the middle 70% range.

© The Independent Institute of Education (Pty) Ltd 2023 Page 39 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

7. What is the minimum number of newspapers a province


must publish to fall within the 15% of provinces that
publish the most newspapers?

Commentary Related to Activity Design:

This activity is designed to serve as a check to see if you can


perform all the possible descriptive numerical summaries for
ungrouped data. The most important part of the analysis is
whether you can interpret all your results.

1.2 Activity 2
Purpose:

Demonstrate your ability to choose an appropriate graph for


the given data and follow the correct steps in constructing the
graph.

Task:

Illustrate the following data by means of a bar chart. Analysis


of costs (R’00) over four years for temporary workers hired to
clean new building sites.

Safety Transport per


Year Lunch
Equipment Month
2009 120 10 20
2010 140 20 20
2011 100 40 30
2012 110 30 50

Commentary Related to Activity Design:

It is possible to apply different bar graphs to portray the data. It


all depends on what information you want from the graph. For
example, do you want to compare total costs per year? Do you
want to compare the different components?

© The Independent Institute of Education (Pty) Ltd 2023 Page 40 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

1.3 Activity 3
Purpose:

The purpose of this activity is to manipulate the frequency


table to some other formats in order to obtain certain
information from it. It also introduces you to basic decision
making.

Task:

The following percentage frequency distribution resulted from a


study of how late a sample of employees of an organisation
arrived for work during a specific month:

Minutes Late % of Employees


1-<6 1
6 - < 11 3
11 - < 16 4
16 - < 21 6
21 - < 26 7
26 - < 31 9
31 - < 36 11
36 - < 41 19
41 - < 46 32
46 - < 51 8

1. Determine the class midpoints for each interval, and


explain the meaning of the midpoints in two of the
intervals.

2. Determine the cumulative less than percentage


frequencies for each interval.

3. If the study covered a sample of 3 000 employees, how


many of them fell into each class interval?

4. Do you think the organisation should be concerned about


the results obtained from this study? Give reasons for
your answer.

Commentary Related to Activity Design:

This activity introduces you to interpreting data and decision


making by making use of the results obtained. If you
experience any problems in working with the percentage
column, refer back to Learning Unit 1.

© The Independent Institute of Education (Pty) Ltd 2023 Page 41 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

1.4 Activity 4
Purpose:

Determine if it is possible to get the same information from a


frequency distribution graph and from a frequency table.

Task:

The following graph shows the duration of a sample of


international phone calls using a prepaid calling card.

Duration of International Calls

70
63 64
60 59
Number of Calls

50 50
40 40
30
25
20
10
0 0
0 8 16 24 32 40 48
Duration (minutes)

1. What is the sample size?

2. How many intervals are used in the construction of the


graph?

3. What is the most frequently occurring time of calls?

4. Half of the calls are shorter than how many minutes?

5. How many of the calls are longer than 32 minutes?

6. How many of the calls are 10 minutes or less?

© The Independent Institute of Education (Pty) Ltd 2023 Page 42 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

1.5 Activity 5

Purpose:

The purpose of this activity is to perform a complete analysis of


a case study given with grouped data.

Task:

The following data set represents the distribution of annual


salaries (R’000) of 50 males who all perform similar jobs in a
particular industry.

Salary (R’000) Frequency Midpoint (x)


20 -< 22 1 21
22 -< 24 3 23
24 -< 26 14 25
26 -< 28 20 27
28 -< 30 10 29
30 -< 32 2 31
50

1. Calculate the mean, median and modal annual salary for


the males in this industry. Explain each answer.

2. By comparing the mean, median and modal statistics,


what can you conclude about the shape of the
distribution.

3. Determine the range, standard deviation, variance and


coefficient of variation for the salaries for the males in the
industry.

4. Determine the inter-quartile range and interpret your


answer.

5. Determine the middle 90% range.

6. Which value falls at the 60th percentile? Explain this


answer.

7. Plot the ogive, and read the value of the median from the
graph.

© The Independent Institute of Education (Pty) Ltd 2023 Page 43 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Commentary Related to Activity Design:

It was required from you to use three different ways to


determine skewness. Are the three answers the same?
Remember that if you deal with grouped data, your answers
are approximations, as the original data is not available.

© The Independent Institute of Education (Pty) Ltd 2023 Page 44 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2. Revision Exercises
2.1 Revision Exercise 1
1. A small company pays each of its five cleaners R22 000,
two clerks R70 000 each and the manager R270 000.
How many employees earn less than the mean salary?

2. Given a negatively skewed distribution with a median of


11 and a mode of 18, which of the following is a possible
value for the mean and why?

28 19 10 12

3. For the data 2, 19, 29, 19, 100, 9, 90; which of the mean,
median or mode would be changed if the 2 was changed
to 29?

4. The mean of a set of five data points is 20. You have


three of the data points; 5, 15, 30. The remaining two
numbers are also the mode of the distribution, what are
the two numbers?

5. For the last 10 days in January this year, the Gautrain


from Pretoria arrived late in Johannesburg by the
following number of minutes. (A negative number means
that the train was early by the number of minutes.)

-2 6 4 10 -4 12 2 -1 3 1

Determine the range and the mean of the data set.

6. A sample of students has a mean of 3.2 members in


their family. The modal number of family members is two
and the median number is 2.1. Based on this
information, what will the shape of the distribution
probably be?

© The Independent Institute of Education (Pty) Ltd 2023 Page 45 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

7. Earthquake intensities are measured using a device


called a seismograph which is designed to be most
sensitive for earthquakes with intensities between 4.0
and 9.00 on the open-ended Richter scale.
Measurements of 18 recent earthquakes gave the
following readings:

4.5 L 5.5 H 8.7 8.9 6.0 H 5.2


L 4 5 5 6 8 8 H H

L indicates that the earthquake had an intensity of below


4.0 and H indicates that the earthquake had intensity
above 9.0. Fifty percent of the earthquake intensities
were more than what value?

8. The number of rejects from 50 samples of the same size


is as follows:

Number of Rejects Number of Samples


in Sample (Frequency of Rejects)
0 5
1 10
2 10
3 20
4 5

8.1 The arithmetic mean number of rejects per sample


is _______?

8.2 Half of the samples have less than _______?

8.3 The modal number of rejects is _________?

9. A financial analyst’s sample of six companies’ book


values (in R’000) were:

25 7 22 33 18 15

If the sample mean is R20 000, what is the sample


standard deviation? (Round your answer to the closest
Rand.)

© The Independent Institute of Education (Pty) Ltd 2023 Page 46 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

10. A company has two regional head offices; in Cape Town


and Johannesburg. Workers in Johannesburg claim that
their salaries are more variable than the workers in Cape
Town. To test their claim, the following data was
collected for a random sample of 100 workers in each
office.

Cape Town Johannesburg


Mean Salary R27 000 R25 000
Standard Deviation R2 000 R2 100

Are the salaries in Johannesburg more variable? Use the


coefficient of variation to determine your answer.

11. Random samples of small townhouse selling prices are


obtained from ABSA Bank and First National Bank. The
results followed normal distributions and are summarised
below:

ABSA First National


Sample Size 50 80
Mean House Price R150 000 R160 000
Standard Deviation R20 000 R25 000

Which financial institutions’ reported prices can be


considered more uniform? Use the coefficient of variation
to prove your answer.

12. In a data set of 120 observations, how many


observations lie between the 50th percentile and the 60th
percentile?

2.2 Revision Exercise 2


1. A random sample of 40 smokers is classified according
to age. The youngest person recorded was 10 years and
the oldest 70 years. The median smoking age was 35
years, 25% of the smoking people were older than 45
and 25% were younger than 30. Draw a box-and-whisker
plot to summarise the data and determine:

1.1 The shape of the age distribution;

1.2 The interquartile range;

1.3 50% of the smokers were older than what age?

© The Independent Institute of Education (Pty) Ltd 2023 Page 47 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

1.4 Smokers younger than what age will be considered


outliers?

1.5 The oldest 25% of the smokers are older than what
age?

1.6 Are there any outliers present?

2. The inter quartile range for a data set is; 90 – 66. Decide
which of the following data values would be classified as
an outlier. 50; 140; 100?

3. The distances travelled each week (in kilometres) by 12


randomly selected sales representatives of an insurance
company are as follows:

100 110 190 200 290 320


380 400 410 580 700 980

Determine the five-number summary table.

4. The weights of a sample of female police officers are


summarised in the following boxplot:

4.1 What proportion of the female police officers weigh


between 70kg and 90kg?

4.2 What proportion of the female police officers weigh


more than 90kg?

4.3 Determine the median weight of the female police


officers.

4.4 Determine the range of the weights.

4.5 What values can the mean possibly be? Why will
this be a possibility?

4.6 Values in the original data set that can be


considered ‘outliers’ are smaller than or larger than
what weights?

© The Independent Institute of Education (Pty) Ltd 2023 Page 48 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2.3 Revision Exercise 3


The following table reveals the income in (R’0 000) of a
training facility for disabled persons:

Year Income from the National Lottery Subsidy Total


2009 1.4 2.0 3.4
2010 1.8 2.7 4.5
2011 2.0 0.6 2.6
2012 2.0 1.7 3.7

Construct a multiple bar graph to portray the data. What are


the main features of the data that you can observe in your
graph?

2.4 Revision Exercise 4


The following data was obtained from a questionnaire
completed by a sample of 25 people about how they get news:
(N=newspaper, T=television, R=radio, M=magazine)

N T N R N T N R N
R T M R M M N M
M N R T R R T M

Summarise the results in a frequency table and construct a pie


chart. Interpret your results.

2.5 Revision Exercise 5


A commercial farmer keeps record of the rainfall figures on his
farm. Over the last 50 months, the following readings (in ml)
were recorded:

55.8 60.9 39.1 40.0 71.4 77.1 37.0 35.5 31.7 65.2
45.9 59.1 91.3 56.0 36.7 52.6 49.5 65.8 44.6 62.3
83.2 58.2 69.3 42.3 71.7 47.3 48.0 69.8 33.8 61.2
75.3 94.6 61.8 64.9 60.6 61.5 56.3 78.8 27.1 76.0
60.7 47.2 30.0 39.8 87.1 69.0 74.5 68.2 65.0 66.3

Summarise this data in the form of a grouped frequency


distribution table, consisting of the classes 27-<37, 37-<47 and
so on, and then answer the following questions.

© The Independent Institute of Education (Pty) Ltd 2023 Page 49 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

1. Which interval occurs most frequently?

2. Construct a histogram for the data. What is the


frequency of the interval with the lowest rainfall?

3. Construct a polygon. What can you conclude about the


shape of the polygon? What is the reason for the
particular shape?

4. Construct a less than ogive. Half of the months show a


rainfall of less than how many millilitres?

2.6 Revision Exercise 6


The following histogram illustrates information obtained from a
street vendors’ association meeting in the Johannesburg CBD
in December of last year.

December profits (R'000)


Number of street vendors

15 14

12 11
10
9

6
3
3 2

0
2-<5 5-< 8 8-<11 11-<14 14-<17
Profit (R'000)

1. What is summarised in this histogram?

2. How many street vendors were sampled?

3. Approximately how much profit is made by those


vendors who earn the highest profit?

4. The 17 vendors who made the least profit earned less


than…?

5. What proportion of the vendors earned between R11 000


and less than R14 000?

© The Independent Institute of Education (Pty) Ltd 2023 Page 50 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

6. What can you conclude about the shape of the


distribution? Explain what it means.

2.7 Revision Exercise 7


1. Each prospective employee who applies for a job at a
certain bank is given a test. The length of time it took a
sample of 42 applicants to complete the test is given in
the following frequency distribution table:

Length of Time (minutes) Number of Applicants


1 -< 4 4
4 -< 7 8
7 -< 10 14
10 -< 13 9
13 -< 16 5
16 -< 19 2

1.1 What was the mean time that it took the applicants
in the sample to complete the test?

1.2 Determine the median time required by the


applicants in the sample to complete the test.

1.3 Determine the most frequent time interval that it


took applicants in the sample to complete the test.

© The Independent Institute of Education (Pty) Ltd 2023 Page 51 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3 Solutions to Activities and Revision Exercises


3.1 Activity 1

Questions: Model Solutions:


The following data represents the number of daily and Sunday newspapers published
in nine provinces during 2010.

6 18 7 9 24 12 6 19 11
1. Calculate the mean, median and • If you take the nine provinces into
modal number of newspapers. account, the average number of
Explain each answer. newspapers published per province
is 12.44;
112
x̅ = = 12.44
9
• The median position is:
9+1
= 5 (Note: The numbers must
2
be in numerical order). The median
is therefore 11. Half of the provinces
publish 11 different newspapers or
less, the other half publish 11
newspapers or more;
• The mode = 6. More of the
provinces publish six different
newspapers than any other number.
2. By comparing the mean, median Mean = 12.44; Median = 11; Mode = 6
and modal statistics, what can you
conclude about the shape of the Distribution of Daily and Sunday
distribution? Newspapers 2010

mode< med<mean

The distribution is positively skewed


because mode < median < mean.

© The Independent Institute of Education (Pty) Ltd 2023 Page 52 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Questions: Model Solutions:


3. Determine the range, standard • Range = Largest number –
deviation, variance and coefficient smallest number
of variation for the set of = 24 – 6 = 18
newspapers. • Standard deviation:

∑(x − x̅)2 334.22


s=√ =√
n−1 8
= √41.78 = 6.46 newspapers

• Variance (s²) = (6.46)2


= 41. 78 newspapers
• Coefficient of variation = 6.46/ 12.44
× 100
= 51.93%
4. Construct a five-number summary • S = 6;
table and a box-and-whisker plot. • Q1 = 6.5;
Comment on the skewness of the • Median = 11;
distribution and the presence of • Q3 = 18.5;
outliers. • L = 24.

• This distribution is positively skewed


3(mean − median)
̅̅̅
Sp =
Standard deviation
= 0.67
• Lower limit = 6.5 - 1.5(18.5 – 6.5)
= -11.5;
• Upper limit = 18.5 + 1.5(18.5 – 6.5)
= 36.5;
• The smallest value and the largest
value fall within the limits, therefore
no outliers are present.
5. Determine the inter-quartile range • The interquartile range = Q3 – Q1
and quartile deviation. Interpret = 18.5 – 6.5
your answer. = 12
12
• The quartile deviation = = 6;
2
• The average deviation around the
median is 6 newspapers.

© The Independent Institute of Education (Pty) Ltd 2023 Page 53 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Questions: Model Solutions:


6. Determine the middle 70% range. Position P85 = 8.5
P85 = 21.5.
(85% of the provinces publish 21.5
newspapers or less).
Position P15 = 1.5
P15 = 6.
(15% of the provinces publish 6
newspapers or less).

Middle 70% range = P85 - P15


= 21.5 – 6
= 15.5
7. What is the minimum number of That will be more than P85 = 21.5, i.e. the
newspapers a province must province must publish 21.5 newspapers or
publish to fall within the top 15% of more.
the provinces?

3.2 Activity 2

Questions: Model Solutions:


Illustrate the following data by means of These graphs can be easily constructed
a bar chart. Analysis of costs (R’00) over using the graph facility in MS Word.
four years for temporary workers hired to
clean new building sites.

Year
Safety Transport
Lunch
This is compound information and both a
Equipment per Month multiple and stacked bar chart will be
2009 120 10 20 suitable.
2010 140 20 20
2011 100 40 30 Multiple Bar Chart
2012 110 30 50 150

100
R'00

50

0
2009 2010 2011 2012

Safety equipment Transport Lunch

© The Independent Institute of Education (Pty) Ltd 2023 Page 54 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Questions: Model Solutions:


Stacked Bar Chart
200
180
160
140
120

R'00
100
80
60
40
20
0
2009 2010 2011 2012

Safety equipment Transport Lunch

3.3 Activity 3
The following percentage frequency distribution resulted from a
study of how late a sample of employees of an organisation
arrived for work during a specific month:

Minutes Late % of Employees


1-<6 1
6 - < 11 3
11 - < 16 4
16 - < 21 6
21 - < 26 7
26 - < 31 9
31 - < 36 11
36 - < 41 19
41 - < 46 32
46 - < 51 8

© The Independent Institute of Education (Pty) Ltd 2023 Page 55 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Minutes Late % of Employees x Cum f (<) f


1-<6 1 3.5 1 30
6 - < 11 3 8.5 4 90
11 - < 16 4 13.5 8 120
16 - < 21 6 18.5 14 180
21 - < 26 7 23.5 21 210
26 - < 31 9 28.5 30 270
31 - < 36 11 33.5 41 330
36 - < 41 19 38.5 60 570
41 - < 46 32 43.5 92 960
46 - < 51 8 48.5 100 240
100 3 000

Questions: Model Solutions:


1. Determine the class midpoints for • Interval 2: The value that represents
each interval and explain the this interval is 8.5; 3% of the
meaning of the midpoints in two of employees are approximately 8.5
the intervals. minutes late;
• Interval 7: The value that represents
this interval is 33.5; 11% of the
employees are approximately 33.5
minutes late.
2. Construct the cumulative See above table.
frequencies for each interval.
3. If the study covers a sample of See above table.
3 000 employees, how many of
them fall into each class interval?
4. Do you think the organisation Yes, the company should be concerned
should be concerned about the because 70% of the employees are half
results obtained from this study? an hour late or more.
Give reasons for your answer.

© The Independent Institute of Education (Pty) Ltd 2023 Page 56 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.4 Activity 4

Questions: Model Solutions:


The following graph shows the duration
of a sample of international phone calls
using a prepaid calling card.

Duration of International
Calls
80
Number of Calls

60 59 63 64
50
40 40
20 25

0 0
0 8 16 24 32 40 48
Duration (minutes)
1. What is the sample size? The last point is the sample size = 64.
2. How many intervals are used in Count the number of upper boundaries =
the construction of the graph? 6.
3. What the most frequently occurring Between 0 and 8 minutes.
time of the calls?
4. Half of the calls are shorter than Half of the number of calls is 32. If you
how many minutes? read 32 from the y-axis to the graph and
drop down to the x-axis, half of the calls
are shorter than approximately 11
minutes.
5. How many of the calls are longer 59 calls are shorter than 32 minutes.
than 32 minutes? There are 64 calls in the sample;
therefore, five calls were longer than 32
minutes.
6. How many of the calls are 10 Approximately 29 calls.
minutes or less?

© The Independent Institute of Education (Pty) Ltd 2023 Page 57 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.5 Activity 5
The following data set represents the distribution of annual
salaries (R’000) of 50 males who all perform similar jobs in a
particular industry.

Salary Frequency x fx Cum f fx2


(f) (<)
20 -< 22 1 21 21 1 441
22 -< 24 3 23 69 4 1587
24 -< 26 14 25 350 18 8750
26 -< 28 20 27 540 38 14580
28 -< 30 10 29 290 48 8410
30 -< 32 2 31 62 50 1922
Total 50 1332 35690

Questions: Model Solutions:


1. Calculate the mean, median and ∑ 𝑓𝑥 1332
x̅ = = = 26.64, i.e. the mean
𝑛 50
modal annual salary for the
salary of the 50 males in the sample
males in this industry. Explain
is approximately R26 640:
each answer.
• The median value is at position n/2 =
25. This value is in the interval 26-
<28:
2(25−18)
Median = 26 + = 26.7
20
This means that half of the males in
the sample are paid approximately
R26 700 or less;
• The mode will be in the interval with
the highest frequency, i.e. 26-<28:
2(20−14)
Mode = 26 + = 26.75
2(20)−14−10
This means that the most frequently
occurring salary in the sample was
approximately R26 750.
2. By comparing the mean, median x̅ = 26.64; Median = 26.7; Mode = 26.75
and modal statistics, what can The distribution is slightly negatively
you conclude about the shape of skewed, because mode > median > mean.
the distribution. These statistics are very close together; so,
it may be argued that the distribution is
approximately symmetrical.

© The Independent Institute of Education (Pty) Ltd 2023 Page 58 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Questions: Model Solutions:


3. Determine the range, standard • Range = Upper boundary of last
deviation, variance and interval – lower boundary
coefficient of variation for the of first interval
salaries for the males in the = 32 – 20
industry. = 12 (approximately)
• Standard deviation is:
1 1 2
∑ fx2 − (∑ fx)2 35690− (1332}
s=√ n
= √ 50
=
n−1 50−1
2.05  R2 050
(Students can use a calculators’ pre-
programmed function to calculate this
answer. The formula from the formula
sheet would give the same answer.)
• Variance (s2) = 2.052
= 4.20  R4 200.
2.05
CV =  100 = 7.70%
26.64
4. Determine the inter-quartile range Inter-quartile range = Q3 – Q1 = 27.95 –
and interpret your answer. 25.21. This means that the middle 50% of
values fall between 25.21 and 27.95.
5. Determine the middle 90% range. Middle 90% range = P95 – P5
2(𝟒𝟕.𝟓−𝟑𝟖)
P95 = 28 + = 29.90
𝟏𝟎
2(𝟐.𝟓−𝟏)
P5 = 22 + = 23
𝟑
Thus, the middle 90% range is 29.90 – 23
= 6.90
6. Which value falls at the 60th 2(𝟑𝟎−𝟏𝟖)𝟐
P60 = 26 + = 27.20
𝟐𝟎
percentile? Explain this answer.
60% of the males in the sample earn R27
200 or less.
7. Plot the histogram, and comment Histogram of Annual Salaries
on the skewness of the
distribution.

The distribution is negatively skewed, but


relatively close to being symmetrical.

© The Independent Institute of Education (Pty) Ltd 2023 Page 59 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Questions: Model Solutions:


8. Plot the ogive, and read the Ogive for Annual Salaries
value of the median from the
graph.

The median is just lower than R27 000.

3.6 Revision Exercise 1


[(5 ×22 000)+(2 ×70 000)+270 000]
1. 𝑥̅ =
8
= R65 000

The five cleaners earn less than R65000.

2. If a distribution is negatively skewed, the mean must be


smaller than the median. The answer is therefore 10.

3. The mean will increase, the median will remain


unchanged, and there will be two modes: 19 and 29.

4. Total = 5 × 20
= 100

5 + 15 + 30 = 50
 100 – 50 = 50

The remaining two data points must be the same in


order to be the value that occurs most often. Therefore,
the mode must be 25.

5. Range = 12 – (-4)
= 16

x̅ = 3.1 minutes (Use calculator)

6. Positively skewed, because mode < median < mean.

© The Independent Institute of Education (Pty) Ltd 2023 Page 60 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

7. 6 on the Richter scale. This is the median value.

8.
110
8.1 = 2.2 rejects
50

8.2 2.5 rejects

8.3 3

∑(𝐱−𝐱̅)𝟐 𝟑𝟗𝟔
9. s=√ =√ = 8.899  R8 899
𝐧−𝟏 𝟔−𝟏

2000
10. CV (CT) = × 100 = 7.41% CV (JHB) =
27000
2100
× 100 = 8.4%
25000

Therefore, salaries are more variable in Johannesburg,


because the CV is higher.

20000
11. CV (ABSA) = × 100 = 13.33%
150000
25000
CV (First National) =  100 = 15.63%
160000
ABSA’s reported prices are more uniform.

12. P50 is observation number 60.5 and P60 is observation


number 72.6. The difference between the positions is
therefore 12 observations.

3.7 Revision Exercise 2


1.
1.1 The distribution is positively skewed.
1.2 Inter-quartile range is (45 – 30) = 15
1.3 50% of the smokers are older than 35 year.
1.4 Smokers younger than 7.5 years.
1.5 The oldest 25% are older than 45 years.
1.6 The lower limit is: 30 – 1.5(45 – 30) = 7.5
The upper limit is: 45 + 1.5(45 – 30) = 67.5

There are no outliers in the lower part of the data


set, as the smallest value in the data set (10) is
greater than the lower limit. However, we know
that there is at least one value in the data set
greater than the upper limit (the maximum value of
70). Therefore, all values in the data set greater
than the upper limit are outliers.

© The Independent Institute of Education (Pty) Ltd 2023 Page 61 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2. Lower limit = 66 – 1.5(90 – 66) = 30


Upper limit = 90 + 1.5(90 – 66) = 126

 140 is an outlier.

3. S = 100
Q1 pos. =3.25
Q1 value = 190 + 0.25(10) =192.5
Median = 350
Q3 pos = 9.75
Q3 value = 410 + 0.75(170) = 537.5
L = 980

4.
4.1 50% because that is the interquartile range.
4.2 25%, because 90kg is Q3.
4.3 85kg.
4.4 100 – 50 = 50kg.
4.5 The mean must be smaller than 85 because the
distribution is negatively skewed.
4.6 Lower limit = 70 – 1.5(20) = 40kg. Weights lower
than 40kg will be considered outliers.
Upper limit = 90 + 1.5(20) = 120kg. Weights
greater than 120kg will be considered outliers.

3.8 Revision Exercise 3


Multiple Bar Graph
3

2,5

2
R0'000

1,5

0,5

0
2009 2010 2011 2012

National Lottery Subsidy

© The Independent Institute of Education (Pty) Ltd 2023 Page 62 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.9 Revision Exercise 4

f
N //// // 7
T //// 5
R //// // 7
M //// / 6
25

frequency

6 7
N
T
R
M
7 5

3.10 Revision Exercise 5

Rainfall (mm) Tally No. of Months(f) %f Cum f (<)


27 –< 37 //// / 6 12 6
37 –< 47 //// // 7 14 13
47 –< 57 //// /// 8 16 21
57 –< 67 //// //// //// 14 28 35
67 –< 77 //// //// 9 18 44
77 –< 87 /// 3 6 47
87 –< 97 /// 3 6 50
50 100

1. The most common amount of rainfall is between 57-<66


mm. This is the modal class interval.

© The Independent Institute of Education (Pty) Ltd 2023 Page 63 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2. The interval is 27 -< 37, and the reading is 6.

3. The shape is approximately symmetrical, i.e. the data


are relatively evenly distributed around the centre.

4. Half of 50 is 25. Draw a straight line from 25 to the line of


the graph, and then drop a straight line down to the x-
axis. Read the answer from there. Half of the months
show a rainfall of less than approximately 60 millilitres.

© The Independent Institute of Education (Pty) Ltd 2023 Page 64 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.11 Revision Exercise 6


1. Profits made during December by a sample of street
vendors.

2. 3 + 14 + 11 + 10 + 2 = 40

3. The ones that earned the most will fall in the last interval:
therefore, they earned between R14 000 and less than
R17 000.

4. Less than R8 000.

5. Ten vendors made between R11 000 and less than


𝟏𝟎
R14 000. Therefore, the answer is: = 𝟐𝟓%.
𝟒𝟎

6. The distribution is positively skewed, as most of the


values are concentrated in the left-hand side of the
distribution.

© The Independent Institute of Education (Pty) Ltd 2023 Page 65 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.12 Revision Exercise 7


384
1.1 Mean = = 9.14 minutes
42

(21−12)3
1.2 Median = 7 + = 8.93 minutes. Half of the
14
applicants took 8.93 minutes or less to write the
test.

1.3 The most frequent time interval that it took


applicants to complete the test was between 7 to
less than 10 minutes (this is the modal class
interval).

© The Independent Institute of Education (Pty) Ltd 2023 Page 66 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Learning Unit 4: Linear Regression and


Correlation Analysis
Material used for this Learning Unit:

• Prescribed Textbook, Chapter Twelve (Please note, all references and


exercises related to Excel can be done for enrichment purposes but will
not be assessed.).

1. Statistics on the Sharp EL-531WH


Calculator
(This is just one example of a calculator. If you have a different
model, please go to the Internet and search for assistance on
that particular calculator.)

Arrows

Your calculator has two different MODES. One is for normal


calculations, while the other is for statistical calculations.

1. Select Statistical Mode by pressing MODE followed by 1


2. Select 1 again (LINE) for Linear Regression.
3. <STAT 1> will appear on display.

© The Independent Institute of Education (Pty) Ltd 2023 Page 67 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

4. Clear the statistical memory before you start with the


data input.

Select 2ndF MODE This CA function clears all MODE


memory.

5. Enter the x-value of the first data point and select .

Enter the y-value of the first data point and select

Note: When in linear regression mode, the 2ndF key is


not required, as the key now assumes the (x, y) function
shown below the key.

6. <DATA SET = 1> will appear on the screen.

7. Continue in the same way until all the data values have
been entered.

Example:

X value Y value
3 86
4 92
5 95
4 83
2 78
3 82

Calculator steps:
Select 2ndF MODE to clear memory.

3 STO 86 M+
4 STO 92 M+
5 STO 95 M+
4 STO 83 M+
2 STO 78 M+
3 STO 82 M+

8. Once all the values have been entered, clear the screen
by selecting <On/C>.

© The Independent Institute of Education (Pty) Ltd 2023 Page 68 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

9. If you have made a mistake when entering the values,


delete the incorrect data point by scrolling up or down to
the data point of concern using the arrows at the top
middle of the calculator:

• Select 2nd F M+ to delete the incorrect x- and y-


value. Enter the correct x- and y-values.
• It does not matter when this data point was
entered.

10. Retrieving Totals

All of the totals previously calculated manually can now


be obtained as follows:

n: RCL 0
y: RCL 2
x: RCL ·
xy: RCL ·
x2: RCL ±
y2: RCL 3

For our example: n = 6 y = 516 x = 21 xy = 1835


x2= 79 y2= 44582

11. Regression and correlation output:

11.1 To obtain Pearson’s correlation coefficient r, select


RCL ÷
11.2 To obtain a (the y-intercept of the regression line),
select RCL (
11.3 To obtain b (the slope of the regression line), select
RCL )

For our example: r = 0,86, a = 67,55 b = 5,27

12. After completing the regression question, it is important


to go back to NORMAL mode before doing any other
calculations.

© The Independent Institute of Education (Pty) Ltd 2023 Page 69 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Statistics on the Casio FX82MS


Calculator
1. Set the calculator to Statistical Mode by selecting MODE
followed by 3 for the REG (regression) option. Select 1
again (LINE) for Linear Regression.

<REG> will appear on display.

2. Clear the statistical memory before you start with the


data input.

SHIFT CLR 1 =

Entering x and y Data Values

1. Enter the x-value of the first data point and select 

2. Enter the y-value of the first data point and select M+.

3. It should say n = 1 on the screen.

4. Enter the x-value of the second data point and select ,


Enter the y-value of the second data point and select M+.

5. n = 2 will now appear on the screen.

6. Continue in the same way until all of the data values


have been entered.

7. Once all the values have been entered, clear the screen
by selecting AC.

© The Independent Institute of Education (Pty) Ltd 2023 Page 70 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

SHIFT key

MODE and CLR


keys

 Key

M+ data
key

8. If you have made a mistake while entering the values,


delete the incorrect data point by scrolling up or down to
the data value of concern, using the arrows located on
the large round button.

9. Delete the incorrect value, by selecting SHIFT M+. This


will delete the applicable x- and y-value.

© The Independent Institute of Education (Pty) Ltd 2023 Page 71 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

10. Enter the correct value. Once you have finished editing
the data, select AC to exit the data edit mode.

Example:

X value Y value
3 86
4 92
5 95
4 83
2 78
3 82

Calculator steps:

SHIFT CLR 1 = to clear memory.

3  86 M+
4  92 M+
5  95 M+
4  83 M+
2  78 M+
3  82 M+

11. All of the totals required to calculate r, b and a, which


were previously calculated manually in class, can now be
obtained from the [S-SUM] menu as follows:

n: SHIFT 1 3
x: SHIFT 1 2
x2: SHIFT 1 1
y: SHIFT 1 SCROLL RIGHT WITH ARROW 2
xy: SHIFT 1 SCROLL RIGHT WITH ARROW 3

12. Retrieving r, a and b:

To obtain Pearson’s correlation coefficient r, select


SHIFT 2 < SCROLL RIGHT WITH ARROW TWICE 3 =
To obtain a - the y-intercept value, select
SHIFT 2 < SCROLL RIGHT WITH ARROW TWICE 1 =
To obtain b – the slope, select
SHIFT 2 < SCROLL RIGHT WITH ARROW TWICE 2 =

© The Independent Institute of Education (Pty) Ltd 2023 Page 72 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

13. Before attempting a new question, remember to first


clear the data set that is currently in the calculator’s
memory.

14. After completing the regression question, go back to


normal calculation MODE.

2 Activities
2.1 Activity 1
Purpose:

This activity is designed to test all aspects of regression and


correlation analysis as well as interpretations.

Task:

A travel agency is interested in knowing how airline fares are


related to the length of the flight (in kilometres). The agency
hypothesised that the longer the flight, the more the airfare
would be. The following sample data was collected:

Kilometres (x) 2375 1400 1250 2325 985 2025


Airfare (R) (y) 1330 810 750 1266 621 1110

1. Identify the dependent and independent variables.

2. Plot a scatter diagram of the data.

3. With reference to the diagram, do you think that the


travel agency’s hypothesis is correct? Why or why not?

4. Calculate the correlation coefficient, and interpret the


answer.

5. Determine the coefficient of determination and interpret


the answer.

6. If you are going to travel 2 400 kilometres, what would


you expect the airfare to be?

Commentary Related to Activity Design:

Observational data allows us to state that two variables might


be related, but one cannot claim causation.

© The Independent Institute of Education (Pty) Ltd 2023 Page 73 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2.2 Activity 2

The maximum daily temperature (in C) and coffee sales (in
Rands) for a coffee shop for eight randomly selected days
were recorded:

C 16 19.5 25.5 30 32.5 36 39 40.5


Sales (R) 262 248 197 200 133 139 114 112

1. Draw a scatter diagram of coffee sales and daily


temperature. Comment on the results.

2. Calculate the correlation coefficient and interpret the


answer.

3. Determine the coefficient of determination and interpret


the answer.

4. If the forecast temperature is 18C, what amount of


coffee in Rands would you expect to sell?

5. What amount of coffee in Rands would you expect to sell


at the y-intercept point?

6. Interpret the slope of the distribution of coffee sold.

Commentary Related to Activity Design:

Observational data allow us to state that two variables might


be related, but you cannot claim causation. The slope of a
distribution is also known as the marginal value.

© The Independent Institute of Education (Pty) Ltd 2023 Page 74 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3 Revision Exercises
3.1 Revision Exercise 1
Bulk advertising is often used to influence the response of
buyers. For example, a product is advertised at ‘2 for R20’ to
convince people that they are getting a bargain. To test this
theory, a Fruit and Veg store advertises an item for equal
periods of time at five bulk rates, and records the quantities
sold:

Number of items in bulk sale Quantity sold


1 35
2 50
3 45
4 75
5 62

1. Plot a scatter diagram, to determine whether a


relationship exists.

2. Calculate and interpret the correlation coefficient.

3. Calculate and interpret the coefficient of determination.

4. Determine the regression equation, and interpret the b0


and b1 values in terms of the number of items and
quantities sold.

5. Estimate the quantity that would be sold if six items are


advertised in bulk at a certain price. Can you rely on this
estimate? Give a reason for your answer.

© The Independent Institute of Education (Pty) Ltd 2023 Page 75 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.2 Revision Exercise 2


The following table shows the number of weeks that six people
have been employed at a car manufacturing inspection station,
and the number of cars each of them checked between 08h00
and 12h00 on a given day.

Weeks employed Cars checked


5 16
1 15
7 19
9 23
2 14
12 21

The following totals are also provided:

 x = 36  y = 108  xy = 715 x 2
= 304

 y = 2008
2

Answer the next four questions:

1. Name the independent and dependent variables.

2. Prove that the number of weeks employed is a good


measure to use to predict the number of cars checked,
making use of the coefficient of determination.

3. Estimate the number of cars a person employed for six


weeks will check between 8h00 and 12h00 on any given
day.

4. Calculate the slope of the regression line.

© The Independent Institute of Education (Pty) Ltd 2023 Page 76 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.3 Revision Exercise 3


Below is a scatter diagram illustrating number of bottles of cool
drink purchased, and number of family members. The sample
was selected from customers who visited the Do-Little
Supermarket during a specific day.
No of bottles of cool drink

6
5
4
3
2
1
0
0 2 4 6 8
Family size

1. What was the sample size?

2. Do you expect a positive or negative relationship


between the two variables?

3. What is the smallest value reported for family size?

4. What is the largest value reported for number of bottles


of cool drink?

© The Independent Institute of Education (Pty) Ltd 2023 Page 77 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

4. Solutions to Activities and Revision Exercises


4.1 Activity 1
Purpose:

This activity is designed to test all aspects of regression and


correlation analysis as well as interpretations.

Task:

A travel agency is interested in knowing how airline fares are


related to the length of the flight (in kilometres). The agency
hypothesised that the longer the flight, the more the airfare
would be. The following sample data were collected:

Kilometres (x) 2375 1400 1250 2325 985 2025


Airfare (R) (y) 1330 810 750 1266 621 1110

Questions: Model Solutions:


1. Identify the dependent and Airfare depends on the distance the
independent variables. aeroplane flies and therefore, airfare is the
Y-variable and kilometres the X-variable.
2. Plot a scatter diagram of the
AIRFARE
data.
1400
1200
AIRFARE

1000
800
600
400
200
0
0 500 1000 1500 2000 2500
KILOMETRES

3. With reference to the diagram, The agency’s hypothesis was correct. The
do you think that the travel scatter plot shows a positive relationship,
agency’s hypothesis is correct? indicating that the longer the flight, the more
Why or why not? expensive the airfare.

© The Independent Institute of Education (Pty) Ltd 2023 Page 78 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

4. Calculate the correlation coefficient and interpret the


answer.

Distance Airfare(Y) XY X2 Y2
(X)
2375 1330 3158750 5640625 1768900
1400 810 1134000 1960000 656100
1250 750 937500 1562500 562500
2325 1266 2943450 5405625 1602756
985 621 611685 970225 385641
2025 1110 2247750 4100625 1232100
10360 5887 11033135 19639600 6207997

𝟏𝟎𝟑𝟔𝟎 𝟓𝟖𝟖𝟕
̅=
𝒙 ̅=
= 𝟏𝟕𝟐𝟔. 𝟔𝟕 𝒚 = 𝟗𝟖𝟏. 𝟏𝟕
𝟔 𝟔

𝒏 ∑ 𝒙𝒚 − ∑ 𝒙 ∑ 𝒚
𝒓=
√[𝒏 ∑ 𝒙𝟐 − (∑ 𝒙)𝟐 ] × [𝒏 ∑ 𝒚𝟐 − (∑ 𝒚)𝟐 ]

𝟔(𝟏𝟏𝟎𝟑𝟑𝟏𝟑𝟓) − 𝟏𝟎𝟑𝟔𝟎(𝟓𝟖𝟖𝟕)
=
√[𝟔(𝟏𝟗𝟔𝟑𝟗𝟔𝟎𝟎) − (𝟏𝟎𝟑𝟔𝟎)𝟐 ] × [𝟔(𝟔𝟐𝟎𝟕𝟗𝟗𝟕) − (𝟓𝟖𝟖𝟕)𝟐 ]

= 𝟎. 𝟗𝟗𝟖

The correlation is positive and very strong.

5. Determine the coefficient of determination, and interpret


the answer.

r2  100 = 0.9982  100 = 99.67%

99.67% of the variation in airfare can be explained by the


linear regression function.

6. If you are going to travel 2 400 kilometres, what would


you expect the airfare to be?

𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦
𝑏1 =
𝑛 ∑ 𝑥 2 − (∑ 𝑥 )2

6(11033135) − 10360(5887)
= = 0.4958
6(19639600) − (10360)2

∑ 𝑦 − 𝑏1 ∑ 𝑥 5887 − 0.4958(10360)
𝑏0 = = = 125.09
𝑛 6

∴ 𝑦̂ = 125.09 + 0.4958𝑥, 985 ≤ 𝑥 ≤ 2375

© The Independent Institute of Education (Pty) Ltd 2023 Page 79 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

For x = 2400, 𝑦̂ = 125.09 + 0.4958(2400) = 1315.01

Thus, it is estimated that a flight of 2400 kilometres will


cost R1315.01.

4.2 Activity 2

The maximum daily temperature (in C) and coffee sales (in
Rands) for a coffee shop for eight randomly selected days
were recorded:

C 16 19.5 25.5 30 32.5 36 39 40.5


Sales (R) 262 248 197 200 133 139 114 112

Questions: Model Solutions:


1. Draw a scatter diagram of coffee
sales and daily temperature. 300
Coffee sales

Comment on the results. 200


100
0
0 20 40 60
Temperature

The scatter plot shows a negative


relationship between temperature and
coffee sales - the higher the temperature,
the lower the coffee sales.
2. Calculate the correlation coefficient, r = -0.97
and interpret the answer. The correlation is strong and negative.
3. Determine the coefficient of r2  100 = (-0.97)2 = 94%, i.e. 94% of the
determination, and interpret the variation in coffee sales can be explained
answer. by the linear regression function.
4. If the forecast temperature is 18C, 𝑦̂= 368.49 -6.46x, 16 ≤ x ≤ 40.5
what amount of coffee in Rands 𝑦̂= 368.49 -6.46(18) = R252.21
would you expect to sell?
5. What amount of coffee in Rands R368.49
would you expect to sell at the y-
intercept point?
6. Interpret the slope of the distribution For every 1C increase in temperature,
of coffee sold. there is an average decrease of R6.46 in
the amount of coffee sold.

© The Independent Institute of Education (Pty) Ltd 2023 Page 80 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

4.3 Revision Exercise 1


1.
80
Quantity sold
60
40
20
0
0 2 4 6
Number of items

There is a positive relationship: the more items there are


in a bulk sale, the greater are the quantities sold.

2. r = 0.81
There is a strong, positive linear correlation between bulk
rate and sales.

3. r2  100 = (0.81)2 = 65%. Therefore, 65% of the variation


in sales can be explained by the regression function.

4. 𝑦̂= 29.7 +7.9x


If there are zero items in the bulk rate, sales will be 29.7
items.

For every additional item in the bulk rate, sales increase


on average by 7.9 items.

5. ŷ= 29.7 +7.9(6) = 77.1 items

One cannot completely rely on this estimate, as the


value of x = 6 lies outside the range of x values used to
create the regression function, i.e. such an estimate
involves extrapolation.

4.4 Revision Exercise 2


1. Independent variable: Weeks employed
Dependent variable: Cars checked

2. r = 0.89. This indicates a strong, positive linear


correlation between number of weeks employed and
number of cars checked. r2 = (r)2  100 = 79.21%.
Therefore, 79.21% of the variation in the number of cars
checked can be explained by the linear regression
model. Both measures indicate that the number of

© The Independent Institute of Education (Pty) Ltd 2023 Page 81 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

weeks of employment is a good measure to use when


estimating the number of cars checked.

3. If x = 6:

ŷ= 13.44 +0.76(6) =18 cars

4. For every additional week employed, a person can, on


average, check 0.76 more cars.

4.5 Revision Exercise 3


1. n = 13

2. Positive relationship.

3. One

4. Five

© The Independent Institute of Education (Pty) Ltd 2023 Page 82 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Learning Unit 5: Basic Probability


Material used for this learning unit:

Prescribed Textbook, Chapter Four. (Please note, all references and


exercises related to Excel can be done for enrichment purposes but will
not be assessed.)

1 Activities
Activity 1
Purpose:

The purpose of this activity is to use the special rule of addition


to calculate probabilities.

Task:

1. The probabilities are 0.05, 0.14, 0.17, 0.33, 0.20 and 0.11
that students will respectively rate a new sandwich filling in
the tuck shop as very poor, poor, fair, good, very good or
excellent.

Assuming that the ratings are mutually exclusive, what is


the probability that a new filling will be rated:

1.1 Very poor or poor.


1.2 Good, very good or excellent.

2. A fair dice is thrown once. Find the probability that the


outcome is:

2.1 Bigger than three;


2.2 An odd number.

3. A card is chosen at random from an ordinary pack. Find the


probability that it is:

3.1 A red card;


3.2 An honours card (A, K, Q, or J);
3.3 A face card (K, Q, J).

© The Independent Institute of Education (Pty) Ltd 2023 Page 83 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

4. Andile keeps records of the lifetime, in days, of a particular


type of battery that she uses in her portable radio.

Lifetime of Battery Number of Batteries


Less than 5 days 6
5 and less than 10 days 12
10 and less than 20 days 34
More than 20 days 4

What is the probability that a battery selected at random


will last:

4.1 Less than 10 days?


4.2 Five or more days?

5. The probability that a car owner in a certain income bracket


will drive a Ford is 0.34, while the probability that he will
drive a Toyota is 0.08. Calculate the probability that such a
person will:

5.1 Not drive a Ford?


5.2 Drive a Ford or a Toyota?
5.3 Drive neither a Ford nor a Toyota?

Commentary Related to Activity Design:

You must make sure that you can identify why you must use
this rule: There is only one outcome, and the events are
mutually exclusive. When you have completed all of the
activities, you will be required to do revision exercises where
the rule to be used is not specified.

Activity 2
Purpose:

These activities are all based on the general rule of addition.

Task:

1. Suppose you have a 25% chance of getting a job offer


from company A; a 40% chance of getting a job offer
from company B; and a 15% chance of getting a job offer
from both companies. What is the probability that you will
get a job offer from either company? What is the
probability that you will get a job offer from one of the two
companies, but not from both?

© The Independent Institute of Education (Pty) Ltd 2023 Page 84 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2. Common sources of caffeine are coffee and tea.


Suppose that 55% of students drink coffee, 25% drink
tea, and 15% drink both coffee and tea. What is the
probability that a randomly selected student will drink tea
or coffee?

3. If you draw a card from a standard deck of 52 cards,


what is the probability that the card will be a:

3.1 Seven or a black?


3.2 Face or a red?
3.3 Ace or a Heart?

4. In a certain lottery, the probability of drawing a number


divisible by two is ½, divisible by three is 1/3 and divisible
by six is 1/6. What is the probability of drawing a number
that is divisible by either two or three?

5. In a large city, two free newspapers providing


community-based news are distributed - the News Time,
and the Get It. The circulation departments report that
52% of households receive News Time, while 35%
receive Get It. 16% of all households receive both
newspapers. What proportion of households received at
least one of the newspapers? Note: At least one means
one or two or both newspapers.

Commentary Related to Activity Design:

You must make sure that you can identify when you must use
this rule: There is only one outcome, and the events are not
mutually exclusive. When you have completed all of the
activities, you will be required to do revision exercises where
the rule to be used is not specified.

Activity 3
Purpose:

These activities are all based on the special rule of


multiplication.

Task:

1. If you roll a dice twice, what is the probability that you will
obtain a six both times?

© The Independent Institute of Education (Pty) Ltd 2023 Page 85 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2. Peter carries car insurance on both his car and his wife’s
car. During any year, the probability is 0.01 of an
insurance claim on his car, with a probability of 0.06 of
there being a claim on his wife’s car. During one year,
what is the probability that both Peter and his wife will
have insurance claims on their cars?

3. The probability of any athlete finishing a 1 000m race is


0.80. Two athletes from your team participate in this
race. What is the probability that:

3.1 Both will finish?


3.2 Neither of the two will finish?

4. A fair coin is tossed three times. What is the probability


that the sequence of results will be heads, tails, and
heads?

5. According to a study conducted by an optometrist, half of


the clients who need vision correction are patients who
require bifocal lenses. For a randomly selected group of
three people who require vision correction, what is the
probability that all three will require bifocals?

Commentary Related to Activity Design:

You must make sure that you can identify when you must use
this rule: There are two or more outcomes and the events are
independent. When you have completed all of the activities,
you will be required to do revision exercises where the rule to
be used is not specified.

Note:

In the addition rule, P(A and B) denotes the probability that A


and B both occur in the same observation. In the multiplication
rule, P(A and B) can also denote the probability that event A
occurs on one trial followed by event B on another trial.

© The Independent Institute of Education (Pty) Ltd 2023 Page 86 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Activity 4
Purpose:

These activities are all based on the general rule of


multiplication:

P (A and B) = P (A)  P (B/A)


Task:

1. In a retirement facility, 65% of the residents are smokers.


Research has indicated that 15% of the smokers have
some form of lung cancer. What is the probability of a
resident having cancer, given that the resident is a
smoker?

2. A clothing manufacturer places high standards on


producing quality garments. Their quality control process
has a record of accepting 3% of all defective garments.
An analysis of their records also shows that only 2% of
good garments tend to be rejected. In the production of
all garments, 10% tend to be defective (or reject) items.
For the next garment inspected by a quality controller,
what is the probability that if a garment is defective, it is
accepted? What is the probability that the garment is
rejected if it is non-defective? What is the probability of
selecting a non-defective garment?

3. In a large city, two free newspapers with community


news are distributed, the News Time and the Get It. The
circulation departments report that 52% of the
households have received News Time and 35% Get It.
16% of all households received both. What is the
probability that if you receive News Time, you will receive
Get It this month?

4. A survey amongst miners revealed that 22% are heavy


smokers and 32% are heavy drinkers. 18 per cent of the
miners also admitted to being both heavy smokers and
heavy drinkers. Based on these results, what is the
probability that if you select a heavy smoker, the person
will be a heavy drinker?

© The Independent Institute of Education (Pty) Ltd 2023 Page 87 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Commentary Related to Activity Design:

You must make sure that you can identify when you must use
this rule: There are two or more outcomes, and the events are
dependent. When you have completed all of the activities, you
will be required to do revision exercises where the rule to be
used is not specified.

Activity 5
Purpose:

These activities are based on counting rules, and how they can
be used to calculate probabilities.

Task:

1. In a Sushi restaurant, you can choose between five


items in column 1 of the menu, six items in column two,
and four items in column three. How many possible
selections can you make if you choose one item from
each column?

2. A combination for a lock consists of three numbers from


one to 30. What is the probability that you could correctly
guess the combination?

3. In deciding how to spend her time on holiday, Jovi can


select from a list of four different city tours, five different
movies and five different restaurants.

3.1 How many different options are available to her if


she elects to take one tour, watch one movie and
eat at one restaurant?
3.2 Assuming that she wants to see any three of the
five movies, but she does not want to see any
movie twice, and she does not care in which order
she sees them, then how many options are
available to her?
3.3 Assuming that she only has enough money to go
on two tours, and that the sequence in which she
does the tours is important, how many different
options are available to her?

© The Independent Institute of Education (Pty) Ltd 2023 Page 88 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Commentary Related to Activity Design:

You must be sure that you can identify when you must use
these rules. The only way to do this is to study the relevant
section.

© The Independent Institute of Education (Pty) Ltd 2023 Page 89 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2 Revision Exercises
Revision Exercise 1
1. Which of the following numbers cannot be a probability?

1.1 -0.001;
1.2 0.4;
5
1.3 ;
4
1.4 0;
1.5 1.

2. Suppose that out of 200 students, 60 have Statistics as


subject, 40 have Accounting, and 25 have both subjects:

2.1 How many students have only Statistics as a


subject? Calculate the probability of having only
Statistics as a subject;
2.2 How many have only Accounting? What is the
probability of a student taking only Accounting?
2.3 How many have Accounting or Statistics or both?
Calculate the probability of having Accounting or
Statistics or both;
2.4 How many have neither Statistics nor Accounting
as subjects? What is the probability of having
neither of the two subjects?

3. Josh prepares and submits tax returns for individuals.


Over the years, he has found that SARS selects 12% of
the tax returns that he prepares and submits for an audit.
On one particular day, he signed on two new clients.
What is the probability that SARS will audit at least one
of these clients?

4. A study on child grants in Gauteng has established that


80% of families have children. 60% of families have
children under 15 years of age, while 30% of the families
have children 15 years and older. What is the probability
of a randomly selected family having children both under
15 years of age and 15 years and older?

© The Independent Institute of Education (Pty) Ltd 2023 Page 90 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

5. A researcher asked 90 customers of a chain store about


their purchases during the past month. Twenty said they
purchased books, 45 said that they purchased
stationary, while 15 said that they purchased books and
stationery. What is the probability that a customer
purchased nothing?

6. Twelve men and 14 women have applied for the same


office job. You are interviewing the applicants. If you
select two of the applicants at random for an interview:

6.1 What is the probability that both of them will be


women?
6.2 What is the probability that one will be a male and
the other a female?

7. At the end of a training program, learners have to pass


an exam to obtain their learner’s licence. The probability
of passing the exam at the first attempt is 0.75. Those
who fail may write again. The probability of passing the
second attempt is 0.6. No further attempts are allowed.
Draw a tree diagram to show all possible outcomes.
What is the probability that a learner will fail to obtain a
licence?

8. There were 500 spectators present at a soccer game in


Soweto. 175 fans support Kaizer Chiefs, and 25 support
Orlando Pirates. Find the probability of selecting a
spectator that does not support either of the two teams.

9. The blood types of a group of 300 people are recorded


as follows:

• Type A: 70;
• Type B: 90;
• Type O: 105;
• Type AB: 35.

If a person from this group is selected at random, what is


the probability that this person has either type O or type
A blood?

© The Independent Institute of Education (Pty) Ltd 2023 Page 91 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

10. There are three different colours of Easter eggs in a box.


Of the 20 eggs in the box, one fifth of the eggs are
yellow, 50% are blue and six are red. Half of each colour
is striped. If you choose an egg that is yellow or striped,
you win. What are your chances of winning?

11. There are 23 clerks amongst the 32 employees of a local


bank. Fifteen of the 23 clerks have typing skills and 13
have filing skills. The telephone rings and an employee
answers it.

11.1 What is the probability that it will be a clerk with


typing skills?
11.2 What is the probability that it will not be a clerk?

12. A letter has an express delivery stamp on it. The


probability of delivery the next working day is 0.86. What
is the probability that the letter is undelivered the next
working day?

13. Emma has a box of marbles. The marbles are red, green
and blue. The probability that she picks a green marble
is 0.6 and the probability that she picks a red marble is
0.25. What is the probability that she picks a blue
marble? How many of the 20 marbles in the box are red
ones?

Revision Exercise 2
1. In a sample of 250 students travelling to campus, 190
use taxis. Of the taxi commuters, 130 travel more than
15km. Of those who do not use taxis, 40 travel less than
15km. Compile a contingency table and answer the
following questions.

1.1 What is the probability that a student from the


sample travels more than 15km to campus?
1.2 What is the probability that a student traveling by
taxi, travels less than 15km?
1.3 What is the probability that a student does not
travel by taxi or travels more than 15km?

© The Independent Institute of Education (Pty) Ltd 2023 Page 92 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

2. A survey is done of customers at a coffee shop to


determine whether there is any difference in the type of
drinks ordered based on the gender of the customers.

Coffee Tea Cold drink Total


Male 80 76 34 190
Female 55 39 66 160
Total 135 115 100 350

2.1 What is the probability that a customer in the


survey is not male?
2.2 What is the probability that a customer orders
coffee or is female?
2.3 What is the probability that a male orders a cold
drink?

© The Independent Institute of Education (Pty) Ltd 2023 Page 93 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

3 Solutions to Activities and Revision


Exercises
Activity 1
Task:

1. (You cannot rate a filling as poor and excellent at the same


time, therefore we can assume the events are mutually
exclusive.)

1.1 P(very poor or poor) = 0.05 + 0.14 = 0.19.


1.2 P(good or very good or excellent) = 0.33 + 0.20 +
0.11 = 0.64.
3 1
2.1 P(>3) = =
6 2

𝟑 𝟏
2.2 P(1 or 3 or 5) = =
𝟔 𝟐

𝟐𝟔 𝟏
3.1 P(red) = =
𝟓𝟐 𝟐

𝟏𝟔 𝟒
3.2 P(honours) = =
𝟓𝟐 𝟏𝟑

𝟏𝟐 𝟑
3.3 P(a face card) = =
𝟓𝟐 𝟏𝟑

12 6 18 9
4.1 P(<10) = + = =
56 56 56 28

12 34 4 50 25
4.2 P( 5) = + + = =
56 56 56 56 28

5.1 P(̅̅̅̅̅̅̅
Ford) = 1 – P(Ford)
= 1 – 0.34
= 0.66

5.2 P(F or T) = P(F) + P(T)


= 0.34 + 0.08
= 0.42

5.3 ̅̅̅̅̅̅̅̅
P(F or T) = 1 - 0.42
= 0.58

© The Independent Institute of Education (Pty) Ltd 2023 Page 94 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Activity 2
1. P(A or B) = 0.25 + 0.40 - 0.15
= 0.50

P(A or B) but not both = 0.25 + 0.40


= 0.65

2 P(C or T) = 0.55 + 0.25 – 0.15


= 0.65

4 26 2 28 7
3.1 P(7 or Black) = + − = =
52 52 52 52 13

12 26 6 32 8
3.2 P(Face or red) = + − = =
52 52 52 52 13

4 13 1 16 4
3.2 P(Ace or Heart) = + − = =
52 52 52 52 13

1 1 1 4 2
4. P(2 or 3) = + − = =
2 3 6 6 3

5. P(N or G) = 0.52 + 0.35 – 0.16


= 0.71

Activity 3
𝟏 𝟏 𝟏
1. P(6 and 6) = × =
𝟔 𝟔 𝟑𝟔

2. P(P and W) = 0.01  0.06


= 0.0006

3.1 P(F and F) = 0.8  0.8


= 0.64

3.2 P (F̅ and F̅) = 0.2  0.2 = 0.04

𝟏 𝟏 𝟏 𝟏
4. P(H and T and H) = × × =
𝟐 𝟐 𝟐 𝟖

𝟏 𝟏 𝟏 𝟏
5. P(B and B and B) = × × =
𝟐 𝟐 𝟐 𝟖

© The Independent Institute of Education (Pty) Ltd 2023 Page 95 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Activity 4
Task:

P(LC and S) 0.15


1. P(LC/S) = = = 0.23
P(S) 0.65

P(A and D) 0.03


2. P(A/D) = = = 0.30
P(D) 0.10

P(GI and NT) 0.16


3. P(GI/NT) = = = 0.31
P(NT) 0.52

P(HD and HS) 0.18


4. P(HD/HS) = = = 0.82
P(HS) 0.22

Activity 5
Task:

1. 5  6  4 = 120

2. Total number of successes = 30  30  30


= 27 000
1
P(Guess combination) =
27 000

3.1 4  5  5 = 100

3.2 5C3 = 10

3.3 4P2 = 12

© The Independent Institute of Education (Pty) Ltd 2023 Page 96 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Revision Exercise 1
𝟓
1. -0.001 and
𝟒

2.1 60 – 25 = 35 only Statistics


35 7
P(only S) = =
200 40

2.2 40 -25 = 15
15 3
P(A only) = =
200 40

2.3 A or S = 60 + 40 - 25 = 75
75
P(A or S) = = 0.375
200

2.4 200 – 75 = 125


125 5
P(neither A nor S) = =
200 8

3. P(A and B Audited) = 0.12 × 0.12 = 0.0144

P(A or B or Both Audited)


= P(A Audited) + P(B Audited) - P(A and B Audited)
= 0.12 + 0.12 – 0.0144 = 0.2256

4. P(Children Under 15 or Children 15 and Older)


= P(Children Under 15) + P(Children 15 and Older) -
P(Children Under 15 and Children 15 and Older)

Therefore:

P(Children Under 15 and Children 15 and Older) =


[P(Children Under 15) + P(Children 15 and Older)] -
P(Children Under 15 or Children 15 and Older)

= (0.60 + 0.30) - 0.80


= 0.10

5. P(B or S) = P(B) + P(S) – P(B and S)


20 45 15 50
= + − =
90 90 90 90

50 40
̅̅̅̅̅̅̅̅
P(B or S) =1- = = 0.4444
90 90

14 13 182
6.1 P(W and W) = × =
26 25 650

12 14 14 12 168
6.2 P(M and F) or P(F and M)= ( × )+ ( × )=
26 25 26 25 325

© The Independent Institute of Education (Pty) Ltd 2023 Page 97 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

7.

P 0.75 P=0.75

Begin
P 0.6 FP=0.25∙ 0.6=0.15

F 0.25

F(0.4) FF = 0.25∙ 0.4 = 0.1

Therefore, P(F and F) = 0.1

8. 500 – 175 – 25 = 300


300
P(Not support either team) =
500

9. Events are mutually exclusive.

105 70 175 7
P(O or A) = + = =
300 300 300 12

10. Yellow = 0.2


Blue = 0.5
Red = 0.3
Striped = 0.5

P(Y or S) = P(Y) + P(S) – P(Y and S)


= 0.2 + 0.5 – 0.1
= 0.6

𝑃(𝑇 𝑎𝑛𝑑 𝐶)
11.1 P(T/C) = ∴ 𝑃(𝑇 𝑎𝑛𝑑 𝐶 ) = 𝑃(𝑇/𝐶) × P(C)
𝑃(𝐶)
15 23 15
= × =
23 32 32

23 9
11.2 P(not C) =1- =
32 32

12. P(not delivered) = 1 – 0.86


= 0.14

13. There are 0.25  20 = 5 red marbles

P(Blue) = 1 – 0.85
= 0.15

© The Independent Institute of Education (Pty) Ltd 2023 Page 98 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Revision Exercise 2
1.

Taxis Other Total

< 15 km 60 40 100

> 15 km 130 20 170

Total 190 60 250

130 20 17
1.1 P(>15) = + =
250 250 25

60 6
1.2 P(<15) = =
250 25

60 170 20 21
1.3 P(𝑇̅ or >15) = + − =
250 250 250 25

2.

Coffee Tea Cold drink Total


Men 80 76 34 190
Women 55 39 66 160
Total 135 115 100 350

2.1 ̅ ) = 160 = 16
P(𝑀
350 35

135 160 55 24
2.2 P(C or F) = + − =
350 350 350 35

34 17
2.3 P(M and Cd) = =
350 175

© The Independent Institute of Education (Pty) Ltd 2023 Page 99 of 148


IIE Module Guide QUAT6221/d/p/w; BSTA6212

Learning Unit 6: Probability Distributions


Material used for this learning unit:

• Prescribed Textbook, Chapter Five. (Please note, all references and


exercises related to Excel can be done for enrichment purposes but will
not be assessed.)

1 Activities
1.1. Activity 1
Purpose:

Use the binomial distribution formula to calculate probabilities.


All possible variations in the questioning are dealt with.

Task:

A low-cost airline has a problem with the number of its


reservations that are “no-shows”. Although every seat of its 10-
seater aeroplane is always reserved in advance, the records
show that 25% of passengers with reservations do not show
up (i.e. their seats remain empty).

For a randomly selected flight, what is the probability that:

1. Three seats will remain empty. P(x = 3)

2. Two seats will remain empty. P(x = 2)

3. Either two or three seats will remain empty. P(x = 3) +


P(x = 2)

4. At least two seats will remain empty.


P (P x  2) = P(x = 2, 3, 4, 5, 6, 7, 8, 9, 10)

[Hint: The short-cut rule can be used here: P(x  2) = 1 –


P(x ≤ 1)]

5. Less than three seats will remain empty. P(x < 3)

6. Exactly five passengers will show up for the flight. P(x =


5). Note that the success outcome has now changed to
“will show”, i.e. p = 0.75

Commentary Related to Activity Design:

This activity will assist you in recognising and calculating


binomial probability questions.

© The Independent Institute of Education (Pty) Ltd 2023 Page 100 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

1.2 Activity 2
Purpose:

Use of the Poisson distribution formula to calculate


probabilities. All possible variations in questioning will be dealt
with.

Task:

A call centre receives an average of three calls per minute.


Calculate the probability that the centre will receive:

1. Exactly four calls in a given minute.

2. More than one call in a given minute.

3. Two calls in two minutes (Note the change in the time


unit. The average must be adjusted accordingly).

4. Less than three calls in 20 seconds.

5. Two calls in 30 seconds.

Commentary Related to Activity Design:

This activity prepares you to recognise and calculate


probabilities following the Poisson method.

1.3 Activity 3
Purpose:

Use of the normal distribution formula to calculate probabilities.


All possible variations in question type will be dealt with. Use a
sketch to indicate the required probability each time.

© The Independent Institute of Education (Pty) Ltd 2023 Page 101 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Task:

Weekly sales of Denny’s soup cans at the local grocery store


are approximately normally distributed with a mean of 2 450
cans and a standard deviation of 400 cans. The store manager
wants to find the following probabilities:

1. During a randomly selected week, what is the probability


that the sales will be between 2 000 and 3 000 cans?

2. During a randomly selected week, what is the probability


that the sales will be more than 3 000 cans?

3. During a randomly selected week, what is the probability


that the sales will be more than 2 000 cans?

4. During a randomly selected week, what is the probability


that the sales will be less than 2 000 cans?

5. What will be the highest sales in the 10% of weeks with


the lowest sales?

6. What will be the lowest sales in the 15% of weeks with


the highest sales?

Commentary Related to Activity Design:

This activity prepares you to recognise and calculate a


probability following the normal distribution. The last two
questions require using the normal distribution from the inside
to the outside. The unknown is not the probability, but the x-
variable.

© The Independent Institute of Education (Pty) Ltd 2023 Page 102 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

2. Revision Exercises
2.2 Revision Exercise 1
1. Let the random variable x represent the number of
bicycles sold per day by Jones Bicycle Shop. The
probability distribution for this activity is shown below:

Answer the following four questions:

x 0 1 2 3 4
P(x) 0.2 0.1 0.3 ?? 0.1

If the above table covers all possible outcomes:

1.1 Find the probability that the shop sells three


bicycles in a given day.
1.2 Find the probability that the shop sells one or two
bicycles on a given day.
1.3 What is the probability that the shop sells at least
one bicycle on a given day?

2. A Cape Town courier service promises that 80% of


Johannesburg-bound parcel deliveries will reach their
destinations within 12 hours. What is the probability that,
of seven parcels sent at random times by a particular
client in Cape Town:

2.1 Only one is delivered late?


2.2 Only one is not delivered late?

3. Privacy is a concern for many users of the Internet. One


survey showed that 71% of Internet users are concerned
about the confidentiality of their email. Based on this
information, what is the probability that for a random
sample of 12 Internet users, four are concerned about
the privacy of their emails?

4. According to an article appearing in a Sunday


newspaper, about 70% of all South African households
have a cellular phone. Suppose you are conducting a
survey on customer satisfaction regarding cellular
phones. If you called seven households selected at
random, calculate the following probabilities:

© The Independent Institute of Education (Pty) Ltd 2023 Page 103 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

4.1 The probability that more than zero households


have a cellular phone;
4.2 The probability that at most one household has a
cellular phone;
4.3 The probability that all seven households have
cellular phones.

2.3 Revision Exercise 2


1. The surface of a car is approximately 15m². Records at a
certain factory show that there is an average of 0.2 paint
blemishes per square meter of painted surface. What is
the probability that a new car will have less than three
paint blemishes in total?

2. In a book, two misprints occur on average per 100


pages. Determine the following probabilities for a book of
400 pages:

2.1 The probability of finding five or six misprints;


2.2 The probability of finding at least one misprint.

3. The average number of traffic accidents per seven-day


week on a certain section of highway is equal to 14.
Based on this, what is the probability that there will be
four accidents on the Monday of a specific week?

4. Jim is a real estate agent who sells large commercial


buildings. Because his commission is so large on a
single sale, he does not need to sell many buildings to
make a good living. History shows that Jim has a record
of selling an average of six large commercial buildings in
180 days. In a 60-day period, what is the probability that
Jim will make no sales?

2.4 Revision Exercise 3


1. A regional radio station recently conducted a survey
amongst a large sample of car commuters, to find out
how long they listen to their car radios whilst driving to
work in the mornings. This information is used to attract
product sponsors to the programs. Assume that
“listening time” found from the survey is normally
distributed with a mean of 20 minutes, and a standard
deviation of four minutes.

© The Independent Institute of Education (Pty) Ltd 2023 Page 104 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

1.1 What is the probability that a randomly selected


commuter will listen to the radio for between 14
and 20 minutes whilst travelling to work in the
mornings?
1.2 What is the probability that a randomly selected
commuter will listen to the radio for more than 25
minutes whilst travelling to work in the mornings?
1.3 What is the minimum listening time of the 15% of
commuters that spend the longest time listening to
the radio in the mornings?

2. Trucks, which arrive at the Johannesburg Fresh Produce


Market, carry a mean weight of 3.2 tons with a standard
deviation of 0.4 tons. What is:

2.1 The probability that, if the weights follow a normal


distribution, the next arriving truck is loaded with
3.8 or more tons of produce?
2.2 The probability that, if the weights follow a normal
distribution, the next arriving truck is loaded with
between three and four tons of produce?
2.3 The probability that, if the weights follow a normal
distribution, the next arriving truck is loaded with
between 3.8 and four tons of produce?
2.4 The weight below which the lightest 9% of all
trucks are loaded?

3. The waiters in a restaurant receive an average tip of R20


with a standard deviation of R5 per table. The tips are
normally distributed, and a waiter feels that he provided
excellent service if the tip is more than R25. Calculate
the probability that:

3.1 A waiter provides excellent service to a table;


3.2 A waiter does not provide excellent service to a
table;
3.3 A waiter receives a tip of between R10 and R12
from a table;
3.4 What is the highest value of the lowest 18% of
tips?

4. Assume that the time it takes to process each customer


through a fast service queue (10 items or less) in a
supermarket is normally distributed, with a mean of 70
seconds and a standard deviation of 15 seconds. The
supermarket promises that if customers are not
processed through the fast service queue in under 90
seconds, they will receive a free gift. From a random

© The Independent Institute of Education (Pty) Ltd 2023 Page 105 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

sample of 240 customers processed through the fast


service queue in a given day, how many free gifts did the
store have to provide?

© The Independent Institute of Education (Pty) Ltd 2023 Page 106 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

3. Solutions to Activities and Revision Exercises


3.2 Activity 1

1. P(x = 3) = 10C3 0.253 0.7510 - 3


= 0.2503

2. P(x = 2) = 10C2 0.252 0.7510 - 2


= 0.2816

3. P(x = 3) + P(x = 2) = 0.2503 + 0.2816


= 0.5319

4. P(x  2) = 1 – P(x ≤ 1) = 1 – (0.0563 + 0.1877) = 0.7560

P(x = 0) = 10C0 0.250 0.7510 - 0


= 0.0563

P(x = 1) = 10C1 0.251 0.7510 - 1


= 0.1877

5. P(x < 3) = P(x = 0) + P(x = 1) + P(x = 2)


= 0.0563 + 0.1877 + 0.2816
= 0.5256

6. P(x = 5) = 10C5 0.755 0.2510 - 5


= 0.0584

3.3 Activity 2
𝑒 −3 34
1. P(x = 4) = = 0.1680
4!

2. P(x>1) = 1 - P(x ≤ 1) = 1 – (0.0498 + 0.1494) = 0.8008

30 ∙𝑒 −3
P(x = 0) =
0!
= 0.0498

31 ∙𝑒 −3
P(x = 1) =
1!
= 0.1494

𝑒 −6 62
3. P(x = 2) = = 0.0446
2!

4. P(x < 3) = P(x ≤ 2) = P(x = 0) + P(x = 1) + P(x = 2)


= 0.3679 + 0.3679 + 0.1839
= 0.9197

© The Independent Institute of Education (Pty) Ltd 2023 Page 107 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

10 ∙𝑒 −1
P(x = 0) =
0!
= 0.3679

11 ∙𝑒 −1
P(x = 1) =
1!
= 0.3679

12 ∙𝑒 −1
P(x = 2) =
2!
= 0.1839

𝑒 −1.5 1.52
5. P(x = 2) =
2!
= 0.2510

3.4 Activity 3
2000−2450
1. z = = −1.13
400

3000−2450
z = = 1.38
400

P(-1.13 < z < 1.38) = P(-1.13 < z < 0) + (0 <z < 1.38)
= 0.3708 + 0.4162
= 0.7870

3000−2450
2. z = = 1.38
400

P(z > 1.38) =0.5 – 0.4162


= 0.0838

2000−2450
3. z = = −1.13
400

P(z > -1.13) = 0.3708 + 0.5


= 0.8708

2000−2450
4. z = = −1.13
400

P(z < -1.13) = 0.5 - P(0 < z < 1.13)


= 0.1292

5. The 10% of weeks with the lowest sales will have a z


value of -1.28 or less. Therefore:

𝑥 = 𝑧𝜎 + 𝜇 = (−1.28 × 400) + 2450 = 1938 𝑐𝑎𝑛𝑠

© The Independent Institute of Education (Pty) Ltd 2023 Page 108 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

6. The 15% of weeks with the highest sales will have a z


value of 1.04 or more. Therefore:

𝑥 = 𝑧𝜎 + 𝜇 = (1.04 × 400) + 2450 = 2866 𝑐𝑎𝑛𝑠

3.5 Revision Exercise 1


1.1 1 – 0.70 = 0.30

1.2 P(x = 1) + P(x = 2) = 0.1 + 0.3 = 0.4

1.3 P(x  1) = 1 – P(x = 0)


= 1 – 0.2
= 0.8

2.1 P(x = 1) = 7C1 0.201 0.807 – 1 = 0.3670

2.2 P(x = 1) = 7C1 0.801 0.207 – 1 = 0.0004

3. P(x = 4) = 12C4 0.714 0.2912 – 4 = 0.0063

4.1 P(x > 0) = 1 – P(x = 0)


= 1 – 0.0002
= 0.9998

P(x = 0) = 7C0 0.700 0.307 – 0 = 0.0002

4.2. P(x  1) = P(x = 0) + P(x = 1)


= 0.0002 + 0.0036
= 0.0038

P(x = 1) = 7C1 0.701 0.307 – 1 = 0.0036

4.3 P(x = 7) = 7C7 0.707 0.307 – 7 = 0.0824

3.6 Revision Exercise 2


1. With λ =3

P(x < 3) = P(x = 0) + P(x = 1) + P(x = 2)


= 0.0498 + 0.1494 + 0.2240
= 0.4232

30 ∙𝑒 −3
P(x = 0) =
0!
= 0.0498

© The Independent Institute of Education (Pty) Ltd 2023 Page 109 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

31 ∙𝑒 −3
P(x = 1) =
1!
= 0.1494

32 ∙𝑒 −3
P(x = 2) =
2!
= 0.2240

2.1 P(5 ≤ x ≤ 6)= P(x = 5) + P(x = 6)


= 0.0916 + 0.1221
= 0.2137

85 ∙𝑒 −8
P(x = 5) =
5!
= 0.0916

86 ∙𝑒 −8
P(x = 6) =
6!
= 0.1221

2.2 P(x  1) = 1 - P(x = 0)


= 1 – 0.0003
= 0.9997

80 ∙𝑒 −8
P(x = 0) =
0!
= 0.0003

3. 𝜆 = 2 per day

24 ∙𝑒 −2
P(x = 4) =
4!
= 0.0902

4. λ = 2 per 60 days

20 ∙𝑒 −2
P(x = 0) =
0!
= 0.1353

© The Independent Institute of Education (Pty) Ltd 2023 Page 110 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.7 Revision Exercise 3


1.1 P(14 < x < 20) = P( -1.50 < z< 0)
= P(0 < z < 1.50)
= 0.4332

14−20
z = = −1.50
4

1.2 P(x > 25) P(z > 1.25)


= 0.5 - P (0 < z <1.25)
= 0.5– 0.3944
= 0.1056

25−20
z = = 1.25
4

1.3 z = 1.04
x = zσ + μ = (1.04 × 4) + 20 = 24.16 minutes

2.1 P(x > 3.8) = 0.5 – 0.4332


= 0.0668

3.8 − 3.2
𝑧= = 1.5
0.4

2.2 P(3 < x < 4) = P(-0.50 < z < 2.00)


= P(0 < z < 0.50) + P(0 < z < 2.00)
= 0.1915 + 0.4772
= 0.6687

3 − 3.2
𝑧= = −0.50
0.4

4 − 3.2
𝑧= = 2.00
0.4

2.3 P(3.8 < x < 4) = P(1.50 < z < 2.00)


= P(0 < z < 2.00) – P(0 < z < 1.50)
= 0.4772 - 0.433
= 0.0440

2.4 z = -1.34
 x = zσ + μ = (-1.34 × 0.4) + 3.2 = 2.664 tons

3.1 P(x > 25) = P(z > 1.00)


= 0.5 – 0.3413
= 0.1587

© The Independent Institute of Education (Pty) Ltd 2023 Page 111 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

25−20
Z = =1
5

3.2 P(x < 25) = 1 - 0.1587


= 0.8413

3.3 P(10 < x < 12) = P(-2.00 < z < -1.60)


= P(1.60 < z < 2.00)
= P(0 < z < 2.00) – P(0 < z < 1.60)
= 0.4772 – 0.4452
= 0.0320

10−20
z = = −2.00
5

12−20
z = = −1.60
5

3.4 z = -0.92
 𝑥 = 𝑧𝜎 + 𝜇 = (−0.92 × 5) + 20 = R15.40

4. P(x > 90) = P(z > 1.33)


= 0.5 – P(0 < z < 1.33)
= 0.5 – 0.4082
= 0.0918

Thus 0.0918 or 9.18% of customers will get a free gift.

9.18% of 240 = 22.032. Thus approximately 22 free gifts


will be given out.

90−70
z = = 1.33
15

© The Independent Institute of Education (Pty) Ltd 2023 Page 112 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Learning Unit 7: Introduction to Sampling


Distributions
Material used for this learning unit:

• Prescribed Textbook, Chapter Six. (Please note, all references and


exercises related to Excel can be done for enrichment purposes but
will not be assessed.)

1 Activities
Activity 1
Purpose:

The purpose of this activity is to determine the probability of a


sampling distribution of a proportion.

Task:

In a certain neighbourhood, it is known that 12% of school


leavers are unemployed. If a random sample of 150 school
leavers is selected, what is the probability that the sample
contains:

1. At most 10% unemployed?

2. Less than15% unemployed?

Commentary Related to Activity Design:

Sampling distributions of the sample proportion follow the


normal z-distribution.

Activity 2
Purpose:

The purpose of this activity is to determine probabilities for a


sampling distribution of a sample mean.

© The Independent Institute of Education (Pty) Ltd 2023 Page 113 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Task:

Suppose cars arrive at the Burger Box roadhouse at an


average rate of 20 cars per hour, with a population standard
deviation of five cars. A random sample of 40 one-hour time
periods is selected, and it is found that an average of 22.1 cars
arrived per hour.

4. Calculate the sampling error.

5. Determine the mean and standard error of the sampling


distribution of the sample mean.

6. What is the probability that a random sample of 40 one-


hour periods results in a mean of at least 22.1 cars?

Commentary Related to Activity Design:

In cases where the population mean is not known or cannot be


calculated, it is not possible to obtain the sampling error. In the
following learning units, the Central Limit Theorem lays the
foundation for constructing interval estimates for µ, and for
testing hypotheses.

© The Independent Institute of Education (Pty) Ltd 2023 Page 114 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

2 Revision Exercises
Revision Exercise 1
1. According to statistics released by the Department of
Health, 15% of all South Africans have hearing
problems. In a random sample of 120 South Africans,
what is the probability of at least 18% having hearing
problems?

2. Suppose that 25% of all South Africans in a given


income and lifestyle category are interested in buying a
Porsche. A random sample of 100 South Africans in this
category is selected. What is the probability that at least
20% of those in the sample will express an interest in
buying a Porsche?

3. A well-known medical aid company claims that one


person in seven will be hospitalised this year. Suppose
you keep track of a random sample of 180 people during
the year. Assuming the medical aid company’s claim is
accurate, what is the probability that fewer than 10% of
the people in the sample will be hospitalised this year?

4. Thirty-eight per cent of all shoppers at the food outlet of


a store are holders of the store’s rewards card. If a
random sample of 100 shoppers is taken, what is the
probability that at least 30 of them are holders of the
rewards card?

Revision Exercise 2
1. Assume that the value of the day-to-day claims received
by a medical aid scheme is normally distributed with a
mean of R400 and a population standard deviation of
R80. A random sample of 120 claims is selected, and the
claim values are recorded. What is the probability that
the sample mean value of the claims does not differ from
the actual population mean value by more than R12 in
either direction?

2. A study was conducted on the reaction time of long


distance truck drivers, after a long trip of non-stop driving
from Durban to Johannesburg. Assume reaction times
are normally distributed, with a mean of 1.5 seconds and
a standard deviation of 0.3 seconds. In a random sample

© The Independent Institute of Education (Pty) Ltd 2023 Page 115 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

of 70 drivers, what is the probability that their mean


reaction time will exceed 1.6 seconds?

3. Assume that the mean number of hours worked per


week for persons with home-based businesses is 23
hours, with a standard deviation of 10 hours. For a
sample of 100 home-based businesses, what is the
probability that a person will on average work less than
20 hours?

4. Suppose that the trade receivable accounts balances of


an advertising company have a mean of R9 250 and a
standard deviation of R2072. What is the probability that
a sample of 81 trade receivable accounts will have a
mean of more than R9 130?

5. An importer of Indian herbs and spices claims that the


average weight of packets of Saffron is 20 grams.
Packets are actually filled to an average weight of 19.5
grams with a standard deviation of 1.8 grams. Selecting
a random sample of 36 packets, what is the probability
that the sample average is less than 20 grams?

6. Study the various types of probability and non-probability


sampling methods discussed at the beginning of Chapter
Six of the prescribed textbook. It is important that you
understand the advantages and disadvantages of these
sampling methods, and the implications that each type of
sampling method may have on your ability to apply
inferential statistics procedures.

© The Independent Institute of Education (Pty) Ltd 2023 Page 116 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

3 Solutions to Activities and Revision Exercises


Activity 1
1. P(p ≤ 0.10) = P(z < -0.75)
= 0.5 – P(0 < z < 0.75)
= 0.5 – 0.2734
= 0.2266

𝑝− 𝜋 0.10−0.12
𝑧= =
𝜋(1−𝜋) 0.12(1−0.12)
√ √
𝑛 150

= -0.75

2. P(p < 0.15) = P(z < 1.13)


= 0.5 + P(0 < z < 1.13)
= 0.5 + 0.3708
= 0.8708

0.15 − 0.12
𝑧= = 1.13
√0.12(1 − 0.12)
150

Activity 2
1. Sampling Error = 𝑥̅ − 𝜇 = 22.1 - 20 = 2.1

2 𝜇𝑥̅ = 𝜇𝑥 = 20

𝜎𝑥 5
𝜎𝑥̅ = = = 0.79
√𝑛 √40

3. P(x  22.1) = P(z > 2.66)


= 0.5 – P(0 < z < 2.66)
= 0.5 – 0.49609
= 0.00391

22.1−20
𝑧= 5 = 2.66
√40

© The Independent Institute of Education (Pty) Ltd 2023 Page 117 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Revision Exercise 1
1. P(p  0.18) = P(z > 0.92)
= 0.5 – P(0 < z < 0.92)
= 0.5 – 03212
= 0.1788

𝑝− 𝜋 0.18−0.15
𝑧= =
𝜋(1−𝜋) 0.15(1−0.15)
√ √
𝑛 120

= 0.92

2. P(p  0.20) = P(z > -1.15)


= 0.5 + P(0 < z < 1.15)
= 0.5 + 0.3749
= 0.8749

0.20−0.25
𝑧= = -1.15
0.25(1−0.25)

100

3. P(p < 10%) = P(z < -1.64)


= 0.5 – P(0 < z < 1.64)
= 0.5 – 0.4495
= 0.0505

1
0.10 −
7
𝑧= = -1.64
1 1
√7(1−7)
180

4. P(p  0.30) = P(z > -1.65)


= 0.5 + P(0 < z < 1.65)
=0.5 + 0.4505
= 0.9505

0.30−0.38
𝑧= = -1.65
0.38(1−0.38)

100

© The Independent Institute of Education (Pty) Ltd 2023 Page 118 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Revision Exercise 2
1. P(388 < 𝑥̅ < 412) = P(-1.64 < z < 1.64)
= 2 × P(0 < z < 1.64)
= 2(0.4495)
= 0.8990

388−400
z= 80 = -1.64
√120

412−400
z= 80 = 1.64
√120

2. P(𝑥̅ >1.6) = P(z > 2.79)


= 0.5 – P(0 < z < 2.79)
= 0.5 – 0.49736
= 0.00264

1.6−1.5
z= 0.3 = 2.79
√70

3. P(𝑥̅ < 20) = P(z < -3.00)


= 0.5 – P(0 < z < 3.00)
= 0.5 – 0.49865
= 0.00135

20−23
z= 10 = -3.00
√100

4. P(𝑥̅ > 9130) = P(z > -0.52)


= 0.5 + P(0 < z < 0.52)
= 0.5 + 0.1985
= 0.6985

9130−9250
z= 2072 = -0.52
√81

5. P(𝑥̅ < 20) = P(z < 1.67)


= 0.5 + P(0 < z < 1.67)
= 0.5 + 0.4525
= 0.9525

20−19.5
z= 1.8 = 1.67
√36

© The Independent Institute of Education (Pty) Ltd 2023 Page 119 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Learning Unit 8: Hypothesis Testing


Material used for this learning unit:

• Prescribed Textbook, Chapter Eight. (Please note, all references


and exercises related to Excel can be done for enrichment purposes
but will not be assessed.)
• Please also note that the following sections need not be covered:

o The p-value approach to hypothesis testing;


o Hypothesis test for σ2.

1 Activities
1.1 Activity 1
Purpose:

Follow the procedure for performing a one-tailed test regarding


the population mean, where the population standard deviation,
 is known. The sample size, n, is large.

Task:

According to a study carried out to develop new tax laws on


company benefits, it is claimed that, on average, people who
claim travel allowances travel more than 20 200 kilometres per
year.

To test this claim, we obtain a random sample of 35


employees who claimed travel allowances. The mean number
of kilometres driven by the 45 employees was 22 100. If the
standard deviation for the population is assumed to be 4 100
kilometres, does the sample provide sufficient evidence to
prove that the average must be higher than 20 200 at a 1%
level of significance?

Commentary Related to Activity Design:

The steps to follow are the same for all hypothesis procedures.
The only differences that can occur are the choice of H0 and
H1, the table used (z or t) to determine the critical value(s), and
the equation used to obtain the sample statistic.

© The Independent Institute of Education (Pty) Ltd 2023 Page 120 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

1.2 Activity 2
Purpose:

Follow the procedure for an upper-tailed hypothesis test


regarding a single population proportion. The sample size, n, is
large.

Task:
In a survey, consumers were asked if they had ever used
online shopping. 72 of the 165 respondents answered “yes”.

At the 5% level of significance test the claim that at most 50%


of consumers have used online shopping.

Commentary Related to Activity Design:

The steps to follow are the same for all hypothesis procedures.
The only differences that can occur are the choice of H0 and
H1, the table used (z or t) to determine the critical value(s), and
the equation used to obtain the sample statistic.

1.3 Activity 3
Purpose:

Follow the procedure for a two-tailed hypothesis test regarding


a single population mean for a small sample.

Task:

The refreshment shop in a theatre complex recorded the


amount spent (in Rands) by 12 randomly selected patrons.
The mean amount spent was R55 with a standard deviation of
R22. Does this sample data provide evidence that the mean
amount spent by all patrons is R50? Use a 1% level of
significance?

Commentary Related to Activity Design:

The steps to follow are the same for all hypothesis procedures.
The only differences that can occur are the choice of H0 and
H1, the table used (z or t) to determine the critical value(s), and
the equation used to obtain the sample statistic.

© The Independent Institute of Education (Pty) Ltd 2023 Page 121 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

2 Revision Exercises
2.1 Revision Exercise 1
The Dairy Ice cream factory uses a filling machine for its two
litre cartons. There is some variation in the actual amount that
goes into a carton. The machine can go out of setting and
inject a mean quantity less than or more than two litres. To
monitor the filling process, the production manager selects a
simple random sample of 46 filled cartons, and measures the
contents. The average amount per carton was found to be 2.15
litres, with a standard deviation of 0.30 litres. Test whether the
machine is still in setting, using a 5% level of significance.

2.2 Revision Exercise 2


A health and safety inspector randomly sampled 200 reports
on industrial accidents, and found that 53 were due to untidy
working conditions. A factory manager claims that less than
20% of accidents are the result of untidy working conditions.
Perform a hypothesis test at a 10% level of significance in
order to determine whether the manager’s claim is true.

2.3 Revision Exercise 3


A national airline claims that 96% of its flights depart on time.
You have booked a flight and are worried that your flight will
not depart on time. You therefore decide to test the airline’s
claim. You record the departure information for 80 randomly
selected flights, and discover that five departed late. Test the
airline’s claim at the 1% level of significance.

© The Independent Institute of Education (Pty) Ltd 2023 Page 122 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

3. Solutions to Exercises
3.1 Activity 1
Step 1: State the null and alternative hypotheses

The claim made is that people drive on average more than


20 200 km. We make this claim the H1 hypothesis:

H0:  ≤ 20 200
H1:  > 20 200

(The test is upper-tailed).

Step 2: Areas Of Acceptance And Rejection, And Decision


Rule (α = 1%)

The H1 indicates a one-tail test to the right. The population


standard deviation is known. Therefore, we assume the normal
z-distribution, with α = 0.01. We will reject H1 if the z-test
statistic is > 2.33

1%
2.33
Step 3: Compute Test Statistic

𝑥̅ − 𝜇 22100 − 20200
𝑧𝑠𝑡𝑎𝑡 = 𝜎 = = 3.11
4100
√𝑛 √45

Step 4: Compare The Test Statistic With The Decision Rule

As 3.11 > 2.33, we reject H0.

Step 5: Conclusion:

We reject H0 at α = 0.01, and conclude that the claim that the


average distance travelled is more than 20 200 kilometres is
probably true.

© The Independent Institute of Education (Pty) Ltd 2023 Page 123 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

3.2 Activity 2
Step 1: Hypotheses

H0:  ≤ 0.50
H1:  > 0.50

Step 2: Areas Of Acceptance And Rejection, And Decision


Rule

For α = 0.05, the critical value is 1.645. We will therefore


accept H0 if the z test statistic is ≤ 1.645. Otherwise, we will
reject H0.

Step 3: Sample Statistic

From the sample, n = 165 and x = 72. Therefore, p = 0.44

𝑝− 𝜋 0.44 − 0.50
𝑧= = = −1.54
√𝜋(1 − 𝜋) √0.50 × 0.50
𝑛 165

Step 4: Compare Sample Statistic To Decision Rule

As z = -1.54 < 1.645, we do not reject H0.

Step 5: Conclusion

We do not reject H0 at α = 0.05, and conclude that the claim


that at most 50% of consumers have used online shopping is
probably true.

3.3 Activity 3
Step 1: Hypotheses

H0: μ = 50
H1: μ ≠ 50

Step 2: Areas Of Acceptance And Rejection, And Decision


Rule

As the population standard deviation is unknown, and the


sample size is small, we will use the t-distribution.

© The Independent Institute of Education (Pty) Ltd 2023 Page 124 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

The degrees of freedom = n – 1 = 12 – 1 = 11. As the test is


two-tailed, for α = 0.01 the critical value is therefore t = 3.106.
Thus, we will accept H0 if:

−3.106 ≤ 𝑡 − 𝑠𝑡𝑎𝑡 ≤ 3.106

Otherwise we will reject H0.

Step 3: Compute Sample Statistic

𝑥̅ − 𝜇 55 − 50
𝑡 − 𝑠𝑡𝑎𝑡 = 𝑠 = = 0.79
22
√𝑛 √12

Step 4: Compare Sample Statistic With Decision Rule

As t-stat = 0.79 is less than 3.106 and greater than -3.106, we


do not reject H0.

Step 5: Conclusion

We do not reject H0 at α = 0.01, and conclude that the mean


amount spent by all patrons is probably R50.

3.4 Revision Exercise 1


Step 1: Hypotheses

H0: µ = 2.00
H1: µ ≠ 2.00

Step 2: Areas of Acceptance And Rejection, And Decision


Rule

Although we do not have the population standard deviation,


the sample is large, and therefore we use the z-distribution.

For a two-tailed test with α =0.05, the critical value is 1.96.


Thus, we will accept H0 if:

−1.96 ≤ 𝑧 − 𝑠𝑡𝑎𝑡 ≤ 1.96

Otherwise, we will reject H0.

© The Independent Institute of Education (Pty) Ltd 2023 Page 125 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Step 3: Compute Sample Statistic

𝑥̅ − 𝜇 2.15 − 2.00
𝑧 − 𝑠𝑡𝑎𝑡 = 𝑠 = = 3.39
0.30
√𝑛 √46

Step 4: Compare Sample Statistic with Decision Rule

As z-stat > 1.96, we do not accept H0.

Step 5: Conclusion

We do not accept H0 at α = 0.05, and conclude that the


machine is probably not still in setting.

3.5 Revision Exercise 2


Step 1: Hypotheses

H0: π ≥ 0.20
H1: π < 0.20

Step 2: Areas of Acceptance And Rejection, And Decision


Rule

As the sample size is large, we will apply the z-distribution.


The test is a lower-tailed test. Therefore, with α = 0.10, we will
accept H0 if:

𝑧 − 𝑠𝑡𝑎𝑡 ≥ −1.28

Otherwise, we will reject H0.

Step 3: Compute Sample Statistic

From the sample, n = 200 and x = 53. Therefore p = 0.265.

𝑝− 𝜋 0.265 − 0.20
𝑧 − 𝑠𝑡𝑎𝑡 = = = 2.30
√ 𝜋 (1 − 𝜋 ) √0.20 × 0.80
𝑛 200

Step 4: Compare Sample Statistic with Decision Rule

As z-stat > -1.28, we do not reject H0.

© The Independent Institute of Education (Pty) Ltd 2023 Page 126 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Step 5: Conclusion

We do not reject H0 at α = 0.10, and conclude that the


manager’s claim is probably false.

3.6 Revision Exercise 3


Step 1: Hypotheses

H0: π = 0.96
H1: π ≠ 0.96

Step 2: Areas of Acceptance And Rejection, And Decision


Rule

As we have a large sample, we will use the z-distribution. For a


two-tailed test with α = 0.01, the critical value is 2.58.
Therefore, we will accept H0 if:

−2.58 ≤ 𝑧 − 𝑠𝑡𝑎𝑡 ≤ 2.58

Otherwise, we will reject H0.

Step 3: Compute Sample Statistic

From the sample, n = 80 and x = 75. Therefore, p = 0.9375.

𝑝− 𝜋 0.9375 − 0.96
𝑧 − 𝑠𝑡𝑎𝑡 = = = −1.03
√ 𝜋 (1 − 𝜋 ) √0.96 × 0.04
𝑛 80

Step 4: Compare Sample Statistic with Decision Rule

As z-stat is greater than -2.58 and less than 2.58, we do not


reject H0.

Step 5: Conclusion

We do not reject H0 at α = 0.01, and conclude that the airline’s


claim is probably true.

© The Independent Institute of Education (Pty) Ltd 2023 Page 127 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Learning Unit 9: Chi-Square Tests


Material used for this learning unit:

• Prescribed Textbook, Chapter Ten. (Please note that the


“Theoretical Distribution” topic should be covered as part of the
course content, but will not be examined.)
• Application of the “Rule of Five” will not be required for examination
purposes.

1 Activities
Activity 1
Purpose:

Learn to apply the chi-square goodness-of-fit test, via a step-


by-step approach, in order to test whether data from an
experiment fits a specified distribution.

Task:

A bank’s marketing division wishes to establish if there is equal


usage of the various banking facilities available to customers
through internet banking. A survey of 500 randomly selected
clients who use the bank’s internet facilities was carried out.
Clients were asked to identify the particular banking facility that
they use most often. The responses are summarised in the
following table:

Facility Number of clients (fo)


Investment advice 68
Funds transfer 110
Statement requests 105
Regular payments 100
Ad hoc payments 117
TOTAL 500

Can the bank’s marketing manager conclude that there is


equal usage of the various Internet banking facilities? Test at a
10% level of significance.

© The Independent Institute of Education (Pty) Ltd 2023 Page 128 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Activity 2
Purpose:

Learn to utilise the chi-square goodness-of-fit test in order to


determine whether data from an experiment fit a specified
distribution.

Task:

A restaurant manager wants to book his waiters’ shifts for the


coming week, according to his belief that the distribution of
customer arrivals during the week is as follows:

Expected % of customers
Day
Monday 5
Tuesday 10
Wednesday 15
Thursday 15
Friday 25
Saturday 30
TOTAL 100

A week was randomly chosen, and the number of customers


who arrived was recorded as follows:

• Monday, 31;
• Tuesday, 18;
• Wednesday, 36;
• Thursday, 23;
• Friday, 47;
• Saturday, 60.

Use this sample to test the manager’s expected distribution,


using a 5% level of significance.

Commentary Related to Activity Design:

It is important to recognise the type of distribution before


attempting the question.

The expected distribution is given as a %. Observed or


sample data can never be changed, therefore the expected
percentages must be adjusted to “number of customers” by
using the sample total. Why? Because the fo and fe column
totals must be the same!

© The Independent Institute of Education (Pty) Ltd 2023 Page 129 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Activity 3
Purpose:

The purpose of this activity is to show the step by step


procedure of how to conduct a test for independence of
association.

Task:

A random sample of adults was selected from each of four


ethnic groups in Cape Town. Respondents were asked to
specify their primary source of news. The results are as
follows:

Ethnic Group
A B C D TOTAL
TV 30 20 25 20 95
Radio 25 25 20 20 90
News Paper 10 10 5 30 55
TOTAL 65 55 50 70 240

Is there a relationship between ethnic group and the source of


news at a 5% level of significance?

Commentary Related to Activity Design:

It is important to recognise the type of distribution before


attempting the question. In this type of distribution, actual
counts are compared with what is expected if the null
hypothesis (variables are independent) is true.

Activity 4
Purpose:

The purpose of this activity is to show the step-by-step


procedure to conduct a test for the equality of proportions.

Task:

A politician wishes to gauge the level of satisfaction amongst


South Africans with regard to the provision of municipal
services. He randomly samples 150 people, 18 years and
older, from each of four randomly selected provinces. The
question asked was: “Are you satisfied or dissatisfied with the

© The Independent Institute of Education (Pty) Ltd 2023 Page 130 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

municipal services in your area?” The following table


summarises the results of the survey.

Province TOTAL
Satisfaction A B C D
Satisfied 75 86 91 85 337
Dissatisfied 75 64 59 65 263
TOTAL 150 150 150 150 600

Test whether the proportion of South Africans who are satisfied


with the services supplied to them is equal for each province at
α =0.10.

Commentary Related to Activity Design:

It is important to recognise the type of distribution before you


attempt the question.

© The Independent Institute of Education (Pty) Ltd 2023 Page 131 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

2 Revision Exercises
Revision Exercise 1
In a small Lotto (using digits 0 to 9 only), the number of times
digits 0 to 9 turned up in a run of 300 selections are
summarised in the table below.

Digit 0 1 2 3 4 5 6 7 8 9
Frequency 24 22 25 32 32 35 37 26 32 35

Test the claim that the Lotto is fair, i.e. that the distribution is
uniform, at a 10% level of significance.

Revision Exercise 2
The manager of the human resources department of a certain
bank analysed the qualification profile of a random sample of
134 managers. The information he obtained is summarised in
the following table:

Management level
Section Department Division
Qualification head head head TOTAL
Matric 28 14 8 50
Diploma 20 24 6 50
Degree 10 10 14 34
TOTAL 58 48 28 134

Determine if qualification and management level are


independent at a 5% level of significance.

Revision Exercise 3
A researcher wishes to investigate whether the proportion of
smokers within different age groups is the same. The question
asked was: “Have you smoked at least one cigarette per day in
the past week?” The results of the survey are as follows.

Age (years)
18 - 29 30 - 49 50 – 64 65 & older Total
Smoked 23 20 22 12
Did not smoke 57 60 58 68
TOTAL

Is there evidence to indicate that that the proportion of


smokers is different for each age group? Test at a 5% level of
significance.

© The Independent Institute of Education (Pty) Ltd 2023 Page 132 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Revision Exercise 4
According to past studies, it was found that 40% of consumers
buy full cream milk, and that 2% milk, low fat milk, skimmed
milk and soya milk each attract 15% of consumers. To assess
whether the demand for each type of milk has changed, a
store manager of a hyper store surveys 200 customers
regarding the type of milk they purchase. The results were as
follows.

Type of milk Full cream 2% Low fat Skimmed Soya


Demand 70 50 20 20 40

Determine whether the demand for each type of milk has


changed significantly at a 10% level of significance.

© The Independent Institute of Education (Pty) Ltd 2023 Page 133 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

3 Solutions to Activities and Revision Exercises


Activity 1
Step 1: Hypotheses

H0: There is equal usage of the internet facilities


H1: H0 is not true

Step 2: Areas Of Acceptance And Rejection, And Decision Rule

No population parameters are being estimated in this test, so m = 0. There are five
different categories of internet facility under consideration. Therefore, df = k – m – 1 =
5 – 0 – 1 = 4.

For α = 0.10, the critical value for 2 = 7.779. We will therefore accept H0 if:

𝜒 2 ≤ 7.779

Otherwise, we will reject H0.

Step 3: Compute Sample Statistic

Number of clients (fo)


Facility fe (fo – fe)2/fe
Investment advice 68 100 10.24
Funds transfer 110 100 1.00
Statement requests 105 100 0.25
Regular payments 100 100 0.00
Ad hoc payments 117 100 2.89
TOTAL 500 500 14.38

(𝑓𝑜 − 𝑓𝑒 )2
𝜒 2 − 𝑠𝑡𝑎𝑡 = ∑ = 14.38
𝑓𝑒

Step 4: Compare Sample Statistic and Decision Rule

As 𝜒 2 − 𝑠𝑡𝑎𝑡 > 7.779, we do not accept H0.

Step 5: Conclusion

We do not accept H0 at α = 0.10, and conclude that there is probably not equal usage
of the various internet banking facilities.

© The Independent Institute of Education (Pty) Ltd 2023 Page 134 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Activity 2
Step 1: Hypotheses

H0: The manager’s belief regarding the distribution of customer arrivals is correct.
H1: H0 is not true

Step 2: Areas of Acceptance And Rejection, And The Decision Rule

No population parameters are being estimated in this test, so m = 0. There are six
different categories of day under consideration. Therefore, df = k – m – 1 = 6 – 0 – 1
= 5.

For α = 0.05, the critical value for 2 = 11.070. We will therefore accept H0 if:

𝜒 2 ≤ 11.070

Otherwise, we will reject H0.

Step 3: Compute Sample Statistic

Expected % of
Day customers fe fo (fo – fe)2/fe
Monday 5 10.75 31 38.15
Tuesday 10 21.50 18 0.57
Wednesday 15 32.25 36 0.44
Thursday 15 32.25 23 2.65
Friday 25 53.75 47 0.85
Saturday 30 64.5 60 0.31
TOTAL 100 215 215 42.97

(𝑓𝑜 − 𝑓𝑒 )2
𝜒 2 − 𝑠𝑡𝑎𝑡 = ∑ = 42.97
𝑓𝑒

Step 4: Compare Sample Statistic and Decision Rule

As 𝜒 2 − 𝑠𝑡𝑎𝑡 > 11.070, we do not accept H0.

Step 5: Conclusion

We do not accept H0 at α = 0.10, and conclude that the manager’s belief regarding
the distribution of customer arrivals is probably incorrect.

© The Independent Institute of Education (Pty) Ltd 2023 Page 135 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Activity 3
Step 1: Hypotheses

H0: There is no relationship between ethnic group and the source of news.
H1: H0 is not true

Step 2: Areas of Acceptance And Rejection, And Decision Rule

There are three rows of news source and four columns of ethnic group. Therefore, df
= (r – 1)(c – 1) = (3 - 1)(4 – 1) = 6.

For α = 0.05, the critical value for 2 = 12.592. We will therefore accept H0 if:

𝜒 2 ≤ 12.592

Otherwise, we will reject H0.

Step 3: Compute Sample Statistic

Ethnic
Group fo fe (fo – fe)2/fe
A TV 30 25.73 0.71
Radio 25 24.38 0.02
Newspaper 10 14.90 1.61
B TV 20 21.77 0.14
Radio 25 20.63 0.93
Newspaper 10 12.60 0.54
C TV 25 19.79 1.37
Radio 20 18.75 0.08
Newspaper 5 11.46 3.64
D TV 20 27.71 2.15
Radio 20 26.25 1.49
Newspaper 30 16.04 12.15
TOTAL 240 240 24.83

(𝑓𝑜 − 𝑓𝑒 )2
𝜒 2 − 𝑠𝑡𝑎𝑡 = ∑ = 24.83
𝑓𝑒

Step 4: Compare Sample Statistic And Decision Rule

As 𝜒 2 − 𝑠𝑡𝑎𝑡 > 12.592, we do not accept H0.

© The Independent Institute of Education (Pty) Ltd 2023 Page 136 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Step 5: Conclusion

We reject H0 at α = 0.05, and conclude that there is probably a relationship between


ethnic group and the source of news.

Activity 4
Step 1: Hypotheses

H0: The proportion of South Africans who are satisfied with the services supplied to
them is equal for each province.
H1: H0 is not true

Step 2: Areas of Acceptance and Rejection, and Decision Rule

There are two rows of satisfaction level and four columns of province. Therefore,
df = (r – 1)(c – 1) = (2 - 1)(4 – 1) = 3.

For α = 0.10, the critical value for 2 = 6.251. We will therefore accept H0 if:

𝜒 2 ≤ 6.251

Otherwise, we will reject H0.

Step 3: Compute Sample Statistic

Province fo fe (fo – fe)2/fe


A Satisfied 75 84.25 1.02
Dissatisfied 75 65.75 1.30
B Satisfied 86 84.25 0.04
Dissatisfied 64 65.75 0.05
C Satisfied 91 84.25 0.54
Dissatisfied 59 65.75 0.69
D Satisfied 85 84.25 0.01
Dissatisfied 65 65.75 0.01
TOTAL 600 600 3.66

(𝑓𝑜 − 𝑓𝑒 )2
𝜒 2 − 𝑠𝑡𝑎𝑡 = ∑ = 3.66
𝑓𝑒

Step 4: Compare Sample Statistic And Decision Rule

As 𝜒 2 − 𝑠𝑡𝑎𝑡 < 6.251, we do not reject H0.

© The Independent Institute of Education (Pty) Ltd 2023 Page 137 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Step 5: Conclusion

We do not reject H0 at α = 0.10, and conclude that the proportion of South Africans
who are satisfied with the services supplied to them is probably equal for each
province.

Revision Exercise 1

Digit 0 1 2 3 4 5 6 7 8 9
Frequency 24 22 25 32 32 35 37 26 32 35

Test the claim that the Lotto is fair. That will be if the distribution is uniform, at a 10%
level of significance.

Step 1: Hypotheses

H0: The lotto is fair (i.e. each number has an equal chance of turning up).
H1: H0 is not true

Step 2: Areas of Acceptance and Rejection, and Decision Rule

No population parameters are being estimated in this test, so m = 0. There are ten
different number categories under consideration. Therefore, df = k – m – 1 = 10 – 0
– 1 = 9.

For α = 0.10, the critical value for 2 = 14.684. We will therefore accept H0 if:

𝜒 2 ≤ 14.684

Otherwise, we will reject H0.

© The Independent Institute of Education (Pty) Ltd 2023 Page 138 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Step 3: Compute Sample Statistic

Digit fo fe (fo – fe)2/fe


0 24 30 1.20
1 22 30 2.13
2 25 30 0.83
3 32 30 0.13
4 32 30 0.13
5 35 30 0.83
6 37 30 1.63
7 26 30 0.53
8 32 30 0.13
9 35 30 0.83
TOTAL 300 300 8.37

(𝑓𝑜 − 𝑓𝑒 )2
𝜒 2 − 𝑠𝑡𝑎𝑡 = ∑ = 8.37
𝑓𝑒

Step 4: Compare Sample Statistic and Decision Rule

As 𝜒 2 − 𝑠𝑡𝑎𝑡 < 14.684, we do not reject H0.

Step 5: Conclusion

We do not reject H0 at α = 0.10, and conclude that the lotto is probably fair (i.e. there
is an equal chance that each number will turn up).

Revision Exercise 2
Step 1: Hypotheses

H0: Qualification and management level are independent (i.e. there is no relationship
between qualification and management level).
H1: H0 is not true.

Step 2: Areas of Acceptance and Rejection, and Decision Rule

There are three rows of qualification and three columns of management level.
Therefore, df = (r – 1)(c – 1) = (3 - 1)(3 – 1) = 4.

© The Independent Institute of Education (Pty) Ltd 2023 Page 139 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

For α = 0.05, the critical value for 2 = 9.488. We will therefore accept H0 if:

𝜒 2 ≤ 9.488

Otherwise, we will reject H0.

Step 3: Compute Sample Statistic

Management
Qualification Level fo fe (fo – fe)2/fe
Matric Section 28 21.64 1.87
Department 14 17.91 0.85
Division 8 10.45 0.57
Diploma Section 20 21.64 0.12
Department 24 17.91 2.07
Division 6 10.45 1.89
Degree Section 10 14.72 1.51
Department 10 12.18 0.39
Division 14 7.10 6.71
TOTAL 134 134 15.98

(𝑓𝑜 − 𝑓𝑒 )2
𝜒 2 − 𝑠𝑡𝑎𝑡 = ∑ = 15.98
𝑓𝑒

Step 4: Compare Sample Statistic and Decision Rule

As 𝜒 2 − 𝑠𝑡𝑎𝑡 > 9.488, we do not accept H0.

Step 5: Conclusion

We do not accept H0 at α = 0.05, and conclude that qualification and management


level are probably dependent (i.e. there is probably a relationship between
qualification and management level).

Revision Exercise 3
Step 1: Hypotheses

H0: The proportion of smokers is the same for each age group.
H1: H0 is not true

© The Independent Institute of Education (Pty) Ltd 2023 Page 140 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Step 2: Areas of Acceptance and Rejection, and Decision Rule

There are two rows of smoking status and four columns of age group. Therefore, df =
(r – 1)(c – 1) = (2 - 1)(4 – 1) = 3.

For α = 0.05, the critical value for 2 = 7.815. We will therefore accept H0 if:

𝜒 2 ≤ 7.815

Otherwise, we will reject H0.

Step 3: Compute Sample Statistic

Age (years)
18 - 29 30 - 49 50 – 64 65 and older Total
Smoked 23 20 22 12 77
Did not smoke 57 60 58 68 243
Total 80 80 80 80 320

Age fo fe (fo – fe)2/fe


18 – 29 Smoked 23 19.25 0.73
Did not smoke 57 60.75 0.23
30 – 49 Smoked 20 19.25 0.03
Did not smoke 60 60.75 0.01
50 – 64 Smoked 22 19.25 0.39
Did not smoke 58 60.75 0.12
65 and older Smoked 12 19.25 2.73
Did not smoke 68 60.75 0.87
Total 320 320 5.11

(𝑓𝑜 − 𝑓𝑒 )2
𝜒 2 − 𝑠𝑡𝑎𝑡 = ∑ = 5.11
𝑓𝑒

Step 4: Compare Sample Statistic and Decision Rule

As 𝜒 2 − 𝑠𝑡𝑎𝑡 < 7.815, we do not reject H0.

Step 5: Conclusion

We do not reject H0 at α = 0.05, and conclude that the proportion of smokers is


probably the same for each group.

© The Independent Institute of Education (Pty) Ltd 2023 Page 141 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Revision Exercise 4
Step 1: Hypotheses

H0: The demand for each type of milk has not changed.
H1: H0 is not true

Step 2: Areas of Acceptance and Rejection, and The Decision Rule

No population parameters are being estimated in this test, so m = 0. There are five
different categories of milk under consideration. Therefore, df = k – m – 1 = 5 – 0 – 1
= 4.

For α = 0.10, the critical value for 2 = 7.779. We will therefore accept H0 if:

𝜒 2 ≤ 7.779

Otherwise, we will reject H0.

Step 3: Compute Sample Statistic

Milk Expected % fe fo (fo – fe)2/fe


Full Cream 5 80 70 1.25
2% 10 30 50 13.33
Low Fat 15 30 20 3.33
Skimmed 15 30 20 3.33
Soya 25 30 40 3.33
Total 200 200 24.57

(𝑓𝑜 − 𝑓𝑒 )2
𝜒 2 − 𝑠𝑡𝑎𝑡 = ∑ = 24.57
𝑓𝑒

Step 4: Compare Sample Statistic and Decision Rule

As 𝜒 2 − 𝑠𝑡𝑎𝑡 > 7.779, we do not accept H0.

Step 5: Conclusion

We do not accept H0 at α = 0.10, and conclude that the demand for each type of milk
has probably changed significantly.

© The Independent Institute of Education (Pty) Ltd 2023 Page 142 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

FORMULAE SHEET: Quantitative Techniques


(QUAT6221 and BSTA6212)
1. Measures Of Central Tendency

∑𝑛𝑖=1 𝑥𝑖 ∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖
𝑥̅ = 𝑥̅ =
𝑛 𝑛

𝑛 𝑐 (𝑓𝑚 − 𝑓𝑚−1 )
𝑐 [ − 𝑓(<)] 𝑀𝑜 = 𝑂𝑚𝑜 +
𝑀𝑒 = 𝑂𝑚𝑒 + 2 2𝑓𝑚 − 𝑓𝑚−1 − 𝑓𝑚+1
𝑓𝑚𝑒

2. Measures Of Dispersion

R = 𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛

R = upper limit of highest class – lower limit of lowest class

∑ 𝑥 2 − 𝑛 𝑥̅ 2 ∑ 𝑓𝑥 2 − 𝑛 𝑥̅ 2
𝑠2 = 𝑠2 =
𝑛−1 𝑛−1

𝑠
𝑠 = √𝑠 2 𝐶𝑉 = × 100%
𝑥̅

𝑄3 − Q1
Interquartile range = Q3 - Q1 Quartile Deviation =
2

𝑛 ∑(𝑥𝑖 − 𝑥̅ )3 3(Mean − Median)


𝑆𝑘𝑝 = 𝑆𝑘𝑝 =
(𝑛 − 1)(𝑛 − 2)𝑠 3 Standard deviation

𝑛 3𝑛
𝑐 [ − 𝑓(<)] 𝑐[ − 𝑓(<)]
𝑄1 = 𝑂𝑞1 + 4 𝑄3 = 𝑂𝑞3 + 4
𝑓𝑞1 𝑓𝑞3

P = lp + (Up – lp)((p/100)n – f(<))


fp
Where:

lp = lower class boundary of the interval containing the pth percentile.


Up = upper class boundary of the interval containing the pth percentile.
f(<) = cumulative frequency of the class interval before the interval containing the pth
percentile.
fp = frequency of the class interval containing the pth percentile.

© The Independent Institute of Education (Pty) Ltd 2023 Page 143 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

3. Probability Distributions

𝑃(𝑟) = 𝑛𝐶𝑟 × 𝑝𝑟 × 𝑞 𝑛−𝑟 , where 𝑞 = 1 − 𝑝 𝜇 = 𝑛𝑝 and 𝜎 = √𝑛𝑝𝑞

𝑒 −𝑎 𝑎 𝑥 𝑒 −𝜆 𝜆𝑥 𝜇 = 𝜆 and 𝜎 = √𝜆
𝑃 (𝑥 ) = or 𝑃(𝑥 ) =
𝑥! 𝑥!

4. Linear Regression And Correlation

𝑦̂ = 𝑏0 + 𝑏1 𝑥 or 𝑦̂ = 𝑎 + 𝑏𝑥

𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦 ∑ 𝑦 − 𝑏1 ∑ 𝑥
𝑏1 = 𝑏 = 𝑏0 = 𝑎 =
𝑛 ∑ 𝑥 2 − (∑ 𝑥)2 𝑛

𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦 𝑟 2 = (𝑟)2 × 100
𝑟=
√[𝑛 ∑ 𝑥 2 − (∑ 𝑥)2 ][𝑛 ∑ 𝑦 2 − (∑ 𝑦)2 ]

5. Hypothesis Testing

𝑥̅ − 𝜇 𝑥̅ − 𝜇
𝑧= 𝑧 or 𝑡 =
𝜎/√𝑛 𝑠/√𝑛

𝑝−𝜋 (𝑓𝑜 −𝑓𝑒 )2


𝑧= 𝜒2 = ∑
𝑓𝑒
√𝜋(1 − 𝜋)
𝑛

6. Index Numbers

𝑝1
Price relative = × 100%
𝑝0

∑(𝑝1 𝑞0 )
Laspeyres price index = × 100%
∑(𝑝0 𝑞0 )

∑(𝑝1 𝑞1 )
Paasche price index = ∑(𝑝0 𝑞1 )
× 100%

© The Independent Institute of Education (Pty) Ltd 2023 Page 144 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

© The Independent Institute of Education (Pty) Ltd 2023 Page 145 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

© The Independent Institute of Education (Pty) Ltd 2023 Page 146 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

TABLE 3
The Chi-Squared distribution ()
This table gives the values of (df)(α)
df 0,995 0,99 0,975 0,95 0,9 0,1 0,05 0,025 0,01 0,005
1 --- --- 0,001 0,004 0,016 2,706 3,841 5,024 6,635 7,879
2 0,01 0,02 0,051 0,103 0,211 4,605 5,991 7,378 9,21 10,597
3 0,072 0,115 0,216 0,352 0,584 6,251 7,815 9,348 11,345 12,838
4 0,207 0,297 0,484 0,711 1,064 7,779 9,488 11,143 13,277 14,86
5 0,412 0,554 0,831 1,145 1,61 9,236 11,07 12,833 15,086 16,75
6 0,676 0,872 1,237 1,635 2,204 10,645 12,592 14,449 16,812 18,548
7 0,989 1,239 1,69 2,167 2,833 12,017 14,067 16,013 18,475 20,278
8 1,344 1,646 2,18 2,733 3,49 13,362 15,507 17,535 20,09 21,955
9 1,735 2,088 2,7 3,325 4,168 14,684 16,919 19,023 21,666 23,589
10 2,156 2,558 3,247 3,94 4,865 15,987 18,307 20,483 23,209 25,188
11 2,603 3,053 3,816 4,575 5,578 17,275 19,675 21,92 24,725 26,757
12 3,074 3,571 4,404 5,226 6,304 18,549 21,026 23,337 26,217 28,3
13 3,565 4,107 5,009 5,892 7,042 19,812 22,362 24,736 27,688 29,819
14 4,075 4,66 5,629 6,571 7,79 21,064 23,685 26,119 29,141 31,319
15 4,601 5,229 6,262 7,261 8,547 22,307 24,996 27,488 30,578 32,801
16 5,142 5,812 6,908 7,962 9,312 23,542 26,296 28,845 32 34,267
17 5,697 6,408 7,564 8,672 10,085 24,769 27,587 30,191 33,409 35,718
18 6,265 7,015 8,231 9,39 10,865 25,989 28,869 31,526 34,805 37,156
19 6,844 7,633 8,907 10,117 11,651 27,204 30,144 32,852 36,191 38,582
20 7,434 8,26 9,591 10,851 12,443 28,412 31,41 34,17 37,566 39,997
21 8,034 8,897 10,283 11,591 13,24 29,615 32,671 35,479 38,932 41,401
22 8,643 9,542 10,982 12,338 14,041 30,813 33,924 36,781 40,289 42,796
23 9,26 10,196 11,689 13,091 14,848 32,007 35,172 38,076 41,638 44,181
24 9,886 10,856 12,401 13,848 15,659 33,196 36,415 39,364 42,98 45,559
25 10,52 11,524 13,12 14,611 16,473 34,382 37,652 40,646 44,314 46,928
26 11,16 12,198 13,844 15,379 17,292 35,563 38,885 41,923 45,642 48,29
27 11,808 12,879 14,573 16,151 18,114 36,741 40,113 43,195 46,963 49,645
28 12,461 13,565 15,308 16,928 18,939 37,916 41,337 44,461 48,278 50,993
29 13,121 14,256 16,047 17,708 19,768 39,087 42,557 45,722 49,588 52,336
30 13,787 14,953 16,791 18,493 20,599 40,256 43,773 46,979 50,892 53,672
40 20,707 22,164 24,433 26,509 29,051 51,805 55,758 59,342 63,691 66,766

© The Independent Institute of Education (Pty) Ltd 2023 Page 147 of 148
IIE Module Guide QUAT6221/d/p/w; BSTA6212

Bibliography
Johnson RR, Kuby PJ. 2011. Elementary Statistics. 11th ed. Duxbury.

Lombaard, C, Van der Merwe, L, Kele and T, Mouton, S. 2010. Elementary Statistics
for Business and Economics. 1st ed. Heinemann: Pearson Publishing

Sullivan M. 2010. Fundamentals of Statistics. 3rd ed. Pearson’s Education.

Triola MF. 2009. Elementary Statistics. 11th ed. Pearson’s Education.

Wegener T. 2020. Applied Business Statistics Methods and Excel-Based


Applications. 5th ed. Juta

Willemse I. 2009. Statistical Methods and Calculation Skills. 3rd ed. Juta.

© The Independent Institute of Education (Pty) Ltd 2023 Page 148 of 148

You might also like