MT (P) 1252 Statistics I Module

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 68

KYAMBOGO UNIVERSITY

MT (P) 1252: STATISTICS I

BY: MR. BEN SOROWEN

Monday 31st January, 2022

1
4 DETAILED DESCRIPTION 1

1 Course Description
This course is divided into the following three major topics:

(i) Frequency distributions.

(ii) Statistical averages.

(iii) Measures of dispersion.

2 Course Objectives
The course aims at enabling students to:

(i) Make and draw frequency distributions.

(ii) Calculate statistical averages.

(iii) Know how to determine statistical measures of dispersion.

3 Learning outcomes
By the end of the course learners should be able to;

(i) Draw frequency distributions

(ii) Find statistical averages

(iii) Determine statistical measures of dispersion

4 Detailed description
4.1 Frequency distribution
Raw data, grouped frequency distributions, class intervals class boundaries, width of a class
interval, discrete and continuous variables, histograms, histograms for grouped distribution,
open-ended classes, discrete distributions, frequency polygons, frequency curves.
Types of frequency curve; cumulative frequency distributions, percentage Ogives.
Continuous distributions, the normal distribution and applications.
Normal approximation to Binomial distribution.

4.2 Statistical averages


Arithmetic mean; Mean of a frequency distribution coded method of computing mean.
Weighted arithmetic mean.
Geometric mean of a frequency distribution.
Page 1
5 INTRODUCTION TO STATISTICS AND DATA COLLECTION 2

Medians of a frequency distribution.


Mode of a frequency distribution.
Relation between mean, median and mode.
Quartiles.

4.3 Measures of dispersion


Range, mean deviation mean deviation of a frequency distribution, quartile deviation, stan-
dard deviation of a frequency distribution.
Effect of adding a constant amount and multiplying each variable by the same amount to
variance and standardized scores.
Conversion to arbitrary scale skewness please indicate the following missing aspects for this
course units:

5 Introduction to statistics and data collection


Statistics refers to the collection, presentation, analysis, and utilization of numerical data to
make inferences and reach decisions in the face of uncertainty in economics, business, and
other social and physical sciences.
Statistics is taught in schools for three key reasons: it is useful for daily life, has an instrumental
role in other disciplines, and is important in developing critical reasoning. At primary level
statistics is reduced to frequency counts and bar graphs, with rules for calculating mean and
range.
Statistics is subdivided into descriptive and inferential.

5.1 Descriptive statistics


Descriptive statistics is concerned with summarizing and describing a body of data.
Descriptive statistics summarizes a body of data with one or two pieces of information that
characterize the whole data. It also refers to the presentation of a body of data in the form
of tables, charts, graphs, and other forms of graphic display.
The purpose of descriptive statistics is to facilitate the presentation and interpretation of data.
Most of the statistical presentations appearing in newspapers and magazines are descriptive
in nature.

5.2 Inferential statistics


Inferential statistics (both estimation and hypothesis testing) refers to the drawing of general-
izations about the properties of the whole (called a population) from the specific or a sample
drawn from the population. Inferential statistics thus involves inductive reasoning
Inferential statistics therefore involves analysing a given characteristic from the sample to
determine the general characteristic within a population. For example, if we want to find
the amount of complex sugar in the soft drinks, we can find a sample of the soft drinks and
analyse them for the amount of complex sugars depending on what to find out in the sample,

Page 2
6 DATA COLLECTION 3

we can infer or make decisions concerning the amount of complex sugars in the entire soft
drinks industry.

Definition 5.1. Population is the totality of items or things under consideration.

Definition 5.2. Sample is a portion of the population that is selected for for analysis i.e., to
analyse the quality of bread produced by the Kyambogo bakery in the month of December,
we may only select a sample of 100 loaves for analysis from the entire amount produced in
December.

Definition 5.3. Parameter is a summary measure that is computed to describe a charac-


teristic from only a sample of the population for example the protein content of a certain food
product, the lifespan of bread, etc.

Definition 5.4. Statistic is a summary measure that is computed to describe the area of
interest from a given ample of the population.

Notes:

1. The major aspect of inferential statistics is the process of using sample statistics to draw
conclusions about population parameters and hence the need for inferential statistics meth-
ods are derived from the needs for sampling since as the population becomes larger, it be-
comes costly and time consuming to obtain information from the entire population therefore
decisions concerning the population characteristics have to be based on the information ob-
tained from a sample of the population.

2. Probability theory provides the link by ascertaining the likelihood that the results from
the sample reflect the results from the entire population.

6 Data Collection
Data are the facts and figures that are collected, analyzed, and summarized for presentation
and interpretation. Data may be classified as either quantitative or qualitative.
We discus the types of data below

1. Quantitative data: Measures either how much or how many of something.


These represent numerical value and can be numerically computed.

2. Qualitative data: Provides labels, or names, for categories of like items.


For example, suppose that a particular study is interested in characteristics such as age,
gender, marital status, and annual income for a sample of 100 individuals. These charac-
teristics would be called the variables of the study, and data values for each of the variables
would be associated with each individual.
Qualitative data represents some characteristics or attributes and they depict descriptions
that may be observed but cannot be computed.

3. Primary data: Data collected for the first time.

4. Secondary data: Data that is sourced by someone other than the user.
Page 3
6 DATA COLLECTION 4

5. Discrete data: These are the data that can take only specific value.
6. Continuous data: These are data that can take values from a given range.
Example 6.1. Age and annual income are quantitative variables; the corresponding data
values indicate how many years and how much money for each individual.
Example 6.2. Gender and marital status are qualitative variables. The labels male and
female provide the qualitative data for gender, and the labels single, married, divorced, and
widowed indicate marital status.
Sample survey methods are used to collect data from observational studies, and experimental
design methods are used to collect data from experimental studies.
The area of descriptive statistics is concerned primarily with methods of presenting and in-
terpreting data using graphs, tables, and numerical summaries.
Whenever statisticians use data from a sample—i.e., a subset of the population to make
statements about a population, they are performing statistical inference.
Estimation and hypothesis testing are procedures used to make statistical inferences. Fields
such as health care, biology, chemistry, physics, education, engineering, business, and eco-
nomics make extensive use of statistical inference.
There are four main reasons as to why we need to collect data:
(i) To provide the necessary input to a research study
(ii) To measure the performance in an ongoing service or production process.
(iii) To assist in formulating alternative causes of action in the decision making process.
(iv) To satisfy curiosity.
For statistical analysis to be useful in the decision making process, the input data must be
appropriate hence proper data collection is extremely important. If the data has flaws or is
biased, no statistical methods are likely to compensate for such deficiencies hence its important
to collect the right data and using the right method.
There are several methods which are used to obtain data
(1) By using published sources (Secondary data) of data e.g journals, magazines,
newspapers, bar codes etc..
(2) By designing an experiment: For such experiments, there must be strict controls over
the treatments given to the participants e.g., in an experiment to test the effectiveness of
a herbal drug, the researcher would determine which participants in the study area would
use the drug and those who would not use the drug (control).
(3) Conducting a survey: No control is normally exercised over the behaviour of the people
carrying out the survey. They are asked questions about what they are interested in their
beliefs, attitudes and other characteristics,and then the responses are edited, coded and
tabulated for analysis.
(4) Observatory study: Here the research focuses on the area of interest directly and in
most cases, in its natural setting. This provides research information which cannot be
presented by the more structured methods of data collection such as experiments and
surveys.
Page 4
7 OBTAINING DATA THROUGH SURVEY RESEARCH 5

7 Obtaining data through survey research


The researcher develops a tool that asks questions and deals with a variety of characteristics
called random variables and the data outcome of these random variables may differ from
one response to another.
There are basically two types of data i.e., the categorical and the numeric data.
Categorical data yields categorical responses; questions such as do you eat matoke? is cate-
gorical and the responses are either yes or no. On the other hand, numeric data yields numeric
responses; how many sweets do you eat in a day? The responses will be numeric in nature e.g
0, 1, 2, . . .
The numeric Random variables may also be discrete or continuous. Therefore yielding discrete
and continuous data respectively. e.g how many loaves of bread do you buy in a day from
Kyambogo University Bakery?
Since the number of loaves would be 1, 2, 3, . . . which are whole numbers, the responses are
discrete while questions such as what is the length of a tree seedlings in a nursery bed are
continuous since it may have some decimal points e.g 2.8cm, 5.6cm e.t.c.
In order to obtain data, an appropriate sampling technique must be employed and in what
follows,we shall discus some of the sampling techniques.
Simple random sampling:
In this case, each individual is chosen entirely by chance and each member of the population
has an equal chance or probability to be selected.
Advantages

i. It is the most straightforward method of probability sampling.

Disadvantages

i. You may not select enough individuals with your characteristics of interest especially if
the characteristics is uncommon.

ii. It may difficult to define a complete sampling frame.

Systematic sampling:
Individuals are selected at regular intervals from the sampling frame. The intervals are chosen
to ensure an adequate sample size. If a sample size, n from a population of size x,then you
should select every x/nth individual for the sample e.g, if you wanted a sample size of 100
from a population of 1000, select every 1000/100th=10th member of the sampling frame.
Advantages

i. It is more convenient than the simple random sampling technique.

ii. It is easy to administer.

Disadvantages

i. It may lead to bias.

Page 5
8 PRESENTATION OF DATA 6

Stratified sampling:
The population is first divided into subgroups (or strata) who all share a similar characteristic.
It is used when we might reasonably expect the measurement of interest to vary between the
different subgroups and want to ensure representation from all the subgroups. e.g, in a study
of stroke outcomes, we may stratify the population by sex,to ensure equal representation of
male and female.
The study sample is then obtained by taking equal sample sizes from each stratum.
Advantages

i. It improves the accuracy and representation of the results by reducing sampling bias.

Disadvantages

i. It requires knowledge of the appropriate characteristics of the sampling frame.


ii. It can also be difficult to decide which characteristic to stratify by.

Clustered sampling:
In a clustered sample, subgroups of the population are used as a sampling unit rather than
individuals. The population is divided into subgroups known as clusters which are randomly
selected to be included in the study.
In single stage cluster sampling, all members of the cluster are then included in the study.
In a two-stage cluster sampling, a selection of individuals from each cluster is then randomly
selected for inclusion.
Advantages

i. It is more efficient than simple random sampling especially where a study takes place over
a wide geographical area.

Disadvantages

i. Increased risk of bias, if the chosen clusters are not representative of the population,
resulting in an increased sampling error.

8 Presentation of Data
The key objective of statistics is to collect and organize data. One of the basics of data orga-
nization comes from presentation of data in a recognizable form so that it can be interpreted
easily. You can organize data in the form of tables or you can present it pictorially.
Pictorial representation of data takes the form of bar charts, pie charts, histograms or fre-
quency polygons.
The benefit of this is that data in the visual form is easy to understand in one glance.
Data can basically be organized in two ways;

(i) Ordered array


(ii) Frequency table
Page 6
8 PRESENTATION OF DATA 7

8.1 Ordered array


An ordered array is a table where the data is arranged in a specific order which is either
ascending or descending so that it can be easily understood.

Example 8.1. let us consider the following data

5 4 8 7 3 6 5 9 6 10 7 8 6 4 2

Present above data in ordered array

SOLUTION. An ordered array of above data is given below

2 3 4 4 5 5 6 6 6 7 7 8 8 9 10

When data is arranged in an ordered array, evaluation of the major features is facilitated very
fast since it becomes easier to identify extremes, typical values and the concentration of the
value.

8.2 Frequency Tables


As the data becomes bigger, it is necessary to condense the data into appropriate summary
tables. This may be done by arranging the data into class groupings. The table containing
the data together with the corresponding frequencies is called a frequency table and may be
either for grouped or un-grouped data.
Frequency tables simplify the process of data analysis and interpretation.
When data is to be grouped,the following should be given attention. Selecting the appropriate
number of classes, obtaining a suitable class interval/width, establishing the boundaries of each
class to avoid overlapping.

(a) Selecting the number of classes:


The number of class groupings depend on the number of observations in the data. Large
numbers of observations will require a larger class class grouping and generally the fre-
quency distribution table should have at least 5 classes but not more than 15 classes.

(b) The class interval/width:


It’s desirable that the frequency distribution table, the groups are of equal width, the
width/class interval given by

Range
number of desired classes
For convenience, we use a whole number close to this value.

(c) The class boundaries:


When constructing a frequency distribution table, it is necessary to establish clearly de-
fined boundaries for each group so that the data from either the ordered array or raw form
Page 7
8 PRESENTATION OF DATA 8

can be properly tabled without overlapping. We can use groupings such as 10− < 20,
20− < 30, 30− < 40 e.t.c, for continuous data or 5 − 9, 10 − 14, 15 − 19, 20 − 24 for
discrete data.
The mid-value for each class which is half way between the boundaries of each class is
representative of the data within the class. e.g 10 − 20 i.e.
10 + 20
= 15.
2
NOTE. Before making a frequency distribution table for raw data,we use tallies to group the
data before obtaining the frequency.

Example 8.2. Present the data in Example 8.1 in a frequency distribution table

SOLUTION. Below is the frequency table for data in Example 8.1

Number Tally Frequency (f)

2 1
3 1
4 2
5 2
6 3
7 2
8 2
9 1
10 1
P
f = 15

Page 8
9 RELATIVE FREQUENCY DISTRIBUTIONS AND PERCENTAGE DISTRIBUTIONS9

9 Relative frequency distributions and percentage dis-


tributions
To simplify the analysis, it is desirable to form either the relative frequency distribution
tables or the percentage distribution tables depending on whether we prefer proportions or
percentages.
The relative frequency is obtained from
frequency of class
Relative frequency =
Total class
frequency of class
Percentage frequency = × 100%.
Total class
Example 9.1. The following are masses of fish from a day’s catch on Lake Bunyonyi

15, 20, 7, 20, 35, 31, 43, 7, 28, 7, 49, 5, 28, 19, 20, 32, 7, 10, 43, 50, 45, 27, 21, 32, 43, 46, 37, 18, 12, 21

Draw a frequency distribution table showing relative frequency and percentage frequency.

SOLUTION.
Range = 50 − 5 = 45

45
Ten groups = = 4.5 ≈ 5
10

Class Tally Frequency (f) Relative proportion frequency Percentage frequency

5-9 5 0.167 16.7


10-14 2 0.067 6.7
15-19 3 0.100 10.0
20-24 5 0.167 16.7
25-29 3 0.100 10.0
30-34 3 0.100 10.0
35-39 2 0.067 6.7
40-44 3 0.100 10.0
45-49 3 0.100 10.0
50-54 1 0.033 3.3
P
f = 30

Page 9
11 HISTOGRAM 10

PROBLEM 9.1 (Group activity). In groups of five, discus the following questions

1. What is the purpose and function of


(a) The field of study of statistics?
(b) Descriptive statistics?
(c) Inferential statistics?
2. (a) Are descriptive or inferential statistics more important today?
(b) What is the importance of a representative sample in statistical inference?
3. (a) To which field of study is statistical analysis important?
(b) What are the most important functions of descriptive statistics?
(c) What is the most important function of inferential statistics?
4. Identify five methods of sampling, discuss how each is done and give its advantages and
disadvantages.

10 Frequency distributions
It is often useful to organize or arrange a body of data into a frequency distribution. This
breaks up the data into groups or classes and shows the number of observations in each class.
A relative frequency distribution is obtained by dividing the number of observations in each
class by the total number of observations in the data as a whole. The sum of the relative
frequencies equals 1.
Example 10.1. Forty students in a class sat for a mathematics quiz marked out of 50 and
attached below are the grades the students scored

17 25 36 12 28 17 16 37 13 39 13 35 26 37 29 28 22 34 27 39
10 14 15 15 24 36 17 44 18 42 14 16 17 48 13 46 17 29 10 35

Construct a frequency distribution table showing class intervals and class midpoints, frequency
and cumulative frequencies for each grades using a class interval of 5 starting with 10-14.
SOLUTION. Consider the frequency distribution table
where cf stands for cumulative frequency.

11 Histogram
A histogram is a bar graph of a frequency distribution, where classes are measured along the
horizontal axis and frequencies along the vertical axis.
Example 11.1. Forty students in a class sat for a mathematics quiz marked out of 50 and
attached below are the grades the students scored Present the data in form of a histogram.
SOLUTION. Consider the frequency distribution table below

Page 10
11 HISTOGRAM 11

Class Tally Frequency (f) Midpoint (x) Relative frequency (F) cf


10-14 8 12 0.2 8
15-19 10 17 0.25 18
20-24 2 22 0.05 20
25-29 7 27 0.175 27
30-34 1 32 0.025 28
35-39 8 37 0.2 36
40-44 2 42 0.05 38
45-49 2 47 0.05 40
P P
f = 40 F =1

17 25 36 12 28 17 16 37 13 39 13 35 26 37 29 28 22 34 27 39
10 14 15 15 24 36 17 44 18 42 14 16 17 48 13 46 17 29 10 35

Class Tally Frequency (f) Class boundary


10-14 8 9.5-14.5
15-19 10 14.5-19.5
20-24 2 19.5-24.5
25-29 7 24.5-29.5
30-34 1 29.5-34.5
35-39 8 34.5-39.5
40-44 2 39.5-44.5
45-49 2 44.5-49.5
P
f = 40

PROBLEM 11.1 (Activity). Present the data given in the table below in a histogram:

Marks obtained 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89
Number of students 4 10 16 22 26 18 8 2

Page 11
12 FREQUENCY POLYGON 12

Figure 1: Histogram

12 Frequency polygon
A frequency polygon is a line graph of a frequency distribution resulting from joining the
frequency of each class plotted at the class midpoint.
A frequency polygon is a graphical form of representation of data. It is used to depict the
shape of the data and to depict trends. It is usually drawn with the help of a histogram but
can be drawn without it as well.
Frequency polygons give an idea about the shape of the data and the trends that a particular
data set follows.

12.1 Steps to draw a frequency polygon


(i) Mark the class intervals for each class on the horizontal axis. We will plot the frequency
on the vertical axis.

(ii) Calculate the class mark (or midpoint) for each class interval. The formula for class
mark is:
(Upper limit + Lower limit)
Classmark =
2
(iii) Mark all the class marks on the horizontal axis. It is also known as the mid-value of
every class.

(iv) Corresponding to each class mark, plot the frequency as given to you. The height always
depicts the frequency. Make sure that the frequency is plotted against the class mark
and not the upper or lower limit of any class.

(v) Join all the plotted points using a line segment. The curve obtained will be kinked.

(vi) This resulting curve is called the frequency polygon.

Page 12
12 FREQUENCY POLYGON 13

Test score 50-59 60-69 70-79 80-89 90-99


Frequency (f) 5 10 30 40 15

Table 1: Scores obtained by college students in a mid semester examination

Example 12.1. One hundred college students sat for mid semester examination marked out
of 100% and table below shows the results they scored
Construct a frequency polygon using the data give in Table 1 above.

Page 13
12 FREQUENCY POLYGON 14

SOLUTION. We first need to calculate the class mark and class boundary from the test
scores given.

Test score Frequency (f) Class mark (x) Class boundary


50-59 5 54.5 49.5-59.5
60-69 10 64.5 59.5-69.5
70-79 30 74.5 69.5-79.5
80-89 40 84.5 79.5-89.5
90-99 15 94.5 89.5-99.5
P
f = 100

Figure 2: Frequency polygon

PROBLEM 12.1 (Activity). Make a frequency polygon and histogram using the given data:

Marks obtained 10-19 20-29 30-39 40-49 50-59 60-69


Number of students 5 12 15 22 14 4

Page 14
13 CUMULATIVE FREQUENCY DISTRIBUTION CURVE OR OGIVE 15

13 Cumulative frequency distribution curve or ogive


A cumulative frequency distribution shows, for each class, the total v number of observations
in all classes up to and including that class. When plotted, this gives a distribution curve, or
ogive

Example 13.1. One hundred college students sat for mid semester examination marked out
of 100% and table below shows the results they scored

Test score 50-59 60-69 70-79 80-89 90-99


Frequency (f) 5 10 30 40 15

Table 2: Scores obtained by college students in a mid semester examination

Construct a cumulative frequency curve (Ogive) using the data give in Table 2 above.

SOLUTION. We first need to calculate the cumulate frequency from the frequency given.

Class Frequency (f) Cumulative frequency Class boundary


50-59 5 5 49.5-59.5
60-69 10 15 59.5-69.5
70-79 30 45 69.5-79.5
80-89 40 85 79.5-89.5
90-99 15 100 89.5-99.5
P
f = 100

Figure 3: Cumulative frequency curve

Page 15
14 STATISTICAL AVERAGES 16

PROBLEM 13.1 (Activity). Attempt the following questions

1. The marks obtained by 100 college students in an examination are given below

Exam marks 0-4 5-9 10-14 15-19 20-24 25-29 30-34 35-39 40-44 45-49
Frequency (f) 2 5 6 8 10 25 20 18 4 2

Construct a cumulative frequency curve (ogive).

2. Construct histogram, frequency polygon and frequency curve from the following data:

Marks obtained 0-9 10-19 20-29 30-39 40-49 50-59 60-69 70-79
Number of students 10 16 20 20 22 15 8 5

14 Statistical averages
14.1 Introduction
The measures of central tendency enable us to make a statistical summary of the enormous
organized data. One such method of measure of central tendency in statistics is the arithmetic
mean. This condensation of a large amount of data into a single value is known as measures
of central tendency.
For example, in the early morning while reading a newspaper, have you observed the daily
temperature reports. Well, the temperature varies all day still how a single temperature can
indicate the condition for the entire day? Or when you get your scorecard in exams, instead of
analyzing your performance based on the percentage in all subjects, the performance is based
upon the aggregate percentage.
The significance of indicating a single value for a large amount of data in real life makes it
easy to study and analyze the collection of data and deduce important information out of it.
Let us discuss the arithmetic mean in Statistics and examples in detail.

14.2 Arithmetic mean


The most common measure of central tendency is the arithmetic mean. In layman’s terms,
the mean of data indicates an average of the given collection of data. It is equal to the sum
of all the values in the group of data divided by the total number of values.
For n values in a set of data namely as x1 , x2 , x3 , . . . , xn , the mean of data is given as:
x1 + x2 + x3 + · · · + xn
x̄ =
n
It can also be denoted as: Pn
i=1 xi
x̄ =
n

Page 16
14 STATISTICAL AVERAGES 17

For calculating the mean when the frequency of the observations is given, such that x1 , x2 , x3 , . . . , xn
is the recorded observations, and f1 , f2 , f3 , . . . , fn is the respective frequencies of the observa-
tions then;
f 1 x1 + f 2 x2 + f 3 x3 + · · · + f n xn
x̄ =
f1 + f2 + f3 + · · · + fn
This can be expressed briefly as: Pn
f i xi
x̄ = Pi=1
n .
i=1 fi
The above method of calculating the arithmetic mean is used when the data is ungrouped in
nature.

Example 14.1. The avocados got from the food tree were measured and given the following
masses.
400g, 200g, 350g, 320g, 200g
Find the mean mass of avocados obtained collected on this day.

SOLUTION. P
xi
x̄ =
n
400 + 200 + 350 + 320 + 200
=
5
1470
=
5
= 294g.
Example 14.2. The following are masses of sweets in Jozes bag

Mass 10 15 18 21 25
Number 2 4 9 5 6

Find the mean mass

SOLUTION. Consider the frequency distribution table below

Mass Number fx

10 2 20
15 4 60
18 9 162
21 5 105
25 6 159
P P
f = 26 f x = 497

Page 17
14 STATISTICAL AVERAGES 18

P
fx
x̄ = P
n
497
=
26
= 19.115.

Properties:

(1) The sum of the deviations about the mean is zero (0)
n
X
(xi=1 − x̄) = 0.
i=1

E.g x̄ = 294
n
X
(xi=1 − x̄) = (400 − 294) + (200 − 294) + (350 − 294) + (320 − 294) + (200 − 294)
i=1
= 0.

(2) The sum of the squares of the deviations about the mean is always a minimum. i.e, as
compared to other values of central tendency
n
(xi=1 − x̄)2 = minimum value
X

i=1

n
(xi − x̄)2 = (400 − 294)2 + (200 − 294)2 + (350 − 294)2 + (320 − 294)2 + (200 − 294)2
X

i=1
= 11, 236 + 8, 836 + 3, 136 + 676 + 8, 836
= 32, 720.
Example 14.3. To obtain Grade A, Ben must achieve an average of at least 70 in five tests.
If his average mark for the four tests is 86, what is the lowest mark he can get in his fifth test
and still obtain Grade A?

14.3 Median
This is the middle value in an ordered sequence of data. If there are no ties, half of the
observations will be smaller than the median, and the other half will be bigger. Unlike the
mean,the median is not affected by extreme observations in the data sets. Hence where
extreme observations do exist or not, it is better to use the median instead of the mean.
To calculate the median in raw data, we first arrange the data in an ordered array so that
 th
we obtain the value in the middle. The median is always in the n+1 2
position, where the
number of data is even, the median is the average of the two middle most values.
Example 14.4. Find the median given

(i) 400, 200, 350, 320, 200

(ii) 12, 18, 10, 5, 21, 16


Page 18
14 STATISTICAL AVERAGES 19

(iii) 5, 8, 4, 4, 2, 9

SOLUTION. (i) Arranging the data in ascending order, we obtain 200, 200, 320, , 350, 400.
Since n = 5, we shall have
 n + 1 th  5 + 1 th  6 th
= = = 3rd
2 2 2
Thus median = 320

(ii) Arranging the data in ascending order, we obtain 5, 10, 12, 16, 18, 21. Since its even, we
take the average of the two middle most
12 + 16 28
Median = = = 14
2 2

(iii) Arranging the data in ascending order, we obtain 2, 4, , 4, 5, 8, 9. Since its even, we take
the average of the two middle most
4+5 9
Median = = = 4.5
2 2

14.4 Mode
The mode is also another measure of central tendency and it is that value which appears
most frequently. The mode is also not affected by extreme values, however it is not used for
descriptive purposes because it is more valuable from sample to sample compared to other
measures of central tendency.
NOTE. Some data may have more than one mode.

Example 14.5.
16, 12, 18, 3, 12, 18, 9

The mode is 12 and 18.


This is called bimodal data
NOTE. The Ogive is drawn using the cumulative frequency on the y − axis and the upper
class boundaries on the x − axis. The points are then joined by using free hand. It can be
used to estimate the median, quartile, decile and the percentiles.

Example 14.6. Construct an Ogive curve using the following set of data
Use the curve to estimate the following

(i) Median

(ii) First quartile

(iii) 60%th percentile

(iv) 8th decile.

SOLUTION. Consider the frequency distribution table below

Page 19
14 STATISTICAL AVERAGES 20

Class Frequency Cumulative frequency

10-19 2 2
20-29 8 10
30-39 3 13
40-49 6 19
50-59 2 21
P
f = 21

Class Frequency Cumulative frequency Class boundary

10-19 2 2 9.5-19.5
20-29 8 10 19.5-29.5
30-39 3 13 29.5-39.5
40-49 6 19 39.5-49.5
50-59 2 21 49.5-59.5
P
f = 21

   
N 21
(i) Median = 2
th = 2
th = 10.5th position
Thus from Figure 4, Median = 31.5.
1
(ii) 1st quartile position = 4
× 21 = 5.25
Thus from Figure 4, 1st quartile = 21.5.
60
(iii) 60th percentile position = 100
× 21 = 12.6
Thus from Figure 4, 1st quartile = 41.5.
8
(iv) 8th decile position = 10
× 21 = 16.8
Thus from Figure 4, 1st quartile = 45.5.

14.5 Skewness
Skewness shows how the data is distributed i.e it can either be symmetric or not and if the
data is not symmetric, it is said to be skewed or asymmetric.
For data which is symmetric, the mean, mode , and median are equal, and therefore it is said
to be zero skewed or normally distributed.
If the mean is bigger than the median,the data is said to be positively skewed or right skewed.
If the median is more than the mean, the data is said to be negatively skewed or left skewed

Example 14.7. The following are marks obtained by 34 primary six students in a mid term
mathematics examination marked out of 40. Find

Page 20
14 STATISTICAL AVERAGES 21

Figure 4: Figure showing an Ogive curve

Figure 5: Figure showing a normally distributed data

Figure 6: Figure showing a right skewed data

Figure 7: Figure showing a left skewed data

Page 21
14 STATISTICAL AVERAGES 22

Class 5-9 10-14 15-19 20-24 25-29 30-34


Frequency 2 14 8 3 5 2

(i) Mode

(ii) Mean

(iii) Median

(iv) What kind of skewness does the data exhibit

(v) Draw the skewness curve

Page 22
14 STATISTICAL AVERAGES 23

SOLUTION. Consider the frequency distribution table below

Class f Cf x fx Class boundary

5-9 2 2 7 14 4.5-9.5
10-14 14 16 12 168 9.5-14.5
15-19 8 24 17 136 14.5-19.5
20-24 3 27 22 66 19.5-24.5
25-29 5 32 27 135 24.5-29.5
30-34 2 34 32 64 29.5-34.5
P P
f = 34 f x = 583

(i) !
∆1
Mode = L + ×c
∆1 + ∆ 2
!
14 − 2
= 9.5 + ×5
(14 − 2) + (14 − 8)
12
= 9.5 + ×5
18
= 12.83.

(ii) P
fx
Mean x̄ = P
f
583
=
34
= 17.147.

(iii)
N
!
− fb
2
Median = L + ×c
f
!
17 − 16
= 14.5 + ×5
8
= 14.5 + 0.625
= 15.125.

(iv) Since the mean is bigger than the median, the data is said to be positively skewed or right
skewed.

(v)

PROBLEM 14.1 (Activity). Attempt the following questions

Page 23
15 MEASURES OF CENTRAL TENDENCY 24

Figure 8: Figure showing a right skewed data

Time (minutes) 40-44 45-49 50-54 55-59 60-64


Number of students 8 22 34 30 26

1. The times to the nearest minute, taken by a group of 120 students to write a particular
essay were recorded and are grouped in the table below Construct the cumulative frequency
table for this distribution and draw the cumulative frequency curve.
Use your curve to estimate

(a) the interquartile range of the times.


(b) the percentage of these students who spent over 62 minutes in writting the essay.

2. The table below shows the frequency distribution of the masses of 52 women students at a
college. Measurements have been recorded to the nearest kilogram.

Mass (kg) 40-44 45-49 50-54 55-59 60-64 65-69 70-74


Frequency 3 2 7 18 18 3 1

(a) Construct the cumulative frequency table for this distribution and draw the cumulative
frequency curve.
(b) How many students weighed less than 57kg.
(c) How many students weighed more than 61kg.
(d) 20% were heavier than xkg.
Find the value of x.
(e) Estimate the median.
(f) Estimate the interquartile range.

15 Measures of central tendency


A data set consisting of the observations for some variable is referred to as raw data or
ungrouped data. Data presented in the form of a frequency distribution are called grouped
data. The measures of central tendency discussed in this chapter will be described for both
grouped and ungrouped data since both forms of data occur frequently.

Page 24
15 MEASURES OF CENTRAL TENDENCY 25

There are many different measures of central tendency. The three most widely used measures
of central tendency are the mean, median, and mode. These measures are defined for both
samples and populations.

15.1 Mean, median and mode for discrete data


15.1.1 Mean

The mean for a sample consisting of n observations is


P
x
x̄ =
n
and the mean for a population consisting of N observations is
P
x
µ=
N
Example 15.1. In a certain study center of Kyambogo University, students sat an online
mathematics quiz marked out off 100% and below are are test scores 30 students chosen
randomly scored

25 46 34 45 37 36 40 30 29 37 44 56 50 47 23
40 30 27 38 47 58 22 29 56 40 46 38 19 49 50

Find the mean score

SOLUTION. P
x
x̄ =
n
1168
= = 38.9.
30

15.1.2 Median

The median of a set of data is a value that divides the bottom 50% of the data from the top
50% of the data.
To find the median of a data set, first arrange the data in increasing order.
If the number of observations is odd, the median is the number in the middle of the ordered
list.
If the number of observations is even, the median is the mean of the two values closest to the
middle of the ordered list.
There is no widely used symbol used to represent the median.

Example 15.2. In a certain study center of Kyambogo University, students sat an online
mathematics quiz marked out off 100% and below are are test scores 30 students chosen
randomly scored
Find the median score

Page 25
15 MEASURES OF CENTRAL TENDENCY 26

25 46 34 45 37 36 40 30 29 37 44 56 50 47 23
40 30 27 38 47 58 22 29 56 40 46 38 19 49 50

19 22 23 25 27 29 29 30 30 34 36 37 37 38 38
40 40 40 44 45 46 46 47 47 49 50 50 56 56 58

SOLUTION. To find the median, first arrange the data in increasing order The two values
closest to the middle are 38 and 40. The median is the mean of these two values.
Thus,
38 + 40 78
Median = = = 38.
2 2

15.1.3 Mode

The mode is the value in a data set that occurs the most often. If no such value exists, we
say that the data set has no mode. If two such values exist, we say the data set is bimodal.
If three such values exist, we say the data set is trimodal. There is no symbol that is used to
represent the mode.

Example 15.3. In a certain study center of Kyambogo University, students sat an online
mathematics quiz marked out of 100% and below are are test scores 30 students chosen
randomly scored

25 46 34 45 37 36 40 30 29 37 44 56 50 47 23
40 30 27 38 47 58 22 29 56 40 46 38 19 49 50

Find the mode

SOLUTION. When the data are examined, it is seen that 40 occurs three times, and that no
other value occurs that often. The mode is equal to 40.

15.1.4 Geometric interpretation

For a large data set, as the number of classes is increased (and the width of the classes is
decreased), the histogram becomes a smooth curve. Oftentimes, the smooth curve assumes a
shape like that shown in Figure 9.
In this case, the data set is said to have a bell-shaped distribution or a mound-shaped distri-
bution. For such a distribution, the mean, median, and mode are equal and they are located
at the center of the curve.
For a data set having a skewed to the right distribution, the mode is usually less than the
median which is usually less than the mean. For a data set having a skewed to the left
distribution, the mean is usually less than the median which is usually less than the mode.

Example 15.4. Find the mean, median, and mode for the following three data sets and

Page 26
15 MEASURES OF CENTRAL TENDENCY 27

Figure 9: Bell-shaped distribution

confirm the above paragraph.

Data set 1: 10, 12, 15, 15, 18, 20


Data set 2: 2, 4, 6, 15, 15, 18
Data set 3: 12, 15, 15, 24, 26, 28

SOLUTION. Table 3 gives the shape of the distribution, the mean, the median, and the mode
for the three data sets.

Data set Mean Median Mode Distribution shape


10+12+15+15+18+20
1 x̄ = 6
= 15 15 15 Bell-shaped
2+4+6+15+15+18
2 x̄ = 6
= 10 10.5 15 Left-skewed
12+15+15+24+26+28
3 x̄ = 6
= 20 19.5 15 Right-skewed

Table 3: Shape of distribution for three data sets

Page 27
15 MEASURES OF CENTRAL TENDENCY 28

PROBLEM 15.1 (Activity). Attempt the following questions

1. Cherop takes four tests and scores the following marks

65, 72, 58, 77

(a) What are his median and mean scores?


(b) If he scores 70 in his next test, does his mean score increase or decrease?
Find his new mean score
(c) Which has increased most, his mean score or his median score?

2. The children in a class state how many children there are in their family.
The numbers they state are given below

1 2 1 3 2 1 2 4 2 2 1 3 1 2
2 2 1 1 7 3 1 2 1 2 2 1 2 3

(a) Find the mean, median and mode for this data
(b) Which is the most sensible average to use in this case?

Page 28
16 MEASURES OF DISPERSION 29

16 Measures of dispersion
In addition to measures of central tendency, it is desirable to have numerical values to describe
the spread or dispersion of a data set. Measures that describe the spread of a data set are
called measures of dispersion.

16.1 Range, variance, and standard deviation for ungrouped data


16.1.1 Range

The range for a data set is equal to the maximum value in the data set minus the minimum
value in the data set. It is clear that the range is reflective of the spread in the data set since
the difference between the largest and the smallest value is directly related to the spread in
the data.

16.1.2 Variance

The variance and the standard deviation of a data set measures the spread of the data about
the mean of the data set.
The variance of a sample of size n is represented by s2 and is given by
(x − x̄)2
P
2
s =
n−1
and the variance of a population of size N is represented by σ 2 and is given by
(x − x̄)2
P
2
σ =
N

16.1.3 Standard deviation

The square root of the variance is called the standard deviation and the standard deviation
is measured in the same units as the variable.
The sample standard deviation is √
s = s2
and the population standard deviation is

σ= σ2

The shortcut formulas for computing sample and population variances are
P
P 2 ( x)2
2 x − n
s =
n−1
and P
P 2 ( x)2
2 x − N
σ =
N −1
respectively.
Page 29
16 MEASURES OF DISPERSION 30

Example 16.1. The height of class 5 students of a certain primary school measured in
centimeters are 100, 102, 118, 124, 126. Find the standard deviation.

SOLUTION. Consider frequency distribution table below

x x2
100 10000
102 10404
118 13924
124 15376
126 15876
P P 2
x = 570 x = 65580

Table 4: Frequency distribution table

v P
( x)2
uP
u x2 − n
s=
t
n−1
v
u 65580 − (570)2
u s s
5 65580 − 64980 600
= = =
t
5−1 4 4

= 150
= 12.25cm

PROBLEM 16.1 (activity). Attempt the following questions

1. Eight people worked in a shop, they are paid hourly rates of

£2, £15, £5, £4, £3, £4, £3, £3.

(a) Find

(i) the mean (ii) the median (iii) the mode

(b) Which average would you use if you wanted to claim that the staff were

(i) well paid (ii) badly paid

(c) What is the range?

Page 30
16 MEASURES OF DISPERSION 31

2. A farmer buys 10 packets of seeds from two different companies. Each pack contains 20
seeds and he records the number of plants which grow from each pack.

Company A 20 5 20 20 20 6 20 20 20 8
Company B 17 18 15 16 18 18 17 15 17 18

(a) Find the mean, median, mode, variance and standard deviation for each company’s
seeds
(b) Which company does the mode suggest is best?
(c) Which company does the mean suggest is best?
(d) Find the range for each company’s seeds.

Page 31
17 MEASURES OF CENTRAL TENDENCY AND DISPERSION FOR GROUPED
DATA 32
17 Measures of central tendency and dispersion for grouped
data
Statistical data are often given in grouped form, i.e., in the form of a frequency distribution,
and the raw data corresponding to the grouped data are not available or may be difficult to
obtain.
The articles that appear in newspapers and professional journals do not give the raw data,
but give the results in grouped form.
For grouped data,the mid value for any group is representative of the properties within that
group.

17.1 Mean
The mean for grouped data is given by
P
fx
x̄ =
n
P
where x represents the class marks, f represents the class frequencies, and n = f.

Example 17.1. The frequency distributions of seed yield of 50 groundnut plants are given
below.

Seed yield in g (x) 3 4 5 6 7


Frequency (f ) 4 6 15 15 10

Table 5: Frequency distribution table showing seed yield for 50 groundnut plants

Find the mean weight

SOLUTION. Consider frequency distribution table below

Seed yield in g (x) f fx


3 4 12
4 6 24
5 15 75
6 15 90
7 10 70
P P
f = 50 f x = 271

Table 6: Frequency distribution table

Page 32
17 MEASURES OF CENTRAL TENDENCY AND DISPERSION FOR GROUPED
DATA 33
Thus P
fx
Mean weight =
n
271
=
50
= 5.42g
Example 17.2. The following table gives the frequency distribution of marks scored by 80
students at a certain Primary Teachers College in a statistics test marked out of 30.

Number of orders 10-12 13-15 16-18 19-21 22-24 25-27 28-30


f 4 12 20 14 16 9 5

Table 7: Shows marks scored by 80 students in a statistics test marked out of 30.

Calculate the mean score.

SOLUTION. Consider frequency distribution table below

Class f x fx
10-12 4 11 44
13-15 12 14 168
16-18 20 17 340
19-21 14 20 280
22-24 16 23 368
25-27 9 26 234
28-30 5 29 145
P P
f = 80 f x = 1579

Table 8: Frequency distribution table

Thus P
fx
Mean score = P
f
1579
=
80
= 19.7375
≈ 19.7.

Page 33
17 MEASURES OF CENTRAL TENDENCY AND DISPERSION FOR GROUPED
DATA 34
17.2 Median
The median for grouped data is found by locating the value that divides the data into two
equal parts. In finding the median for grouped data, it is assumed that the data in each class
are uniformly spread across the class.
N − fb 
2
Median = L1 + × C.
f
Where L1 is the lower boundary of the modal class, N is the total frequency, fb is the cumula-
tive frequency before median class, f frequency of the median class, and C is the class width
or interval.

17.3 Mode
The modal class is defined to be the class with the maximum frequency. The mode for grouped
data is defined to be the class mark of the modal class.
 ∆1 
Mode = L1 + × C.
∆1 + ∆ 2
Where L1 is the lower boundary of the modal class, C is the class interval/width, ∆1 is the
difference between the modal frequency and the frequency of the class before the modal class
and ∆2 is the difference between the modal frequency and the frequency of the class after the
modal class

17.4 Variance and standard deviation


The variance for grouped data is given by

f (x − x̄)2
P
2
s = .
n−1
Alternatively, we can write P 2
fx
f x2 −
P
s2 = n
n−1
and the standard deviation is given by

s= Variance

Page 34
17 MEASURES OF CENTRAL TENDENCY AND DISPERSION FOR GROUPED
DATA 35
Example 17.3. The following are marks scored by 24 students in a mathematics examination

20, 19, 23, 28, 42, 51, 44, 48, 76, 58, 64, 90, 87, 59, 36, 32, 45, 83, 15, 76, 66, 53, 57, 91

(i) Starting with 10 − 19 and using equal groups of interval, 10, make a frequency table

(ii) Using the table, calculate the mean

(iii) Mode and median

(iv) Variance and hence standard deviation.

SOLUTION. (i) Frequency table

Class Tally f x fx Cf x − x̄ (x − x̄)2 f (x − x̄)2

10-19 2 14.5 29.0 2 -38.333 1,469.42 2,938.84

20-29 3 24.5 73.5 5 -28.333 802.76 2,408.28

30-39 2 34.5 69.0 7 -18.333 336.099 672.178

40-49 4 44.5 178.0 11 -8.333 69.44 277.76

50-59 5 54.5 272.5 16 1.667 2.78 13.9

60-69 2 64.5 129.0 18 11.667 136.12 272.24

70-79 2 74.5 149.0 20 21.667 469.46 938.92

80-89 2 84.5 169.0 22 31.667 1,002.80 2,005.6

90-99 2 94.5 94.5 24 41.667 1,736.14 3,472.28

f (x − x̄)2 =
P P P
f = 24 f x = 1, 268
15669.498

(ii) P
fx
Mean x̄ = P
f
1, 268
=
24
= 52.833
Page 35
17 MEASURES OF CENTRAL TENDENCY AND DISPERSION FOR GROUPED
DATA 36
(iii) Modal class is 50 − 59, thus
 ∆1 
Mode = L1 + ×C
∆1 + ∆2
 1 
= 49.5 + × 10.
1+3
= 52

(iv)
N − fb 
2
Median = L1 + ×C
f
24
!
2
− 11
= 49.5 + × 10
5
= 51.5

(v)
f (x − x̄)2
P
Variance =
n−1
15, 669.498
=
24 − 1
15, 669.498
=
23
= 681.28252.
= 681.3.

(vi) √
Standard deviation = Variance

= 681.3
= 26.1.
Example 17.4. The frequency distributions of seed yield of 50 groundnut plants are given
below.

Seed yield in g (x) 3 4 5 6 7


Frequency (f ) 4 6 15 15 10

Table 9: Frequency distribution table showing seed yield for 50 groundnut plants

Find the

(i) variance

(ii) standard deviation

Page 36
17 MEASURES OF CENTRAL TENDENCY AND DISPERSION FOR GROUPED
DATA 37
SOLUTION. Consider frequency distribution table below

Seed yield in g (x) f fx f x2


3 4 12 36
4 6 24 96
5 15 75 375
6 15 90 540
7 10 70 490
f x2 = 1, 537
P P P
f = 50 f x = 271

Table 10: Frequency distribution table

(i) Variance P
2 ( f x)2
fx − n
P
2
s =
n−1
2
1, 537 − (271)
50
=
50 − 1
1, 537 − 1, 468.82
=
49
68.18
=
49
= 1.391

(ii) Standard deviation √


s = 1.391
= 1.1794
NOTE. Take note of the following;

(i) Population variance is given by

f (x − x̄)2
P
2
δ =
n

(ii) Population standard deviation is given by


sP
f (x − x̄)2
δ=
n

(iii) Sample variance is given by

f (x − x̄)2 nδ 2
P
2
s = =
n−1 (n − 1)

Page 37
17 MEASURES OF CENTRAL TENDENCY AND DISPERSION FOR GROUPED
DATA 38
(iv) Sample standard deviation is given by
sP v
f (x − x̄)2 u nδ 2
u
s= =t
n−1 (n − 1)

(v) Coefficient of variance is given by


s
× 100%.

The smaller the coefficient of variation, the better the data.

Example 17.5. The following are masses of sample food items from a store

Mass 10-19 20-29 30-39 40-49 50-59


Frequency 2 8 3 6 2

(i) Draw a histogram and use it to estimate the mode.

(ii) Draw a percentage histogram and superimpose a frequency polygon.

(iii) Calculate mean and variance.

(iv) Find coefficient of variation.

SOLUTION. Consider the frequency distribution table below

Class f %f x fx x − x̄ (x − x̄)2 f (x − x̄)2 Class


Boundary

10-19 2 9.5 14.5 29.0 -19.05 362.90 725.8 9.5-19.5


20-29 8 38.1 24.5 196.0 -9.05 81.90 655.2 19.5-29.5
30-39 3 14.3 34.5 103.5 0.95 0.90 2.7 29.5-39.5
40-49 6 28.6 44.5 267.0 10.95 119.90 719.4 39.5-49.5
50-59 2 9.5 54.5 109.0 20.95 438.90 877.8 49.5-59.5
f (x − x̄)2 =
P P P
f = 21 f x = 704.5
2, 980.9

Page 38
17 MEASURES OF CENTRAL TENDENCY AND DISPERSION FOR GROUPED
DATA 39
(i) Using a histogram to obtain the mode

(ii) Percentage histogram together with a frequency polygon

(iii) P
fx
Mean x̄ = P
f
704.5
=
21
= 33.55

Page 39
17 MEASURES OF CENTRAL TENDENCY AND DISPERSION FOR GROUPED
DATA 40
2

P
f (x x̄)
Variance of sample s2 =
n−1
2, 980.9
=
21 − 1
= 149.045

Standard deviation of sample s = Variance

= 149.045
= 12.08.

(iv)
s
Coefficient of variance =

12.08
= × 100% = 36.00596
33.55
= 36%.

PROBLEM 17.1 (Activity ). Attempt the following questions

1. There are twenty pupils in class A and twenty pupils in class B. All the pupils in class A
were given an I.Q test. Their scores on the test are given below

100 104 106 107 109 110 113 114 116 117
118 119 119 121 124 125 127 127 130 134

(a) Construct a frequency distribution table for above data starting with class interval of
100-104.
(b) Calculate the mean score and the standard deviation for pupils in class A.
(c) Class B takes the same I.Q test. they obtain a mean of 110 and standard deviation of
21. Compare the data for class A and Class B.

2. The following are the scores in a test for a set of 15 students

5 4 8 7 3 6 5 9 6 10 7 8 6 4 2

(a) (i) Calculate the mean score


(ii) Calculate the standard deviation of the scores
(b) A set of 10 different students took same the test their scores are listed below

5 6 6 7 7 4 7 8 3 7

After making any necessary calculations for the second set, compare the two sets of
scores. Your answer should be understandable to someone who does not study statistics.

Page 40
18 DISTRIBUTIONS 41

3. 40 boys sat a test which was marked out of 50. Their marks were

28 42 35 17 49 12 48 38 24 27 23 24 30 42 44 13 48 33 26 17
18 12 45 27 17 16 28 33 34 38 27 46 25 20 40 169 43 38 25 21

(a) Starting with class interval of 10-14, contract a frequency distribution table for above
data.
(b) Calculate
(i) the mean of the marks
(ii) the standard deviation of the marks
(c) 40 girls sat the same test their marks had a mean of 30 and a standard deviation of
6.5 compare the performances of the boys and girls.

18 Distributions
Probability distributions are divided into two categories:

(i) Discrete probability distribution. This include Binomial, Poisson and Hypergeometric
distributions.
(ii) Continuous probability distribution. This include normal, uniform and exponential dis-
tributions.

18.1 Random Variables


Suppose that to each point of a sample space we assign a number, we then have a function
defined on the sample space. This function is called a random variable. In general, a random
variable has some specified physical, geometrical, or other significance.
A random variable associates a numerical value with each outcome of an experiment. A
random variable is defined mathematically as a real-valued function defined on a sample
space, and is represented as a letter such as X or Y .
Example 18.1. Suppose that a coin is tossed twice so that the sample space is S = {HH, HT, T H, T T }.
Let X represent the number of heads that can come up. With each sample point we can as-
sociate a number for X as shown in table?? below

Sample point HH HT TH TT
X 2 1 1 0

Table 11: Sample space for a coin tossed twice

Thus, for example, in the case of HH (i.e.,2 heads), X = 2 while for T H (i.e., 1 head), X = 1.
It follows that X is a random variable.
It should be noted that many other variables could be defined on this sample space,for example,
the square of the number of heads or the number of heads minus the number of tails.
Page 41
18 DISTRIBUTIONS 42

A random variable that takes on a finite or countably infinite number of values is called a
discrete random variable while one which takes on a noncountably infinite number of values
is called a continuous random variable.

18.2 Discrete probability distribution


A discrete random variable is a variable that can assume only a countable number of values.
Many possible outcomes:

• number of complaints per day


• number of TV’s in a household
• number of rings before the phone is answered

Only two possible outcomes:

• gender: male or female


• defective: yes or no
Example 18.2. An experiment consists of observing 100 individuals who get a flu shot and
counting the number X who have a reaction. The variable X may assume 101 different values
from 0 to 100. Another experiment consists of counting the number of individuals W who get
a flu shot until an individual gets a flu shot and has a reaction. The variable W may assume
the values 1, 2, 3, . . . The variable W can assume a countably infinite number of values.

18.3 Continuous probability distribution


A continuous random variable is a variable that can assume any value on a continuum (can
assume an uncountable number of values).
A random variable is a continuous random variable if it is capable of assuming all the values
in an interval or in several intervals. Because of the limited accuracy of measuring devices,
no random variables are truly continuous.
Example 18.3. The following random variables are considered continuous random variables:
survival time of cancer patients, the time between release from prison and conviction for
another crime, the daily milk yield of a cow, weight loss during a dietary routine, and the
household incomes for single-parent households in a sociological study.

Further examples include

• thickness of an item
• time required to complete a task
• temperature of a solution
• height, in inches

These can potentially take on any value, depending only on the ability to measure accurately.
In this unit, we shall study Binomial distributions and normal distributions.
Page 42
19 BINOMIAL DISTRIBUTION 43

19 Binomial Distribution
Binomial distribution deals with the possible numbers of successes when there are n trials,
each of which may be a success (with probability p) or a failure (with probability q); p and q
are fixed positive numbers and
p + q = 1.
This distribution is denoted by B(n, p).
For B(n, p), the probability of r successes in n trials is found by the same argument as before.
Each success has probability p and each failure has probability q, so the probability of r
successes and (n − r) failures in a particular order is pr q n−r . The positions in the sequence of
n trials which the successes occupy can be chosen in n Cr ways. Therefore

P (X = r) = n Cr pr q n−r for 0 6 r 6 n.

This can also be written as  


 n  r
Pr =  p (1 − p)n−r
r
The successive probability for X = 0, 1, 2, . . . , n are terms of the binomial expansion of (p+q)n .
To state that X has the binomial distribution B(n, p) you can use the abbreviation X ∼
B(n, p), where the symbol ∼ means ’has the distribution’.
It is often the case that you use a theoretical distribution, such as the binomial, to describe
a random variable that occurs in real life. This process is called modeling and it enables you
to carry out relevant calculations.
Binomial distribution is the discrete random variable having the following characteristics.

(i) They have a fixed number of trials, n i.e can toss a coin 5 times, can attempt 8 questions,
etc.

(ii) There are only two possible outcomes for each trial e.g a success and a failure, correct
and wrong, head and tail, etc. The letter p denotes the probability of a success on one
trial, and q denotes the probability of a failure on one trial. p + q = 1.

(iii) The probability of a success is the same for each trial e.g the probability of getting a
head is 0.5.

(iv) The trials are independent i.e, the outcome of each trial does not depend on the outcome
of any previous trial.

For n trials,

• the probability of getting x successes,


 
n  x n−x
P (x) = 
 p q ,
x

where n =number of trials, p = probability of success on each trail and q = 1 − p =


probability of failure.
Page 43
19 BINOMIAL DISTRIBUTION 44

• If X ∼ B(n, p) then the expectation or mean of X = µ = np.


This means that if the probability of success in each single trial is p, then the expected
numbers of successes in n independent trials is np.

• Variance, V ar(X) = σ 2 = npq = np(1 − p).


√ q
• standard deviation= σ = npq = np(1 − p).

Example 19.1 (When to apply binomial distribution). Examples below show when to apply
a binomial distribution

• A manufacturing plant labels items as either defective or acceptable

• A firm bidding for a contract will either get the contract or not

• A marketing research firm receives survey responses of “yes I will buy” or “no I will not”

• New job applicants either accept the offer or reject it

Example 19.2. Extensive research has shown that 1 person out of every 4 is allergic to a
particular grass seed. A group of 20 university students volunteer to try out a new treatment.

(a) What is the expectation of the number of allergic people in the group?

(b) What is the probability that

(i) exactly two


(ii) no more than two of the group are allergic?

(c) How large a sample would be needed for the probability of it containing at least one
allergic person to be greater than 99.9%?

(d) What assumptions have you made in your answer?

SOLUTION. We are given that n = 20, and p = 0.25

(a) Expectation = np = 20 × 0.25 = 5 people

(b) X ∼ B(20, 0.25)

(i) P (X = 2) = 20 C2 (0.75)18 (0.25)2 = 0.067

Example 19.3. A fair coin is tossed 5 times, find the probability of getting

(i) 2 heads

(ii) At least 4 heads

(iii) At least 1 head

(iv) More than 2 heads.

SOLUTION. P (H) = 0.5 = p thus P (T ) = q = 1 − 0.5 = 0.5

Page 44
19 BINOMIAL DISTRIBUTION 45

(i)  
 5  2
P (H = 2) =   0.5 0.53
2
 
5 
=
  0.03125
2
= 10 × 0.03125
= 0.3125

(ii)
P (x > 4) = P (x = 4) + P (x = 5)
   
5  4
5 
=  (0.5) (0.5)1 +  5
 (0.5) (0.5)0
 
4 5
= 0.15625 + 0.03125
= 0.1875

(iii)
P (x 6 1) = P (x = 1) + P (x = 0)
   
5  1 4  5  0 5
=
  (0.5) (0.5) +   (0.5) (0.5)
1 0
= 0.15625 + 0.03125
= 0.1875

(iv)
P (x > 2) = P (x = 3) + P (x = 4) + P (x = 5)
     
 5  3  5   5 
=  (0.5) (0.5)2 +   (0.5)
4
(0.5)1 +  5
 (0.5) (0.5)0
3 4 5
= 0.3125 + 0.15625 + 0.03125
= 0.5
Example 19.4. A basket has 8 good fish and 12 rotting fish. A fish is picked at random
and then put back before making the next pick. If the picking is done 10 times. Find the
probability that;

(a) only 3 of the picking are good

(b) more than 7 of the pickings are good

(c) find the expected number of good fish being picked

(d) find the variance of good fish being picked

(e) find the standard deviation of good fish being picked

Page 45
19 BINOMIAL DISTRIBUTION 46
8
SOLUTION. P (good) = 20
= 0.4, n = 10, q = 1 − 0.4 = 0.6

(a) x = 3  
 10  3
P (x = 3) =   (0.4) (0.6)7
3
= 0.2150

(b)
P (x > 7) = P (x = 8) + P (x = 9) + P (x = 10)
     
10  8
10  10 
=  (0.4) (0.6)2 +  9
 (0.4) (0.6)1 +  10
 (0.4) (0.6)0
  
8 9 10
= 0.0106 + 0.0016 + 0.0001
= 0.0123

(c)
E(x) = np
= 10 × 0.4
=4

(d)
V ar(x) = npq
= 10 × 0.4 × 0.6
= 2.4

(e) Standard deviation= 2.4 = 1.549

Example 19.5. It is estimated that 42% of women ages 45 to 54 are overweight. If 20 females
between 45 and 54 are randomly selected, what is the probability that one-half of them are
overweight?

SOLUTION. Let X represent the number of women in the 20 who are overweight.
Then, X has a binomial distribution with n = 20 and p = 0.42, q = 1 − 0.42 = 0.58.
The probability P (X = 10) is given as follows:
 
20  10 10
P (X = 10) = 
  (0.42) (0.58)
10
20!
= (0.42)10 (0.58)10
10!10!
= 0.1359.

Example 19.6. Seventy-five percent of employed women say their income is essential to
support their family. Let X be the number in a sample of 200 employed women who will say
their income is essential to support their family. What is the mean and standard deviation of
X?

Page 46
19 BINOMIAL DISTRIBUTION 47

SOLUTION. X is a binomial random variable with n = 200 and p = 0.75, q = 0.25.


The mean is
µ = np = 200 × 0.75 = 150
and the standard deviation is
√ √ √
δ= npq = 200 × 0.75 × 0.25 = 37.5 = 6.12.

PROBLEM 19.1 (Exercise ). Attempt the following questions

1. Thirty percent of the trees in a national forest are infested with a parasite. Fifty trees are
randomly selected from this forest and X is defined to equal the number of trees in the 50
sampled that are infested with the parasite. The infestation is uniformly spread throughout
the forest. Identify the values for n, p, and q.

2. There is a fault in a machine making microchips,with the result that only 80% of those it
produces work. A random sample of eight microchips made by this machine is taken. What
is the probability that exactly six of them work?

3. About 32% of students participate in a community volunteer program outside of school. If


30 students are selected randomly, find the probability that at most 14 of them participate
in a community volunteer program out of school.

Page 47
20 NORMAL DISTRIBUTION 48

20 Normal Distribution
20.1 Introduction
The normal distribution is the most commonly used of all probability distributions in statis-
tical analysis. Many distributions actually found in nature and industry are normal. Some
examples are the IQs (intelligence quotients), weights, and heights of a large number of people
and the variations in dimensions of a large number of parts produced by a machine.
The normal distribution often can be used to approximate other distributions, such as the
binomial and the Poisson distributions and provides the basis for statistical inferences.
In this section, you will learn how to

(i) Standardize a normal variable and use standard normal tables.

(ii) Use the normal distribution as a model to solve problems.

(iii) Use the normal distribution as an approximation to the binomial distribution.

20.2 Normal distribution curve


The normal variable X is continuous. Its probability density function f (x) depends on its
mean µ and standard deviation σ, where
1 −(x−µ)2
f (x) = √ e 2σ2 , −∞ < x < ∞
σ 2π

To describe the distribution, we write

X ∼ N (µ, σ 2 ).

The normal distribution curve has the following features

(i) It is bell-shaped and symmetric in appearance

(ii) The mean, mode and median are equal for normal distribution

(iii) The total area under the curve equals 1.

(iv) It is asymptotic to the x-axis ie never crosses the x-axis.

(v) The maximum value of f (x) is √1 .


σ 2π

Figure 10: Normal distribution curve

Page 48
20 NORMAL DISTRIBUTION 49

20.3 Finding probabilities


The probability that X lies between a and b is written

P (a < X < b).

To find this probability, you need to find the area under the normal curve between a and b.

Figure 11: Normal distribution curve

One way of finding areas is to integrate, but since the normal function is complicated and
very difficult to integrate, tables are used instead (see Appendix B).
Example 20.1. Find

(i) P (z > 2.15) (v) P (z > −2.94)

(ii) P (z < 1.72) (vi) P (z 6 −1.28)

(iii) P (1.5 < z 6 2.62) (vii) P (z > −2.94)

(iv) P (z 6 −1.28) (viii) P (−1.52 6 z 6 1.84)

SOLUTION. (i)
P (z > 2.15) = 0.5 − P (0 < z < 2.15)
= 0.5 − 0.4842
= 0.0158.

(ii)
P (z < 1.72) = 0.5 + P (0 < z < 1.72)
= 0.5 + 0.4573
= 0.9573.

Page 49
20 NORMAL DISTRIBUTION 50

(iii)
P (1.5 < z 6 2.62) = P (0 < z < 2.62) − P (0 < z < 1.5)
= 0.4956 − 0.4332
= 0.0624.

(iv)
P (z 6 −1.28) = 0.5 − P (0 < z < 1.28)
= 0.5 − 0.3997
= 0.1003..

(v)
P (z > −2.94) = 0.5 + 0.4984
= 0.9984

(vi)
P (−2.31 6 z 6 −1.28) = P (0 < z < −2.31) − P (0 < z < −1.28)
= 0.4896 − 0.3997
= 0.0899

Page 50
20 NORMAL DISTRIBUTION 51

(vii)
P (−1.52 6 z 6 1.84) = P (0 < z < 1.52) + P (0 < z < 1.84)
= 0.4357 + 0.4671
= 0.9028

Page 51
20 NORMAL DISTRIBUTION 52

20.4 Standardizing Normal distribution


The standard normal distribution with µ = 0 and σ = 1. Any normal distribution (defined
by a particular value for µ and σ) can be transformed into a standard normal distribution by
letting µ = 0 and expressing deviations from µ in standard deviation units.
We often can find areas (probabilities) by converting X values into corresponding z values.
For simplicity, the z distribution is standardized using
x−u
z=
δ
where the z values can either be negative or positive depending on whether x > u or not. The
corresponding probabilities are then read from the z tables.
Example 20.2. The grades on the midterm examination for college students are normally
distributed with a mean of 78% and a standard deviation of 8. Three students are randomly
picked from class and are found to have the following grades

(i) 52% (ii) 90% (iii) 78%

Find the corresponding normalized values for each.


SOLUTION. µ = 78, δ = 8

(i) (ii) (iii)


x = 52 x = 89 x = 78
x−µ x−µ x−µ
z= z= z=
δ δ δ
52 − 78 90 − 78 78 − 78
= = =
8 8 8
= −3.25 = 1.5 =0

The table only reads positive values of z and since it is symmetric, then the positive and
negative values will have the same probabilities
Example 20.3. Bread produced by Ntake Bakery is normally distributed with a mean mass
of 500g and a standard deviation of 15g. Find the probability that a Ntake Bakery bread
picked on at random has a mass

(i) less than 470g (iii) between 485g and 510g

(ii) more than 510g (iv) between 480g and 490g

SOLUTION. We shall use table in the Appendix B to obtain the z values in each of the
above asked probabilities.
(i)
 470 − 500 
p(x < 470) = p z < = p(z < −2)
15

p(z < −2) = 0.5 − p(0 < z < 2)


= 0.5 − 0.4772
= 0.0228.
Page 52
20 NORMAL DISTRIBUTION 53

(ii)
 510 − 500 
p(x > 510) = p x > = p(z > 0.67)
15

p(z > 0.67) = 0.5 − p(0 < z < 0.67)


= 0.5 − 0.2486
= 0.2514.

(iii)
 485 − 500 510 − 500 
p(485 6 x 6 510) = p 6
15 15
= p(−1 6 z 6 0.67)

p(−1 6 z 6 0.67) = p(0 < z < 1) + p(0 < z < 0.67)


= 0.3413 + 0.2486
= 0.5899.

(iv)
 480 − 500 490 − 500 
p(480 6 x 6 490) = p 6
15 15
= p(−1.33 6 z 6 −0.67)

p(−1.33 6 z 6 −0.67) = 0.4082 − 0.2486


= 0.1596.
Example 20.4. The lifetime of light bulbs is known to be normally distributed with µ = 100h
and σ = 12h. What is the probability that a bulb picked at random will have a lifetime between
115 and 133 burning hours?
Page 53
20 NORMAL DISTRIBUTION 54

SOLUTION. Let X be the time measured in hours of burning time of a bulb. We are asked to
find P (115 < X < 133) given µ = 100h and σ = 10h and letting X1 = 115h and X2 = 133h.
We shall now standardize it and get
X1 − µ 115 − 100 15
z1 = = = = 1.25.
σ 12 12
X1 − µ 133 − 100 33
z2 = = = = 2.75.
σ 12 12
To obtain the required probability, we use the cumulative normal table to get the shaded area
between z1 = 1.25 and z2 = 2.75 as shown in figure 12 below

Figure 12: Normal curve

Looking up z1 = 1.25, in the Appendix B, we get 0.3944. This is the area from z = 0 to
z1 = 1.25.
Looking up z2 = 2.75, in the Appendix B, we get 0.4970. This is the area from z = 0 to
z2 = 2.75.
Subtracting 0.3944 from 0.4970, we get

0.4970 − 0.3944 = 0.1026,

or 10.26%, for the shaded area that gives P (115 < X < 135).
The probability that a bulb picked at random will have a lifetime between 115 and 133 burning
hours is 0.1026 or 10.26%.

Page 54
20 NORMAL DISTRIBUTION 55

PROBLEM 20.1 (Activity). Attempt the following questions

1. The mean weight of a large group of people is 80kg and the standard deviation is 6kg. If
the weights are normally distributed, find the probability that a person picked at random
from the group will weigh

(a) between 71kg and 80kg (b) above 95kg (c) below 68kg

2. If 20% of the students entering college drop out before receiving their diplomas, find the
probability that out of 20 students picked at random from the very large number of students
entering college, less than 3 drop out.

3. If 90% of the bulbs produced in a plant are acceptable, what is the probability that out of 10
bulbs picked at random from the very large output of the plant, 8 are acceptable?

4. In a bid to fill various positions within Uganda Revenue Authority (URA) that were ad-
vertised in April 2021, an online assessment was conducted from 3rd to 5th January 2022.
The assessment was conducted through an independent service provider ”Test Gorilla”,
a Netherlands based company. A total of of 30, 471 applicants attempted the online assess-
ment while 11,946 did not attempt the assessment. The pass mark was 40%
Assuming that the results obtained by the applicants who sat for the online assessment are
normally distributed with a mean µ = 56% and standard deviation σ = 16. What is the
probability that an applicant picked at random scored

(i) between 46% and 80% (iii) between 20% and 40%
(ii) between 40% and 100% (iv) below 40%

5. Heights of college women have a distribution that can be approximated by a normal curve
with a mean of 65 inches and a standard deviation equal to 3 inches. About what proportion
of college women are between 65 and 67 inches tall?

Page 55
21 NORMAL APPROXIMATION 56

21 Normal approximation
The normal distribution can be used as an approximation to

(a) Binomial approximation: The normal distribution can be used as an approximation


to the binomial distribution, under certain circumstances, namely:

(i) If X ∼ B(n, p) and if n is large and


(ii) p is close to 21 ,

then X is approximately N (np, npq), where q = 1 − p.

(b) Poisson approximation: The normal distribution can also be used to approximate the
Poisson distribution for large values of the mean of the Poisson distribution.

In this section, we shall look at normal distribution can be used as an approximation to the
binomial distribution.

21.1 Normal distribution approximating Binomial distribution


Binomial probabilities with a small value for n (say, 20) can be displayed in the statistical
tables.
To calculate the probabilities with large values of n, you had to use the binomial formula,
which could be very complicated.
Using the normal approximation to the binomial distribution simplified the process. To com-
pute the normal approximation to the binomial distribution, take a simple random sample
from a population.
You must meet the conditions for a binomial distribution:

(i) there are a certain number n of independent trials.

(ii) the outcomes of any trial are success or failure.

(iii) each trial has the same probability of a success p.

Recall that if X is the binomial random variable, then X ∼ B(n, p). The shape of the binomial
distribution needs to be similar to the shape of the normal distribution. To ensure this, the
quantities np and nq must both be greater than five (np > 5 and nq > 5 ); the approximation
is better if they are both greater than or equal to 10).
Then the binomial can be approximated by the normal distribution with mean µ = np and

standard deviation σ = npq.
Remember that q = 1 − p. In order to get the best approximation, add 0.5 to x or subtract
0.5 from x (use x + 0.5 or x − 0.5 ).
The number 0.5 is called the continuity correction factor.

Page 56
21 NORMAL APPROXIMATION 57

Example 21.1. Experience indicates that 30% of the people entering a store make a purchase.
Using
(a) the binomial distribution and
(b) the normal approximation to the binomial,
find the probability that out of 30 people entering the store, 10 or more will make a purchase.
SOLUTION. Let X be the number of people who enter the store to make purchases. We are
required to compute P (X > 10).
(a) Here n = 30, p = 0.3, q = 1 − 0.3 = 0.7. We are asked to compute P (X > 10). From the
binomial distribution table, we shall obtain from Appendix A that
P (X > 10) = P (10) + P (11) + · · · + P (30)
= 0.1416 + 0.1103 + 0.0749 + 0.0444 + 0.0231 + 0.0106 + 0.0042 + 0.0015
+ 0.0005 + 0.0001
= 0.4112.

(b) We have that


µ = np = 30 × 0.3 = 9 persons,and
q √
σ = np(1 − p) = npq

= 30 × 0.3 × 0.7

= 6.3 ≈ 2.50998
≈ 2.51
Since np > and nq > 5, we can approximate the binomial probability with the normal.
However, the number of people is a discrete variable.
In order to use the normal distribution, we must treat the number of people as if it were
a continuous variable and find P (X > 9.5), thus
x−µ 9.5 − 9
z= =
σ 2.51
0.5
= ≈ 0.1992
2.51
≈ 0.20.
From normal table in Appendix B, for z = 0.20, we get 0.0793. This means, that 0.0793
of the area under the standard normal curve lies from z = 0 to z = 0.20.

Figure 13: Normal curve

Therefore
P (X > 9.5) = 0.5 − 0.0793
= 0.4207.
Page 57
21 NORMAL APPROXIMATION 58

PROBLEM 21.1. Attempt the following questions

1. Use the normal approximation to the binomial with n = 30 and p = 0.5 to find the proba-
bility P (X = 18).

2. Use the normal approximation to the binomial with n = 10 and p = 0.5 to find the proba-
bility P (X > 7).

3. According to recent surveys, 53% of households have personal computers. If a random


sample of 175 households is selected, what is the probability that more than 75 but fewer
than 110 have a personal computer?

4. Past experience indicates that 60% of the students entering college get their degrees. Using

(i) the binomial distribution and


(ii) the normal approximation to the binomial,

find the probability that out of 30 students picked at random from the entering class, more
than 20 will receive their degrees.

5. Suppose X is a binomial random variable with n = 600 trials and probability of success
p = 0.35 . Find the probability using the normal approximation to the binomial distribution
with a continuity correction.

(i) P (X > 220) (iii) P (190 < X < 200)


(ii) P (X 6 198) (iv) P (X = 212)

6. Assume that the probability of a college student having a car on campus is 0.30. A random
sample of 12 students is taken. What is the probability that at least 4 will have a car on
campus?

(i) Work the problem as a binomial.


(ii) Approximate the probability using the standard normal.
(iii) Is the approximation reasonable? Explain clearly.

7. According to the Nation’s Report Card, also known as the National Assessment of Educa-
tional Progress (NAEP), only 25% of senior two students are proficient in mathematics.
Suppose 200 senior two students from Ugandan schools are selected at random. Answer
each problem using the normal approximation to the binomial distribution.

(i) Find the approximate probability that at least 55 students are proficient in mathemat-
ics.
(ii) Find the approximate probability that between 60 and 65 (inclusive) students are pro-
ficient in mathematics.
(iii) Suppose the NAEP test results for each student are used to find that 36 (of the 200)
students are proficient in mathematics. Is there any evidence to suggest that fewer
than 25% of senior two students are proficient in mathematics?
Justify your answer.

Page 58
21 NORMAL APPROXIMATION 59

8. (a) (i) Write down two conditions for X ∼ B(n, p) to be approximated by a normal
distribution Y ∼ N (µ, σ 2 ).
(ii) Write down the mean and variance of this normal approximation in terms of n
and p
(b) A factory manufactures 2000 DVDs every day. It is known that 3% of DVDs are faulty.
Using a normal approximation, estimate the probability that at least 40 faulty DVDs
are produced in one day.

9. In a certain College, 20% of students own a touch screen laptop. A random sample of
n students is chosen from the school. Using a normal approximation, the probability that
more than 55 of these n students own a touch screen laptop is 0.0401 correct to 3 significant
figures. Find the value of n.

Page 59
A BINOMIAL DISTRIBUTION TABLE 60

Appendices
A Binomial Distribution table

Example A.1.
P (X = 3, n = 5, p = 0.30) = 0.1323

Page 60
REFERENCES 61

B Standard Normal Distribution table

References
[1] Freedom, D. A. (2005). Statistical models theory and practice.

[2] Mosteller, F.,& Turkey,J. W. (2007). Data Analysis and Regression.

[3] Salvatore, D. P. (2021). Schaums outline of theory and problems of statistics and econo-
metrics.

Page 61
REFERENCES 62

Page 62
REFERENCES 63

Page 63
REFERENCES 64

Page 64
REFERENCES 65

Page 65
REFERENCES 66

Page 66
REFERENCES 67

Page 67

You might also like