Statistics For Finance

Oxfo Business & Technology College
Module for the Course
Statistics for Finance

Course Code = BUMA - 223
Credit Hour = 3
Wisdom at the source of Blue Nile

October , 2016
Shire
0
TABLE OF CONTENTS
CONTENTS
Page
BLOCK 1: INTRODUCTION TO STATISTICS
UNIT 1: Introduction to Statistics ……………………………………… 1
UNIT2: Collection of Data ……………………………………………… 21
BLOCK 2: CLASSIFICATION AND PRESENTATION

OF STATISTICAL DATA
UNIT 3: Classification of Data………………………………………….. 33

UNIT 4: Graphical Presentation of Data ………………………………... 50
BLOCK 3: MEASURES OF CENTRAL TENDENCY
UNIT 5: Definition and Purpose of Averages …………………………… 66

UNIT 6: Mathematical Measures of Central Tendency…………………. 68
UNIT 7: Positional Measures of Central Tendency……………………… 94
UNIT 8: Relationship between Mean, Median, and Mode ……………… 109
BLOCK 4: MEASURES OF DISPERSION (VARIATION)
UNIT 9: Meaning and Types of Dispersion …………………………… 119

UNIT 10: Range and Quartile Deviation ………………………………. 135
UNIT 11: Mean Deviation …………………………………………………148
BLOCK 5: ELEMENTARY PROBABILITY THEORY
UNIT 12: Counting Methods …………………………………………… 190

UNIT 13: Probability …………………………………………………… 212
1
BLOCK 6: INTRODUCTION TO PROBABILITY DISTRIBUTIONS
UNIT 14: Discrete Probability Distribution …………………………… 235

UNIT 15: Continuous Probability Distribution ………………………… 257
BLOCK 7: SAMPLING AND SAMPLING DISTRIBUTION OF THE MEAN
UNIT 16: Concepts And Reasons For Sampling……………………….. 272

UNIT 17: Types Of Sampling Techniques …………………………….. 276
UNIT 18: Sampling Distribution Of The Sample Mea ………………… 288
BLOCK 8: ESTIMATION AND TEST OF HYPOTHESIS
UNIT 19: Estimation…………………………………………………… 300

UNIT 20: Test Of Hypothesis…………………………………………. 317
BLOCK 9: CORRELATION AND REGRESSION
UNIT 21: Correlation …………………………………………………… 342

UNIT 22: Regression …………………………………………………… 364
APPENDIXES …………………………………………………………………… 387
2
BLOCK 1: INTRODUCTION
UNIT 1: INTRODUCTION TO STATISTICS
UNIT 2: COLLECTION OF DATA
INTRODUCTION
When did the practice of Statistics start? It is not as such possible to tell the exact point
at which Statistics was started. However, it is not done consciously, the practice of
Statistics in one of the things human beings perform in their day-to-day activities. All the
estimations, forecasts, comparisons, averaging, and so on are part of Statistics. Statistics
is something that is part of our lives. We encounter it in all walks of life, but most of the
time we don’t recognize it.
It was around the 16th century that Statistics considered as a formal discipline. After
passing through continuous developmental stages, Statistics currently reached at a point
more than ever in its history.
Nevertheless, Statistical results are not ends by themselves. They are used to develop and
strengthen ideas or theories that we have in other fields like economics, social and natural
science, business, medicine, etc. Rober W. Bugess, as cotted by C.B Gubta, put the aim
of Statistics as:
“ The fundamental gospel of Statistics is to push back the domain of ignorance, prejudice,
rule of thumb, arbitrary or premature decisions, traditions and dogmatism, and to increase
the domain in which decisions are made and principles are formulated on the basis of
analyzed facts.”
3
BLOCK 2: CLASSIFICATION AND PRESENTATION
OF STATISTICAL DATA
UNIT 3: CLASSIFICATION OF DATA

UNIT 4: GRAPHICAL PRESENTATION OF DATA
INTRODUCTION
In the preceding block, you have learnt about the collection of data. When conducting a
statistical study, you must gather data for the particular variable under study.
In this block, you will learn about classification and presentation of data. The first part
deals with ‘classification of data’ and the following unit deals with ‘presentation of data’.
The purpose of this block is to explain how to organize data by constructing ‘frequency
distribution’ and how to present the data by constructing graphs and charts. The graphs
and charts illustrated in this block include histograms, frequency polygons, ogives, pie
charts, bar charts, time series graphs, and pictographs (pictograms).
4
BLOCK 3: MEASURES OF CENTRAL
T E NDE NC Y
UNIT 5: DEFINITION AND PURPOSE OF AVERAGES

UNIT 6: MATHEMATICAL MEASURES OF CENTRAL TENDENCY
UNIT 7: POSITIONAL MEASURES OF CENTRAL TENDENCY
UNIT 8: MEANING AND TYPES OF DISPERSION
INTRODUCTION
Visual presentation of data would disclose some characteristic features of a mass of data.
And further summarization of data is so essential to show the relationship between
variables and to correlate one variable with another. To describe the characteristic
features of the entire mass of data with single quotient, the more obvious measure that
helps to make quicker and better decision is the measure of Central Tendency, also called
the Averages. An average gives a bird's eye view of huge mass of data, which are not
easily intelligible, since it refers to a numerical value that is a central point about which
other values in a series get dispersed.
5
BLOCK 4: MEASURES OF DISPERSION
(VARIATION)
UNIT 9: RANGE AND QUARTILE DEVIATION
UNIT 10: MEAN DEVIATION
UNIT 11: STANDARD DEVIATION
INTRODUCTION
Although, averages serve the purpose of describing the characteristics of a distribution,

they cannot give a comprehensive idea and as such hide many important facts about the
distribution. One of the important facts concealed (hidden) by the measures of central
tendency is regarding the variability is the values. Averages fail to explain the extent of
deviation from the average value in the distribution. In the absence of such information,
the averages among different distributions cannot be meaningfully compared.
Comparison can be made only in cases where all the values are the same as the average
or when there are no significant deviations in the values from the average. However, this
situation is very rare especially, in case of the data pertaining to various socio-economic
problems. Hence, any average by itself fails to give if it is not identified and accompanied
information describing the range of things and their deviations from the central value.
Some other values are required for describing the characteristic of distribution and
making comparisons with other distributions. As Simpson and Kafka rightly pointed out,
“An average doesn’t tell the full story. It is hardly fully representative, of a mass unless
we know the manner individual items scatter around it. A further description of the series
is necessary if we are to gauge how representative the average is “. To support and
supplement measures of central tendency, other measures like dispersion, and skew ness
are devised. While the measures of dispersion explain the degree of variation in the
individual items from the central value, skew ness measures the degree of symmetry or
asymmetry of the distribution.
6
BLOCK 5: ELEMENTARY PROBABILITY THEORY
UNIT 12: COUNTING METHODS

UNIT 13: PROBABILITY
INTRODUCTION
The theory of probability is a far-reaching branch of statistics, which helps to obtain

numerical information about the occurrence or non-occurrence of an event. It is a
measure of the likely hood that an event will happen in the future which plays a major
role in our daily life where we continually faced with decisions leading to uncertainty.
The use of information from descriptive data requires knowledge of probability as one of
the major tools in drawing valid generalization and wise and reasonable decisions about
the large size population from which the sample data is taken.
Dear students, this block consists of two units; the first of which is counting methods
which gives you clear idea of how to determine the possible number of elements in an
event as well as in a sample space of an experiment, while the second unit give you the
techniques of determining the probability of an event. You also need to realize that in
ordinary language, probability chance, likely hood, and odds are interchangeable.
7
BLOCK 6: INTRODUCTION TO
PROBABILITY DISTRIBUTIONS
UNIT 14: DISCRETE PROBABILITY DISTRIBUTION

UNIT 15: CONTINUOUS PROBABILITY DISTRIBUTION
INTORODUCTION
In the preceding block, you have learnt about probability, one of the most important tools
in statistics and which is important in decision making as it provides a mechanism for
measuring, expressing and analyzing the uncertainties associated with future events.
You also saw that the probability of an event could be computed either by summing the
probabilities of the experimental outcomes (sample points) comprising the event or by
using the relationships established by the addition, conditional probability and
multiplication laws of probability.
In this block, you will learn what a ‘probability distribution’ is. The first part deals with
discrete probability distribution and the in the following unit, an introduction to
continuous probability distribution will be dealt with, focusing on the normal probability
distribution.
8
BLOCK 7: SAMPLING AND SAMPLING DISTRIBUTION OF THE
MEAN
UNIT 16: CONCEPTS AND REASONS FOR SAMPLING

UNIT 17: TYPES OF SAMPLING TECHNIQUES
UNIT 18: SAMPLING DISTRIBUTION OF THE SAMPLE MEAN
9
BLOCK 9: CORRELATION AND REGRESSION
UNIT 21 CORRELATION
UNIT 22 REGRESSION
INTRODUCTION
In this block, we shall look at methods, which investigate whether two quantitative
variables are related. In our practical life, we come across different sets of data that deal
with more than one variable which are interrelated and interdependent. For example, the
instructor wonders whether the Mathematics mark and the Statistics mark are related: did
a good performance in one subject go with a good performance in the other?
She/he decides that s/he can most easily discover this by plotting the marks for all the
students on a sheet of graph paper. In addition, she/he tries to see the relationship
mathematically. These mathematical methods are Karl Pearson’s coefficient of
correlation and Spearman’s Rank coefficient of correlation.
BLOCK 3: CLASSIFICATION AND

PRESENTATION OF STATISTICAL
DATA

UNIT 4: GRAPHICAL PRESENTATION OF DATA
INTRODUCTION
In the preceding block, you have learnt about the collection of data. When conducting a
statistical study, you must gather data for the particular variable under study.
10
In this block, you will learn about classification and presentation of data. The first part
deals with ‘classification of data’ and the following unit deals with ‘presentation of data’.
The purpose of this block is to explain how to organize data by constructing ‘frequency
distribution’ and how to present the data by constructing graphs and charts. The graphs
and charts illustrated in this block include histograms, frequency polygons, ogives, pie
charts, bar charts, time series graphs, and pictographs (pictograms).

CONTENTS:
3.0. Aims and Objectives

3.1 Introduction
3.2. Definition of Classification of Data
3.3. Types of Classification
3.4. Frequency Distribution
3.5. Common Terminologies in a Grouped Frequency Distribution
3.6. Rules for Forming a Grouped Frequency Distribution
3.7. Cumulative Frequency Distribution (CFD)
3.8. Relative Frequency Distribution (RFD)
3.9. Summary
3.10.Answers to Check Your Progress (CYP) Questions
3.11. Model Examination Questions
11
3.12. Glossary
3.13. References
3.0. AIMS AND OBJECTIVES
The aim of this unit is to study about the collection of data for a statistical study and
discuss the various types of classification of data, and then to organize these data into a
frequency distribution.
At the end of this unit, you will be able to:
 Define ‘classification of data’

 Identify the types of ‘classification of data’
 Define what a ‘Frequency Distribution (FD)’ is
 Organize raw data in to a ‘Frequency Distribution (FD)’
 Define ‘presentation of data’
 Represent the FD in a histogram, frequency polygon or cumulative frequency
curve (ogive)
 Present data using such common diagrammatic techniques as bar charts, pie
charts, and pictogram (pictograph).
3.1. INTRODUCTION
After collecting relevant information (data) for the purpose of statistical investigation, the
next important task is classification and presentation of this data. It is difficult to group
the meaning of any considerable volume of numerical data unless their mass is some
hours reduced to relatively few convenient classes or categories and presented with the
help of some kinds of visual aid.
This section discusses classification of data. Presentation of data using graphs and charts
will be seen in the next unit.
12
3.2. DEFINITION OF CLASSIFICATION OF DATA
Classification: - is the process of arranging things in groups or classes according to

their resemblance.
Purposes of Classification:-
 To eliminate unnecessary detail.
 To bring out clearly points of similarity & dissimilarity
 To enable one to form mental pictures of objects on measurements
 To enable one to make comparisons and draw inferences
3.3. TYPES OF CLASSIFICATION
1. Geographical Classification: - Data are arranged according to places like continents,

regions, and countries
Example
Region Common Language Spoken
1 Tigrigna
2 Afar
3 Amharic
4 Oromifa
2. Chronological Classification:- Data are arranged according to time like year,

month.
Example
Year (in EC) Population (in million)
1974 30
1986 52
1991 60
3. Qualitative Classification: - Data are arranged according to attributes like color,

religion, marital-status, sex, educational background, etc.
13
Example 3. Employees in a Factory x
Educated Un educated
Female Male Female Male
4. Quantitative Classification:- In this type of classification, the statistical data is

classified according to some quantitative variables. The variable may be either
discrete or continuous.
Example 4.
Mr. x Height (X) in cm
A 160
B 182
C 175
D 178
Note: There are two kinds of variables, which can have values: Discrete Variable and
Continuous Variable.
A. Discrete Variables – are variables that are associated with enumeration or counting
Example
Number of students in a class
Number of children in a family, etc
B. Continuous Variables – are variables associated with measurement.

Example
Weights of 10 students.
Heights of 12 persons.
Distance covered by a car between two stations etc.
3.4. FREQUENCY DISTRIBUTION
When the raw data have been collected, they should be put in to an ordered array in an
ascending or descending order so that it can be looked at more objectively. Then this
data must be organized in to a “FD” which simply lists the values or classes with their
14
corresponding frequencies in a tabular form. Here, frequency refers to the number of
observations a certain value occurred in a data.
The tabular representation of values of a variable together with the corresponding
frequency is called a Frequency Distribution (FD).
Definition:
A frequency distribution is the organization of raw data in table form, using classes and
frequencies.
Frequency distribution is of two kinds
A. Ungrouped Frequency Distribution (UFD)

Shows a distribution where the values of a variable are linked with the respective
frequencies.
Example 7. Consider the number of children in 15 families.

1 0 3 2 0
2 4 1 3 1
4 1 2 2 3
Construct ungrouped FD for the above data.
Solution:
No. of Children No. of Family Frequency
(Values) (Tallies)
0 // 2
1 //// 4
2 //// 5
3 /// 3
4 // 2
Total 16
CYP 1
Consider the following scores in a statistics test obtained by 20 students in a given class.
10, 4, 4, 7, 5, 7, 7, 8, 5, 7, 8, 5, 10, 8, 7, 5, 7, 8, 7, 4
Prepare an ungrouped FD
15
B. Grouped Frequency Distribution (GFD)
If the mass of the data is very large, it is necessary to condense the data in to an
appropriate number of classes or groups of values of a variable and indicate the number
of observed values which fall in to each class. Therefore, a GFD is a frequency
distribution where values of a variable are linked in to groups & corresponded with the
number of observations in each group.
Example*
Values (xi) 1 - 25 26 - 50 51 - 75 76 -
100
Frequency (fi) 3 10 18 6
3.5. COMMON TERMINOLOGIES IN A GFD
i. Class:- group of values of a variable between two specified numbers called lower
class limit
(LCL) & upper class limit (UCL)
*
In Example , the GFD contains four classes: 1 – 25, 26 – 50, 51 – 75, and 76 – 100
LCL1 = 1, UCL1 = 25 LCL3 = 51, UCL3 = 75
LCL2 = 26, UCL2 = 50 LCL4 = 76, UCL4 = 100
ii. Class Frequency (or Simply Frequency): refers to the number of observations
corresponding to a class.
In Example * the class frequency of the 1st, 2nd, 3rd, & 4th classes are respectively 3, 10,
18 and 6.
iii. Class Boundaries: are boundaries obtained by subtracting half of the unit of
measurement (u) from the lower limits or by adding ½ (u) on the upper limits of a class.
i.e UCBi = UCLi + ½ (u)
LCBi = LCLi - ½ (u)
Where UCBi = Upper Class Boundaries and
LCBi = Lower Class Boundaries
Remark: The unit of measurement (u) is the gap between any two successive classes. i.e
u = lower limit of a class – upper limit of the preceding class.
16
*
In Example , consider the 2nd class, 26 – 50 , since u = 26 – 25 = 1,
LCL2 = 26 UCL2 = 50
LCB2 = 26 - ½(1) = 25.5 UCB2 = 50 + ½(1) =50.5
iv. Class Width (size of a class or class interval): it is the difference between the upper
and lower class limits or the difference between the upper and lower class boundaries of
any class.
Remarks:
1. If both the LCL & UCL are included in a class, it is called an inclusive class. For
inclusive classes,
Class width (cw) = UCBi - LCBi
2. If LCL is included and the UCL is not included in a class, it is called an exclusive
class. For exclusive classes
cw = UCLi – LCLi
To be consistent, we use inclusive classes.
v. Class Mark (cm): it is the mid point (center) of a class
cmi = UCBi + LCBi

2
Note:- the difference between any two successive class marks is equal to the width of
a class
vi. Range (R) : is the difference between the largest (L) and the smallest (S)
values in a data
R=L–S
CYP 2 consider the following GFD
Class Frequency (f)

5–9 2
10 – 14 6
15 – 19 12
20 – 24 7
25 – 29 3
Total 30
a. What is the class frequency of the 3rd class?
17
b. How many observations (items) are linked into the last class?
c. Find i. the LCL and UCL of the fourth class
ii. the UCB and LCB of the third class
iii. the class interval ( class width) of the fifth class
iv. the class mark (mid point) of the second class
3.6. RULES FOR FORMING A GROUPED FREQUENCY

DISTRIBUTION
To construct a GFD the following points should be considered

1) The classes should be clearly defined. That is each observation should fall in to
on e & only one class.
2) The number of classes neither should either to be too larger nor should be too
small. Normally, 5 to 20 classes are recommended
3) All the classes should be of the same width. An approximate suitable class width
can be obtained as:
Range R L S
cw  i.e cw  
Number of Classes n n
R
Example 8. Let  6.8263
n
If all the observations are whole numbers, cw = 7
If all the observations are to one decimal places, cw = 6.8
If all the observations are to two decimal places, cw = 6.83, etc.
Note that a suitable number of classes can be obtained by using the formula n  1 +
3.322 logN
up/down to the nearest whole number, where N is the total number of observations.
Remark Unequal class intervals create problem in graphing and computing some
statistical measures
4) Determine the class limits

i. determine the lower class limit of the first class (LCL1), then
LCL2 = LCL1 + cw, LCL3 = LCL2 + cw, …, LCLi+1 = LCLi + cw
ii. determine the upper class limit of the first class (UCL1) i.e.
UCL1 = LCL1 + cw – u, where u = the unit of measurement, then
UCL2 = UCL1 + cw , UCL3 UCL2, … , UCLi+1 = UCLi + cw
5) Complete the GFD with the respective class frequencies.
Example 9. The number of customers for consecutive 30 days in a supermarket was

listed as follows:
20 48 65 25 48 49
35 25 72 42 22 58
53 42 23 57 65 37
18
18 65 37 16 39 42
49 68 69 63 29 67
a. construct a GFD with a suitable number of classes
b. complete the distribution obtained in (a) with class boundaries & class marks
Solution: i. Range = Largest value – smallest value

= 72 – 16 = 56
ii. N = 30 (total number of observations)

 number of classes, n = 1 + 3.322 log30
 n = 1 + 3.322 log30
= 1 + 3.322 (1.4771)
= 5.9
Hence a suitable number of class n is chosen to be 6
Range 56
iii. Class width =  = 9.33 = cw
n 6
For the sake of convenience, take cw to be 10 (note that it is also
possible to choose the cw to be 9).
iv. Take lower limit of the 1st class (LCL1) to be 16 & u = 1
i.e. LCL1 = 16 and UCL1 = LCL1 + cw – u =
16+10-1 = 25
LCL2 = LCL1 + cw = 16 + 10 = 26 UCL2 = UCL1 + cw = 25 +
10 = 35
LCL3 = LCL2 + cw = 26 + 10 = 36 UCL3 = UCL2 + cw = 35 +
10 = 45
There fore, the GFD would be
a) b)
Class (xi) Frequency (fi)
16 – 25 7
26 – 35 2
36 – 45 6
46 – 55 5
56 – 65 6
66 – 75 4
Class (xi) Frequency (fi) CBi cmi
16 – 25 7 15.5 – 25.5 2.05
26 – 35 2 25.5 – 35.5 30.5
36 – 45 6 35.5 – 45.5 40.5
46 – 55 5 45.5 – 55.5 50.5
56 – 65 6 55.5 – 65.5 60.5
19
66 – 75 4 65.5 – 75.5 70.5
CYP 3
Construct a grouped frequency distribution for the following ages of 50 persons with 6
classes.
37 40 69 35 36 70 72 62 36 72
65 64 47 59 55 42 45 50 46 65
54 63 51 50 61 60 58 58 56 58
55 45 49 51 50 56 44 60 70 44
52 43 55 46 42 62 57 48 60 55
3.7. CUMULATIVE FREQUENCY DISTRIBUTION (CFD)
It is the collection of values of a variable above or below specified values in a

distribution. GFD is of two types.
a. ‘Less Than’ Cumulative Frequency Distribution (<CFD): shows the collection
of cases lying below the upper class boundaries of each class.
b. ‘More Than’ Cumulative Frequency Distribution (>CFD): shows the

collection of cases lying above the lower class boundaries of each class.
Remark: The frequency distribution does not tell us directly the number of units above
or below specified values of the classes this can be determined from a “cumulative
Frequency Distribution’
Example 11 Consider the frequency distribution in Example 9
Class (xi) Frequency (fi) Less than Cumulative More than Cumulative
Frequency (<cfi) Frequency (>cfi)
3-6 4 4 30
7 – 10 7 11 26
11 – 14 10 21 19
15 – 18 6 27 9
19 – 22 3 30 3
This means that from ‘less than’ cumulative frequency distribution there are 4
observations less than 6.5, 11 observations below 10.5, etc and from ‘more than’
cumulative frequency distribution 30 observations are above 2.5, 25 above 6.5 etc.
20
3.8. RELATIVE FREQUENCY
DISTRIBUTION (RFD)
It enables the researcher to know the proportion or percentage of cases in each class.
Relative frequencies can be obtained by dividing the frequency of each class by the total
frequency. It can be converted in to a percentage frequency by multiplying each relative
frequency by 100%. i.e.
fi
Rf i 
n
Where Rfi – is the relative frequency of the ith class
fi – is the frequency of the ith class
n – is the total number of observations
Note: Pfi = Rfi  100%
Where Pfi is percentage frequency of each class.
Example 14: The relative and percentage of frequency distribution of Example 9 is :
xi fi Rfi %freq. (Pfi)

3–6 4 4/30 4/30  100
7 – 10 7 7/30 7/30  100
11 – 14 10 10/30 10/30  100
15 – 18 6 6/30 6/30  100
19 – 22 3 3/30 3/30  100
Total 30 1 100%
3.9 SUMMARY
This unit discussed the definitions of classification of data and a frequency distribution.
In order to describe situations, draw conclusions or make inferences about random
events, one must organize the data in some meaningful way. The most convenient
method of organizing data is to construct a frequency distribution.
21
Therefore, a frequency distribution was seen as a distribution showing the
correspondence of values or classes with their respective frequencies.
3.10 ANSWERS TO CHECK YOUR PROGRESS (CYP)

QUESTIONS
CYP 1
Value(xi) Frequency(fi)
4 3
5 4
7 7
8 4
10 2
CYP 2
a) 12
b) 3
c) i) L.C.L4 = 20 and U.C.L4 = 24
ii) Since u = 10 – 9 = 1 (or any gap between two consecutive classes)
L.C.B3 = L.C.L3 – ½(u) = 15 - ½.1 = 14.5
U.C.B3 = U.C.L3 + ½(u) = 19+ ½.1 = 19.5
iii) class interval = class width = cw = UCB5 – LCB5 = 29.5 – 25.5 = 6
iv) class mark(cm2) = UCB2 + LCB2
2
= 19.5 + 14.5
2
= 24/2
= 12
CYP 3
i) Largest Value(L) = 72 and Smallest Value(S) = 35

Range(R) = L – S = 72 – 35
= 37
ii) N = 50 (total number of observation)
iii) Select the number of classes desired (usually between 5 and 20);
in this case, let n = 6 be arbitrarily chosen.
iv) class width(cw)= Range = R
# of classes n
i.e. cw = 37 = 6.1666… = 7(Round the answer up to the next
number)
6
iv) Select a starting point as the lower class limit (this is usually the smallest
score i.e. LCL1 = 35 ). Add the class width(cw = 7) to that score to get the
22
lower limits of the next class. Keep adding until there are 6 classes as
shown
35
42
49
56
63
70
v) Subtract one unit from the lower limit of the second class to get the upper
limit of the first class; then add the class width(cw) to each upper limit
to get all the upper limits. i.e. UCL1 = LCL2 - 1 = 42 – 1 = 41. So the
first class is 35-41.
vi) Tally the data (count the number of observations linked in to the respective
classes) and write the numerical values for tallies in the frequency column.
Therefore, the frequency distribution would be:
Class Limits Tally Frequency(fi)

35-41 //// 5
42-48 //// //// 11
49-55 //// //// // 12
56-62 //// //// /// 13
63-69 //// 5
70-76 //// 4
3.11. MODEL EXAMINATION QUESTIONS
Direction: Answer each of the following questions.
1. Determine whether each statement is true or false.
a) A frequency distribution is the organization of raw data, in table form, that lists
values or classes with their corresponding frequencies.
b) The mid point of a class is found by adding the upper and lower limits and
dividing by
c) If the gap between any two successive classes is one and the limits of a class are
10-19,
then the width of the class is 9.
d) If the limits of a class in a frequency distribution are 26-30, then the boundaries
are
25.5-30.5.
e) When data is first collected, it is called raw data.
f) A frequency distribution should contain between 50 and 100 classes.

g) It is not important to keep the width of each class the same in a frequency
distribution.
23
2. Classify each variable as discrete or continuous.
a) Number of cartoons of milk manufactured each day.

b) Temperatures of airplane interiors at a given airport.
c) Lifetimes of transistors in a stereo set.
d) Weights of newborn calves.
3. 100 employees were surveyed in a factory to find out their ages. The result was
obtained as follows.
32 21 28 31 35 46 48 49 49 48
36 37 22 31 28 34 20 45 44 48
38 33 33 23 28 29 33 26 36 30
43 42 32 36 24 27 27 32 45 45
39 39 38 32 33 25 30 28 37 36
42 43 38 40 35 34 20 30 36 32
40 38 38 40 46 36 35 21 31 35
41 42 39 40 46 44 32 37 22 27
41 39 40 38 44 45 48 36 32 23
40 41 40 44 49 49 49 49 37 33
Construct a Grouped Frequency Distribution (GFD) with five classes for the above data.
3.12. GLOSSARY
Raw Data: Data collected in original form.
Frequency: The number of values in a specific class of the distribution or the number of
times a
value occurs in the distribution.
Cumulative Frequencies: refer to the total frequency of all values up to and including
the upper
boundary of the class interval that is under consideration.
Frequency Distribution: A table showing classes or values with their corresponding

frequencies.
Class: In set refers to a group of data considered as one item in a frequency distribution.
Range: Means the difference between the largest and the smallest values in a set of data.
Class Interval: Refers to difference between class limits (boundaries).
24
Class Limits: Means limits of different classes in a frequency distribution.
Class Boundaries: Boundaries that are obtained by adding and subtracting half of unit of
measurement.
3.13. REFERENCES
 Allen G.Bluman, Elementary Statistics, A Step By Step Approach,

 Anderson, Sweeney, Williams, Statistics For Business and Economics, Fifth edition
1986
 Douglas A.Lind, Robert D.Mason, Basic Statistics for Business & Economics,
Second Edition
 Richard I.Levin, Statistics for Management, Third edition,1984.
 Stephen A.Book, Essentials of Statistics, 1978
UNIT 4: PRESENTATION OF DATA
CONTENTS:
4.0. Aims and Objectives
4.1. Introduction
4.2. Histogram
4.3. Frequency Polygon
4.4. Cumulative Frequency Curve (Ogive)
4.5. Line Graph
4.6. Vertical Line Graph
4.7. Bar Chart (Bar Diagram)
4.8. Types of Bar Charts
4.9. Pie Chart
4.10. Pictograph (Pictogram)
4.11. Summary
4.12. Answer to Check Your Progress Questions (CYP)
4.13. Model Examination Questions
4.14. Glossary
4.15. References
4.0. AIMS AND OBJECTIVES
The aim of this unit is to study how to construct and present data using different types of
graphs, charts, and diagrams that can facilitate comparisons and in general to have an
over all good picture of data.
25
 Define ‘presentation of data’

 Identify different types of ‘graphs’ and ‘charts’
 Identify the types of ‘bar charts’
 Construct a ‘histogram’, ‘frequency polygon’, ‘ogives’, and other graphs, and
‘charts’.
4.1. INTRODUCTION
This unit deals with the study of organizing a set of raw data in to a Frequency
Distribution (FD) and describes the distribution graphically in a histogram, a frequency
polygon, & a cumulative frequency curve (ogive). The other types of numerical
information will be summarized & presented in the form of bar chart, pie chart or a
pictogram.
Definition:
Presentation is a statistical procedure of arranging and putting data in a form of

tables,
graphs, charts and/or diagrams
4.2. HISTOGRAM
After you complete a frequency distribution, your next step will be to construct a
“picture” of these data values using a histogram. A histogram is a graph consisting of a
series of adjacent rectangles whose bases are equal to the class width of the
corresponding classes and whose heights are proportional to the corresponding class
frequencies. Here, class boundaries are marked along the horizontal axis (x – axis) and
the class frequencies along the vertical axis ( y – axis) according to a suitable scale. It
describes the shape of the data. You can use it to answer quickly such questions a,s are
the data symmetric? And where do most of the data values lie?
Example 1. Considers the following GFD and construct a histogram

3–6 4
7 – 10 7
26
11 – 14 10
15 – 18 6
19 - 22 3
Total 30
Solution:
Histogram for the above distribution
10
Class frequency (fi)
2.5 6.5 1.05 14.5 18.5 22.5
Class boundaries (CBi)
CYP 1 construct a histogram for the following distribution

5 – 10 4
10 – 15 7
15 – 20 9
20 – 25 12
25 - 30 6
30 – 35 5
4.3. FREQUENCY POLYGON
It is a line graph of frequency distribution. Although a histogram does demonstrate the

shape of the data, perhaps the shape can be more clearly illustrated by using a frequency
polygon. Here, you merely connect the centers of the tops of the histogram bars (located
at the class midpoints) with a series of straight lines. The resulting figure is a frequency
polygon. Here the class marks are plotted along the x – axis and the class frequencies
27
along the y – axis. Empty classes are include at each end so that the curve will anchor
with the x – axis.
Example 2. Construct a frequency polygon for the frequency distribution given in

Example9
Solution:
A frequency polygon for the
distribution in example 9
15
frequency (fi)
10
0
0.5 7.5 12.5 17.5 22.5 27.5 32.5 37.5
Class marks (cmi)
CYP 2 construct a frequency polygon for the frequency distribution given under CYP 1
4.4 CUMULATIVE FREQUENCY CURVE, (OGIVE)
It is the graphic representation of a cumulative frequency distribution Ogives are of two

kinds. ‘Less than’ ogive and ‘more than’ Ogive < Ogive and > Ogive.
A) ‘Less than’ ogive: here, upper class boundaries are plotted against the ‘less than’
cumulative frequencies of the respective class & they are joined by adjacent lines.
Example 3. Draw a ‘less than’ ogive for the frequency distribution in Example 11
Solution:
28
A less than ogive showing the frequency
distribution above
35
Less than cumulative
30
frequency (<Cfi)
25
20
15
10
5
0
6.5 10.5 14.5 18.5 22.5
Upper class boundary (UCBi)
B) ‘More than’ ogive: here, lower class boundaries are plotted against the ‘more
than’ cumulative frequencies of their respective class and they are joined by
adjacent lines.
Example 4. Draw a ‘More than’ ogive for the frequency distribution in Example 11
Solution:
29
A more than ogive for the above frequency
distribution
40
More than cumulative
30
frequency (>Cfi
20
10
0
2.5 6.5 10.5 14.5 18.5
lower class boundaries (LCBi)
4.5. LINE GRAPH
It represents the relation ship between time (on the x-axis) and values of variable (on the
y-axis). The values are recorded with respect to the time of occurrence.
Example 5. Draw a line graph for the following time series.
Year 1986 1987 1988 1989 1991

Values 20 10 30 15 1
Solution:
A line graph showing the above time series
35
30 30
25 25
20 20
Values
15 15
10 10 10
5
0
1986 1987 1988 1989 1990 1991
Year
30
4.6. VERTICAL LINE GRAPH:
Is a graphical representation of discrete data (or characteristics expressed with whole

numbers) with respect to the frequencies. Vertical solid lines are used to indicate the
frequencies.
Example 6. Draw a vertical line graph for the following data
Family A B C D E
Number of children 3 2 7 6 4
Solution:
Y
7 …………………
6 …………………………
5
4 ………………………………
3 ……
2 ……………
1
X
A B C D E
vertical line graph showing number of children in family A , B , C , D and E
4.7. BAR CHART (BAR DIAGRAM):
Histogram, Frequency polygon, ogives are used for data having an interval or ratio level
of measurement. The other kinds of presenting statistical data suitable for a particular
kind of situations are bar charts, pie chart and pictograph.
Bar chart is a series of equally spaced bars of uniform width where the height (length) of
a bar represents the amount (magnitude) of frequency corresponding with a category.
Bars may be drawn horizontally or vertically. Vertical bar graphs are preferred as they
allow comparison with other bars.
4.8. TYPES OF BAR CHARTS
A. Simple Bar Chart:
It represents a single set of data (variable) classified in different categories. Singular bars
are drawn with the respective frequencies.
Example18: Revenue (in millions of Birr) of company x from 1980 to 1982 is given
below
31
Year Revenue
1980 50
1981 150
1982 200
Solution:
A simple bar chart showing revenues of

company X from 1980 to 1982
250
200
150
Revenue
100
50
0
1980 1981 1982
year
B. Multiple Bar Chart:
here two or more bars are grouped with the corresponding frequency to represent two or
more interrelated data in each category. The bars of related variables are kept adjacent to
each other for every set of values. These charts can be used if the overall total is not
required and each bar is shaded or colored separately and a key is given to distinguish
them.
Example19: The following table shows the production of wheat and maize in hundreds
of quintals.
Year Maize Wheat

1980 40 80
32
1981 20 60
1982 60 100
Solution:
The number of quintals(in thousands) of

wheat and maize production
100 100
80 80
60 60 60
Number of
quintals 40 40 maize
20 20 wheat
0
1980 1981 1982
Year
C. Subdivided Bar Chart:
It is used to present data by subdividing a single bar with respect to the proportional
frequency. Each portion of the bar is then shaded or colored and a key is give to
distinguish them.
Example20: The number of quintals of wheat and maize (in millions of quintals)
produced by country x in the indicated years.
Year Wheat Maize

1980 150 150
1981 300 200
1982 350 100
Solution:
33
The number of quintals of wheat and maize
produced by country X
600
Number of
quintals
400 200 100 Maize
200 150 Wheat
300 350
150
0
1980 1981 1982
Year
D. Percentage Bar Chart:
It is a subdivided bar chart where percentages are used in each classification rather than the
actual frequencies.
Example 21: construct percentage bar chart for the data in Example 19.
Solution:
Year % of Wheat Production % of Maize
Production
1980 150/300  100 = 50 150/300  100 = 50
1981 300/500  100 = 60 200/500  100 = 40
1982 350/450  100 = 78 100/450  100 = 22
Percentage of wheat and maize production from 1980-1982
100%
22
80% 50 40
Percentage
produced
60% wheat
40% 78 maize
50 60
20%
0%
1980 1981 1982
Year
34
4.9. PIE CHART
A pie chart is a circle divided in to various sectors with areas proportional to the value of
the component they represent. It shows the components in terms of percentages not in
absolute magnitude. The degree of the angle formed at the center has to be proportional
to the values represented.
Example 22: the monthly expenditure of a certain family is given below.
Items Expenditure % Proportion (Pfi) Degrees (360o Rfi)

Clothing 100 100/1000  100 = 10 100/1000  360o = 36
Food 350 350/1000  100 = 35 350/1000  360o = 126
House Rent 250 250/1000  100 = 25 250/1000  360o = 90
Miscellaneous 300 300/1000  100 = 30 300/1000  360o = 108
Total 1000 100% 360o
Solution: The pie chart for the above expenditure is as follows
300
350 Food
House rent
Clothing
Misc.
100
250
4.10. PICTOGRAPH (PICTOGRAM)
A pictograph is a graph that uses symbols or pictures to represent data.
Example 23: In comparing the population of a country from 1990 to 1992, we simply
draw pictures of people where each picture may represent 1000,000 people.
35
1992 -  Key:  = 1000,000
1991 - 
1990 - 
4.11. SUMMERY
This unit discussed how to present the organized data. Once a frequency distribution is
constructed, the representation of the data by using graphs is a simple task. The most
commonly used graphs in research statistics are the histograms, frequency polygon, an
ogive, and other graphs and diagrams, like the bar charts, pie charts, pictograms can also
be used. And some of these graphs are seen frequently in newspapers, magazines, and
various statistical reports.
4.12. ANSWERS TO CHECK YOUR PROGRESS (CYP)

QUESTIONS
CYP 1
y
freq.12
10
x
5 10 15 20 25 30 35
Class boundaries (CBi)
CYP 2
. y
12
10
Cummulative Frequency
36
x
2.5 7.5 12.5 17.5 22.5 27.5 32.5 37.5
Class Marks (cmi)
4.13. MODEL EXAMINATION QUESTION
1. Determine whether each statement is true or false.

a. The ogive uses cumulative frequencies.
b. Histogram can be drawn by using vertical or horizontal bars.
c. In the construction of a frequency polygon, the class limits are used for
the x-axis.
d. Data collected over a period of time can be graphed by using a pie
chart.
e. When the data is represented graphically by symbols or pictures, the
graph is called a frequency curve.
3. Construct a histogram, frequency polygon, and both ogives to represent the data
shown below .
Class Boundaries (CBi) Frequency fi

5.5-10.5 1
10.5-15.5 2
15.5-20.5 3
20.5-25.5 5
25.5-30.5 4
30.5-35.5 3
35.5-40.5 2
37
4.14. GLOSSARY
Histogram: Refers to a statistical graph which represents, by the height of a rectangular

column,
the number of times that each class of result occurs in a sample or
experiment.
Frequency Polygon: Refers to the graph obtained when the mid points of the tops of the
rectangles in a histogram having equal class intervals are
connected
by line segments.
Frequency Curve: Refers to a smooth frequency polygon for data that can take a
continuous set of values.
Ogives: Are cumulative frequency curves.
Bar Chart: Refers in a graph made up of bars whose lengths are proportional to
quantities in a set of data
Pie Chart: Refers to a diagram wherein proportions are shown as sectors of a circle.
Pictogram: Refers to a diagram that shows statistical data in a pictorial form.
4.15 REFERENCES
Refer the list of books in unit 1.
38
BLOCK 4: MEASURES OF CENTRAL
TENDENCY

UNIT 8: RELATIONSHIP BETWEEN MEAN, MEDIAN, AND MODE
INTRODUCTION
Visual presentation of data would disclose some characteristic features of a mass of data.
And further summarization of data is so essential to show the relationship between
variables and to correlate one variable with another. To describe the characteristic
features of the entire mass of data with single quotient, the more obvious measure that
helps to make quicker and better decision is the measure of Central Tendency, also called
the Averages. An average gives a bird's eye view of huge mass of data, which are not
easily intelligible, since it refers to a numerical value that is a central point about which
other values in a series get dispersed.
39
CONTENTS:
5.1 Definition
5.2 Purpose of Average
5.3 Requisites of a good average
5.4 Glossary
5.5 References
5.1 DEFINITION
Statistics provides its tools to reduce each group of values in to a single summary figure
representing each group. These representative values are called averages (the measures
of central tendency). In other words, they are measures, which condense a huge un
widely set of numerical data in to a single value. Its value always lies between the
minimum and maximum values or it has a tendency to be somewhere at the center. In
general, the measures of central tendency is divided in to two
1. Mathematical Measures of Central Tendency

2. Positional Measures of Central Tendency
5.2 PURPOSE OF AVERAGES

The main objectives of calculating average are:
1. It provides one single value. An average reduces the complex mass of data in to a
single representative value, which enables to grasp the salient feature of data.
2. It facilitates comparison. Comparison of two sets of huge raw data can be possible
by working out their averages.
3. It facilitates statistical data inference. An inference, which is derived on the
values calculated from the sample, is called statistical inference. It is helpful in
knowing the parameters of population.
4. It aids in decision-making. The management is often interested in knowing
normal output of a plant, representative sales, overall productivity index, price
index, etc and takes decision on future course of action. All these are the
connotations of an average.
40
To sum up, the averages are very much useful in:
i) Describing the distribution in concise manner
ii) Comparative study of different distributions
iii) Computing various other statistical measures such as dispersion, skew
ness and other basic characteristics of mass of data.
5.3 REQUISITES OF A GOOD AVERAGE

An average should be
 Vigorously (rigidly) defined.
 Easy to understand and calculate.
 Based on all values of the given data.
 Suitable for further mathematical treatment.
 Affected as little as possible by fluctuations of sampling.
 Not be affected much by extreme observations. The extreme values should not
pull up or pull down the value of the average.
5.4 GLOSSARY
Fluctuation - Move up and down or be irregular (of price, level, etc.)
Extreme values - Refers to the largest or smallest variant values which are borne by the
number of a set. The expression signifies values neighboring the end
values.
Inference - Drawing conclusion from facts or by reasoning.
Parameter - Refers to characteristic or determining feature.
5.5 REFERENCES
 Business Statistics, C.R. REEDY. M Com Ph. D., 1994
 Business Statistics [A textbook for B.Com. Students of Indian Universities].
R.H. DHARESHWAR, M.Sc. M.Phil. 1999
41
CONTENTS:
6.0 Aims and Objectives
6.1 Introduction
6.2 Summation Notation and Its Properties
6.3 Arithmetic Mean (AM)
6.4 Geometric Mean (GM)
6.5 Harmonic Mean (HM)
6.6 Advantages and Application Areas of the Three Means
6.7 Summary
6.8 Model Examination Questions
6.9 Answers to Check Your Progress Questions
6.10 Glossary
6.11 References
6.0 AIMS AND OBJECTIVES

 Apply properties of summation notation to solve problems which involve the
summation operator.
 Understand the definition and properties of AM, GM and HM.
 Compute problems of the three means for ungrouped and grouped data.
 Identify the advantages and application areas of the three means.
6.1 INTRODUCTION
The definition should not leave anything to the description of the person who calculated
averages. Averages should be computed with sufficient ease and rapidity or averages
should not involve more of mathematical complexities. The most popular and widely
used measure for representing the entire data by one value is arithmetic mean.
6.2 SUMMATION NOTATION AND ITS PROPERTIES
Summation operator, , implies that the values that follow it are to be summed or added
together.
42
n  upper lim it
xi
i  m  lower lim it
 the i th var iable of x
5
Example  x = x1 + x2 + x3 + x4 + x5
i 1
i
Properties:
1. The summation of sums of differences
 x  yi  
n n n n n n
i 1
i x
i 1
i  y
i 1
i ,  x
i 1
i  yi   x
i 1
i  y
i 1
i
Example: Suppose x1 = 1 , x2 = 3 , x3 = 4 , y1 = 2 , y2 = 5 , y3 = 3
 x  yi  
3 3 3
Then
i 1
i  xi 
i 1
y
i 1
i
(x1 + y1) + ( x2 + y2) + ( x3 + y3) = (x1 + x2 + x3) + ( y1 + y2 + y3)

(1 + 2) + ( 3 + 5) + ( 4 + 3) = (1 + 3 + 4) + ( 2 + 5 + 3)
3 + 8 + 7 = 8 + 10
18 = 18
3 3 3
 xi
i 1
 yi    xi 
i 1
y
i 1
i …… left for the student
2. Multiplication by a constant
n n
 kxi  k  xi
i 1 i 1
Example: Suppose k = 7 and x1 = 2 , x2 = 5 , x3 = 4

3 n
Then  kxi 1
i  k  xi
i 1
kx1 + kx2 + kx3 = k(x1 + x2 + x3)

7(2) + 7(5) + 7(4) = 7(2 + 5 + 4)
14 + 35 + 28 = 7(11)
77 = 77
7.0 Summation of a constant
n
Case 1: If lower limit equal to 1,  k  nk
i 1
Example: suppose k = 6 and upper limit = 4
43
4
Then  k  nk
i 1
4
 6  46
i 1
6 + 6 + 6 + 6 = 24
Case 2: if lower limit is different from 1 or lower limit is greater than 1,

n
 k  n  m   1k
im
for m < n
Example: Suppose k = 8 and upper limit = 6 & lower limit = 4 then

n
 k  n  m   1k
im
 8  6  4  18
i4
8 + 8 + 8 = 3(8)
24 = 24
4. Sum of summations
k n n
 xi 
i 1
 xi 
i  k 1
x
i 1
i for any k < n
Example: Suppose x1 = 2 , x2 = 4 , x3 = 6, x4 = 3, x5 = 2, x6 = 4 let k = 3 then

k n n
 xi 
i 1
 xi 
i  k 1
x
i 1
i
3 6 6
x
i 1
i  x
i 4
i  x
i 1
i
(x1 + x2 + x3) + (x4 + x5 + x6) = x1 + x2 + x3 + x4 + x5 + x6

(2 + 4 + 6) + (3 + 2 + 4) = (2 + 4 + 6 + 3 + 2 + 4)
21 = 21
6 6
Let  xi  10, x  148 , x1 = 3 , x2 = 2
2
CYP 1 i
i 3 i 3
6 6 6 6
 xi  xi  xi ( xi  2)  (2 xi  3) 2
2
Find i. ii. iii. iv.
i 1 i 1 i 1 i 1
44
2
v. 
i 1
(ixi  4)
6.3 ARITHMETIC MEAN (AM)
6.3.0 Definition
The arithmetic mean is the sum of the values in a group divided by the number of items
in that group. Let x1, x2, …, xn be n values of a variable x, then their arithmetic mean is
n
x  x2    xn x i
x
defined by: x  1  i 1

n n n
Where x – sum of all observations
n – total number of observations
6.3.1 Computation Of Arithmetic Mean For Ungrouped And

Grouped Data
For ungrouped data:
n
x i
d
Direct method: x  i 1
Short cut method: x  A 
n n
Where n – number of items A = Assumed mean d = sum of deviations i.e. ( xi -
A)
Example: Find the arithmetic mean for the following data by
i. direct method ii. short cut method
23.4 15.6 22.1 20.0 26.7 31.4 18.9 22.3
Solution:
8
8 x i
180.4
i.  xi = 180.4 , n = 8
i 1
x  i 1
n

8
 22.55
ii. Let A = 22 then di : 1.4, -6.4, 0.1, -2, 4.7, 9.4, -3.1, 0.3
8
8 d i
4 .4
d
i 1
i = 4.4 , n = 8 x  A  i 1
n
 22 
8
= 22 + 0.55 = 22.55
For grouped data:

For Discrete Series:
45
n
fx i i
 fx  fd
Direct method: x  i 1
 Short cut method: x  A 
n n n
Where f - frequency d - deviation of items from assumed mean (xi – A)
A - assumed mean n - number of observations
Example: Given data of 50 students of marks of a test in a class. Calculate the arithmetic
mean by i. direct method ii. short cut method.
No. of Students 20 30 40 50 60 70
Marks 8 10 16 8 5 3
Solution:
Marks xi fi fx Di = ( x – 40) where fd
A = 40
20 8 160 -20 -160
30 10 300 -10 -100
40 16 640 0 0
50 8 400 10 80
60 5 300 20 100
70 3 210 30 90
50 2010 10
i. x 
 fx 2010
  40.20
n 50
ii. x  A 
 fd  40  10  40.20
n 50
For continuous series:
Direct method x 
 fcmi   fd 1 
Step deviation method x  A   c
n  n 
 
Where , f – frequency n – number of observation
Cmi – class mark A – assumed mean
d – derivation of class marks from assumed mean (cmi – A)
d' – d/c c – class width
Example: In a survey, the number of persons at different ages is found as follows:
Age in Year 5 - 15 15 - 25 25 - 35 35 - 45 45 - 55 55 - 65
No. of Persons 8 10 14 20 16 12
Solution:
Classes f cmi fcm d = cm – A df d1 = d/c fd1

where A = 30
5 - 15 8 10 80 -20 -160 -2 -16
15 - 25 10 20 200 -10 -100 -1 -10
46
25 - 35 14 30 420 0 0 0 0
35 - 45 20 40 800 10 200 1 20
45 - 55 16 50 800 20 320 2 32
55 - 65 12 60 720 30 360 3 36
80 3020 620 62
i. x 
 fcm 3020
  37.75
n 80
ii. x  A 
 fd  30  620  30  7.75  37.75
n 80
  fd 
1
iii. x  A     c  30   62  10  30  0.775 10  37.75

 n   80 
 
CYP 2 Find the arithmetic mean of the following data.
i. 53 54 52 32 30 60 47 46 35 29
ii.
Height 64 65 66 67 68 69 70 71 72 73
(in inches)
No of 4 9 12 18 20 12 10 9 4 2
Students
iii.
Classes 10 - 15 15 - 20 20 - 25 25 - 30 30 - 35
Frequencies 5 6 7 7 5
6.3.2 Properties of the Arithmetic Mean

The mathematical properties of an arithmetic mean are discussed below:
1) The product of arithmetic mean  x  and the number of observations (N) on
n
which the mean based is equal to the sum of all given values i.e. x N  x
i 1
i
2) The algebraic sum of the deviations of the given values from the arithmetic mean
is equal to zero. Mathematically,
 xi  x   0 … for ungrouped data
 f x
 x   0 … For grouped data
i i
Because of this property, the arithmetic mean may be characterized as a point of

‘Balance’
3) The sum of squares of deviations is minimum when deviations are taken from the
arithmetic mean. i.e.
 xi  x    xi  A … For ungrouped data
2 2
47
 f x  x   f x  A …For grouped data. Where, A is different
2 2
i i i i
from
mean.
4) Suppose the mean of the values x1 , x2, … , xn be x0 . Then
i. if a constant k is added to each xi, then the new mean xn will be x0 +
k.
Proof: Arithmetic mean of x1 + k, x2 + k, …, xn + k is
n
 x i  k
 x1  k  x2  k    xn  k
A .M  i 1

n n
A .M 
 x1  x2    x n   k  k    k
n
x1  x 2    x n nk
A .M  
n n
A .M  x 0  k
ii. if each value is multiplied by a constant k, then the new mean will be k x0
Proof: A.M for kx1 , kx2, … kxn, is
n
 kx i
kx1  kx2    kxn
A.M  i 1

n n
k  x1  x2    xn 
A.M 
n
A.M  kx0
4
Example: Given data 12, 10, 8, 6, 16, 7, 11. If each item is multiplied by and 8 is
5
added, what will be the new mean?
4
7
xn  x0  8
x i
70
5
x0  i 1
  10 New mean
7 7
xn 
4
10  8  16
5
48
CYP 3 Given data 3, 8, 9, 4, 7, 5, 10, 11, 6 if each item is multiplied by 2 and 6 is
added, then
i. The new mean will be _______________
ii.   xi  x   __________________
6.3.3 Combined (Pooled) Arithmetic Mean
Let x1 and x2 be the AM of two groups, having N1 and N2 observations respectively. If

we combine the two groups in to a single group, then the arithmetic mean of the
N x  N 2 x2
combined group will be xc  1 1
N1  N 2
n
N x  N 2 x2    N n xn N x i i
For n number of groups, xc  1 1  i 1
N1  N 2    N n n
N
i 1
i
Example: The mean height of 25 male and 20 female is 161.0cm and 155.6cm. What
will be the combined mean height?
xm = 161.0cm, xF = 155.6cm, NM = 25, NF = 20
xm N m  x F N F
xc 
Nm  NF
161.0 25  155.6 20  7137
xc    158.60cm
25  20 45
CYP 4 In a factory, 120 workers get an average wage of birr 30 a day, 160 workers get
Birr 50 a day, 80 workers get Birr 60 a day and 40 workers get birr 80 a day.
Find
i. the average of averages.
ii. the general average.
6.3.4 Weighted Arithmetic Mean

An item or value may be relatively more important or less important than other items.
This relative importance is technically known as weight. In case where the relative
importance of the different items is not the same we compute weighted arithmetic mean.
If w1, w2, …, wn are weights attached to the values x1, x2, … , xn respectively, then the
weighted AM is defined as
49
xw 
x1 w1  x2 w2    xn wn

 wx
w1  w2    wn w
Example: An auto ride costs Birr 5 for the first km, Birr 4 for the next 3kms and Birr 9
for each of the subsequent kms. Find the average cost per km for 10 kms.
Rate (Birr) Distance (km) w xw

5.00 1 5.00
4.00 3 12.00
9.00 6 54.00
10 71.00
xw 
 xw 
71.00
 7.10Birr
w 10
CYP 5 Given data
Designation Class I Class II Subordinate Clerical Lower

Officer Officer Staff Staff Staff
Monthly Salary (in birr) x 1500 800 500 250 100
Strength of Cadre. w 10 20 70 100 150
Calculate i. the simple arithmetic mean.

ii. the weighted arithmetic mean.
6.3.5 Correcting the Arithmetic Mean
Wrong Sum  Wrong Entry  Correct Entry

Formula: Correct Mean 
Total Number of Observations
Examples:
1. The average mark of 100 students was found to be 40 but latter it is discovered that a
score of 33 was misread as 83. Find the correct average corresponding to the correct
sum.
x  40 
   xi  x N  40 100   4000 wrong sum
N  100
Wrong Entry = 83
Correct Entry = 33
50
4000  83  33 3950
Correct Mean    39.5
100 100
2. The average of a class having 35 pupils is 14 years. When the age of the class
teacher is added to the sum of the ages of the pupils, the average rises by 0.5 year.
What must be the age of the teacher?
x  14 
   xi  14 35  490 … Sum of ages of the pupils
N  35
x  14.5
   xi  14.5 36   522 … Sum of ages of the pupils and the teacher
N  36 
 Age of the teacher is 522 - 490 = 32 years.
3. Goals scored by a football team in successive matches are 5, 2, 4, 3, 6, 0, 4 and 6.
What is the number of goals the team must score in the next match in order that the
average comes to 4 goals per match?
Total goal scored in 8 matches = 5 + 2 + 4 + 3 + 6 + 0 + 4 + 6 = 30
Total goal scored in 9 matches = x .N = 4  9 = 36
Hence the goals required in the 9th match to bring the average 4 = 36 – 30 = 6
CYP6 The mean of 200 items is 50. Later on it is discovered that two items were
wrongly taken as 92 and 8 instead of 192 and 88. Find out the correct mean.
CYP7 The average rainfall for a week, excluding Sundays, was 10cm. Due to heavy
rainfall on Sunday, the average rainfall for the week rose to 15cm. How much
rain fall was there on Sunday?
6.4 GEOMETRIC MEAN (GM)

6.4.0 DEFINITION
Geometric mean is defined as the nth root of the product of n items or values of a series.
If there are two items, we take square root; if there are three items, the cube root and so
on.
Symbolically, let x1, x2, … , xn be the n values of a variable x, then their G.M is defined
as
G.M  n x1 . x2  xn
If the number of observation is more than three or more, the computation of the nth root is
very tedious. To simplify computation, the logarithms are used. In terms of log.
51
Log G .M  Log n x1 . x 2 . x 3  xn
 x1 n
1
 Log . x2  xn

1
. Log  x1 . x2  xn 
n

1
 Log x1  Log x2    Log xn 
n
n
1

n
. 
i 1
Log xi
1 n

Anti log  Log GM   Anti log n .  Log xi 
 i 1 
1 n

GM  Anti log n


i 1
Log xi 

6.4.1 Computation of Gm for Ungrouped and Grouped Data

1 n

For ungrouped data: G.M  Anti log 
n
 Log x 
i 1
i
1 n

For grouped data: G.M  Anti log 
n

i 1
f i . Log xi 

Example: Compute GM of the following data.

i. x : 23 27 54 35 50
ii. x: 10 16 22 28 34
f: 5 4 3 6 2
iii. Classes: 30 – 40 40 – 50 50 – 60 60 – 70
fi 5 8 4 3
Solution:
i. x: 23 27 54 35 50
Log x : 1.3617 1.4314 1.7324 1.5441 1.6990
5
 Log x
i 1
i  9.4021 n = 5
1 
G.M  Anti log  9.4021  Anti log 1.5670   36.9
5 
52
ii. x: 10 16 22 28 34
f: 5 4 3 6 2
Log x: 1 1.2041 1.3424 1.4472 1.5315
fi log xi: 5 4.8164 4.0272 8.6532 3.0630
20
f
i 1
i Log xi  25.5598
1
G.M  Anti log  25.5598  Anti log 1.2780   18.6
 20 
iii. Classes: 30 – 40 40 – 50 50 – 60 60 – 70
fi : 5 8 4 3
CMi : 35 45 55 65
Log CMi : 1.5441 1.6532 1.7401 1.8129
fi Log CMi : 7.7200 13.2256 6.9612 5.4387
1
G.M  Anti log  33.3455  Anti log 1.6673  45.81
 20 
CYP8 Calculate GM for the following data.
i. x: 8 40 175 1209 2000
ii. x: 2 3 4 5 6
f: 5 7 8 3 2
iii. Classes: 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50
– 60 fi : 2 5 6 18 13
6
6.5 HARMONIC MEAN (H.M)

Definition:
Harmonic mean is the reciprocal of the arithmetic mean of the reciprocals of the given
observations.
Symbolically, if x1, x2, …, xn are n variables of x then their HM is given by:
1 n
H .M   n
1 1 1 1

n  x1

1
x2
   
xn 
i 1 xi
53
6.5.1 Computations of Harmonic Mean for Ungrouped and Grouped
Data
n n
For ungrouped data : H .M  n
For grouped data: H .M  n
1 fi
x
i 1
x
i 1
i i
Example: Find the harmonic mean of the following data

i. x : 20 30
ii.
x 2 3 4 5 6
f 5 7 8 3 2
iii.
Classes 20 - 24 25 - 29 30 - 34 35 - 39 40 - 44 45 - 49 50 - 54
fi 11 18 32 37 21 47 13
Solution:
2 2 120
i. 20 30 H .M     24
1 1 5 5

20 30 60
ii.
x 2 3 4 5 6
f 5 7 8 3 2
f/x 2.5 2.33 2 0.6 0.33
25
fi 25
x i 1
 7.76 H .M 
7.76
 3.22
i
iii.
Classes 20 - 24 25 - 29 30 - 34 35 - 39 40 - 44 45 - 49 50 - 54
fi 11 18 32 37 21 47 13
Cmi 22 27 32 37 42 47 52
fi/Cmi 0.5 0.67 1 1 0.5 1 0.25
179
fi 179
 CM
i 1
 4.92 H .M 
4.92
 36.38
i
CYP9 Find the H.M of the following data

i. x: 10 25 12 8 5
54
ii.
Marks 40 50 60 70
No. of Students 20 30 50 10
iii.
Classes 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 60 - 70
fi 4 6 10 12 5 3
6.6 ADVANTAGES AND APPLICATION AREAS OF THE THREE

MEANS
6.6.1 Advantages:
All are i. rigidly defined.
ii. based on all the observations.
iii. suitable for further mathematical tea.
AM iv. easy to calculate and understand.
v. is least affected by fluctuations of sampling compared to other averages.
GM iv. it gives highest weightage to smaller values and smaller weightage to large
values.
v. it is a proper average to measure the relative change (like percentage increase
in Population, sales over a period of time, etc.
HM iv. is not affected very much by fluctuation of sampling.
v. is particularly useful in averaging speed, special types of rates and ratios
where time factor is involved.
vi. since the reciprocals of the variables are involved, it gives greater weightage
to smaller values.
6.6.2 Application Problems
1. Prove that
i. AM = GM = HM if all the values are equal in a series.
ii. HM < GM < AM if the values are different in a series.
Solution:
i. Suppose there are two items x and y in the series
If x = y = 7, then
x  y 2
AM  , GM  xy , HM 
2 1 1

x y
55
7  7 2 2  7
AM   7 , GM  7  7  7 , HM    7
2 1 1 2

7 7
Therefore, AM = GM = HM
ii. Suppose there are two items x and y in the series
x  y
Then AM  , GM  xy
2
If x  y, then x – y > 0
x  y  0
 x  y 
2
 0
x  y 2 xy  0
x  y  2 xy
x  y
 xy
2
AM  GM
xy
Consider  xy This is proved above.
2
xy xy xy
 by multiplying both sides by we get
2 x  y x  y
2 xy xy 1 2 2
xy   2 .  2 .  
x  y x  y x  y x y 1 1
 
xy xy xy y x
2
 xy   GM  HM
1 1

x y
Therefore, HM < GM < AM
Note- We can have the following relationship between the three means.
x  y  2 xy 
AM . HM     xy
2 x  y
To equalize AM . HM to GM, we put AM . HM under square root
x  y  2 xy 
GM  xy  .    AM . HM
2 x  y
 GM  AM . HM If there are only two positive observations in the series.
56
2. The price of a commodity increased by 5% from 1979 to 1980, by 9% from 1980 to
1981 and by 73% from 1981 to 1982. The average increase from 1979 to 1982 is
quoted as 25.6% and not 29%. Verify.
Solution:
Year Price at the end of the year taking preceeding as 100%. (X) Log X
1980 100 + 5 = 105 2.0212
1981 100 + 9 = 109 2.0374
1982 100 + 73 = 173 2.2380
6.2966
AM = 5 + 9 + 73 = 87 = 29
3 3 3
GM = Antilog[1/3(6.2966)] = Antilog(2.0989) = 125.6
Therefore, Rise in price is 125.6 - 100 = 25.6%
Verification:
Year Rise Price would be Growth 25.6% Growth 29%
1979 100 100 100
1980 5% 105 125.6 129
1981 9% 114.45 157.75 166.41
1982 73% 198 198 214.67
Thus GM is the best average to give us the true rise in price.
3. World Population has increased from 5 billion to 6 billion within 12 years. Calculate
the average increment per year.
Solution:
The average annual increase is computed by applying the formula
n n
Pn = Po(1 + r) or r =  Pn/Po - 1.
Where Pn - the amount at the end of the period
Po - the amount at the beginning of the period
n - time (years)
r - rate of change
Pn = 6, Po = 5, n = 12 r = ?
12
r=  6/5 - 1 = 1.01 - 1 = 0.01
The average increment per anum = 1%
Therefore, GM is used in determination of average percentage of change in amount.
57
4. A machine depreciates by 40% in the first year, by 25% in the second year and by
10% per anum for the next three years. Each percentage being calculated on the
diminishing value, what is the average percentage of depreciation for the entire
period?
Solution:
Depreciation (%) After depreciation (%) = X Log X
40% 60% 1.7782
25% 75% 1.8751
10% 90% 1.9542
10% 90% 1.9542
10% 90% 1.9542
9.5159
GM = Antilog [1/5(9.5159)]
= Antilog (1.9032)
 81
Rate of depreciation per anum is 100 - 81 = 19%
5. The weighted GM of 5 numbers 10, 15, 25, 12 and 20 is 17.15. If the weights of the
first four numbers are 2, 3, 5 and 2 respectively, find out the weight of the fifth
number.
Solution:
X W Log X (LogX).W
10 2 1.0000 2.0000
15 3 1.1761 3.5283
25 5 1.3979 6.9895
12 2 1.0792 2.1584
20 x 1.3010 1.3010(x)
14.6762 + 1.3010.x
Log17.15 = 14.6762 + 1.3010.x
12 + x
1.2343 = 14.6762 + 1.3010.x
12 + x
-0.0667x = -0.1354
x = 2.03
The missing weight is 2.
6. A cyclist pedals from his house to his college at a speed of 8 kmph and back from the
college to home at 12 kmph. Find the average speed.
Solution:
Let the distance between the house and the college be x kms. Then the distance from
house to college is covered in x/8 hrs and from college to house in x/12 hrs.
58
And the total distance = 2 x (house to college and back) is covered in (x/8 + x/12)hrs.
Average Speed = Total distance traveled
Time taken
= 2x = 2x = 48x = 9.60kmph
x/8 + x/12 5x/24 5x
7. Mr. Raga traveled a distance of 900 kms by train at an average speed of 60 kmph, 200
km by boat at speed of 20 kmph, 1000 km by plane at 800 kmph speed and finally 4
km by taxi at 25 kmph speed. What is the average speed for the entire distance?
Solution:
X W X/W
60 900 15.00
20 200 10.00
800 1000 1.25
25 4 0.16
2104 26.41
Weighted HM = W
W/X
= 2104 = 79.67 kmph.
26.41
CYP 10 If the arithmetic mean and the geometric mean of two items is 12.5 and 10
respectively, then
i. find the HM of the two items.
ii. find the value of the two items.
CYP 11 A motorist travels at a uniform speed of 20 kmph, 60 kmph and 30 kmph from
A to B, B to C and C to D respectively. Find the average speed.
CYP 12 In a factory, a unit of work is completed by A in 5 minutes, by B in 7 minutes,
by C in 4 minutes, by D in 8 minutes and by E in 6 minutes.
i. What is their average rate of work?
ii. What is the average number of units of work completed per minute?
iii. At this rate, how many units of work will they complete in six hours a
day?
CYP 13 Find the average rate of increase in Population which in the first decade had
increased by 20%, in the next by 30% and in the third by 40%.
6.7. SUMMARY
Arithmetic mean is mostly used in practice of all areas because its characteristics value
being represented to all items in the variable.
59
Geometric mean is widely used in averaging ratios and percentages and in computing
average rates of increase or decrease.
Harmonic mean is useful in comparing the values of a variable with constant quantity of
another variable, i.e. time, rate, distance covered, quantities purchased or sold per unit
etc.
6.8. MODEL EXAMINATION QUESTIONS

1. Given data:
Income ('000) Below 10 10-30 30-60 60-100 Above 100

# of Students 5 10 15 8 2
Calculate the mean income.
2. Calculate the number of shops corresponding to class interval 30 - 40 of the
following distribution if the mean profit is 28.
Profit per Shop 0-10 10-20 20-30 30-40 40-50 50-60

# of Shops 12 18 27 f4 17 6
3. Find the class intervals if the AM of the following distribution is 30.1 and
assumed mean is 31.5.
Step deviations (d') -3 -2 -1 0 1 2

Frequency (fi) 5 10 25 30 20 10
4. The mean weight of 150 students of a class is 60kgs. The mean weight of boys is
70kgs and that of girls is 55kgs. Find the number of boys and girls in the class.
5. The price of a commodity increased by 20% in 1989, decreased by 12% in 1990
and increased by 15% in 1991. Calculate the average annual change in price.
6. If the price of a commodity triples in a period of 6 years, what is the average
percentage increase per year?
7. A train runs the first 40 kms at a speed of 60 kmph, the next 60 kms at a speed of
80 kmph and the last 80 kms at a speed of 100 kmph. What is the average speed
of the train for the whole journey?
8. If the GM of two positive observations is 2/3 of their AM and the sum of the two
observations is 18, then
i. their HM is ____________
ii. the two observations are _________ and _________
60
6.9. ANSWERS TO CHECK YOUR PROGRESS QUESTIONS
CYP 1 i. 15 ii. 161 iii. 128 iv. 518 v. 15
CYP 2 i. 43.8 ii. 68 iii. 22.67
CYP 3 i. 20 ii. 0
CYP 4 i. 55 ii. 49
CYP 5 i. Br 630 iii. Br 302.86
CYP 6 50.9
CYP 7 45cm.
CYP 8 i. 168.4 ii. 3.347 iii. 32.20
CYP 9 i. 9.12 ii. 52.98 iii. 33.44
CYP 10 i. 8 ii. 5 and 20
CYP 11 30 kmph.
CYP 12 i. 5.65 minutes. ii. 0.177 units of work / minute. iii. 63.72 units of
work.
CYP 13 i. 29.7%
6.10. GLOSSARY
Assumed mean - Refers to an estimated or approximate value for the arithmetic mean
or average which is used to simplify its calculation. The nearer it is to
the mean, the smaller are the numbers involved.
Class Interval - The range of interval between the highest and lowest values allowed
in a particular class.
Depreciate - Make or become less in value (being diminished in value).
Deviation - Refers to the difference between a value of a variable and the mean of
its distribution.
61
Rate - Refers to standard of reckoning, obtained by bringing two numbers or
amounts into relationship like a period of time and a number of people,
Currencies, Tax, etc.
Time Series - Refers to a set of values of a variable recorded over a period of time.
6.11. REFERENCES
 Business Statistics, C.R. REEDY. M Com Ph. D., 1994

 Practical Business Statistics. T .K. Nag pal, P.S. Narayana. , 1988
 Business Statistics, Dr. J. S. CHANDAN, Prof. Jadjit Singh, Ph D. (USA), 1996
62
CONTENTS:
7.1 Introduction
7.2 Mode
7.3 Median
7.3 Quartiles, Deciles and Percentiles and Grouped Data
7.4 Summary
7.7 Glossary
7.8 References
At the end of this unit, the students will be able to

 Define and state the importance of mode, median, quartiles, deciles and
percentiles.
 Calculate the modal value, median, quartiles, deciles and percentiles of a
certain distribution and interpret.
7.1 INTRODUCTION
The mode and median are called positional measures of central tendency. The term
position refers to the place of a value in the series. The values being divided by a number
of equal parts are called partition values. Besides median, which divides a series in to
equal parts, the quartiles, deciles and percentiles are important measures.
7.2 MODE
A value, which occurs most frequently in a series of observations, is called Mode. So by

looking the observations mode can be identified.
It is the value, which has the greatest frequency density in its immediate neighborhood.
Importance:
1. Mode can be used as a central location for qualitative as well as quantitative data,
like the median. Example, if a beauty measurement turns in to three impressions
63
or scores, which we rate ‘very beautiful’, ‘beautiful’ and ‘not beautiful’, then the
modal value is beautiful.
2. Like the mean, the mode is not affected by extreme values.
3. Mode can be used when one or more of the classes are open-ended.
Computation of Mode for Ungrouped and Grouped Data

For ungrouped data: Mode  x̂  = that value in the data set, which occurs most often.
For grouped data: Discrete Series: Mode  x̂  = the value of the variable corresponding
to the maximum frequency.
Continuous Series: The class corresponding to the maximum frequency is called the
modal class. The value of mode is obtained by the following
interpolation formula.
 f1  f 0 
Mode  xˆ   l     c

 1 f  f 0    f1  f 
2 
or
 1 
Mode xˆ  l    c
 1   2 
Where l – LCB of the modal class f2 – frequency succeeding f1
f1 – maximum frequency C – magnitude of the class
f0 – frequency preceding f1 ∆1 = f1 – f0
f2 – frequency succeeding f1 ∆2 = f1 – f2
Example: Find the value of mode for the following data

i. 25, 15, 23, 40, 27, 25, 23, 25, 20, 19, 22, 24, 25
ii.
x 10 20 30 40 50 60
f 4 9 16 25 22 15
iii.
Classes 0-9 10 - 19 20 - 29 30 - 39 40 - 49 50 - 59 60 - 69 70 - 79
fi 328 350 720 664 598 524 378 244
Solution:
i. Mode = value which occurs most often
Mode = 25
ii. Mode = Value of the variable with maximum frequency
Mode = 40
64
iii. Modal Class = 19.5 - 29.5
l = 19.5 f0 = 350 f1 = 720 f2 = 664 c = 10
 f1  f 0 
Mode  xˆ   l     c
  f1  f 0    f1  f 2  
 720  350  3700
 19.5     10  19.5   28.1854
 720  350   720  664   426
CYP 14 Find the modal value of the following data.

i. 27, 33, 42, 25, 23, 27, 25, 33, 27, 28, 16, 18
ii.
Height (in inches) 58 60 61 62 63 65 68 70
No. of Persons 4 6 10 8 20 24 9 5
iii.
Classes 0 - 400 400 - 800 800 - 1200 1200 - 1600 1600 - 2000
fi 4 12 40 41 27
7.3. MEDIAN
The median is that value of the variable, which divides the group in to two equal parts,
one part comprising all the values greater and the other all the values less than median.
Or median can be defined as the middle value of a set of data values when they are
arranged in ascending or descending order.
Importance:
 In dealing with qualitative data, median is more suitable average
 Median is recommended if the distribution has unequal classes, since it is simple
to compute than the mean.
 Median is especially useful incase of open-ended classes since it is only positional
and not calculated average.
 The magnitudes of extreme deviations do not influence the median.
65
Commutation of Median for Ungrouped and Grouped Data
For ungrouped data:
First, rearrange the values in the order of magnitude.
Then apply the following formula.
 N  1
th
Median ~
x   vallue of the   item (where n is odd)
 2 
xn  1

2
1  
th th
N N 
Median ~
x   Value of   item  Value of   1  item  Where n is even 
2  2 2  
1  
  xn  xn  1 
2  2 2 
For grouped data:

Discrete Series:
1. Compute the < cfi column.
N  1
2. Search for the value of , if not available, consider the value just greater
2
than it, in the column of < cfi.
Continuous Series:
1. Compute the < cfi column.
N
2. Search for the value of if not available, consider the value just greater than it,
2
in the column of < cfi.
3. Then the following interpolation formula is used to calculate the median.
c n 
Median ~ x  l    c. f 
f 2 
where l - LCB of the median class
c - class interval of the median class
f - frequency of the median class
N
c.f - cumulative frequency jut less than
2
Example: Find the median of the following data.
i. a) 27, 33, 42, 25, 23, 25, 33, 28, 27, 16, 18, 12
66
b) 8, 5, 2, 6, 15, 10, 25
ii.
x 4 6 8 10 12 14 16
f 2 4 5 3 2 4 1
iii.
x 50 - 60 60 - 70 70 - 80 80 - 90 90 - 100 100 - 110

fi 20 21 50 40 53 16
Solution:
i. a. Rearranging:
12 16 18 23 25 25 27 27 28 33 33
42
n = 12 … even
1  
~
x   xn  xn  1  
1
x6  x7   1 25  27  26
2  2 2  2 2
b. Rearranging: 2 5 6 8 10 15 25
n = 7 … odd
~ x
x  n 1  x4  8
2
ii.
x 4 6 8 10 12 14 16
f 2 4 5 3 2 4 1
<cfi 2 6 11 14 16 20 21
th
n = 21 Median = The value of N+1 item
2
th
= 21 + 1 item
2
= The value of the 11th item
= 8
iii.
x 50 - 60 60 - 70 70 - 80 80 - 90 90 - 100 100 - 110
fi 20 21 50 40 53 16
<cfi 20 41 91 131 184 200
67
th
n
Median class = Value of   item  100th item  80 - 90
2
l = 80, c = 10, f = 40, c.f = 91
c n 
Median ~
x  l  100  91  80  9  82.25
10
  c. f   80 
f 2  40 4
CYP 15 Find the median of the following data.

i. a. 20 15 21 13 22 24 22 25 26 27 25
b. 120 125 112 137 129 127
ii.
x 28 30 32 34 36 38 40 42
f 14 15 16 24 16 10 6 4
iii.
x 30 - 34 35 - 39 40 - 44 45 - 49 50 - 54 55 - 59
fi 5 10 15 20 6 4
Determining the value of median Graphically:
Draw both the more than and less than ogives on the same graph. From the point of
intersection of these two curves, draw a perpendicular line to the x – axis. The foot of the
perpendicular line is the value of the median.
Examples: Find the median of the following frequency distribution graphically.
Classes 0 - 20 20 - 40 40 - 60 60 - 80 80 - 100
fi 15 25 30 14 16
Solution:
Classes 0 - 20 20 - 40 40 - 60 60 - 80 80 - 100
fi 15 25 30 14 16
<cfi 15 40 70 84 100
>cfi 100 85 60 30 16
68
The < & > Ogives
120
100
80
<cfi
CFi 60
>cfi
40
20
0
0 - 20 20 - 40 40 - 60 60 - 80 80 - 100
CBi
The perpendicular line drawn from the intersection point meets the x-axis approximately
at 46. Therefore, the Median of the distribution is 46.
7.4 QUARTILES, DECILES AND PERCENTILES

Definitions:
Quartiles: are the three values, which divide the given data in to four equal parts. They
are denoted by Q1, Q2 and Q3.
Q1 - The lower or first quartile. It covers 25% of the distribution.
Q2 - The middle or second quartile. It covers 50% of the distribution.
Q3 - The upper or third quartile. It covers 75% of the distribution.
Deciles: are the nine values, which divide the series in to ten equal parts. They are
denoted by D1, D2, … , D9.
D1 = Covers 10% of the distribution
.
.
.
Percentiles: are the 99 values, which divide the series in to 100 equal parts. They are
denoted by P1, P2 , … , P99.
Note that: i. Q1 = P25 Q2 = D5 = P50 = median Q3 = P75
ii. D1 = P10, D2 = P20, D3 = P30, … , D9 = P90.
69
Importance:
The quartiles are more widely used in Economics and Business while the deciles and
percentiles are important in Psychology and Educational Statistics concerning grades,
rates, ranks, etc. The working principle for computing the partition value is basically the
same as that of computing the median.
Computation of Quartiles, Deciles and Percentiles for Ungrouped and

Grouped Data
For ungrouped data and discrete series:
First, for ungrouped data, rearrange the values in the order of magnitude and for discrete
series, compute the <Cfi column. Then apply the following formula.
i  N  1
th
Q i  value of item
4
i  N  1
th
D i  value of item
10
i  N  1
th
Pi  value of item
100
For continuous series:
1. Compute the <cfi column.
2. Determine the quartile, decile or percentile class.
3. Apply the following interpolation formula.
c  iN 
Qi  l    c. f 
f  4 
c  iN 
Di  l    c. f 
f  10 
c  iN 
Pi  l    c. f 
f  100 
Example: For the data given below, compute the value of Quartiles, D3, D7, P15 and P88
and interpret.
Marks Below 10 10 - 20 20 - 40 40 - 60 60 - 80 Above 80

No. of Students 10 15 25 30 14 6
<cfi 10 25 50 80 94 100
Solution:
th
N
Q1 – size of item = 25th item 10 – 20 quartile class
4
70
l = 10, c = 10, f = 15, c.f = 10
c n 
Q1  l    c. f   10 
10
25  10   20
f 4  15
Mark of 25% of students is less than 20.
th
2N
4
l = 20, c = 20, f = 25, c.f = 25
c n 
Q2  l    c. f   20 
20
50  25  40
f 2  25
Mark of half of students is below 40.
th
3N
4
l = 40, c = 20, f = 30, c.f = 50
c  3n 
Q3  l    c. f   40 
20
75  25  73.33
f  4  30
3
Mark of th of students is below 73.33.
4
th
3N
D3 – size of item = 30th item 20 – 40 decile class
10
L = 20, c = 20, f = 25, c.f = 25
c  3n 
D3  l    c. f   20 
20
30  25  24
f  10  25
Mark of 30% of students is below 24.
th
7N
D7 – size of item = 70th item 40 – 60 decile class
10
L = 40, c = 20, f = 30, c.f = 50
c  7n 
D7  l    c. f   40 
20
70  50  53.33
f  10  30
Mark of 70% of students is below 53.33.
th
15N
P15 – size of item = 15th item 10 – 20 percentile class
100
L = 10, c = 10, f = 15, c.f = 10
c  15n 
P15  l    c. f   10 
10
15  10  13.3
f  100  15
71
th
88N
P88 – size of item = 88th item 60 – 80 percentile class
10
L = 60, c = 20, f = 14, c.f = 80
c  88n 
P88  l    c. f   60 
20
88  80  71.43
f  100  14
CYP 16 Compute the value of Quartiles, D4, P69 and interpret for the data given below.
i. 46 35 28 52 54 43 35 49 46 50 41
ii.
Daily Wages 40 45 50 55 60 65 70
No. of Workers 9 22 26 18 13 8 5
iii.
Rent in 150-250 250-350 350-450 450-550 550-650 650-750 750-850 850-950
Birr
No. of 8 10 15 25 40 20 15 7
Houses
7.5 SUMMARY
The arithmetic mean and median satisfy the conditions of definition and stability. Media
has a distinct merit over mean insofar as easy calculations. Mode can be located just by
inspection. In case, every value occurs the same number of times mode is useless
measure. It is observed that the median, quartiles, deciles and percentiles have good
relation.
7.6 MODEL EXAMINATION QUESTIONS

1. In a class of 15 students, three failed in a test. The marks of 12 students who
passed were 9, 6, 7, 8, 8, 9, 6, 8, 7, 5, 4 and 7. What was the median of all the 15
students?
2. Calculate the mode and median for the distribution of the weights in kgs of 150
people from the data given below.
Weight in Kgs 30 - 40 40 - 50 50 - 60 60 - 70 70 - 80 80 - 90
No. of People 18 37 45 27 15 8
3. For the data given below, find the missing frequencies if median is 37 and mode
is 43 million birr.
72
Fund raised in 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60
millions of birr
No. of NGO’s 3 F2 16 20 F2 16
4. The following distribution shows the marks of 60 students in Economics.

Calculate Q3, D5, P57 and the median.
Marks 31 - 39 41 - 49 51 - 59 61 - 69 71 - 79 81 - 89 91 - 99
No. of Students 12 10 12 9 6 7 4
5. For the following data Q1 is found to be 41. Find the missing frequency.
Classes 30 - 34 35 - 39 40 - 44 45 - 49 50 - 54 55 - 59
fi 8 10 f3 20 12 25
7.7 ANSWERS TO CHECK YOUR PROGRESS QUESTIONS
CYP14 i. 27 ii. 65 iii. 1226.6667
CYP15 i. a. 22 b. 126 ii. 34 iii. 44.5
CYP16 i. Q1 = 35 Q2 – 46 Q3 = 50 D4 = 43 P69 = 50
ii.Q1 = 45 Q2 = 50 Q3 = 60 D4 = 50 P69 = 55
iii.Q1 = 458 Q2 = 580 Q3 = 685 D4 = 542 P69 = 646.5
7.8 GLOSSARY
Interpolation - Refers to the process of calculating an unknown value of a

variable that is between two or more known values in the series.
Open-ended Classes - A class that allows either the upper or lower end of a
quantitative classification scheme to be limitless or
indeterminate.
Percentage - Refers to the proportion or rate per hundred parts. Symbol is %.
To write a number as a percentage, all that is needed is to
multiply it by 100.
Stability - Refers to quality of being firm or not likely to move or change.
73
7.9 REFERENCES
 Practical Business Statistics. T .K. Nag pal, P.S. Narayana. , 1988

 Statistics for Business and Economics. J. S. CHAN DAN., 1998
 Business Statistics [A Text Book for B. Com. Students of Indian
Universities] R.H DHARESHWAR, M.Sc. M. Phil., 1999
74
UNIT 8: RELATIONSHIP BETWEEN MEAN, MEDIAN AND MODE
CONTENTS:
8.1 Introduction
8.2 Symmetric and Moderately Skewed Distribution
8.3 Summary
8.4 Model Exam Questions
8.6 Glossary
8.7 References

At the end of this unit, you will be able to
 Describe the empirical relationship existing between mean, median and mode.
 Determine the properties of symmetric distribution.
 Determine the skew ness of a certain distribution.
8.1 INTRODUCTION
For a moderately symmetric distribution, median lies between mean and mode. An
approximate relationship among these averages is:
Mean – Mode = 3 (Mean – Median) or
Mean – Median = 1/3 (Mean – Mode).
From this empirical relationship, we can see that median is closest to mean than mode. If
the maximum frequency has repeated or if the grouping gives two modal classes, then the
distribution is called Bi-modal distribution. In such situation, mode is obtained by:
Mean – Mode = 3 (Mean – Median) or
Mode = 3 Median – 2 Mean
Example: Find the value of mode for the following distribution.
Wages 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 60 - 70 70 - 80
No. of 10 40 20 0 10 40 16 14
Persons
x
fx i i5890
  39.2667
N 150
c n 
~
x  l    c . f   40 
10
75  70  45
f 2  10
Then
x  2 x  345  239.2667   135  78.5334  56.4666
xˆ  3 ~
75
CYP 19 Calculate mode using the empirical relationship of mean and median for the
following distribution.
Classes 130-134 135-139 140-144 145-149 150-154 155-159 160-164

fi 5 15 28 24 17 10 1
8.2 SYMMETRIC AND MODERATELY SKEWED DISTRIBUTION
A distribution is said to be symmetrical when the values of the variables, equidistant from
the mean, have equal frequencies.
Consider the following frequency distribution
Classes 20 - 30 30 - 40 40 - 50 50 - 60 60 - 70 70 - 80 80 - 90
fi 12 18 25 36 25 18 12
In this distribution, the mirror images of the frequencies with respect to the central
frequency are present on both sides. Such distribution can be said Symmetric Frequency
Distribution. If we calculate the mean, median and mode for this distribution, we can
find that x  ~
x  xˆ  55 .
Properties of Symmetric Distribution:

1. Mean = Median = Mode
2. The quartiles are equidistant i.e. Q3 – Q2 = Q2 – Q1
3. If we draw a frequency curve to the given frequency distribution, we will get a
symmetric curve.
Skew ness is the study of concentration of frequencies in a frequency distribution. Skew
ness means lack of symmetry. If a distribution is not symmetric, then there should be
higher concentration of frequencies either in the upper half or lower half of the
distribution. This type of distribution is called a skewed frequency distribution. The
frequency curves of a skewed FD have a longer tail on one side.
There are two types of skew ness
1. Positively skewed
2. Negatively skewed
~
x ~
x ~
x
Mean = Median = Mode Mean > Median > Mode Mean < Median <
Mode
Q2 – Q1 = Q3 – Q2 Q2 – Q1 < Q3 – Q2 Q2 – Q1 > Q3 – Q2
76
Symmetric Positively Skewed Negatively Skewed
Example: Test the skew ness of the following frequency distribution.

Solution:
Classes 59.5-62.5 62.5-65.5 65.5-68.5 68.5-71.5 71.5-74.5

fi 7 18 35 34 6
CMi 61 64 67 70 73
< Cfi 7 25 60 94 100
fi CMi 427 1152 2345 2380 438
100
fx i i
6742
x i 1

 67.42
N 100
c n 
~
x  l    c . f   65.5 
3
50  25  65.5  2.1  67.6
f 2  35
 f1  f 0 
xˆ  l    c
 f1  f 0  f1  f 2 
 35  18  51
 65.5    3  65.5   68.3
 35  18  35  34  18
Since x  ~ x  xˆ , the distribution is negatively skewed.
CYP 18 Test the skew ness of the following distribution.
Marks 31-39 41-49 51-59 61-69 71-79 81-89 91-99

No. of Students 12 10 9 12 6 7 4
8.3 SUMMARY
Skew ness discloses the difference between the manners in which the observations are
distributed in a particular distribution compared with a normal distribution.
The relation between the three averages xˆ  3~

x  2 x will be applied if the distribution is
bi-modal or moderately symmetric.
Skew ness is zero if Mean = Median = Mode or if the distribution is symmetric.
77
1. For a certain moderately symmetric distribution, if the mode is equal to 51 and ~

x is
55, then find
a. mean of the distribution.
b. the type of skew ness of the distribution.
2. For a certain symmetric distribution the first and the last deciles are 200 and 360
respectively. What is the modal value of the distribution?
3. Test the skew ness of the following distribution.
Classes 3-10 11-18 19-26 27-34 35-42

fi 1 2 3 6 4
CYP 17 Mode = 144.06
CYP 18 x  59.5 ~x  58.89 xˆ  57.67

~
Since xˆ  x  x the distribution is positively Skewed.
8.6 GLOSSARY
Bi - modal - Refers to a distribution of data points in which two values occur more
frequently than the rest of the values in the data set.
Empirical - Derived from or relating to experiment and observation, rather than
theory.
Skew ness - A form of asymmetry in a frequency distribution.
Symmetric FD -A frequency distribution in which the distribution of frequencies is
identical on both sides of the mode. The Mean, Median and Mode
coincide.
8.7 REFERENCES
 Business Statistics, C.R. REDDY. M. Com. PH.D, 1994

 Business Statistics. Dr. J. S. Chan Dan, Prof. Jagjit Singh, PhD. (USA). ,1996.
 Statistics for Business and Economics, J. S. CHAN DAN, 1998.
78
UNIT 13. PROBABILITY
CONTENTS
13.1 Introduction
13.2 Definition of Probability
13.3 Properties of Probability
13.4 Multiplication Rule of Probability
13.5 Conditional Probability
13.6 Addition Rule of Probability
13.7 Summary
13.8 Answer to check Your Progress questions
13.9 Model Exams
13.10 Glossary
13.11 References
13. 0 AIMS AND OBJECTIVES
At the end of this unit, the student will be able to:

 define what is meant by probability and measure of uncertainty
 determine probability of an event of experiment of various forms
 use the known probabilities of events in understanding and interpreting various
related phenomena
 make inferential investigation applying standard statistical theory to evaluate the
accuracy and reliability of good ness of estimates or to evaluate the level of
confidence use in any hypothesis test in making various kinds of erroneous
decisions based on a sample of observations about the population of which the
sample is taken. Therefore at this level you are required at least to determine the
probability of an event of an experiment applying any one or the other of the
counting method discussed in unit 1.
79
13.1 INTRODUCTION
In the previous unit you have seen methods of counting in finding the number of
elements in an event as well as in a sample space of an experiment, which was the ground
to determine the probability of an event. There are two approaches of the Definitions of
probability. But we will be interested and wok mainly with one of the Definition namely
the classical Definition of probability. You will see different techniques of determining
the probability of an event, but in any case you need to apply the counting methods
discussed in the previous unit.
Before we define what probability of an event of an experiment let us introduce the

following Notation.
Notation: the probability of an event E written as P (E) read as the probability of event E
that is always expressed by a number between 0 and 1 inclusively i.e. for any event E, we
must have
0  P (E)  1.
Note that the probability P of an event E is the numerical information for the occurrence
of event E.
13.2 DEFINITION OF PROBABILITY
13.2.1 Classical Definition of Probability:
Suppose in an experiment there are n equally likely out comes and if an event E can
m n E 
happen in m of these then P ( E )  
n n u 
Example1: In rolling a regular die what is the probability of getting an even number on
the upper face.
Solution: When a regular die is rolled, the number that faces up can be any one of the
six equally likely out comes. 1, 2, 3, 4, 5, or 6 and three of these are even.
Hence n (u) = 6 , n (E) = 3, where E = {2, 4, 6} and u = {1, 2, 3, 4, 5, 6, }
3 1
P (E) = 
6 2
80
Example 2: In rolling a pair of regular dice, what is the probability of scoring a sum
a) 8 b) 9 c) 10 d) 11 e) 12
Solution: n (u) = 36
a) E1 = {(2, 6), (3, 5), (4, 4), (6, 3), (6, 2)} then P (E1) = 5/36
b) E2 = {(3, 6), (4, 5), (5, 4), (6, 0)} then P (E2) = 4/36
c) E3 = {(4, 6), (5, 5), (6, 4)} then P (E3) = 3/36
d) E4 = {(6, 6), (6, 5), (5, 4), (6, 0)} then P (E4) = 2/36
e) E5 = {(6, 6)} then P (E5) = 1/36
Example3: five cards bearing numerals 1, 3, 5, 7, and 9 are placed in a box and two are
with drawn at random. What is the probability that the sum of the numbers shown on the
cards drawn is
a) 4 b) 8 c) 16 d) an even number e) an odd number
Solution: U = {(1, 3), (1, 5), (1, 7), (1, 9), (3, 5) (3, 7), (3, 9), (5, 7), (5, 9), (7, 9)}
n(u) = 10
1 2 1
a) P (E1 )  b) P (E 2 )  c) P (E 3 ) 
10 10 10
d) P (E 4 )  1 e) P (E 5 )  0
Where E1 = {(1, 3)} E2 = {(1, 7) , (3, 5)} E3 = {(7, 9)} E4 = u and E5 =
{}
13.2.2 Relative Frequency Definition of Probability

(Empirical Approximation of Probability)
If an experiment is performed n times and out of which m times an event E occurs then
m
the ratio is called the observed relative frequency of the event for those n repeated
n
experiments and as the number of trials of the experiment increases the observed relative
m
frequency of event E approaches to the probability of event E, Hence
n
81
Relative Frequency Definition of Probability: - In performing an experiment large
number of times in which an event E actually occurs then
Number of Times E Occured
P (E) 
Number of Times experiment was Reapeated
Example4: In an experiment of tossing a fair coin, if 1000 tosses of the coin result 523
523
head, then the observed relative frequency of head is  0.523 . If another 1000
1000
489
toss results 489 heads then the observed relative frequency of heads is  0.489 .
1000
Then the observed relative frequency of heads in the total of 2000 tosses is
523  489 1012
  0.506 .
2000 2000
According to the statistical Definition, continuing in this manner, the observed relative
frequency of heads gets closer and closer to the number called the probability of a head in
a single toss of the coin and that is 0.5.
Example5: How many five-digit numerals can be written using the digits 1, 3, 5, 7 and 9
if no digit is repeated in each numeral? If each numeral is equally likely to be chosen,
what is the probability that the number chosen: -
a) is odd b) is even c) has unit digit is 9
d) is greater than 50,000 e) less than 40,000
Solution: The number of five digit numerals that could be written is P (5, 5) = 5! = 120
a) 1, since all the digit used to write the numerals are odd the unit digit is certainly
odd and thus all the five digit numerals are odd.
b) 0, since the unit digit can never be even, the first digit numeral can never be even,
thus it is an impossible event.
1
c) , since the number of the five digit numerals whose unit digit is 9 is 4! = 24 and
5
24 1
the probability of this event is  .
120 5
82
3
d) , since for the number to be greater than 50,000 the 10,000th digit has to be
5
selected only from 6, 7 or 9 and there after any of the numbers 1, 3, 5, 7 or 9
which was not already selected can be selected once. Hence there are 3  4  3
 2  1 = 72 different numbers greater than 50,000. So the probability of his
72 3
event is  .
120 5
2
e) , since there are 2  4  3  2  1 = 48 different five digit numerals less than
5
48 2
40,000. So the probability of this event is  .
120 5
Example6: From a jar containing 4 white, 3 red and 2 black balls all identical except
color, three balls are drawn at random. How many different out comes are there? What
is the probability that an out come consists of
a. 3 white balls b. 3 red balls c. 2 white and 1 red balls
d. 2 red and 1 white balls e. 1 white and 1 red balls f. 2 red and 1 black
g. 1 red and 2 black h. 1 one fro each color i. 2 white and 1 black balls
Solution: Totally there are 9 balls. Hence the number of possible outcomes of drawing
3 balls randomly is c (9, 3) = 84. Thus
c (4 , 3) 4 1
a. P (3W) =  
c (9 , 3) 84 21
c (3 , 3) 1
b. P (3R) = 
c (9 , 3) 84
c (4 , 2)  c (3 , 1) 6  3 18 3
c. P (2W , 1R) =   
c (9 , 3) 84 84 14
c (4 , 1)  c (3 , 2) 4  3 1
d. P (1W , 2R) =  
c (9 , 3) 84 7
83
c (4 , 1)  c (2 , 2) 4 1
e. P (1W , 2B) =  
c (9 , 3) 84 21
c (3 , 2)  c (2 , 1) 6 1
f. P (2R , 1B) =  
c (9 , 3) 84 14
c (3 , 1)  c (2 , 2) 3 1
g. P (1R , 2B) =  
c (9 , 3) 84 28
c (4 , 1)  c (3 , 1)  c (2 , 1) 4  3  2 2
h. P (1W , 1R , 1B) =  
c (9 , 3) 84 7
c (4 , 2)  c (2 , 1) 6  2 12 1
i. P (2W , 1B) =   
c (9 , 3) 84 84 7
CYP 1 A committee consisting of 5 persons is to be chosen randomly from a group of 6

men and 4 women. What is the probability that exactly 2 of the members of the
committee are women?
CYP 2 If 3 light bulbs are chosen at random from 10 bulbs of which 3 are defective then
what is the probability that a. none of them is defective b. all are defective
c. exactly one is defective d. exactly two are
defective
e. at least two are non- defective.
CYP 3 If a committee of 3 persons is to be randomly chosen from a group of 4 men
and 2
women. What is the probability that exactly one of the members of the committee is a
woman?
CYP 4 Suppose a two-letter word is a one vowel and one consonant pair is written from
the letter of the word “GONDAR”. Whether or not it gives meaning what is the
probability that a randomly chosen word is either “DO” or “GO”.
CYP 5 A three digit whole number is written using the digit 1, 2, 3, …,9. If a digit is
used at most once in a whole number, then what is the probability that a randomly chosen
number is divisible by 2?
84
13.3 PROPERTIES OF PROBABILITY OF EVENT
Definition:
In an experiment if it is certain for an event to occur it is called sure event and if it is
certain for an event not to occur it is called an impossible event.
Note: In an experiment any event E is either sure event, impossible event or some where
in between. Therefore the probability of any event E can be expressed as 0  P (E)  1
where the probability of sure event is 1 and the probability of an impossible event is 0.
i.e. P(s) = 1,
P() = 0 and 0 < P(E) < 1 for any event E such that E  s and E   and the sum of the
probabilities of all the sample points is 1. Where s is the sample space of the experiment.
Example1: In rolling a fair die on a flat surface, an event of getting a “7” on the upper
face is an impossible event and its probability is 0. While an event of getting a number
between 0 and 7 on the upper face is a sure event, its probability is 1. But the probability
of an event E which is a proper subset of the sample space is between 0 and 1, provided
that E  .
Definition:
In an experiment two or more events are said to be mutually exclusive event iff they
cannot occur simultaneously.
Note: In an experiment mutually exclusive events are pair wise disjoint whose union is a
subset of the sample space of the experiment.
Example: In rolling a fair die, the event of getting the set of prime number E1 and the set
of composite number E2 on the upper face are two mutually exclusive events since E1 =
{2, 3, 5} and E2 = {4, 6} can not occur simultaneously.
Definition:
In an experiment two events are said to be complementary iff they are disjoint whose
union gives the sample spaces.
85
Rule of complementary events: If E and E are two complementary events of an
experiment then P(E) + P(E) = 1
Example3: In rolling a regular die, what is the probability that the face appears up
shows not composite number?
Solution: U = {1,2,3,4,5,6} Let E = {4,6} the E = {1,2,3,5}

2 2 4
 P(E) = and P(E)  1 - P (E)  1 - 
6 6 6
Example4: In tossing a fair 5-cent coin three times, what is the probability of achieving
at least one head in the three tosses?
Solution: U = (HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}. Let E be an event
consisting of no head i.e. E = {TTT} then E is an event consisting of at least one head.
1 1 7
Since P(E) = and P(E)  1 - P (E)  1 - 
8 8 8
Example5: Suppose a family plan to have four children. What is the probability that not
all the children have the same sex if it is equally likely for a son or daughter to be born?
Solution: n (u) = 16, let E be an event that the children are all sons or all daughters i.e.
2 2 14 7
E = {SSSS , DDDD} then P(E) = and P(E)  1 - P (E)  1 -  
16 16 16 8
Definition:
Two events are said to be independent if the occurrence of one does not affect the
probability of the occurrence of the other. Several events are similarly independent if the
occurrence of any one does not affect the probabilities of the occurrence of the other. If
two events are not independent then they are said to be dependent. Similarly several
events are not independent then they are said to be dependent.
86
Example6: In rolling a pair of fair dice. Let E1 be an event consisting of prime number
that appears on the upper face of the first die and E2 be an event consisting of composite
number that appears on the upper face of the second die, then since the occurrence of E1
does not affect the probability of the occurrence of E2, E1 and E2 are said to be
independent events.
Example 7: Suppose a box contains 10 balls all identical except in color where 6 of
them are white and 4 of them black. If one ball is drawn randomly and is obtained to be
white, with out replacement if a second ball is drawn randomly then the probability that a
5 4
second ball to be white is , to be black is . But the probability that a first ball to be
9 9
6 4
white was , to be black was Hence the two events are dependent events, since the
10 10
occurrence of one affects the probability of the occurrence of the other.
Note: If the balls were drawn with replacement, the two events would be independent
since the probabilities of a second event to occur would not be affected by the occurrence
of the first.
Example8: If 3 light bulbs are chosen at random from a dozen of bulbs of which 4 are
defective, what is the probability that
a) none is defective b) all defective
c) 1 defective and 2 non defective d) 2 defective and 1 non defective
Solution: there are c (12, 3) ways of choosing 3 bulbs from 12 i.e. 220
c (8 , 3) 56 14 c (4 , 3) 4 1
a)   b)  
220 220 55 220 220 55
c (4 ,1)  c (8 , 2) 4  28 28 c (4 , 2)  c (8 ,1) 12 12
c)   d)  
220 220 55 220 220 55
87
Example9: Suppose from a box containing 7 white and 3 black balls, we draw 2 balls
turn by turn with out replacement. What is the probability of drawing 1 white and 1
black ball?
7 3 21
Solution: The probability of drawing 1st white and then 2nd black is   .
10 9 90
3 7 21
The probability of drawing 1st black and 2nd white is   . Hence the total
10 9 90
probability of drawing 1 white and 1 black is
21 21 42 7 c (7 ,1)  c (3 ,1) 7  3 7
|   or P(1w ,1b)   
90 90 90 15 c (10 , 2) 45 15
CYP 6 Suppose 30 men and 20 women are attending a conference and if 3 participants
are randomly selected to report on the discussion, find the probability that at least one is
a woman.
CYP 7 Suppose a test consists of 10 true – false questions. An unprepared student gives
the answer by guess randomly. What is the probability that he gives
a) all correct answer b) no correct c) 5 correct answer d) at least one correct answer
CYP 8 Among the 12 nominees for the board of directors of a farm cooperative, there are
8 men and 4 women. In how many ways can the members select any two of the
nominees as directors? What is the probability that the selection consists of
a) both men b) both women c) one man and one woman
13.4 MULTIPLICATION RULE OF PROBABILITY
In an experiment the probability that two independent events E1 and E2 occurring is

given by P(E1  E2) = P(E1)  P(E2)
In general the probabilities of n independent events E1, E2, … , En happening is given by

P(E1  E2  …  E2) = P(E1)  P(E2)  , …,  P(En)
88
Example 1: Suppose a die is thrown twice, what is the probability of the 1st throw being
less than 3 and the 2nd throw being less than 4.
Solution: Let E1 be an event of the 1st throw being less than 3, and E2 be an event of the
2 3 1
2nd throw being less than 4. Then P (E1  E2) = P (E1)  P (E2) = . 
6 6 6
Example 2: Suppose one box contains 5 black and 3 white balls and a second box
contains 4 black and 6 white balls if one ball is drawn from each box, what is the
probability that
a) both are black b) both are white c) 1 white and 1 black
Solution: a) let E1 be an event of being black from the 1st box and E2 be an event of
being black from the 2nd box. Then E1 and E2 are independent.
5 4 1
 P (E1  E2) = P (E1) . P (E2) = . 
8 10 4
b) E1 is then an event of being white from the 1st box and E2 is an event of being white
from the 2nd box. Then E1 and E2 are also in dependent events
3 6 9
 P (E1  E2) = P (E1) . P (E2) = . 
8 10 40
c) We get an event of 1 white and 1 black if either we get an event of being white from
the 1st box and black from the 2nd box or an event of being black from the 1st box and
white from the 2nd box. Thus P (E1  E2 )  (E1  E2) = P (E1  E2) + P (E1  E2) =
P(E1) . P (E2) + P (E1) . P (E2)
5 6 3 4 21
= .  .  or
8 10 8 10 40
 10 9  21
P (1w , 1b) = 1 – [P (E1  E2) + P (E1  E2) = 1      or
 40 40  40
c (5 ,1)  c (6 ,1)  c (3 ,1)  c (4 ,1) 30  12 21

P (1w , 1b) =  
c (8 ,1)  c (10 ,1) 80 40
89
Example 3: What is the probability of getting two consecutive kings if two cards are
drawn at random from a deck of 52 playing cards if
a) the 1st card is replaced before the 2nd card is drawn
b) the 1st card is not replaced before the 2nd card is drawn
Solution: a) There are 4 kings among the 52 cards. Thus the probability of the 1st king
4 4 1
and 2nd king to be drawn is .  (the two events are independent)
52 52 169
b) If the 1st card drawn is king and not replaced then there are only 3 kings remained
4 3 1
among the rest 51 cards the probability of the 1st king and 2nd king is . 
52 51 221
Example4: If A and B are events such that P(A) = 0.7 ad P(B) = 0.4 and P(A  B) = 0.2
are A and B independent event?
Solution: Since P(A) P(B) = 0.7  0.4 = 0.28 and P(A  B) = 0.2 we have P(A  B) 
P(A) . P(B) there fore A and B are not independent events.
CYP 9 If P(A) = 0.8 , P(B) = 0.25 and P(A  B) = 0.2 then are A and B independent
events?
CYP 10 Find the probability that a “6” turning up once in the two tosses of a fair die.
CYP 11 Two cards are drawn from a well-shuffled deck of 52 playing cards. Find the
probability that they are both pictured a) if cards are drawn with replacement
b) if cards are drawn with out replacement
CYP 12 Find the probability of three consecutive 2’s turning up in rolling a fair die three
times.
13. 5 CONDITIONAL PROBABILITY
When two events are dependent, the concept of conditional probability is used to show
the occurrence of the related events.
90
Definition:
If A and B are two dependent events then the probability of event B occurring given that
event A has occurred denoted by P (B\A) read as probability of event B given that event
A has occurred is called the conditional probability of B given that A has occurred given
by
P B  A 
P B \ A  
P (A)
Note: If A and B are independent events then P (B\A) must equal P (B) since the
occurrence of A should not affect P (B). Hence P (A  B) = P (A) . P (B) if A and B are
independent events and
P (A  B) = P (A) . P (B\A)
= P (B) . P (A\B) if A and B are dependent events
Example1: Suppose there are 30 applicants for a job in a certain organization, which are
cross- classified by their sex and color.
Black W hi t e
Male 12 8
Female 4 6
Assume that each applicant is equally likely to be chosen for a job. What is the
probability that the applicant chosen is
a) black b) white c)male d) female e) male and
black
f) female and black g) male and white h) female and white
Solution: Let B stands for the set of black applicant W stands for white applicant M
stands for male applicant and F stands for female applicant
12  4 8 8  6 7
a) P (B) =  b) P (W) = 
30 15 30 15
12  8 2 4  6 1
c) P (M) =  d) P (F) = 
30 3 30 3
91
12 8
e) P (M  B) = g) P (M  W) =
30 30
4 6
f) P (F  B) = h) P (F  W) =
30 30
Example2: In Example 1 above find the probability that an applicant chosen is

a) male given that black is chosen b) male given that white is chosen
c) female given that black is chosen d) female given that white is chosen
P (M  B) 12 8 12 3
Solution: a) P (M/B) =   
P (B) 30 15 16 4
P (M  W) 8 15 4
b) P (M/W) =  . 
P (W) 30 7 7
P (F  B) 4 15 1
c) P (F/B) =  . 
P (B) 30 8 4
P (F  W) 6 15 3
d) P (F/W) =  . 
P (W) 30 7 7
CYP 13
Suppose there are 80 employees in a company classified by their academic background
and experience as shown below and if an employee is randomly selected to be a chair
person of the employees association, then Find the probability that the selected person to
have
a) experience below 10 years given that he (she) is graduate
b) experience below 10 years given that he (she) is not graduate
c) experience 10 years or above given that he (she) is graduate
d) experience 10 years or above given that he (she) is not graduate
Experience Academic Back ground
Graduate Non Graduate
Below 10 years 8 26
10 years and above 22 24
92
13.6 ADDITION RULE OF PROBABILITY
In an experiment the probability of one or the other events A or B happening is given by

P (A  B) = P (A) + P (B) – P (A  B)
If A and B are mutually exclusive then P (A  B) = P (A) + P (B) since A  B = 

and P (A  B) = P ( ) = 0
In general if the probabilities of n mutually exclusive events E1, E2, … , En happening are
P1, P2, …, Pn respectively, then the probability of one or the other of the n mutually
exclusive events to occur is given by P (E1  E2  …En) = P (E1) + P (E2) + … + P (En)
= P1 + P2 + …+ Pn
Example1: In throwing a pair of dice, what is the probability of achieving a sum

between 6 and 10.
Solution: Let E1 be an event achieving a sum 7 then E1 = {(1, 6), (1,5), (3,4), (4,3),
6
(5,2), (6,1)} hence P (E1) = , Let E2 be an event of achieving a sum 8 then E2 = {(2,6),
36
5
(3,5), (4,4), (5,3), (6,2)}, hence P (E2) = . Let E3 be an event achieving a sum 9 then
36
4
E3 = {(3,6), (4,5), (5,4), (6,3)} hence P (E3) = since E1, E2 and E3 are mutually
36
6 5 4 15
exclusive events P (E1  E2  E3) = P (E1) + P (E2) + P (E3) =   
36 36 36 36
Example2: 9 cards bearing numerals 1,2,3 …or 9 is placed in box and one card is
withdrawn randomly. What is the probability that the card drawn is numbered either an
odd number or a multiple of 3?
Solution: Let E1 be an event of odd numbered to be drawn i.e.E1 = {1,3,5,7,9} and E2 be
an event of multiple of 3 to be drawn i.e. E2 = {3,6,9}
5 3 2 6 2
 P (E1  E2) = P (E1) + P (E2) - P (E1  E2) =    
9 9 9 9 3
93
Example3: find the probability of drawing a black card or a king from a deck of 52
cards randomly.
Solution: Let E1 be the event of drawing a black card, then n (E1) = 26 and E2 be the
event of drawing a king then n (E2) = 4 where 2 of them are black.
26 4 2 7
 P (E1  E2) = P (E1) + P (E2) - P (E1  E2) =   
52 52 52 13
CYP 14 Find the probability of scoring a sum of 9 or 10 two tosses of a pair of fair dice.
CYP 15 Use addition rule and rule of complementary events to find a formula for the
probability of not getting either event A or event B.
13.7 SUMMARY
1. In performing an experiment large number of times that an event E actually occurs

Number of times E occured
then P(E)  (Relative frequency
Number of times the experiment was repeated
Definition of probability)
2. Suppose in an experiment there are n equally likely out comes and if an event E
m
can happen in m of these then P (E)  . [Classical Definition of Probability]
n
3. Events A and B are mutually exclusive if they can not occur simultaneously hence
AB=
a. Two events are A and B are complementary if they are mutually exclusive
and their union gives the sample space of the experiment and
A  B or B  A therefore P(A) + P(B) = 1
b. Two event A and B are independent if the occurrence of one does not
affect the probability of the occurrence of the other. Several events are
independent if the occurrence of any one does not affect the probabilities
of the occurrence of the others. If A and B are not independent then they
are said to be dependent. If A and B are independent events then
i. P (A and B) = P (A). P (B).
94
ii. P (A and B) = P (A) . P (B/A)
= P (B) . P (A/B) If A and B are dependent
c. The probability of one or the other of two mutually exclusive events A and
B happening is given by P (A  B) = P (A) + P (B)
The probability of one or the other of any two events A and B happening is
given by
P (A  B) = P (A) + P (B) – P (A  B)
d. The probability of one or the other of k mutually exclusive events A1, A2
… or Ak happening is given by P (A1  A2  …  Ak) = P (A1) + P (A2)
+ … + P (Ak).
e. The probability of any event E is given by the sum of the probabilities of
the individual out comes comprising event E.
f. The probability of k in dependent events A1, A2 … Ak happening is given
by
P (A1  A2  …  Ak) = P (A1) . P (A2) . … . P (Ak).
g. If A and B are dependent events then the conditional probability of event

A given that event B has occurred written as
P (A  B)
P A / B is given by P A / B  provided P(B)  0.
P (B)
If events A and B are independent
P A / B  P (A) and P B / A   P (B)
CYP 1 The sample space consists of C (10 , 5) = 252 members. If two are women the
rest three must be men. The event consists of C (6 , 3)  C (4 , 2) = 20  6 = 120
members.
 The probability that exactly two of the members of the committee are women
120 10
is 
252 21
95
CYP 2 The sample space consists of C (10 , 3) = 120
C 7 , 3 35 C 3 , 2   C 7 ,1 21
a.  d. 
120 120 120 120
C 3 , 3 1 C 7 ,1  C 3 , 2   C 7 , 2  22
b.  e. 
120 120 120 120
C 3 ,1  C 7 , 2  63
c. 
120 120
C 2 ,1  C 4 , 2  2  6 12
CYP 3  
C 6 , 3 20 20
CYP 4 The sample space consists of 16 t letter word 8 is vowel – consonant pair and 8 is
2 1
a consonant – vowel pair. Then P (DO of GO) =  .
16 8
CYP 5 The sample space consists of 9  8  7 = 504 members the event consists of 7 
224 4
8  4 = 224  the probability is 
504 9
CYP 6 The probability that at least one is a woman
C 30 , 3 3060 1454
= 1 – P (no woman) = 1   1 
C 50 , 3 17600 1760
CYP 7 He can mark of his paper in 210 different ways.

1 1 252 1
d. b. c. d. 1 – P(no correct) = 1 -
210 210 210 210
1023
=
1024
CYP 8 In C (12 , 2) = 66
C 8 , 2  28 C 4 , 2  6 C 8 ,1  C 4 ,1 32
e.  b.  c. 
66 66 66 66 66 66
CYP 9 Since P (A) P(B) = 0.8  0.25 = 0.2 = P(A  B) then A and B are independent
events.
CYP 10 The sample space consists of 36 put comes out of which in 10 of them 6 turning
10
up once. The probability is
36
96
1st 1 2 3 4 5 6
2n d 1 (1,1) (2,1) (3,1) (4,1) (5,1) (6,1)

2 (1,2) (2,2) (3,2) (4,2) (5,2) (6,2)
3 (1,3) . . . . .
4 (1,4) . . . . .
5 (1,5) . . . . .
6 (1,6) (2,6) (3,6) (4, 6) (5, 6) (6,6)
12  12 12 3 12  11 33 11
CYP 11 a.   b.  
52  52 52 13 52  51 663 221
1
CYP 12 They are independent events. The probability that a “2” turns up in a toss is
6
1 1 1 1
then the probability that three consecutive tosses is    3
6 6 6 6
CYP 13 Let A be the set of employees with experience 10 years or above
B be the set of employees with experience below 10 years
G be the set of employees who are graduates
N be the set of employees who are not graduates
P (B  G) 8 / 80 8 4
a) P (B/G) =   
P (G) 30 / 80 30 5
P (B  N) 26 / 80 26 13
b) P (B/N) =   
P (N) 50 / 80 50 25
P (A  G) 22 / 80 22 11
c) P (A/G) =   
P (G) 30 / 80 30 15
P (A  N) 24 / 80 24 12
d) P (A/N) =   
P (N) 50 / 80 50 25
97
CYP 14 n (S) = 36 , n (E) = 5 , E = {(3 , 6) , (4 , 5) , (5 , 4) , (6 , 3) , (5 , 5)}
5
P(E) =
36
1st 1 2 3 4 5 6
2n d 1 (1,1) (2,1) (3,1) (4,1) (5,1) (6,1)

2 (1,2) (2,2) (3,2) (4,2) (5,2) (6,2)
3 (1,3) . . . . .
4 (1,4) . . . . .
5 (1,5) . . . . .
6 (1,6) (2,6) (3,6) (4, 6) (5, 6) (6,6)
CYP 15 P(A  B) = 1 – P ( A  B) = 1 – (P (A) + P (B) = 1 – P (A) - P (B)
1) If a natural number is written using the digit 1, 2, 3, 4, 5 if each digit is used at most
once in a natural number what is the probability that a randomly chosen number is
a) < 2000 b) between 200 and 3000 c. > 3000
2) In a decimal system of Notation, how many three-digit numeral can be written? What
is the probability that a randomly chosen number is
a) an odd number b) an even number
c) greater than 300 d) less than 700
3) Suppose a test consists of 5 multiple-choice questions each permit 4 answers. An

unprepared student guess at each one to mark on his paper. What is the probability that
he can give correct answer for all of them?
98
4) From a class of 12 boys and 18 girls, three students are randomly selected to represent
their class. What is the probability that
a) all are boys b) all are girls
c) at least one girl d) at least one boy in the section
5) A die is thrown and a card is drawn from a well shuffled deck of 52 playing cards at
the same time, what is the probability that an out come consists of
a) even number from the die and diamond from the card
b) a 2 from the die and a pictured card from the playing cards
c) a prime number from the die and a king from the card
d) a composite number from the die and a red card from the playing card
6) From a lot consisting of 5 defective and 20 non defective items

a) if two items are selected at random what is the probability that both are
non defective
b) if three items are drawn at random what is the probability that all are non
defective
7) If 7 men and 5 women have applied for a job and 4 applicants are randomly selected
from this group, find the probability that
a) all 4 are women b) 2 are men c) at least one is a woman d) at least one
is a man.
8) A box contains 12 fuses of which 3 are defective. Two fuses are randomly selected,
turn by turn with out replacement. Find the probability that
a) both are defective b) both are non defective c) one defective and
one non defective.
9) From a lot consisting of 100 items of which 10 are defective three items are chosen
randomly with out replacement what is the probability that
a) all are defective b) all are non defective c) one defective d) two defective
99
10) One box contains 5 black and 4 white balls; a second box contains 4 black and 3
white balls. If one ball is drawn from each box what is the probability that
a) both are white b) both are black c) the two balls have different color
11) Three balls are drawn successively from a box containing 5 green , 4 yellow and 3
red balls. What is the probability that it is drawn in the order green, yellow and red if
each ball is
a) replaced b) not replaced before the next draw.
12) There are 50 applicants for a job in a company. Some are college graduate and some
are not, some have at least 5 years of experience and some have below 5 years of
experience with the exact break down as given below. If the order in which the
applicants are interviewed by the manager is at random. If G is the event that the first
applicant interviewed is a college graduate and E is the event that the first applicant
interviewed has at least 5 years experience. Determine each of the following
probabilities.
College Graduate Not College Graduate
At Least 5 Years Experience 10 4
Less Than 5 Years Experience 21 15
a) P (G) b)P (E) c) P (E) d) P ( G  E ) e)P (G/E)

f) P (E/G) g) P ( G  E) h) P (G / E)
13) Find the probability of getting a “red card” or a “card with 6” if one card is drawn
randomly form a well shuffled deck of a 52 playing cards.
14) A day of the week is randomly selected, what is the probability that it is neither
Tuesday nor Thursday?
15) A box contains 9 cards each numbered exactly one of 1,2,3…9. If 3 cards are drawn
turn by turn with out-replacement, then what is the probability that the drawn cards
are numbered odd- even- odd or even - odd - even?
16) If P (A) = 0.3 and P (B) = 0.6 then what is known about P (A or B) if A and B are
a) mutually exclusive events b) not mutually exclusive events
100
13.10 GLOSSARY
1. Symbol: - A, B, E, A1, A2 … Ak = Events

A = Complement of event A
P = Probability
S = Sample Space
Ai = A sub I (the ith event)
A  B = either event A or Event B
A  B = event A and event B
2. Notation: -P (E) = Probability of event E,

P (Ai) = Probability of event Ai
P (A/B) = Probability of event A happening given that even B has
occurred
CYP = Check your Progress Questions
MODE = model Examination
P (A  B) = the probability of either event A or event B happening
P (A  B) = the probability of event A and event B both happening.
P(2W) = the probability of getting two whites
P(2R , W) = the probability of getting two red and one white
13.11 REFERENCES
 Elementary Statistics by Mario F. Triola
 Elementary Business Statistics by Freund & Williams
 Statistics (Schaum’s Out line Series) by Murray R. Spiegle Ph.D
101
UNIT 14: DISCRETE PROBABILITY DISTRIBUTION
CONTENTS
14.1 Introduction
14.2 Random Variable
14.3 Definition of Probability Distribution
14.4 Types of Probability Distributions
14.5 Expected Value and Variance of a Probability Distribution
14.6 The Binomial Probability Distribution
14.7 The Poisson Probability Distribution
14.8 Summary
14.10 Check Your Progress Questions (CYP)
14.11 Glossary
14.12 References
1 4 .0 AIMS AND OBJECTIVES
The aim of this unit is to study the concept of random variable and then discuss the most
commonly used discrete probability distributions, the Binomial and Poisson probability
distributions.

 define ‘random variable’
 define ‘probability distribution’
 identify the types of probability distributions
 calculate the mean and variance of a discrete probability distribution
 identify a distribution as Binomial or Poisson
 solve problems involving the Binomial and Poisson distributions.
102
14.1 INTRODUCTION
In block 3, you have learnt how to construct a frequency polygon for a given frequency
distribution. It seemed that there was no way of telling in advance how the polygon
would look like and how the mean and the standard deviation would be. As a result, it
may be necessary to further study the behavior of the frequency polygon so as to study
the general behavior of the distribution in general and make some conclusions, which are
useful for decision-making.
A probability distribution can be thought of as a theoretical distribution, which is a

probability distribution that describes how outcomes are expected to vary.
This section focuses on the definitions of random variable and probability distribution.
Then, you will deal with the two most common discrete probability distributions.
14.2 RANDOM VARIABLE
In block 6, we defined the concept of ‘experiment’ and its associated outcome. A random
variable provides a means of assigning numerical values to experimental outcomes. The
definition of a random variable is as follows:
Definition:
A random variable is a variable whose values are determined by chance. Or,
A random variable is a numerical description of the outcome of an

experiment.
Notation: Random variables are usually denoted by capital letters like X, Y, Z, etc.
103
Example 1: Consider the experiment of tossing of fair coin once.
The sample space is S={H, T} where H denotes the outcome ‘Head’ and T
denotes the outcome ‘Tail’. So, there are two possible outcomes H or T.
Now, let the random variable X represents the outcome `Head’, then X can take
the value 0 or 1.
Example 2: Suppose a single fair die is rolled once.
The sample space of this experiment constitutes six possible outcomes,

S = {1, 2, 3 , 4, 5, 6}
Let the random variable Y denotes the outcome ‘A number greater than 2
occurs’. Then the random variable can assume the values 3, 4, 5 or 6.
Examples 3: Consider the experiment of rolling two fair dice once simultaneously.
If the random variable T indicates the outcome `the sum of the numbers on the
two dice is greater than 10,’ then T can take the pairs (5, 6), (6, 5) or (6, 6) since
in each of these cases the sum of the numbers is grater than 10.
CYP1
Let two fair coins be tossed once simultaneously. If the random variable X denotes ‘A tail
appears ’
What are the possible values of the random variable X?
14.2.1 Types of Random Variables
As stated above, a random variable provides a means of associating a numerical value

with each possible experimental outcome.
Depending upon the numerical values it can assume, a random variable can be classified
into two major divisions.
104
A) Discrete Random Variable: is a random variable that may assume either a finite
number of values or an infinite sequence (e.g. 1, 2, 3…) of values. In general, a
discrete random variable takes whole number values, which can be counted or
enumerated.
Example: The number of students who are enrolled for a diploma program in
Unity University College, the number of defective batteries observed in assessing
its quality, the number of customers who visit a shop during one day of operation
are all examples of discrete random variables.
B) Continuous Random Variable: is a random variable which may take on all

values in a certain interval or collection of intervals. A Continuous random
variable, as the name implies, assumes all possible values between any two
values.
Example: Weight, time, temperature, etc are example of continuous random

variable.
Remark: One way to determine whether a random variable is discrete or

continuous is to think of the values of the random variable as points on a line
segment. If the entire line segment between any two of these points also
represents values the random variable may assume, then the random variable is
continuous.
CYP2
Decide whether each of the following random variables is discrete or continuous. Put
your answer on the space provided.
1. Weight of a shipment of goods.

_______________________________________________
2. The number of indigenous birds, which are visited each day in the Awash
National Park.
________________________________________________
105
3. The amount of time elapsed to cover a distance between two stations in a city.
________________________________________________
14.3 DEFINITION OF PROBABILITY DISTRIBUTION
The probability distribution for a random variable describes how the probabilities are
distributed over the values of the random variable. For a discrete random variable X, the
probability function is denoted by P(X). The probability function provides the probability
for each value of the random variable.
A probability distribution may in general be defined as follows:
Definition:
A probability distribution is a correspondence, which assigns probabilities to the values of a

random variable.
Example 1: Construct a probability distribution for the number of heads in tossing two
fair coins simultaneously once.
Solution: The sample space of the experiment contains the following:

S = {HH, HT, TH, TT}
Let the random variable X denotes the ‘number of heads’. We then use the probability
function P(X) to assign probability to each out come consequently; the probability
distribution is given below:
Outcome, X 0 1 2
Probability, P(X) ¼ ½ ¼
106
The probability distribution shows that the probability that the random variable can
assume the value 0 is ¼, the value 1 is ½ and the value 2 is ¼. Note that the sum of these
probabilities is 1.
Example 2: The number of mistakes a typist made in ten days of assessment is shown in
the following table.
No of mistakes 2 3 4 5
No of days 1 4 3 2
a) Construct a probability distribution for the number of mistakes she committed.

b) Represent graphically the probability distribution in part (a).
Solution:
a) In Constructing the probability distribution, our random variable assumes a value
for the number of mistakes the typist committed. Let the variable X denotes this
random variable. Then, we assign a probability for each of the number of days
with respect to the total number of days.
The probability distribution is shown below:
No of mistakes, X 2 3 4 5
Probability, P(X) 1/ 10 4/ 10 3/ 10 2/ 10
b) We emphasize at this point that a probability distribution can be displayed on the

coordinate plane. The value of the random variable X is shown on the horizontal
axis (x-axis) and the probability that the random variable X assumes these values
is shown on the vertical axis (y-axis). For the probability distribution in the
example, the random variable X, which is the number of mistakes the typist
107
committed is labeled on the x-axis and the corresponding probability, P(X) on the
y-axis.
Y axis
0 .4
P(X)
Probability 0 .3
0 .2
0 .1
1 2 3 4 x axis
number of mistakes
14.4 TYPES OF PROBABILITY DISTRIBUTION
A probability distribution can be classified as a discrete or continuous probability

distribution according to whether it assumes a discrete or continuous random variable.
This section discusses discrete probability distribution. Continuous probability

distribution will be seen in the next unit.
In the construction of the probability distribution for a discrete random variable, the
following two conditions must be satisfied.
Properties (Required Conditions) for a Discrete Probability Distribution
The sum of the probabilities of all the events in the sample space must equal 1.
i.e.  P(x) =1
The probability of each event in the sample space must be between or
equal to 0 and 1.
i.e. 0  P(x)  1
108
For instance, in the above example, these two conditions are satisfied since
 P(X) = P(2) + P(3) + P(4) + P(5) = 0.1+ 0.4 + 0.3 + 0.2 = 1 and
each of these probabilities is greater than or equal to 0 and less than or equal to 1.
For some discrete random variables, the probability distribution can be given as a formula
that yields (x) for every possible value of x.
Example 3: Suppose a probability distribution is given by the formula:
 (x) = x/5 for x = 0, 2, 3
Construct the probability distribution correspondence.
Solution:
The outcome x assumes the values 0, 2 and 3
Out come, x 0 2 3
Probability, (x) 0/ 5 2/ 5 3/ 5
CYP3
1. Construct a probability distribution for the number of tails in tossing three fair
coins once.
2. Assign a probability function, which can generalize all the outcomes in tossing a
fair coin once.
14.5 EXPECTED VALUE AND VARIANCE OF A PROBABILITY

DISTRIBUTION
14.5.1. Expected Value
The expected value, or mean, of a random variable is a measure of the central location for
the random variable. It is denoted by E(x) or . The mathematical expression for the
expected value of a discrete random variable x is as follows:
109
Expected value of a discrete random variable:
E(x)=  = x1 P(x1) + x2 . P(x2) +………..+ xn P(Xn) Or,
n
E (x) = x
i 1
i . P(xi)
where x1, x2,-------,xn are the outcomes and P(x1), P(x2)…P(xn) are the
corresponding probabilities.
The above formula shows that in order to compute the expected value of a discrete
random variable, we must multiply each value of the random variable by the
corresponding probability
P(x) and then add the resulting products.
14.5.2 Variance
While the expected value provides the mean value for the random variable, we often
need a measure of dispersion, or variability, for the random variable just as we need
variance in block 5 to summarize the dispersion in a data set. The mathematical
expression for the variance of a discrete random variable is as follows:
Variance of a discrete probability distribution, σ 2
 x   . Px   x Px .

n n
σ2 = i
2
i i
2
i
2
i 1 i 1
2
and the standard deviation is ó  ó
Example 1: If three fair coins are tossed, find the expected number of heads that will
occur and obtain the variance.
110
Solution:
Begin by constructing the probability distribution for the number of heads in tossing the
three coins.
The probability distribution is constructed below:
No of heads, x 0 1 2 3
Probability, P(x) 1/ 8 3/ 8 3/ 8 1/ 8
Then,
4
E(x)=  i 1
xi.P(xi) = xi P(x1) + x2 . P(x2) + x3 . P(x3) + x4 . P(x4)
= 0· 1/ 8 + 1· 3/ 8 + 2 · 3/ 8 + 3· 1/ 8
= 0 + 3/ 8 + 6/ 8 + 3/ 8 = 12/ 8 = 6/ 4 = 3/ 2 = 1. 5
The theoretical mean  = 1.5 implies that if the experiment is done as many times as
possible, then on the average a head occurs 1.5 of the time.
4
2 = 
i 1
[(xi-)2· P(xi)]
= (x1 - )2 · P(x1) + (x2 - )2 · P(x2) + (x3 - )2 · P(x3) + (x4 - )2 · P(x4)
= (0 - 1.5)2 · 1/8 + (1-1.5)2 · 3/8 + (2 - 1.5)2 · 3/8 + (3 - 1.5)2 · 1/8
2 = 0.5
Example 2: One thousand tickets are sold at $1 each for a color television valued at
$350. What is the expected value if a person purchases one ticket?
Solution:
The problem can be seen as follows:

When a person purchases one ticket, he has two chances, to lose $1 or gain $349.
Gain, x $ 349 -$1

P(x) 1/ 1000 999/ 1000
111
Hence,
E(x) = $349 · 1/1000 + (-$1) · 999/1000 = -$0.65
Or,
E(x) = overall gain - $1 = $350 · 1/1000 - $1 = $0.65
i.e. The average loss is $0.65 for each of the 1000 ticket holders.
CYP4
Five balls numbered 0, 2, 4, 6 and 8 are placed in a bag. After the balls are mixed, one is
selected, its number is noted, and then it is replaced. If this experiment is repeated many
times,
a) Find the expected value.

b) Find the variance and the standard deviation.
14.6 THE BINOMIAL PROBABILITY DISTRIBUTION
The Binomial Probability Distribution is a discrete probability distribution that has many
applications. It is associated with a multi-step experiment that we call the Binomial
experiment, which is a probability experiment satisfying the following four requirements.
Properties of the Binomial Experiment

a) Each trial can have only two outcomes or outcomes that can be reduced
to two outcomes (success or failure).
b) There must be a fixed number of trials.
c) The outcomes of each trial must be independent.
d) The probability of a success must remain the same for each trial.
Definition:
A probability distribution showing the outcomes of a Binomial experiment along with the
corresponding probabilities is termed as a Binomial Probability Distribution.
112
In a Binomial experiment, the probability of exactly x successes in n trials is given by:
Px 
n!
. p x .qn x
n  x ! x!
Where x is the number of successes

P(x) is the probability of success
n is the number of trials
P is the numerical probability of success
q is the numerical probability of failure
Note: q = 1 - p and 0  x  n
Example 1: Consider the experiment of tossing a coin three times. Show that it is a
binomial experiment and find the probability of getting exactly two heads.
Solution:
This is a binomial experiment since

i) There are only two outcomes, head and tail.
ii) The number of trials is fixed (three)
iii) The probability of success, getting a head, does not change from trial to trial.
i v) The trials or tosses are independent, since the outcome of any trial is not
affected by the outcome of any other trial
Now, to find the probability of getting two heads, let p denotes the probability of getting
a head on a single toss.
113
Then p = 1/2, q = 1-1/2 = 1/2
n = 3, x=2
Px  
n!
. p x . q n x
n  x ! x!
2
3!  1   1  32 3!  1   1 
P(2)  .   .   .  . 
3  2!2!  2   2  1!2!  4   2 
3
= = 0 .3 7 5
8
Example 2: A new drug is effective 60% of the time. What is the probability that in a
random sample of 4 patients, it will be effective on two of them?
Solution:
This is a Binomial experiment as the points of the experiment are satisfied. Define
‘effective’ as ‘success’ and ‘non effective’ as ‘failure’. Then,
p = 0 .6 , q = 1 - 0 .6 = 0 .4 , n = 4, x=2
Required p (2) = ?
. 0.6  . 0.4   6  0.0576   0.3456

4!
P ( 2) 
2 2
4  2 !2!
Hence, the drug will be effective on two of a random sample of 4 patients with a
probability of 0.3456 (or 34.56%).
Remark: (Using the Binomial tables)

We recognized that it is tedious to calculate probabilities using the binomial formula
when n is a large number. For such cases, you may use the binomial probability
distribution table that is given at the end of this block.
Let’s demonstrate how to read the table with an illustrative example. Consider the
experiment of tossing a fair coin 4 times. What is the probability of getting three heads?
Clearly, this is a Binomial experiment where
n = 4 and p=1/2 = 0.5 and x = 3
Under the column `n’ choose the number 4, proceed horizontally and correspond it with
x=3, then read the number that matches p=0.5, which is of course.
114
CYP5
A survey found that 30% of teenage consumers receive their spending money from part-
time jobs. If five teenagers are selected at random, find the probability that at least three
of them will have part-time jobs.
M e a n , a n d V a r ia n c e o f a P r o b a b ilit y D is t r ib u t io n
Definition
The mean, variance and standard deviation of a variable that has the Binomial
distribution is found as:
Mean =n·p
Variance 2 = n·p·q
Standard deviation  = npq
Example1: A coin is tossed four times. Find the mean, variance and SD of the number of
heads that will be obtained.
Solution:
Here n = 4, p = 1/2, and q = 1/2

=n.p=4.½=2
2 = n . p . q = 4 . 1/2 . 1/2 = 1
= 2 = 1=1
Example 2: A die is rolled 240 times. Find the mean, variance and standard deviation for
the number of 3’s that will be rolled.
Solution:
n = 240,P=1/6
 = n . p = 240(1/6) = 40
2 = n . p . q = (24)(1/6)(5/6)  33.33
= 33.33  5.77
115
CYP6
Calculate the mean and variance of the number of `Head’ that will appear if a fair coin is
tossed 1000 times.
14.7 THE POISSON PROBABILITY DISTRIBUTION
A discrete probability distribution that is useful when n is large and p is small and when
the independent variables occur over a period of time is called the Poisson probability
di s t r i but i on.
The Poisson probability distribution assumes the following two conditions:
i) The probability of an occurrence is the same for any two intervals of equal length.
ii) The occurrence or non-occurrence in any interval is independent of the
occurrence or non-occurrence in any other interval.
The Poisson probability function

e  . x
P x ;   
x!
Where P(x, λ) is the probability of x occurrences in an interval of time, volume,
area etc for a variable, λ denotes the mean number of occurrences and e  2.7183
Example1: Past police records indicate a mean of five accidents per month while
investigating the safety of a dangerous intersection. The number of accidents is
distributed according to the probability in any month of
a) Exactly 3 accidents.
b) Fewer than 2 accidents.
Solution: By assumption the given distribution is a Poisson probability distribution.

Given that =5
x . e  
a) Px  
x!
116
x=3
P3 
5  . 2.7183
3 5

125 0.00674
3! 6
= 0 .1 4 0 4
b) Fewer than 2 accidents comprise 0 and 1 accident during any month.
 P0   P1 
5  . 2.7183
0 5

5  2.7183
1 5
0! 1!
 0 .0 6 7 4 + 0 .3 3 7 0
 0 .4 0 4 4
Remark:- Although the above probability was determined by evaluating the

probability function, it is often easier to refer to the table for the Poisson probability
distribution. These table provides probabilities for specific values of x and . We
have included the table at the end of this block.
For convenience, in example 1a,  = 5 and x = 3. In the first column of the table
choose
x = 3 and correspond it with  =5, the intersection of these two numbers gives you the
required probability, which is  0.1404.
Example 2: If there are 200 typographical errors randomly distributed in a 500-page

manuscript, find the probability that a given page contains exactly 3 errors.
Solution: First of all, find the mean number of errors

200
  0 .4
500
Or, 0.4 error per page.
Since x = 3,
e   x 2.7183 . 0.4 
0.4 3
Px ,      0.00715
x! 3!
Thus, there is less than a 1% probability that a give page contains less than 3 errors.
117
CYP7
A sales firm receives, on the average, 3 calls per- hour on its toll-free number. For any
given hour, find the probability that the firm receives
a) At most 3 calls
b) At least 3 calls
c) 5 or more calls.
14.8 SUMMARY
This unit discussed the definitions of random variable as a variable assigned to a random
probability experiment and where the probability distribution of such experiment attains
the summarized table comprising the random variable together with the probability of
occurrence of the events. A random variable can be discrete or continuous, depending on
the values it assumes.
A probability distribution was seen as a distribution showing the correspondence of the

outcomes of a random variable with the respective probabilities .The two widely used
discrete probability distributions are the Binomial and Poisson probability distributions
and each of these distribution has its own property.
CYP1
The sample space is S = {HH, HT, TH, TT}. There are three possibilities; no tail occurs,
one tail occurs or two tails occur.
Hence, the random variable can take the value 0, 1 or 2.
CYP2
1. Continuous random variable
2. Discrete random variable
3. Continuous random variable
118
CYP3
The sample space constitutes 8 possibilities
S={HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
Let the random variable Y denotes the number of tails that appear. The possible cases are
shown below
No tail (Y = 0) One tail (Y = 1) Two tails (Y = 2) Three tails (Y = 3)
HHH HHT, HTH, THH HTT, THT, TTH TTT
The probability distribution is shown below.
No of tails, Y 0 1 2 3
Probability P (Y) 1/ 8 3/ 8 3/ 8 1/ 8
CYP4
Let X be the number on each ball. The probability distribution is as follows:
No on ball, X 0 2 4 6 8
Probability, P(X) 1/ 5 1/ 5 1/ 5 1/ 5 1/ 5
5
a)   E  x    xi . P xi   0.  2.  4.  6.  8.
1 1 1 1 1
i 1 5 5 5 5 5
 x 
5
 1
. P xi    2   0 2 .  2 2 .  4 2 .  6 2 .  82 .   4 2
1 1 1 1
2 
2
b) i
i 1  5 5 5 5 5
=8
and   8  2.83
CYP5
Given p = 0.3, q = 1 – 0 .3 = 0 .7 n = 5 and x = 3, 4, 5
119
P(at least 3) = p(3) + p(4) + p(5)
. 0.3 . 0.7   . 0.3 0.7   . 0.3 0.7 

5! 3 2 5! 4 1 5! 5
0
=
5  3!3! 5  4!4! 5  5!5!
= 0 .1 3 3 3 3 2 3 + 0 .0 2 8 3 5 + 0 .0 0 2 4 3
= 0 .1 6 3 0 8
So the probability that at least three of them will have part-time jobs is 0.16308.
CYP6
The experiment is Binomial with p = ½, q = 1 - ½ = ½ and n = 1000
 = n . p = 1000  ½ = 500
2 = n . p . q = 100 ½ . ½ = 250
CYP7
a) At most 3 calls means 0, 1, 2 or 3 calls:

P(0, 3) + P(1, 3) + P(2 , 3) + P(3, 3)
= 0 .0 4 9 8 + 0 .1 4 9 4 + 0 .2 2 4 0 + 0 .2 2 4 0
= 0 .6 4 7 2
b) At least 3 calls means 3 or more calls:

P(0, 3) + P(1, 3) + P(2, 3)
= 0 .0 4 9 8 + 0 .1 4 9 4 + 0 .2 2 4 0 = 0 .4 2 3 2
and 1 – 0.4232 = 0.5768
c) P(0 , 3) + P(1 , 3) + P(2 , 3) + P(3 , 3) + P(4 , 3)

= 0 .0 4 9 8 + 0 .1 4 9 4 + 0 .2 2 4 0 + 0 .2 2 4 0 + 0 .1 6 8 0
= 0 .8 1 5 2
and 1 – 0.8152 = 0.1848
120

1. Consider the experiment of tossing a coin four times
a. List the experimental outcomes
b. Define a random variable that represents the number of heads occurring on
the tosses.
c. Show what value the random variable would assume for each of the
experimental outcomes.
d. Is this random variable discrete or continuous? Why?
e. Complete the probability distribution for the number of heads that appear
in the experiment.
2. The only information available to you regarding the probability distribution of a set
of outcomes is the following list of frequencies:
X 0 1 2 3 4 5
FREQUENCY 18 48 180 252 72 30
a) Construct a possible probability distribution for the set of outcomes.

b) Find the expected value of the outcome.
c) Find the variance and the standard deviation of the distribution.
d) Draw a graph for the distribution.
3. A Psychologist has determined that the number of hours required to obtain the trust
of a new patient is 1, 2 or 3. Let x be a random variable indicating, the time in
hours required to gain the patients trust. The following probability function has
been proposed.
x
(x) = for x = 1, 2 or 3
6
a. Is this a valid probability function? Explain.
b. What is the probability that it takes exactly 2 hours to gain the patients trust?
c. What is the probability that it takes at least 2 hours to gain the patient’s trust?
121
4. For a Binomial distribution with n=6 and p=0.3, find the following probabilities
a) P(r = 5) b) P(r > 4) c) P(r < 2) d) P(r  3)
5. Find the mean and standard deviation of the Binomial distribution with
a) n = 12, p = 0.25
b ) n = 2 5 , p = 0 .4
c) n = 2,250, p = 0.95
6. When a new machine is functioning properly, only 3% of the items produced are
defected. Assume that we will randomly select two parts produced on the machine and
that we are interested in the number of defective parts found.
a) Describe the conditions under which this situation would be a Binomial
experiment.
b) Draw a free diagram showing this as two trial experiments.
c) How many experimental outcomes result in exactly one defect being found?
d) Compute the probabilities associated with finding no defects, exactly 1 defect,
2 defects.
7. At a particular university, it has been found that 20% of the students withdraw
without completing the introductory statistics course. Assume that 20 students have
registered for the course this time.
a) What is the probability that 2 or fewer will withdraw?
b) What is the probability that exactly 4 withdraw?
c) What is the probability that more than 3 will withdraw?
d) What the expected number of withdraws?
8. Of the next-day express mailings handled by a postal service, 85% are actually
received by the addressee 1 day after the mailing. What is the expected value and
variance for the number of 1-day deliveries in a group of 250 express mailings?
9. Consider a Poisson probability distribution with mean 3.
a) Show that the probability function is equivalent to P x  

4
3e 2
b) Find p (x = 2) c) Find p (x  2)
122
10 Airline passengers arrive randomly and independently at the passenger-screening
facilitate major international airport. The mean arrival rate is 10 passengers per minute.
a) What is the probability of no arrivals in a 1-minute period?
b) What is the probability that 3 or fewer passengers arrive in a 1-minute
period?
c) What is the probability of at least one arrival in a 1-minute period?
14.11 GLOSSARY
Random Variable: A numerical description of the outcome of an experiment.
Discrete Random Variable: A random variable that may assume only a finite or infinite
sequence of values.
Continuous Random Variable: A random variable that may assume all values in an
interval or collection of intervals.
Probability Distribution: A description of how the probabilities are distributed over
the values the random variable can take on.
Expected Value: A measure of the mean, or central location, value of a random
variable.
Variance: A measure of dispersion (or variability) of random variable.
Binomial Probability Distribution: A probability distribution showing the probability

of successes in n trials of a Binomial experiment.
Poisson Probability Distribution: A probability distribution showing the probability of

x occurrences of an event over a specified interval of time or space.
14.12 REFERENCES
 Anderson, Sweeney, Williams, Statistics For Business and Economics, Fifth

edition 1986
 Richard I. Levin, Statistics for Management, Third edition, 1984
 Stephen A. Book, Essentials of Statistics, 1978
123
UNIT 15: CONTINUOUS PROBABILITY DISTRIBUTION
CONTENTS
15.1 Introduction
15.2 The Normal Probability Distribution
15.3 Area Under the Normal Curve
15.4 Applications of the Normal Distribution
15.5 Summary
15.6 Answers to Check Your Progress Questions (CYP)
15.8 Glossary
15.9 References
The aim of this unit is to enable you get the idea of the normal probability distribution
and apply it to solve some problems involving it.
At the end of this unit you will be able to:-

 Define `normal probability distribution’
 Find the area of curves, which are normal
 Apply the normal probability distribution to solve problems.
15.1 INTRODUCTION
So far, we have been concerned with discrete probability distributions. In this unit, we
shall turn to cases in which the variable can take on any value within a given range and in
which the probability distribution is continuous.
A common continuous probability distribution is the normal probability distribution.

Several mathematicians were instrumental in its development; among them is the
eighteen-century mathematician and astronomer Karl Gauss. In honor of his work, the
normal probability distribution is often called the Gaussian distribution.
124
There are two basic reasons why the normal distribution occupies such a prominent place
in statistics. First, it has some properties that make it applicable to a great many situations
in which it is necessary to make inferences by taking samples. Second, the normal
distribution comes close to fitting the actual observed frequency distributions of many
phenomena, including human characteristics (weights, heights and IQS)
15.2 THE NORMAL PROBABILITY DISTRIBUTION
Many contentious variables such as height and weight have distributions that are bell-
shaped and are called approximately normally distributed variables, deriving the most
important probability distribution used to describe a continuous random variable called
the normal probability distribution.
The normal probability distribution is a continuous, symmetric, bell-shaped
distribution of a variable.
The form or shape of the normal probability distribution is shown below.
The shape and position of the normal distribution curve depends on two parameters, the
mean and the standard deviation. Each normally distributed variable will have its own
normal distribution curve.
125
Properties of the normal probability distribution
 The normal distribution curve is bell-shaped.

 The mean, median and mode are equal and are located at the center of
the distribution.
 The curve is Uni-modal
 The curve is symmetrical about the mean.
 The curve is continuous and never touches the x-axis
 The total area under the normal distribution curve is equal to 1, or 100%
 The area under the normal curve that lies within one SD of the mean is
approximately 0.68 or 68%, within two SD’s about 0.95 or 95% and
within three SD’s about 0.997 or 99.7%
34.13% 34.13%
13.59% 13.59%
2.28% 2.28%
-3 -2 -  + +2 +3
about 68%
about 95%
about 99.7%
8. The mathematical equation of the normal probability distribution is defined by the

probability density function.
  x   2
f x  
1
e 2 2
  3 .1 4 1 5 9
2
Where  = mean e  2 .7 1 8 3
 = Standard deviation
126
The standard normal probability distribution
A random variable that has a normal distribution with a mean of 0 and a standard
deviation of 1 is said to have a standard normal probability distribution.
Recall that the standard score (z-score) of a value is the number of standard
deviations that value is from the mean.
All normally distributed variables can be transformed into the standard normal
distributed variable by using the formula for the standard score:
z= value – mean
standard deviation
X 
Or, z=

The standard normal curve
-3 -2 -1  1 2 3
CYP1
1. Write the two parameters that determine the shape and position of the normal
curve.
2. What is the total area under the normal curve?
3. Determine the area of the normal curve within the range - and +
4. Find the z-score of the value 20 if the entire distribution has a mean of 10 and
the standard deviation is 3.
127
15.3 AREA UNDER THE NORMAL CURVE
As with other continuous random variables, probability calculations with any normal
probability distribution are made by computing areas under the graph of the probability
density function. Thus, to find the probability that a normal random variable lies within
any specific interval, we must, compute the area under the normal curve over that
interval.
For the standard normal probability distribution, areas under the normal curve have been
computed and are available in tables that can be used in computing probabilities. The
normal probability distribution table is available at the end of this block.
For the solution of problems using the normal distribution, the following steps are used.
1. Draw a picture
2. Transform the given value to z-value
3. Shade the area desired
4. Read the area from the standard normal distribution table.
Example 1: Find the area under the normal curve between z=0 and z=2.34
Solution:
The standard normal curve
Representation is shown: From
the table the intersection 0 2 .3 4
of z = 2.3 with 0.04 gives 0.4904 or
49.04% which is the required area.
Example 2 : Find the area under the normal distribution curve between z = -1.93 and z =
2 .3 5
Solution: For easy look, draw

the normal curve and locate the two z-scores.
The total area (the shaded region) is the area
between –1.93 and 0 plus the -1.93 0 2 .3 5
128
area between 0 and 2.35;
Hence, from the normal distribution table
Area = 0.4732 + 0.4906 = 0.9638 or 96.38%. Note that it is equivalent to say that the
probability of the z-value lying between z = -1.93 and z = 2.35 is 96.38%. This can also
be written as:
P(-1.93 < z > 2.35) = 0.9638
Example 3: Find the probability that the z-value of a normally distributed variable lies
to the left of 1.65
Solution
The probability that the z-value
lies to the left of 1.65 is equivalent to
finding the area under the standard
normal curve, which is to the left of 1.65
Hence, total area = area to the left of 0 0 1 .6 5
plus area between 0 and 1.65 = P(z < 1.65)
= 0 .5 0 0 0 + 0 .4 5 0 5 = 0 .9 5 0 5 o r 9 5 .0 5 %
Which is required probability.
Example 4: find P(z > 1.91)

S ol ut i on
P(z > 1.91) = area to the right of 0 area between 0 and 1.91.
i .e
P(Z > 1.91) = P(z > 0) - P(0 < z < 1.91) 0 1 .9 1
= 0 .5 0 0 0 - 0 . 4 7 9
= 0 .0 2 1 o r 2 . 1 %
CYP2
If a random variable x, has a normal distribution with a mean 5.6 and standard deviation
1.4, find
a) P(5 < x < 6) b)P(x < 7) c)P(x > 6.4)
129
15.4 APPLICATIONS OF THE NORMAL DISTRIBUTION
The area under the normal curve is used to solve practical application problems such as
finding probabilities or percentages of values. In order to solve such problems you need
only transform the values of the variable into the z values and read the standard normal
distribution table.
Example 1: The scores for an IQ test are normally distributed with a mean of 100 and a
standard deviation of 15. Find the percentage of IQ scores that will fall below 112.
Solution
Step 1: Draw a figure and represent the area
Step 2: Find the z-value
Corresponding to an IQ
Score 112.
Z = x -  = 1 1 2 – 1 0 0 = 0 .8 100 112
 115 0 0 .8
Step3: From the table,
P(z < 0.8) = P(z < 0) + P(0 < z < 0.8) = 0.5000 + 0.2881 = 0.7881
Hence, 78.81% of the IQ scores fall below 112.
Example2: The monthly salaries of 2000 workers are normally distributed with a mean
of birr 550 and of workers whose monthly salaries are
a) Between birr 600 and 700
b) Less than birr 700.
Solutions: the z – values corresponding to 600 and 700 are

600  550
Z  0.625
80
700  550
Z  1.875 550 600 700
80
Hence, 96.99% x200=1939.8 0 0.625 1.875
Approximately 1940 of the workers earn a monthly salary less than birr 700.
130
Example 3 A college desires to accept only the top 10% of all graduating seniors on
the basis of the results o a national placement test. The test has a mean of 500 and a
standard deviation of 100. Find the cut-off score for the exam.
Solution:
The area is shown.
We solve the problem back ward.
We need to determine the point on 500 x
the axis that cuts the upper 10% of the area. 0 z
Let it be denoted by x
P(z < 0) = 0.5000 – 0.1000 = 0.4000
From the table, the z – value that corresponds to the area 0.4000 is approximately 1.28.
x  500
Then, 1.28   x  628
100
Hence the score 628 should be used as a cut –off score. Any student scoring below 628
should not be admitted.
CYP3
A standardized test has a mean of 50 and a standard deviation of 10. The scores are
normally distributed. If the test is administered to 800 students, approximately how many
will score between 48 and 62?
T – DISTRIBUTION
Since the variation between the sample mean and the population mean is given by Z  X
,

where  X  , the population standard deviation in large samples can be
n
approximated by sample standard deviation in large samples can be approximated by
S
sample standard deviation, so that,  X  . This relationship is not valid for small
n
samples because of wide fluctuations in the values of sample standard deviation (s).
131
Based upon this variation, Gossett came up with different sets of critical scores, called t-
scores. (Gossett wrote articles under the name of student; hence the distribution of t-
scores is known as student t-distribution). These t-scores are to be used in place of Z-
scores. The larger the sample size, the closer will be the value of t-score to the value of Z-
score.
t-score distribution is useful not only when sample size is small but also when the
population standard deviation is not known. A small sample must come from a normal or
near normal distribution, in order for a t-test to be used. The t-scores should not be used if
the small samples came from a population which is distributed in a non-normal pattern.
Properties of the t-distribution

1- Similar to Z-distribution, it is a continuous distribution.
2- Similar to Z-distribution, it is symmetrical and bell shaped.
3- Unlike Z-distribution, it is not just one distribution, but a family of distributions,
so that every time the size of the sample changes, a new t-distribution is
generated. For example, a t-distribution for a sample size of 20 is different from a
t-distribution for a sample size of 25 and so on.
4- The t-curve is lower at the mean than the Z-curve, meaning it is more spread out
at the center and it is higher at the tail ends. However, as the sample size
increases, the t-distribution approaches the Z-distribution in shape and
characteristics.
The shape of t-distribution in comparison to Z-distribution is shown as below.

For the t-distribution has a greater spread than Z-distribution, the critical t-scores would
be numerically larger than the Z-scores for a given level of significance, and the smaller
the size of the sample, the larger the critical t-score value would be. This is quite logical
since we would require a wider range of values for a small sample for more conclusive
evidence in order to reject a null hypothesis. Since t-distribution is a family of curves,
one for each sample size, we must identify this distribution by degrees of freedom. It will
enable us to find the value of t-scores relative to level of significance with given degrees
of freedom, which are identified by the sample size n. The number of degrees of freedom
(df) is taken as (n – 1).
132
Example: -
Find the value of t-score from the table of t-distribution, if level of significance () is
0.025 and degree of freedom (df) is 9:
Answer:
t-score = 2.262
CHI SQUARE (  2) TEST:

In most statistical tests, our decisions are based on the assumption that the population is
normally distributed. But when this assumption about the population cannot be made, it is
necessary to use the CHI SQUARE (  2) test. This test is good for nominal or ordinal
scale of measurement where nominal scale of measurement deals with the data which can
only be classified into categories such as male and female, or freshman, juniors and
seniors and so on. There is no particular order for these groupings and are mutually
exclusive so that an item in one category is not included in another category. The ordinal
scale of measurement assigns different ranks to these categories. One category may be
superior in standing an the other may be good or fair and so on.  2 test is used for
analyzing qualitative variables such as opinions of persons, religious affiliations, smoking
habits, etc. It deals with judgments about proportions of two or more than two
populations.
Properties of Chi square distribution

1- It involves squared observations and hence it is always positive or greater than or
equal to zero.
2- The distribution is not symmetrical. It is skewed to the right so that its skew ness
is positive. However, as the number of degrees of freedom increases, Chi-square
approaches a symmetrical distribution.
3- Similar to t-distribution, there is a family of chi-square distributions. There is a
particular distribution for each degree of freedom.
The estimation of degree of freedom of  2-distribution is determined by the number of

categories in which various attributes of the sample are place, so that if there are K
numbers of categories, then the number of degrees of freedom (df) would be (k – 1). For
133
categories of two or more independent samples (where given contingency table), the df
would be (k – 1) (r – 1), where r-is the number of rows and k-number of columns. For
example, if a sample of 100 students were categorized as freshman, sophomores, juniors
and seniors, then there are four categories and k is 4.
So that the degree of freedom or df is k-1 = 3

The following illustration shows the family of  2 curves with varying degrees of
freedom and it can be seen that as the number of degrees of freedom increases,  2
distribution approaches the normal curve.
The  2 test is used to test whether there is a significant difference between the observed
number of responses in each category and the expected number of responses for such
category under the assumptions of null hypothesis. In order words, the objective is to find
how well the distribution of observed frequencies (fo) fit the distribution of expected
frequencies (fe). Hence this test is also called goodness-of-fit test.
Example: -
Find the critical value of  2 from the table of  2-distribution if level of significance  is
0.05 and degree of freedom is 2.
Answer  2 = 5.991
15.5 SUMMARY
This unit discussed probability distributions to the case of continuous random variable.
With continuous probability distribution, we associate a probability density function that
provides that probability that the random variable x assumes various values. We have
also discussed that the area under the standard normal curve represents the probability
distributions used to solve practical problems, which can be reduced to a normal
distribution.
CYP1
1. The mean and the standard deviation
134
2. 1 or 100%
3. Recall that 68% of th4e values lie within 13.59 between  + 1 and  + 2, and
2.28% between  + 2and  + 3
Hence, 68 + 13.59 + 2.28 = 83.87% of the values lie between  - 3, which is
equivalently the area between them.
4. Given x = 20,  = 10,  = 3
value  mean x   20 10 10
Z     3.33
SD  3 3
CYP2
Given  = 5.6 ,  = 1.4

e) The Z score corresponding to 5 and 6 are
5  5 .6 6  5 .6
Z   0.43 and Z   0.29
1 .4 1 .4
5 5 .6 6
= p(5 < x < 6) = p(-0.43 < Z < 0.29) -0.43 0 0.29
= p(5 < x < 6) = p(-0.43 < Z < 0.43) + p(0 < Z < 0.29)
= 0 .1 6 6 4 + 0 .1 1 4 1
= 0 .2 8 0 5 o r 2 8 .0 5 %
b) b) The Z score for x = 7 is
7  5 .6
Z 1
1 .4
p( x < 7) = p( Z < 1 )
= p(Z < 0) + p(0 < Z < 1)
= 0 .5 0 0 0 + 0 .3 4 1 3 5 .6 7
=0.8413 + 84.13% 0 1
c) The Z-score for x = 4.4 is
8 .4  5 .6
Z 2
1 .4
p(x > 8.4) = p(Z > 2)
= p(Z > 0) - p(0 < Z < 2)
135
= 0 .5 0 0 – 0 .4 7 7 2 5 .6 8 .4
= 0 .0 2 2 8 o r 2 .2 8 % 0 2
CYP3
The Z score for 48 and 62 are
48  50
Z   0 .2 and
10
62  50
Z  1 .2
10
p(48 < x < 62) = p(-0.2 < Z < 1.2) 48 50 62
= p(-0.2 < Z < 0) + ( p(0 < Z < 1.2) -0.2 0 1 .2
=0.0793 + 0.3849
=0.4642 or 46.42 %
Hence, 46.42% x 800  371 students scored between 48 and 62
15.7 CHECK YOUR PROGRESS QUESTIONS
Answer each of the following questions

1. Given that Z is the standard normal random variable, compute the following
probabilities.
a) P (-1.98 < Z < 0.49)
b) P (Z > 2.30)
c) P (0.52 < Z > 1.22)
d) P (Z < 1.29)
2. Given that Z is the standard normal random variable, fine z for each situation
a) The area between 0 and Z is 0.4750
b) The area between 0 and Z is 0.2291
c) The area the right of Z is 0.1314
d) The area to the left of Z is 0.6700
3. The demand for a new product is assumed to be normally distributed with  = 200
and  = 40. Letting x be the number of units demanded, find the following:
a) P (180 < x < 220)
136
b) P (x > 250)
c) P (x < 225 < x < 250)
4. The test scores from a college admissions test are normally distributed, with a
mean of 450 and a standard deviation of 100.
a) What percentage of the people taking the test score between 400 and 500?
b) Suppose that someone receives a score of 630.What percentage of the
people taking the test score better?
c) If a particular university will not admit anyone scoring below 480, what
percentage of the persons taking the test would be acceptable le to the
University?
5. Lamps used in residential area street lighting are constructed to have a mean
lifetime of 400 days with a standard deviation of 30 days. Furthermore, their
lifetimes are normally distributed, what percentage of such lamps last
a) Longer than 1 year (365days)?
b) Between 375 and 425 days?
c) Longer than 480 days?
15.8 GLOSSARY
Continuous probability distribution: is a probability distribution where the probability

that the random variable will assume a value in any interval.
Probability density function: The function that defines the probability distribution of a
continuous random variable.
Normal probability distribution: A continuous probability distribution where its
probability density function is bell shaped and determined by the mean  and the
standard deviation .
Standard normal probability distribution: A normal distribution with a mean of 0 and

a standard deviation of 1.
137
15.9 REFERENCES
 The references used in unit 28

 Statistics for Business and Economics, IS CHANDAN. Medgar Evers college
(City University of New York), 1998.
 Business Static’s, C.R> REEDY. M. Com Ph. D., 1994.
UNIT 16: CONCEPTS AND REASONS FOR SAMPLING
CONTENTS
16.1 Introduction
16.2 Definition of Sample and Census Survey
16.3 Advantages and Disadvantages of Sample Survey
16.4 Summary
16.5 Glossary
16.6 Reference Books
16.0 AIM AND OBJECTIVES
At the end of this unit students should be able to:-

1- define sample and census survey
2- describe reasons of sampling
3- state advantages and disadvantages of sample survey
16.1 INTRODUCTION
Inferential statistics is a systematic method of inferring satisfactory conclusions about the

population on the basis of examining a few representative units termed as sample. The
process of selecting samples is called sampling. Generalization of the sample data results
138
to the population, which is one of the characteristic features of research, needs scientific
approach of searching for facts. Therefore, sampling must be scientific.
16.2 DEFINITION OF SAMPLE AND CENSUS SURVEY
A subset of the population selected for the study is known as sample. The group from
which the samples are selected is called Universe or Population.
Sample survey: is a procedure, which makes one able to draw inferences about the
population by observing or measuring few items.
Census survey: is a method of inquiry, which makes one able to draw inferences by
observing each item constituting the population.
Objectives of sampling are:

1- To obtain maximum information about the characteristics of the population with
less time, energy and expenditure
2- To obtain the best possible values of parameter.
Sampling refers to the method of selecting a sample from the universe. A proper
procedure is to be adopted for evaluating the sample plan in order to select representative
units of the universe. Sampling occupies a key role in the study and has acquired the
status of a technical job.
The number of units in the sample is called Sample size. Not on a new line
* Sample size should never be too small nor too large but optimum. Optimum fulfills the
needs of efficiency, representative ness, reliability and validity.
The size of sample for a study is determined on the basis of the following factors
i- the size of the population
ii- the availability of resources
iii- the degree of accuracy
iv- the homogeneity or heterogeneity of the population
v- the nature of the study
vi- the method of sampling technique adopted
139
vii- the nature of respondents
16.3 ADVANTAGES AND DISADVANTAGES OF SAMPLE SURVEY
If the sample is drawn on scientific approach, the adopted sample design is good and the
sample size is adequate. Sample method has some merits over the census method. That
are:
1- Sampling saves time and money.
2- It is much convenient as it involves less personal staff.
3- It is useful when population is infinitely large.
4- It can be more accurately supervised and data can be carefully selected.
5- It is useful in case of inspecting the quality of units, which we have to resort to
sampling, such as testing the quality of bulbs, tubes, strength of stencils, testing
explosives, etc.
Sampling method has its limitations and problems, which are:
1- It would give unreliable data if not designed and executed carefully. Samples are
like medicines. They can be harmful if taken carelessly or without knowledge of
their effect.
2- The service of skilled, trained, qualified personnel for supervision; and
sophisticated equipment and statistical techniques are required. In the absence of
these, it may not be reliable.
3- Sample survey is not useful when information is needed about each and every unit
of the population.
16.4 SUMMARY
In a field of statistical analysis, it is not possible to take the entire population for
consideration due to time, cost and other constraints. Therefore, random samples are
taken from the population, which are analyzed properly and lead to generalizations that
are valid for the entire population. A small sample properly selected may be a true
representative of the universe while a large sample poorly chosen may be unreliable. So
the selection of a sample should be done in a manner that every item in the universe must
140
have an equal chance of inclusion in the sample. Thus a good sample possesses two
characteristics, which are:
i) Representative ness of the Universe

ii) Adequate in Size
16.5 GLOSSARY
Inference – refers to drawing conclusions from facts or by reasoning.

Parameter – characteristic or determining feature.
Respondent – interview, person who is interviewed and gives information
Survey – a study or investigation of population, usually human beings or economic,
social or political institutions.
16.6 REFERENCE BOOKS
 Business statistics, Dr. J.S Chandan Prof. Jagiit Singh KK Khanna. 19995,
Reprint1996.
 Business statistics, Theory and Practice. C.R/ REDDY. M. Com Ph D. 1994.
141
UNIT 17: TYPES OF SAMPLING TECHNIQUES
CONTENTS
17.1 Introduction
17.2 Types of Sampling Techniques
17.3 Summary
17.4 Answers to Self-Assessment Questions
17.5 Model Exam Questions
17.6 Glossary
At the end of this unit students will be able to: -

1- describe sampling technique and types of sampling techniques
2- differentiate random from non-random sampling method
3- apply given formulas to solve questions based on the sample survey
17.1 INTRODUCTION
Statistical methods are especially appropriate for handling data (information), which are
subject to variations, and for which we can observe only a fraction of the totality of
observations, which may exist. Under this situation, techniques must be devised by which
we can make inferences about the nature of the totality of the universe from the particular
observation we have.
17.2 TYPES OF SAMPLING TECHNIQUES
Sampling technique refers to the method of selecting a sample from the universe
(population). It occupies a key role in a study and has acquired the status of being a
technical job. The right type of sampling technique is of paramount importance in the
142
execution of a sample survey in accordance with the objectives and the scope of the
inquiry. The sampling methods may broadly be classified as:
1- Probability sampling (simple, stratified, & systematic)

2- Non-probability sampling (judgment, convenient, quota, incidental, purposive,
self-selected, etc.)
3- Mixed sampling (cluster sampling)
17.2.1 RANDOM (PROBABILITY) SAMPLING
Random sampling method is a method of selection of a sample such that each item within
the population has equal chance of being selected.
In this method, there is no place for investigator’s bias in sample selection since it
depends on probability. It provides more accurate estimates in the sense of greater
precision.
I. Simple Random Sampling Method (SRSM): involves very simple method of

drawing a sample from a given population. The selection of samples is random in
character.
The oldest method adopted in simple RS is the use of lottery system.
Suppose population size is 100 and sample size is 10. i.e. N = 100 and n = 10. Hundred
chits would be prepared bearing the serial number of units in the universe. These chits
would be put together and shuffled thoroughly, and then ten would be drawn one by one.
The sampling units corresponding to the number on the selected chits will form a random
sample. This method gives a sample, which is quite independent of the natures of
universe. This method is commonly in practice even at present.
The other most practical and inexpensive method is the method of “Random Number
Tables” (RNT). If we have to select a sample of size n from a universe of size N less than
9, then the numbers can be paired as 0 to 9.If we have to select a sample of size n from a
universe of size N less than 99, then the RNT will be from 00 to 99.If N is less than 999,
then from 000 to 999 and so on.
143
Then, select any K from the RNT and if K  N, the kth unit will be selected as a sample
and if K > N, divide K by N and take the remainder or the Rth unit as a sample. This
process continues till n number of samples are selected.
Example: From 40 big enterprises in Addis Ababa, we want to study the case of only 5 of
them. Let 12, 59, 67, 81 and 97 be the numbers selected from the RNT. Then which of
the items of the population are selected for the sample.
S ol ut i on: N = 40, n = 5, K = 12, 59, 67, 81, 97.

Since N = 40 is a two digited number, the RNT will be from 00 to 99.
K = 12 < N  The 12th item is selected for the sample.
K = 59 > N  59  40 = 1 + 19/40 The 19th item is selected.
K = 67 > N  67  40 = 1 + 27/40 The 27th item is selected
K = 81 > N  81 40 = 2 + 1/40 The 1st item is selected
K = 97 > N  97  40 = 2 + 17/40 The 17th item is selected
The 1st, 12th, 17th, 19th and 27th items are selected for the ‘study’.
II Stratified Random Sampling Method (STRSM)
Under this method, the whole population is divided into a number of homogeneous
groups or strata. From each of these strata, random sample of size n is selected. Thus,
stratified RS means selecting a number of random samples, one from each stratum of the
universe. It is used when each group has small variation within itself but wide variation
between the groups.
The sample may be either proportionate or disproportionate. Suppose the universe is

divided into two groups consisting of 100 and 160 respectively and their respective
sample sizes being 10 % of the universe. Meaning a sample of size 10 + 16 = 26 is drawn
in proportion to the total number of items. But in disproportionate stratified RSM,
samples are taken from each stratum regardless of the number of units in the universe.
Thus in the above example, an equal number of units i.e. 12 from each stratum may be
drawn in which the total number of items in the sample is 24.
144
- The size of sample items which must be selected from the ith stratum is denoted by ni
and is given by
nN i
ni  Where n – Sample size
N
N – Population size
Ni – Size of the ith stratum
Example: In unity University College a survey is to be conducted on 120 students’

tendency towards Mathematics. The total number of students in each field is as indicated
below. Give the sample size of each field of study.
Ni N1 N2 N3 N4 N5
Field of study Accounting Business Law Marketing Architecture
No. of students 3000 2000 1500 2500 1000
S ol ut i on: n = 120
N1 = 3000, N2 = 2000, N3 = 1500, N4 = 2500, N5 = 1000
N = N1 + N2 + N3 + N4 + N5 = 10,000
nN 1 120 x 3000 nN 2 12
Then, n1    36 n2   x 2000  24
N 10,000 N 1000
n 120 n
Or n1  , N1  x 3000  36 i.e.  0.012
N 10,000 N
12 12 12
n3  x 1500  18 , n 4  x 2500  30 , n5  x 1000  12
1000 1000 1000
III. Systematic Random Sampling Method (SYRSM)
In this method, a random starting point is selected from the list representing the universe
and the remaining units are automatically selected in a definite sequence at an equal
spacing from one another. This method is recommended if the sample units are arranged
in systematic order such as chronological, geographical, alphabetical, etc. and also if the
sample units in the universe are uniquely identified. Systematic sampling is also called
sampling by regular intervals or sampling by fixed intervals.
145
- To get a systematic sample of size n from a population of size N, draw a random
N
number i from 1 to K, where K = , and then select i, i + K, i + 2K, i + 3K, …
n
th
 N 
In general, the i element of the sample is ni  i 
th
w item. where 0  w  n – 1
 n 
Or we can have an alternative method,
Ai = A1 + (i – 1) K. Where A1 – the random starting point or the first sample item.
Ai – the ith item in the sample
Example 1: - From the files of 24 cases of the federal high court, the cases of only 4 of
these is to be seen. The fifth file was selected randomly. Indicate the remaining three
elements of the sample.
Solution: - N = 24 , n = 4 , A1 = 5
N 24
K=  6
n 4
Then A2 = A1 + (2 – 1) K
= A1 + K = 5 + 6 = 11. The 11th file is the second element
A3 = A1 + (3 – 1) K
= A1 + 2K = 5 + 2 (6) = 17. The 17th file is the third element.
A4 = A1 + (4 – 1) K
= A1 + 3K = 5 + 3 (6) = 23. The 23rd file is the fourth element.
Example 2 : - If the 4th and 12th elements of a systematic sample are 70 and 126 (in the
population) respectively, then which item of the population is the first element of this
systematic sample.
Solution: - A4 = 70 , A12 = 126 , A1 = ?

A4 = 70 = A1 + 3K taking these two simultaneous equations,
A12 = 126 = A1 + 11K
56 = 8K  K = 7
146
Then A4 = A1 + 3K
70 = A1 + 3(7)
 A1 = 70 – 21 = 49
The 49th item of the population is the random starting point for the systematic samples.
17.2.2 NON-RANDOM (NON-PROBABILITY) SAMPLING METHOD
In this method, the chance of including any elementary unit of the population in the
sample cannot be determined. It is simple to adopt and no complicated procedure is
needed to draw a sample.
There are many non-random sampling techniques. Some of which are Judgment,
Convenient and Quota sampling.
Judgment Sampling: - The exercise of good perception and appropriate strategy are
taken into account. Samples are selected deliberately by the investigator. It is a personal
view. So it becomes satisfactory with regards to one’s research needs. For example, if a
sample of 10 students is to be singled out from a class of 50 for analyzing the habits of
students, the investigator would select ten students, who in his opinion are representative
of the class.
Convenient Sampling: - Elements of the sample are selected by taking those elements of
the population, which are readily available or convenient for the investigator.
Example: Asking people form one area.
Quota sampling: - In this technique, quota is set up according to given criteria, but the
sample with in prescribed quota is selected by personal judgment of the investigator. It is
suitable in market and public opinion surveys where stratification is very difficult.
However, it suffers from representivness as the interviewer may select samples
convenient for him with regards to location and sample unit.
It is the combination of judgment and stratified sampling methods. so it enjoys the merits
of bot h .
147
Example: - If we ask about Canada dry for a prescribed quota of 20 households, 15
students and 10 children, then this method is quota sampling.
17.2.3 MIXED SAMPLING METHOD (CLUSTER SAMPLING)
Groups of items (clusters), homogenous in character, are formed on location or class

basis. Here a sample of cluster is selected and next within cluster, sub groups are
identified for inclusion in the sample. It is also known as Area sampling as selection of
units is made on the basis of place. The clusters may or may not be equal in size. The
smaller the size of the cluster, the greater will be the accuracy. It is economical and much
easier.
Example: - Suppose a survey is conducted about students’ capacity in auditing. From 10

colleges in Addis Ababa if college X is selected and from 50 classes of college X if 6
classes are selected randomly and considered for the study, then this technique is cluster
sampling.
SAQ 1 A researcher used a random number table ranging from 000 to 999 and
selected 85, 199, 350, 740 and 960 randomly. If the total number of
observations is 120, which items should be included in the sample.
SAQ 2 A stratified sample is going to be selected from four fields of study in

Arat Kilo University. The number of students in Mathematics, Statistics,
Biology and Chemistry is 200, 360, 400 and 480 respectively. I f the ratio
of population size to that of sample size is 40, how large a sample must be
taken from each of the four fields of study.
I
SAQ 3 If the 3rd and 5th items of a systematic sample are 21 and 37 (in
population) and if there are 8 items in the sample, then
a – Give the remaining items in the sample.
b – Find the total number of items in the population.
148
17.3 SUMMARY
Statisticians prefer sample survey to census survey for it is possible to obtain required
accuracy, errors can be controlled effectively, follow up in case of non-response is easy,
efficient when statistical resulted are needed urgently and economical as it covers only
representative units of the universe. This is highly significant in carrying out surveys in
developing countries with budding economy who cannot afford census survey due to lack
of finance.
17.4 ANSWERS TO SELF ASSESSMENT QUESTIONS
SAQ 1 N = 120 , K = 85 , 199, 3 50, 740, 960
K = 85 < 120  The 85th item is selected.

79
K = 199 > 120  19 9  120 = 1 +  The 79th item is selected.
120
110
K = 350 > 120  35 0  120 = 2 +  The 110th item is selected
120
20
K = 740 > 120  74 0  120 = 6 +  The 20th item is selected
120
K = 960 > 120  960 120 = 8 The 120th item is selected
 The 20th, 79th, 85th, 110th and 120th items are included in the sample.
SAQ 2
N1 = 200, N2 = 360, N3 = 400, N4 = 480
For N1 – Mathematics
N n 1
N2 – Statistics  40  
n N 40
N3 – Biology
N4 – Chemistry ni = ?
nN 1 n 1
n1   , N1  x 200  5
N N 40
149
nN 2 n 1
n2   , N2  x 360  9
N N 40
nN 3 n 1
n3   , N3  x 400  10
N N 40
nN 4 n 1
n4   , N4  x 480  12
N N 40
And the total number of students taken for the study is 36
A3 = 21 n=8 A3 = A1 + (3 – 1) K = A 1 + 2K
SAQ 3
A5 = 37 A5 = A1 + (5 – 1) K = A1 + 4K
A5 – A3 = (A1 + 4K) – (A1 + 2K)
37 – 21 = A1 – A1 + 4K – 2K
16 = 2K
K=8
a ) A3 = 21 = A1 + 2 (8)  A1 = 21 – 16 = 5
A2 = A1 + K = 5 + 8 = 13 A6 = A1 + 5K = 5 = 5 (8) = 45
A4 = A1 + 3K = 5 + 3 (8) = 29 A7 = A1 + 6K = 5 + 6 (8) = 53
A8 = A1 + 7K = 5 + 7 (8) = 61
N
b) = K  N = nK
n
= 8 (8)
= 64
 There are 64 items in the population
1 7 .5 MODEL EXAM QUESTIONS
1) In a systematic random sampling, the 10th and 15th sample elements correspond to the
indices (serial numbers) 68 and 103 respectively. Find the index for the 5th systematic
sample.
2) Discuss the difference between random and non-random sampling techniques.
150
3) Classify each of the following samples as random, systematic stratified or cluster
a. Every fifth teenager entering an amusement park is asked to select his or her
favorite ride.
b. All police officers of a small town are interviewed to determine whether they feel
the crime rate has changed over the past year.
4) Unity University College has registered 12,000 students for the last four years. The
college administration would like to know the number of students who have
participated in co-curricular activities. For the purpose of the study, the administrator
collected the names of 400 students from the files by taking proportional number of
students from each of the years (batches) for interview.
Based on the above information, find
a. The variable of interest
b. The source of data (primary or secondary)
c. The population
d. The sample
e. The sampling technique used
5) From 48 small-scale factories of shoes, a researcher wants to study the case of 5 of

them. If he randomly selected 9, 32, 48, 73 & 98 from the RNT, which items should
be included in the sample?
6) A personnel manager selected 20 workers for interview from the master list of 320
workers. He randomly selected the 4th worker.
a – What type of sampling method did he prefer?
b – What is the population size?
c – What is the sample size?
d – Find the interval or constant of coding.
e – from the master list which workers are going to be interviewed 5th, 12th and
19th?
7) In a certain systematic sample, the sum of the 5th and 6th items is 60 and the 3rd item is
one third of the 8th item.
a ) Find the interval.
151
b) Find the first item in the sample.
c) If total number of items is 42, what will be the total number of items selected
for the sample?
8) A research was conducted on four weredas in Addis Ababa. Number of people for
each Wereda is given below. If sample size to population size is given in the ratio 1:10
for Wereda 2, then find
a) Total number of people taken for the sample.
b) The sample size of each Wereda
Wereda 1 2 3 4
No.of people 10,000 11,000 9,000 1 2 ,0 0 0
17.6 GLOSSARY
Cluster: refers to number of things of the same kind or homogeneous, found closely
together.
Estimate: forming judgment about or approximate calculation of size, cost, etc.
Probability: refers to a mechanism, which measures and analyzes the chance of
occurrence of an uncertain event.
Proportionate: corresponding in degree, amount or Ratio.
Random: means each and every unit in the population will have an equal chance of
being selected.
Sampling Units: Sampling unit is the unit in terms of which the enumerator collects the
data.
Sampling: The process of taking sample and making inference to the population.
Sampling Frame: The listing of all units in the population under study.
Sampling Error: The difference between the results obtained from a sample study and
the results that would have been obtained from an equal complete coverage.
Non-Sampling Error: Errors that can arise even in census or complete enumeration.
They mainly arise at the stage of acquiring, recording and processing of data.
152
Parameters: Values obtained from a population and used to describe or summarize
population characteristics.
Statistics: Values obtained from samples and used to describe sample characteristic
(behavior).
17.7 REFERENCE BOOKS
 Business Statistics, Dr. J.S Chandan Prof. Jadjit Singh KK Khanna, 1995, Reprint
1996.
 Business Statistics (A text book for B. Com. Students of Indian Universities), R.H.
DHARESHWAR, M.Sc. M. Phil. 1999.
 Business Statistics (Practical) T.K. Nagpal, P.S.Narayana, 1988.
BLOCK 6: CO-ORDINATE GEOMETRY
UNIT 22 DISTANCE FORMULA AND MID-POINT OF A LINE SEGMENT

UNIT 23 EQUATIONS OF A STRAIGHT LINE
UNIT 24 PERPENDICULAR AND PARALLEL LINES
UNIT 25 SYSTEM OF LINEAR EQUATIONS
The development of the Cartesian Coordinate System represented a very important

advance in mathematics. It was through the use of this system that Rene Descartes (1596-
1650), a French philosopher-Mathematician, was able to transform geometric problems
into algebraic problem that could be solved almost algebraically. The joining of algebra
& geometry known as Coordinate Geometry.
From your high school mathematics, you know that the coordinate (a,b) exists for every
point in the x-y coordinate plane. Since every point on each axis has a real number
associated with it. Hence each point located in the plane can be associated with a unique
ordered pair of real numbers.
In this block we develop some of the basic tools used in coordinate geometry & apply
these tools to write different form of equations of a line and solve system of linear
equations graphically.
153
UNIT 22 DISTANCE FORMULA AND MID-POINT OF A LINE SEGEMNT
CONTENTS
22.1 Introduction
22.2 Length of a horizontal & a vertical line segment
22.3 Distance Formula
22.4 Summary
22.5 Answer to self-assessment questions (SAQ)
22.6 Model Examination questions
22.7 References

At the end of this unit the student should be able to:
- find the distance between pairs of points in a coordinate plane
- apply distance formula
- find the mid-point of a line segment
22.1 INTRODUCTION
In this unit you will learn how to find distance between point in a coordinate plane by
considering different cases and the mid-point of a line segment using the coordinate of
the mid points. Recall that the distance between two points is the length of the segment
that connects them.
22.2 LENGTHS OF A HORIZONTAL AND VERTICAL LINE SEGMENTS
Definition:
A line is said to be horizontal iff it is parallel to the x-axis

___
For a horizontal line segment PQ with P(x1,y) and Q(x2,y)
y-axis
P(X1,Y)
Q(X2,Y)
PQ = /x2-x1/ is the length of the line segment (PQ)  y 
Note: 1. PQ denotes line segment and read as

Line segment PQ 0 x1 x2
x
2. PQ denotes the length of the line segment PQ ( PQ )
154
Definition:
A line is said to be vertical iff it is parallel to the Y-axis

___
For a Vertical line segment PR with P(x,y1) and R(x,y2) y-axis
With P(x,y1) and R(x,y2)
___ y2 R(x,y2)
PR = /y2-y1/ is the length of PR
y1 P(x,y2)
x-
axis
0 X
Example
From the graph given below, find a) P1P2 b) P3P4
y P3 (1,3)
P1(-3,1)
 P2 (4,1)
x
0
P4 (1,-2)
Solution:
___
___
a) P1P2 is a horizontal line segment because the y-coordinate of P1 and P2 are
equal,and P3P4 is a vertical line because both P3 & P4 have the same x-coordinate.
Therefore, P1P2 =  x2-x1= 4-(-3) =7= 7
P3P4= y2-y1 = -2-(3) = -5 = 5
For the following pair of points find PQ (The length of PQ , or the

SA
distance
Q1 between the points P and Q
a) P (3,4), Q (3,1)
b) P (4,3), Q (1,3)
155
22.3 DISTANCE FORMULA
So far we have seen distance formulas for a horizontal and a vertical line segments. The
basic tool in coordinate geometry is the distance between any two points, which is easily
derived using Pythagorean theorem.
Let P1 (x1, y1) & P2 (x2, y2) be two point in a Rectangular Coordinate System then refer
the figure below. We can see that: y axis
P2 (x2, y2)
P1P22 =x2-x12 + y2-y12
y2-y1
x axis
 P1P2 = x2  x1 2   y2  y1 2
P1(x1, y1) Q(x2, y1)
Remark: x2-x12 = x1-x22=(x2-x1)2 = (x1-x2)2
_____
Theorem: The distance between two points P1 and P2 denoted d(P1, P2) is given by
d (P1P2) = x2  x1 2   y2  y1 2 or d = x2  x1 2   y2  y1 2
Where P1 (x1, y1), P2 (x2, y2) and ‘d’ is the distance between P 1 and P2
Note: The formula given above can be used to find the distance between any two points
in a coordinate plane.
Example: Find the distance between points (-3,6) and (0,2)
Solution
LetP1 (x1, y1) = P1 (-3,6) and P2 (X2, Y2) = P2 (0,2)
 
Then d P1P2 = x2  x1 2   y2  y1 2
= 0  (3)2  2  62
= 25
= 5
SAQ2 Find the distance between the points a) (5,-2) and (-6,-4)
b) (0,0) and (5,6)
156
MIDPOINT OF A LINE SEGMENT
The mid-point formula

__
If P (x1,y1) and Q (x2,y2) are any two point in a coordinate plane, then
x x y y
the mid-point M of PQ has the coordinates  1 2 , 1 2 
 2 2 
Y
. P2 (X2,Y2)
. M (X1+Y2 , Y1+Y2)
2 2
X
.
P (X1,Y1)
____
Example1: Find the coordinates of the mid point of AB for A (7,1) , B (-
3,5)
Solution: A (7 , 1) , B (-3 , 5) Letting (x1, y1) = (7, 1) and (x2, y2) = (-3, 5)
M (x,y) = (x=(x1+x2 , y= y1+y2)

2 2
= (7+-3 , 1+5)
2 2
= (2,3)
___
There fore the Mid-point of AB has the coordinates (2,3)
___
Example2: M is the mid point of CD. Find the coordinate of D for C (-
5,4) & M (-2,1).
Let C have coordinate (X1,Y1) and D have coordinate (X2,Y2)
Solution: M (x1+x2 , y1+y2) = (-2,1) C(-5,4)
2 2
= x1+x2 = -2 and y1+y2= 1 M(-2,1)

2 2
= -5 + x2 = -2 and 4+ y2 = 1
157
2 2 (X2,Y2)
= -5 + x2 = -4 and 4+y2 = 2
= x2 = 1 and y2 = -2
Therefore, D has coordinates (1,-2)
Example:3
ΔABC has vertices A (-4,-3) , B (4,-1) and C (-2,3). Find the length of the median CM  
where M is the mid-point of AB
Solution:
 
To get the length of the median CM , it requires
both
the mid-point theorem & the distance formula.
First
C(-2,3) find the coordinates of M. Use mid point
formula.
Let M(x, y) be the mid-point of AB where
x = -4+4 = 0 and y= -3+-1 = -2
2 2
B (4,-1) Therefore M(x,y) = (0,-2)
·M  
Then find d CM
A(-4,-3) Use distance formula
d CM = x 2  x1 2   y 2  y1 2
= (0  2) 2  (2  3) 2
= 4  25
d CM = 29
 
There fore, the length of the median CM is 29 units.
___
SAQ3 Find the coordinate of the mid-point of AB
a) A (5,0) , B(-4,1) b) A (-3,1) , B (8,-5)
___ ___
SAQ4 Show that AC and BD have the same mid point for
A (-3,-5) , B (2,-3) , C (3,5) , and D (-2,3)
158
22.4 SUMMARY
In this unit, we have discussed the distance and mid-point formulas. The distance
between any two points can be computed by using the distance formula. The mid-point is
a unique point on the line segment that is equidistant from the two end points. The
concept of mid-point and distance formulas will be applied in different section of
coordinate geometry.
22.5 ANSWER FOR SELF ASSESSMENT QUESTION
SAQ1: a) PQ = 3 b) PQ = 3
SAQ2 : a) 125  5 5 b) 61
SAQ3: a) (½,½)
b) (5/2,-2)
__ __
SAQ4: The mid point of AC is (0, 0) The mid-point of BD = is (0,0)
: - They have the same mid-point
22.6 MODEL EXAMINATION
1. For the given pair of points, find PQ

a) P (6, -3) , Q (4,-5)
b) P (2, -1) , Q (2, 5)
c) P (1, 9) , Q (-1, 7)
d) P (2, 3) , Q (3, -1)
2. If Q (3,X) is 5 units from P (-2,-1), find all possible value of x.
___
3. Find the coordinates of the mid-point of CD , if :
a) C (3,8) , D (-5,2)
b) C (3,7) , D (-3,-7)
c) C (-2,6) , D (5,-5)
___
4. M is the mid point of AB find the coordinates of B, if:
a) M (-3,-1) , A (7,5)
b) M (5,0) , A (-8,2)
c) M (-6,1) , A (10,-3)
159
__
5. In ΔABC, M is is the midpoint of AB. Show that AM=MB=MC for A (7,1) , B (1,-7)
and
C (1,1).
22.7 REFERENCES
1. Fundamental of Pre- University Mathematics

Yismaw Alemu (Ph.D)
2. Mathematics:- An introductory Course
160
UNIT 23: EQUATIONS OF A STRAIGHT LINE
CONTENTS
23.0: Aims & Objectives
23.1: Introduction
23.2: Two-points form of equation of a line
23.3: Point-slope form of equation of a line
23.4: Slope-intercept form of equation of a line
23.5: Intercepts form of equation of a line
23.6: General form of equation of a line
23.7: Summary
23.8: Answer to SAQ
23.9: Model examination
23.10 References
23.0 AIMS & OBJECTIVES
The aim of this unit is to let you be well accustomed to forming the different forms of
equations of a line and generalize it to a more compact form.
At the end of this lesson students should be able to
-draw a line, given the coordinate of one point and the slope of the line, & then write its
equation
-write an equation of a line in standard form, given the coordinate of two points on the
line.
23.1 INTRODUCTION
In this unit we investigate some standard equation whose graph are straight line and the
concept of slope of a line in key point here that help us to relate points of straight line
which results the equation of the given line in different forms.
23.2 TWO - POINTS FORM OF EQUATION OF A LINE
Slope of a line
If we take two points P1 (x1,y1) and P2 (x2,y2) on a line, then the ratio of
the change in y to the change in x as we move from point P1 to P2 is
called the slope of the line i.e. slope of a line is the measure of the
“Steepness” of a line.
161
Definition
If a line passes through two distinct points P1 (x1,y1) and P2 (x2,y2), then
its slope, some usually denoted by, m, is given by the formula,
M = y2-y1 , x1x2 Y
x2-x1  P2 (x2,y2)
= Vertical Change
Horizontal Change y2-y1
P1(x1,y1)  (x2,y1)
x2-x1
Note: The Slope of a line can be computed by considering any two

points of the line and the result is always constant.
If a line passes through P1 (x1,y1) and P2 (x2,y2), then the equation of the
line is given by the formula
y-y1 = y2-y1 = m where (x, y) is any point on the line
other
(x-x1) (x2-x1) than p1 and p2
hence the equation y-y1 = y2-y1 is two points form of equation of a straight line.
x2-x1 x2-x1
Example: Find the slope and equation of a line that passes through the points A(3,4) and
B (-5,6).
Solution: A (3, 4) , B (-5, 6) Let (x1, y1) = (3, 4) and (x2, y2) = (-5, 6)
Slope = m = y2-y1 = 6-4 = -1

x2-x1 -5-3 4
The equation of the line is given by
y-y1 = x2-y1
x-x1 x2-x1
y-4 = 6-4
x-3 -5-3
y-4 = 2
x-3 -8
-8 (y-4) = 2x – 6
162
-8y+32 = 2x – 8
-8y = 2x – 40
y = -¼x + 5
SAQ1 Find the equation of the line that passes through the points
a) (0,1) and (6,-2)
b) (-2,-3) and (2-6)
23.3 POINT-SLOPE FORM OF EQUATION OF A LINE
An equation of a line passes through a point P1 (x1,y1) with slope m is

y-y1 = m (x-x1) ,
Note that P (x,y) is any point on the line other than P1 variable point and P1 (x1,y1) is
fixed
· P (x,y)
·P1 (x1,y1)
Example
2
Write the equation of a line that passes through (2,3) having slope /3.
Solution: P (2,3) m= 2/3 , Letting (x1, y1) = (2, 3)

y-y1 = m (x-x1)
y-3 = 2/3 (x-2)
3y-9 = 2x-4
3y = 2x+5
SAQ2
Write the equation of a line passes through (1,-1) having a slope of -½.
23.4 SLOPE-INTERCEPT FORM OF EQUATION OF A LINE
If a line has slope m and y intercept b, then an equation of the line is given by y=mx+b
Example1: If equation of a line is –2x-6y = 12. Find the slope and the y-intercept of the
line.
163
Solution: First write the given equation in slope intercept form.
-6y = 2x +12
y = -1/3x – 2
1
Therefore, m= - /3 and the Y-intercept is –2
Note: The Y-intercept of a line is the y-coordinate of the point where the line intersects
the
Y-axis.
Example2: Write the equation of the line in slope-intercept form that passes through the
points (3,0) and (5,-4).
Solution: First compute the slope of the line
m= Y2-Y1 = -4-0 = -4 = -2 ; (3,0) is the coordinate of the Y-intercept

X2-X1 = 5-3 2
Therefore, the equation of the line with slope m and Y-intercept b is given by
Y = mx + b , Since m = -2 and b = 3. It gives
Y = -2x + 3
SAQ3
Write the equation of the line with slope m and Y-intercept b.
a) m = -3, b = 4 b) m = 7, b = -2 c) m = 0 , b = -1/2
23.5 TWO INTERCEPTS FORM OF EQUATION OF A LINE
If a line ℓ has X-intercept (a,0) and Y-intercept (0,b), where both a and b are not zero
then the equation of line ℓ is given by:
x y
  1 a, b  0
a b
Example: The y-intercept & the x-intercept of the line ℓ is 2 and 3 respectively, then
write the equation of the line in intercept form.
Solution:
x-intercept = 2 x+ y=1
y-intercept = 3 a b
x y
Therefore,   1 is the equation of the line.
2 3
SAQ4
164
Indicate the slope, x-intercept, y-intercept & write the equation in the
intercepts form.
a) y = -3/5x + 4 b) 4x – 3y = 24
23.6 GENERAL FORM OF EQUATION OF A LINE
Theorem: The General form of the equation of a line is given by ax + by + c = 0, where

a,b,c R and a and b can not be 0 at the same time.
Example: Write Y + 3 = -1/2 (X-5) in standard form(or general form).
Solution: -2 (y + 3) = x-5
 -2y – 6 = x-5
 -2y – x = 1
 -2y – x – 1 = 0 or  2y + x + 1 = 0
SAQ4
Write 3(y-1) = 2x + 4 in standard form (general form)
23.7 SUMMARY
Equation of a line is a first-degree equation that shows relation between any two points
on the line. The graph of any equation that can be written in the form Ax + By + C = 0.
Where A, B, C  R with A and B not both zero, in a line. An equation of the line through
point (x1,y1) with slope m is
y-y1 = m(x-x1). An equation of the line with slope m and y-intercept b is y = mx + b.
23.8 ANSWER FOR SELF-ASSESSMENT QUESTIONS
SAQ1 a) 2y = 2 – x b) 4y = -3 (x + 6)
SAQ2 2y = - (x + 1)
SAQ3 a) y = -3x + 4 b) y = 7x – 2 c) y = -1/2

3 20
SAQ4 a) m = , b = 4, x-intercept
5 3
165
x y
: - The eq: - 20
 =1
3 4
4
b) m = , b = -8, x-intercept 6
3
x y
: - The eq: -  1
6 8
SAQ5 3y – 2x – 7 = 0 or
2x –3y + 7 = 0
1. For each line whose equation is given, find the slope and coordinate of any one
point on the line.
a) y – 6 = 2 (x-5) b) y – 7/3 = (x + ¾) c) y = -x
2. Write each equation in the form y = mx + b

a) 3y = 6x + 12 b) –4y = 2x – 3 c) 5x + y = -2
1
d) y- 8 = /3 (x + 12)
3. Write in the general form of the equations given in question (2).
4. Graph each line and indicate the slopes and intercepts

a) y – x = 1 b) y = 2/3 x – 3
3 4
23.10 REFERENCES

Yismaw Alemu (Ph.D)
166
UNIT 24 PERPENDICULAR AND PARALLEL LINES
CONTENTS
24.1 Introduction
24.2 Parallel and Perpendicular lines
24.3 Summary
24.4 Answer to SAQ
24.5 Model Examination
24.6 References
The aim of this unit is to let you differentiate parallel and perpendicular lines.
At the end of this unit the student should be able to:

-determine from their slopes if lines are parallel or perpendicular.
-find the slope of a line parallel or perpendicular to a given line.
24.1 INTRODUCTION
From geometry course, we know that two vertical lines are parallel to each other and that
a horizontal line and vertical lines are perpendicular to each other. In this unit you will
see some technique that can help you to see when two non-vertical lines are parallel and
perpendicular to each other.
24.2 PARALLEL AND PERPENDICULAR LINES
Theorem: Given two non-vertical lines ℓ1 and ℓ2 with slopes m1 and m2, respectively,
then
ℓ1 // ℓ2 if and only if m1 = m2
ℓ1  ℓ2 if and only if m1 , m2 = -1 where // mean parallel to
 mean perpendicular to
Example: Given a line ℓ: 2x –y = 2 and the point P (1,2), find an equation of the line ℓ1
through P that is a) Parallel to ℓ b) perpendicular to ℓ
Solution: First, find the slope of ℓ by writing 2x –y = 2 in slope intercept form Y = mx

+b
167
y = 2x –2
m=2
a) The slope of the line ℓ1 parallel to ℓ is the same with that of ℓ. Hence, slope of ℓ 1
= 2 = slop of ℓ
For ℓ1, m = 2 and P (1,2) is on ℓ1 , then let ( x1, y1) = (1, 2). Hence
ℓ1 : y-y1 = m (x-x1)
ℓ1: y – 2 = 2(x –1)
ℓ1: y = 2x –2 + 2
ℓ1: y= 2x
b) ℓℓ1
(Slope of ℓ) . (Slope of ℓ1) = -1
Slope of ℓ1 = -1 = -1/2 , let m1 be slope of ℓ1
slope of ℓ
There fore m1 = -1/2 , p (1,2) on ℓ1 . Take (x1, y1) = (1, 2), Hence eq. of ℓ1 is given by:
y-y1 = m (x-x1)
 y - 2 = -1/2 (x –1)
 y –2 = -X/2 + ½
 y = -X/2 + 5/2
2y = -x + 5  y = - ½ x + 5/2
SAQ1 Given a line L with equation 4x + 2y = 3 and the point P (2, -3), find an
equation of a line through P that is
a) Parallel to L b) Perpendicular to L
Write the final answers in the slope-intercept form, i.e. y = mx + b
24.3 SUMMARY
One of the fundamental relationships that exists between straight lines is the relationship
of being parallel to or perpendicular to each other. In this unit we have described the
relation by the help of their slope.
24.4 ANSWER TO SELF-ASSESSMENT QUESTIONS
SQA1: a) y = -2x + 1
b) y = X/2 - 4
1.Give the slope of a line parallel to the line with the given equation. Then give the
slope of a line perpendicular to the line with the given equation.
a) y = 3x –1 b) y = -2x + 4 c) y = x + 3
168
__ __
2. Use slope to show PR | SQ , if
P (2,-1) , Q (5,3) , R (1,6) , S (-2,2)
3. Write an equation in slope-intercept form of the line passing through the given
point and parallel to the line whose equation is given by
a) (5,-2) ; y = -6x + 1 b) (0,6) ; 2x + 4y = 10
4. Write an equation in slope-intercept form of the line passing through the given
point and perpendicular to the line whose equation is given
a) (-3,1) ; y = 2/3 x – 4 b) (-4,-5) ; 3x + 2y = -7
24.6 REFERENCES

Yismaw Alemu (Ph.D)
169
UNIT 25: SYSTEM OF LINEAR EQUATIONS
CONTENTS

25.1 Introduction
25.2 System of Linear Equation
25.3 Summary
25.4 Answer to SAQ
25.5 Model Examination Question
25.6 References
2 5 .0 AIMS AND OBJECTIVES
The aim of this unit is to let you see the different possible cases of solving two linear
equation in two variables (un know)
At the end of this unit, the student should be able to:

- solve system of linear equation in two variables both algebraically and graphically.
25.1 INTRODUCTION
In this chapter we review how system of linear equation are solved algebraically and
geometrically. We focus on system linear equation in two variables (simultaneous)
equation.
25.2 SYSTEMS OF LINEAR EQUATIONS
In this unit we are interested in solving linear systems of the type

ax + by = h
cx + dy = k system of two linear equations in two variables.
Where x & y are variables, a,b,c,d,h,k R.
The pair (Xo, Yo) is a solution of this system if each equation is satisfied by the pair.
Possible solution to a linear system

The Linear System
ax + by = h
cx + dy = k
must have:
1.exactly one solution
170
or
2. No solution
or
3. Infinitely many solution
There are no other possibilities
Solution by Graphing
We first graph both equations in the same rectangular coordinate system. Then the
coordinates of any points that the graph have in common must be solution to the system,
since they must satisfy both equations.
Example: Solve the system x+y=7

2x + 3y = 18
6
(3,4)
3
Therefore, T.S = {(3,4)}
2
10
={x=3 y
= 4}
2x + 3y = 18
x+y=7
SAQ1 Solve by graphing and check x-y = 3

x + 2y = -3
Solution by substitution
We solve the system in the next example using substitution method.
Example: Solve by substitution: 2x – 3y = 7

3x – y = 7
Solution:
Step 1: Solve either equation for one variable in terms of the other.
3x – y = 7
-y = -3x + 7
y = 3x –7
171
Step 2: Substitute the expression obtained in step 1 into the other equation in the
system and solve the resulting linear equation in one variable.
2x – 3y = 7
2x – 3(3x –7) = 7
2x – 9x + 21 = 7
-7x = -14
:- x=2
Step 3: Substitute the value of the variable determined in Step 2 into any one of the two
equation. Since 3x – y = 7  y = 3x – 7. on y = 3x – 7 if we substitute x = 2, we
get
y = 3(2) –1
y = -1
Therefore (2, -1) is the solution for the system.
SAQ2
Solve by substitution,
3x – 4y = 18
2x + y = 1
Solution Using Elimination by addition
In this method, we multiply the equation by appropriate numbers so that when we add the
two equations, one of the two variables may be eliminated and get a linear equation in
one variable, solve for that variable and substitute the result in any one of the two
equation, to solve for the second variable.
Example: Solve using elimination by addition:

3x – 2y = 8
2x + 5y = -1
Solution:
Multiply the 1st equation by 5 and the 2nd equation by 2, and add the resulting equations.
3x – 2y = 8
2x + 5y = -1
15x – 10y = 40
4x + 10y = -2
19x = 38
x=2
Substitute x = 2 back into either of the original equation.

Let us take the second equation.
172
2(2) + 5y = -1
5y = -1-y
y = -1
T.S = {(2,-1)}
SAQ3
Solve using elimination by addition:
6x + 3y = 3
5x + 4y = 7
25.3 SUMMARY
So far we have discussed different method of solving system of linear equation in two
variables. The techniques we discussed can be applied for the system of n linear equation
with n variables but it may be time consuming. So we need other technique, which is
introduced in the next block.
254 ANSWER TO SELF-ASSESSMENT QUESTIONS
SAQ1 SAQ2 = {(2, -3)}

SAQ3 = {(-1, 3)}
X–Y=3
-3 -3 1 3
/2
2 ·
-3
X + 2Y = -3
1. Solve the following by graphing

a) x + y = 7 b) 3x – y = 7
x–y=3 2x – 3y = 1
2. Solve by substitution
a) 2x – y = 3 b) 2x + y = 6
173
x + 2y = 14 x – y = -3
3. Solve using elimination

a) 3x – 6y = -9 b) 2x – 3y = -2
-2x + 4y = 6 -4x + 6y = 7
25.6 REFERENCES

Yismaw Alemu (Ph.D)
BLOCK 7 MATRICES AND DETERMINANTS
UNIT 26: MATRICES

UNIT 27: DETERMINANTS
This block is about matrices and determinants. The concept of matrices and
determinants are useful to solve different problems, which can be reduced to systems of
linear equations.
The first unit is about matrices (singular matrix). A matrix is an array of numbers
in a rectangular way. And the second unit is about determinants. In this unit we are
going to associate with each square matrix a real number, called determinant of the
matrix.
If you remember we have discussed the method of solving systems of linear

equations (or simultaneous equations) in block 6. in this block we will discuss new
methods to solve systems of linear equations by the help of matrices and determinants.
174
UNIT 26 MATRICES
CONTENTS:
26.0 Aims and objectives

26.1 Introduction
26.2 Definition of Matrix
26.3 Types of Matrix
26.4 Operation of Matrix
26.4.1 Addition and Subtraction of Matrices
26.4.2 Multiplication of Matrices
26.5 Summary
26.6 Answers To Check Your Progress Questions (CYP)

This unit, aims at introducing matrices and operations of matrices, as well as transposes
of matrices. There fore, after ending this unit you should be able to:
 define different terminologies related to matrices

 find sums, differences, products and scalar products involved in matrices
operations
 determine the transpose of a given matrix
26.1 INTRODUCTION
In this unit we discuss matrices. We define and study, some algebraic operations on
matrices, including addition, subtraction, multiplication, and transpose.
Matrices are relatively new concept in mathematics. They were not devised until 1857
when the British Mathematician Arthur Calyley (1821 – 1895) began to use them in the
study of systems of linear equations and linear transformation.
175
26.2 DEFINITION OF MATRIX
A matrix (plural matrices) is an array (or arrangement) of numbers (or variables) in a

rectangular way such that each number has a definite position allotted to it. These
numbers (or variables) are called elements of the matrix.
A capital letter is generally used to name a matrix and lower case letters with double
subscripts generally denote its entries. A matrix A can be written as:
A = (aij)mn where, the notation aij indicate the entry in row i and column j.
In general, we can write a matrix as:
 a11 a12  a1n 

 
a a22  a2 n 
A =  21 

 
a  amn 
 m2 am 2
Here, matrix A has m rows and n columns. It is a matrix of dimension ( or order) m x n,

read as A is an m by n matrix. When we write the order of a matrix the number of rows
should be written first.
 1 2 3
Example: consider the following matrix, A =   A is a 23 matrix, and here
 6 5 8
a11 = 1 , a12 = 2 , a13 = 3 ,a21 = 6, a22 = 5 , a23 = 8
( 1 2 3 ) is the first row and
( 6 5 8 ) is the second row
 
1
  is the first column.
6
CYP1 For matrix A in the above example the 2nd and 3rd columns are _______ and
_____respectively.
Remark:
A matrix, by definition, is simply an arrangement of numbers and has no numerical value
176
26.3 TYPES OF MATRICES
i) Row Matrix: A matrix, which has exactly one row, is called a row matrix.
Example: (5 9 6 2) is a row matrix, but
1 2 3 is not a row matrix (why)?

0 0 0
ii.) Column Matrix: A matrix which has exactly one column is called a column matrix.
3
  3
Example:  2  and   are examples of column matrices.
1   2
 
iii) Square Matrix: A matrix whose number of columns an rows equal is called a
square matrix.
0 0 0 1 2 3 4
1 2    
Example: (2) ,   ,  0 0 0  ,  5 6 7 8 
3 2 0 0 0 0 2 1 3
   
are examples of square matrices.
Note: a matrix with order m x n is said to be square if and only if m = n
Example: consider the following matrix.
1 2 3 4
 
A = 5 6 7 8
8 0 2 1
 
This matrix (matrix A) is an mn = 34 matrix. Since 4  3, a matrix A is
not a square matrix.
iv) Null or zero matrix: A matrix each of whose elements is zero is called a null
matrix or zero matrix.
0 0 0 0
Example: ( 0 ) ,   , ( 0 0 ) ,   are examples of null or zero matrices
0 0 0 0
v) Diagonal Matrices: A square matrix whose every element (entry) other the main diagonal elements is zero is called a
diagonal matrix: (The main diagonal of a square matrix runs from upper left to the lower right)
177
1 0 0
 3 0  
Example:   ,  0 5 0  are examples of diagonal matrices
0 1 0 0 2
 
Note:
1. The diagonal elements in a diagonal matrix may also be zero.

2. A = (aij) is a diagonal matrix if and only if
i. A is a square matrix
ii. aij = 0 , for all i  j
0 0 0
 
Example: A =  0 0 0  is a diagonal matrix. Clearly A is a square matrix, having 3
0 0 0
 
rows and 3 columns and
a12 = 0 , a13 = 0 , a21 = 0 , a23 = 0 , a31 = 0 a32 = 0
a31 = 0, a32 = 0, a33 = 0
0 2
CYP2 Let A =   Is A a diagonal matrix? (Why?)
3 0
[Hint: If aij = 0 for i  j, then the matrix is a diagonal matrix]
vi) Scalar Matrix: A diagonal matrix, whose diagonal elements are equal, is called
scalar matrix.
3 0 0 0
1 0 0  
0 0   0 3 0 0
Example:   , 0 1 0 , 0
0 0 0 0 1 0 3 0
   
0 3 
 0 0
are examples of scalar matrices.
vii) Identity Matrix: A diagonal matrix whose diagonal elements are all equal to
1 (unity) is called identity matrix(or unit matrix), and it is denoted by I
178
1 0 0
1 0  
Example:   ,  0 1 0  are examples of identity matrices.
0 1 0 0 1
 
Note: A matrix A = (aij) is said to be an identity matrix if and only if

1) A is diagonal matrix
2) aij = 1 , for all i=j.
vii) Triangular matrix: A square matrix whose elements aij = 0 whenever i<j is called
a lower triangular matrix.
Similarly, a square matrix whose elements aij = 0 whenever i>j is called upper
triangular matrix.
1 0 0
 2 0  
Example:   ,  4 5 0  are lower triangular matrices, and
 3 0 6 8 9
 
1 2 3 4
 
0 5 4 6 1 2
0 ,   are upper triangular matrix
0 8 2 0 3
 
0 7 
 0 0
CYP3 1) Give the dimension of each matrix.

 4 1
 5 2  3  
A)   B)  6 9  C) ( 6 )
2 1 6   2 3
 
6 
 
  3 a b
D)   a. E) ( 2 3 6 -1 ) F)  
0 e f 
 
2 
 
0 0 0 0 1 0 0 0
   
0 0 0 0 0 1 0 0
G)  H) 
0 0 0 0 0 0 1 0
   
0 0 0  0 1 
 0  0 0
179
2) List the matrices in (1) above that can be described as follows.
i. A row matrix iv. A zero matrix
ii. A square matrix v. A diagonal matrix
iii. A column matrix vi. An identity matrix
Definition:
Two matrices are equal if and only if they have the same dimensions and the elements in
all corresponding positions are equal.
 2 3 5 / 2  2 3 5 / 2
Example 1.      but
  3 3/ 3 1    3 1 1 
4 9 4 6 
    
9 2 9 2 
Example 2. Find the value of each variable if
 x  3  1  4  1
    
 2 6   2 3 y 
Solution: since the matrices are equal, elements in corresponding positions are equal.
x + 3 = 4 and 3y = 6
There fore, x = 1 and y=2
Note: A zero matrix with order (dimension) mn can be denoted by Omn ,for example
0 0 0
O23 =  
0 0 0
CYP4 Find the value of each variable
 x y  x  2 3 y 10  z 
A) O22 =   B) O23 =  
 z o  0 0 0 
 4 x  3 y   z 1  3 6  3x  y   4 
C)      D)     
19  1 5 0  19  1 5 0  1x  3 y    2 
180
26.4 OPERATIONS OF MATRICES
26.4.1 Addition and Subtraction of Matrices

Addition and subtraction on matrices can be defined if and only if the matrices have the
same dimensions.
Definition:
Given two mn matrices

A= (aij)mn and B= (bij) mn, their sum is
A + B = (aij + bij)mn and their difference is
A – B = (aij – bij) mn
Example1: find A + B for each of the following

 1 3   1  2
 5 0   6  3    
A) A =   , B =   B) A =   1 5  , B =   1  2
 4 1/ 2  2 3   6 0  3 1 
  
Solution: Each pair of matrices has the same order. Add the corresponding entries.
  5 0   6  3
A) A + B =     
 4 1/ 2   2 3 
  5  6 0  (3)   1  3 
=     
 4  2 1/ 2  3   6 7 / 2 
 1 3   1  2
   
B) A + B =   1 5    1  2 
 6 0   3 1 
   
 1  (1) 3  (2)   0 1 
   
=   1  1 5  (2)    0 3 
 6  (3) 0  1   3 1 

Example 2. Find C – D for each of the following if possible

 1 2  1  1
   
A) C =   2 0  , D =  1 3 
  3 1 2 3 
   
181
 5  6   4
B) C =   , D =  
 3 4  1 
Solution:
A) We subtract corresponding entries (elements)
 1 2   1  1
   
C – D =   2 0 - 1 3 
  3 1 2 3 
   
 1  1 2  (1)   0 3 
   
=   2 1 0  3     3  3
  3  2 1  3    5  4
   
B) The matrices do not have the same order, so we cannot subtract.
The properties of addition of matrices are summarized below.
Properties Of Addition Of Matrices

Let A , B , and C be mn matrices. Let O mn be the mn zero matrix.
1. Closure property A + B is an mn matrix
2. Commutative property A + B = B + A
3. Associative property (A + B)+C = A+(B+C)
4. Identity property A + O mn = A = O mn + A
(i.e. O mn is an identity matrix with respect to addition (+))
5. Inverse property A + (-A) = -A + A = O mn
(i.e. the inverse of A with respect to addition is –A)
26.4.2 Multiplication of Matrices
The product of two matrices A and B AB is defined only when the number of columns of
A is the same as the number of rows in B.
Definition:
Dot Products
The dot product of a 1n row matrix and, an n1 column matrix is a real number given
by:
182
 b1 
 
b 
(a1 a2 …an) .  2  = a1b1 + a2b2 + …anbn

 
b 
 n
Remark: the dot between the two matrices is important. If the dot is omitted, the
multiplication is of another type, which we consider later.
5
 
Example: ( 3 2 1) .  1 
 4
 
= 3  5 + 2  1 +1  4
=15 + 2 + 4
= 21
Definition:
Matrix Product ( multiplication)
The product of two matrices A and B AB is defined only on the assumption that the
number of columns in A is equal to the number of rows in B. if A is an m  p matrix and
B is a
p  n matrix, then the matrix product of A and B, denoted by AB (with no dot) is an

m  n matrix whose elements in the ith row and jth column is the dot product of the ith row
matrix of A and the jth column matrix of B.
 1 6 
 3 1  1    4  6
Example: For A=   , B =  3  5  and C =  
2 0 3   2 4  1 2 
 
Find each of the following
A) AB B) BA C) BC D) AC
Solution: A) A is a 2 3 matrix and B is a 32 matrix, so AB will be a 22 matrix
 1 6 
 3 1  1  
AB =   .  3  5 
2 0 3   2 4 
 
183
 3  1  1  3  (1)  (2) 3  6  1  (5)  (1)  4 
=  
 2  1  0  3  3  (2) 2  6  0  (5)  3  4 
 3  3  2 18  5  4   8 9
=   =  
 2  0  6 12  0  12    4 24 
B) B is a 32 matrix and A is a 23 matrix, so BA will be a 33 matrix.
 1 6 
   3 1  1
BA =  3  5   
 2 4  2 0 3 
 
 1 3  6  2 1 1  6  0 1  (1)  6  3 
 
=  3  3  (5)  2 3  1  5  0 3  (1)  5  3 
  2  3  4  2  2  1  4  0  2  1  4  3 
 
 15 1 17 
 
=   1 3  18 
 2  2 14 
 
C) B is a 32 matrix and C is a 22 matrix, so BC will be a 32 matrix
 1 6 
   4  6
BC =  3  5   
 2 4  1 2 
 
 1 4  6  1 1  (6)  6  2 
 
=  3  4  5  1 3  (6)  5  2 
  2  4  4  1  2  (6)  4  2 
 
 10 6 
 
=  7  28 
  4 20 
 
D) The product AC is not defined because the number of columns of A which is 3 is not
equal to the number of rows of C which is 2.
Note
1. If A is a square matrix, then A can be multiplied by itself.
i.e. A. A = A2 (called power of a matrix)
2. The scalar product of a number k and a matrix A is the matrix denoted by kA, obtained
by multiplying each entry of A by the number k. The number k is called a scalar.
184
Example: find A2 and kA if
1 0
A =   and k = 3
3 4
1 0 1 0  1 0 
Solution: A2 =     =  
 3 4   3 4  15 16 
1 0  3 0 
kA = 3   =  
 3 4   9 12 
CYP5 Give the dimension of the product of the matrices.

9 9 0 
   6 2 8  
A) ( 3 2 1 ) 6 B)    3 0 
 2   1 0 3  1  4 
   
CYP6 Multiply
5  0 8
 4      3 1  2
A) (3 1)   B) ( 0 1 0 )  6  C)  3 1   
6 7   1 5  0 8  5 
   
Properties of matrix multiplication

Let A , B and C be nn matrices. Let I nn be the identity matrix and O nn be the zero
matrix.
1. Associative property (AB)C = A(BC)
2. Distributive property A(B+C) = AB + AC
(B + C)A = BA + CA
3. Identity property I nn . A = A . I nn = A
4. Multiplicative property of Onn Onn . A =A . Onn = Onn
Note: Matrix multiplication is not in general commutative.
0 3   2 0
Example: let A =   and B =  
 1  1  1 4
 0 3   2 0   3 12 
Then AB =     =  
 1  1  1 4   1  4 
 2 0 0 3  0 6 
But BA =     =  
 1 4   1  1  4 1
Therefore, AB  BA
Definition:
Let A be a matrix. The matrix obtained from A by interchanging of its rows and
corresponding columns, is called the transpose of A, and denoted by A t or A
185
1 2 3
Example: let A =   , then the transpose of A is given by
4 5 6 
1 4
t  
A = 2 5  , it is obtained simply by interchanging the row’s and
3 6 

corresponding column’s
Note:
1. (At)t = A
2. (A + B) t = A t + B t
3. (AB) t = B t A t
 2 3 1 5
CYP7 let A =   , and B =  
 4 5 6 7
Verify (At) t = A , (A+B) t = At + Bt , (AB) t = Bt At
26.5 SUMMARY
In this unit we have seen that two matrices are equal if and only if they have the same
dimension (or orders) and all corresponding elements are equal. Matrices having the
same dimensions can be added (or subtracted) by adding (or subtracting) corresponding
elements.
In matrix algebra, a real number is called a scalar. The scalar product of a real number k
and a matrix A is the matrix kA.
The product of Amn and B n p is the mp matrix.
26.6 ANSWERS TO CHECK YOUR PROGRESS QUESTIONS(CYP)

 2  3
CYP1   and   respectively
5 8
CYP2 No, since a12 = 2 0 , and also a21 = 3 0
CYP3 1) A) 23 B) 32 C) 11 D) 41

E) 1 4 F) 22 G) 44 H) 44
2) i) C and F ii) C , F , G and H iii) C and D
iv) G v) C , G and H vi) H
CYP4 A) x = 0 , y = 0 , z = 0 B) x = -2 , y = 0 , z = 10
C ) x = -1 , y = 6 , z = 4 D) x = 1 , y -1
186
CYP5 A) |x| B) 22
 0 64  40 
 
CYP6 A) ( 18 ) B) ( 6 ) C)  9 11  11 
  3 39  23 
 
 2 4  2 3
CYP7 At =   , (At)t =   = A
 3 5  4 5
3 8  3 10 
A+B =   , (A + B)t =  
10 12   8 12 
 3 10   2 4 1 6
At + Bt =   = (A + B)t , Since At =   and Bt =  
 8 12   3 5 5 7
 20 34 
(AB)t =   = Bt At
 31 55 
Instruction: write the short and precise answers on the space provided.
 3 2
 3 2 5    4 2 0
Let A =   , B =  5 6  , C =   , k = 5 , and = 3
 4 6 1  0 1  1  3 5 
 
Then
A) A + C = ______________________ B) A – C = _________________________
C) At + B = ______________________ D) (At + B) t = ______________________
E) k(A + C) = ____________________ F) ( K + )A = ______________________
G) AB = _________________________ H) (AB) t = ________________________
I) BtAt = _________________________
26.8 REFERENCES
- Understanding Pre- College Mathematics, Mesaye Demessie, 2001

- Business Mathematics Quzi Zmerndin, V.K Khanna, S.K Bhambri,1980
187
UNIT 27 DETERMINANTS
CONTENTS:

27.1 Introduction
27.2 Definition
27.3 Cramer’s Rule
27.4 Inverse of a Matrix
27.5 Solution of System of Equations Using the Inverse Matrix (optional)
27.6 Summary
27.7 Answers to Check Your Progress Questions (CYP)
27.9 References

The aim of this unit is to introduce you new methods of solving systems of equations
(simultaneous equations) using matrices and determinants. Hence, by the time you
complete this unit you will be able to:
 find the determinants of any nn square matrix.

 solve systems of equations using Cramer’s rule
 determine the inverse of a given matrix
 solve systems of equations using inverse of matrices.
27.1 INTRODUCTION
In the previous block (block6), you remember that we have seen how to solve systems of
linear equations that involves two or more variables. This unit is concerned about a new
term; ‘determinant.’ The concept lies in that whenever it is necessary to assign a real
number to a matrix. In line with this, we will use determinants to solve systems of linear
equations. Moreover, we will use determinants to find the inverse of a matrix, which in
turn, is used to solve systems of linear equations.
188
27.2 DEFINITION
With every square matrix, we associate a number called its determinant. The number of
elements in any row or column is called the order of the determinant.
Determinant of a 22 matrix

a b
Let A =   , then the determinants of A, denoted by det A, is defined as follows:
c d 
a b
det A = = ad – bc
c d
Since the determinant of A has 2 rows, det A is of order 2.
Note: the determinant of a matrix is usually displayed in the same form as the matrix, but
with vertical bars rather than brackets enclosing the elements.
6 4
Example: Let A =   Evaluate det A
2 3 
6 4
Solution: det A = = 6  3 – 2  4 = 10
2 3
Definition:
Determinants of a 33 matrix

 a1 b1 c1 
 
Let A =  a2 b2 c2  then det A is defined as follows
a c3 
 3 b3
a1 b1 c1
det A = a2 b2 c2 = a1b2c3 + a2b3c1 +a3b1c2 –a3b2c1 – a2b1c3 – a1b3c2
a3 b3 c3
A convenient method for finding the six terms needed to evaluate a 33 determinant is
shown below.
1. Copy the first two columns of the matrix in order to the right of the third
column.
189
a1 b1 c1 a1 b1
a2 b2 c2 a2 b2
a3 b3 c3 a3 b3
2. Multiply each element in the first row of the original matrix by the other two
elements from left-to-right down ward on the diagonal, these products are the first
three terms of the determinant.
a1 b1 c1 a1 b1
a2 b2 c2 a2 b2
a3 b3 c3 a3 b3
a1b2c3 + b1a3c2 + b2b3c1
3. Multiply each element in the last row of the original matrix by the other two
elements from left-to-right upward on the diagonal, the opposites of these
products are the last three terms of the determinant
a1 b1 c1 a1 b1
a2 b2 c2 a2 b2
a3 b3 c3 a3 b3
-a3b2c1 – a1b3c2 – a2b1c3
Therefore, det A is given by:

- - -
a1 b1 c1 a1 b1 c1 a1 b1
det A = a2 b2 c2 = a2 b2 c2 a2 b2
a3 b3 c3 a3
c3 a3 b3 b3
+ + +
a1b2c3 + b1c2a3 + a2b3c1 – a3b2c1 – a1b3c2 – a2b1c3
= a1b2c3 + a3b1c2 + a2b3c1 – a3b2c1 - a1b3c2 – a2b1c3
Remark: This method works only for 33 determinant
3 5 0
Example: Evaluate det A = 2 4  1
1 6 3
Solution:
190
3 5 0 3 5 0 3 5
det A = 2 4  1 = 2 4 1 2 4
1 6 3 1 6 3 1 6
3  4  3 + 5 -1  1 + 0  2  6 - 1  4  0 – 6  –1 3 – 3  2  5
= 36 – 5 + 0 + 18 – 30 = 19 Answer: - det A = 19
CYP1 Evaluate the determinants of the following matrices

 2  5 3  2 9 4
 3 1  2 15     
A)   B)   C)  0 4 1 D)  2 1 1 
 4 3 0 3    3 2 6  4 0 5
   
Definition:
For a square matrix A = (aij), the minor Mij of an element aij is the determinant of the
matrix formed by deleting the ith row and the jth column of A.
 3 2 5
 
Example: for the matrix A = (aij) =   1 4 9  , find each of the following
 6 0 7
 
A) M12 B) M23 C) M 33
Solution:
A) Delete the first row and the second column and find the determinant of the 2  2
matrix formed by the remaining elements.
 3 2 5
  1 9
1 4 9 M12 =
 6 0 7 6 7
 
= -1  7 – 6  9
= -7 – 54
= -61
B) Delete the second row and the third column and find the determinant of the 2  2
matrix formed by the remaining elements.
191
 3 2 5
  3 2
1 4 9 M23 =
 6 0 7 6 0
 
= 30–62
= -12
 3 2 5
  3 2
C)   1 4 9  M 33 =
 6 0 7 1 4
 
= 3  4 – (-1  2)
= 12 + 2
= 14
Definition:
For a square matrix A = (aij) , the cofactor Aij of an element aij is given by:
Aij = (-1)i+j Mij, where Mij is the minor of aij
 3 2 5
 
Example: for the matrix A = (aij) =   1 4 9  , find each of the following.
 6 0 7
 
A) A11 B) A23 C) A31
 3 2 5
  4 9
Solution: A)   1 4 9  M11 =
 6 0 7 0 7
 
= 4  7 – 0  9 = 28
1+1
Then, A11 = (-1) M11
= (-1)2 (28) = 28
 3 2 5
  3 2
B) 1 4 9 M23 =
 6 0 7 6 0
 
= 3  0 – 6  2 = -12
Then A23 = (-1)2+3 M23

= (-1)5 (-12)
= (-1) (-12) = 12
192
 3 2 5
  2 5
C) 1 4 9 M31 =
 6 0 7 4 9
 
= 2  9 – 4  5 = -2
3+1
Then A31 = (-1) M31
= (-1)4 (-2) = -2
 1 0 0  2
 
 4 1 0 0 
CYP2 For the matrix A =  , Find:
5 6 7 8 
 
 2  3 1 0 
 
A) M41 and M44 B) M12 C) A41and A44 D) A12
Definition:
Determinant of Any Square Matrix

For any square matrix A of order nn (n>1), we define the determinant of A, denoted
by Aor det A, as follows. Choose any row or column. Multiply each element in
that row or column by its cofactor and add the results.
Example1: evaluate det A ( or A)

  8 0 6
 
A=  4  6 7 
 1  3 5
 
Solution: let us choose the third row.
det A = (-1) A31 + (-3) A32 + 5 A33

(Note: A31 is the cofactor of the element –1 ,A32 is the cofactor of the element (-3), and
A33 is the cofactor of the elements 5)
0 6 8 6 8 0
det A = (-1)(-1) 3+1 . + (-3) (-1) 3+2 . + 5(-1) 3+3
6 7 4 7 4 6
0 6 8 6 8 0
Note: A31 = (-1) 3+1 , A32 = (-1) 3+2 and A33 = (-1) 3+3
6 7 4 7 4 6
det A = (-1) . (-1) . (07 - (-6)  6 ) + (-3)(-1) . (-87-46) + 5(-1) . (-8  (-6) – 04)
4 5 6
= (-1) (36) + (+3) (-80) + 5(48)
193
= -36 - 240 + 240
= -36
Example2: Evaluate det A if,

  1 6  1
 
A =  3  5 3 
 0 2 
 4
Solution: Let us choose the second column.
Then, det A = 6A12 + (-5) A22 + (-3) A32
3 3 1 1 1 1
= 6(-1)1+2 + (-5) (-1)2+2 + (4) (-1)3+2 .
0 2 0 2 3 3
= -6 (-32 – 0 3) + (-5) (-12 - 0-1) + (-4) (3-1 - (-1)  (-3))
= (-6) . (-6) + (-5) (-2) + -4 (-6)
= 36 +10 +24
= 70
Remark:
The determinant of a 11 matrix is simply the element of the matrix.
CYP3 i) Evaluate the det A, if

 1 0 0  2
 
 4 1 0 0 
A= 
5 6 7 8 
 
 2  3 1 0 
 
ii) A = ( 5 )
27.3 CRAMER’S RULE

Determinant may be used to solve system of linear equations. The procedure for solving
systems of linear equations using Cramer’s rule is given below.
Cramer’s Rule for 2  2 systems.

The solution of the system of equations
a1x + b1y = c1
a2x + b2y = c2
is given by
D Dy
x x , y
D D
194
a1 b1
where D = , determinant of the coefficient matrix
a2 b2
c1 b1 a c1
Dx = , Dy = 1 , and D  0
c2 b2 a2 c2
Note that the denominator D contains the coefficients of x and y, in the same position as
in the original equations. For x, the numerator is obtained by replacing the x-coefficient
in D (the a’s) by the c’s. For y, the numerator is obtained by replacing the y-coefficient in
D (the b’s) by the c’s.
Example: solve using Cramer’s rule
2x – y = 5
x – 2y = 1
Solution: we have
2 1
D= = -4 –(-1) = -3  0
1 2
Dx 5 1  10  (1) 9
x = =   3
D 1 2 3 3
-3
2 5
Dy 1 1 25 3
y = =  1
D 3 3 3
Hence, the solution is (3 , 1) and the solution set is {(3 , 1)}
CYP4 solve using Cramer’s rule:

A) 2x + 5y = 7 B) 3x + 4y = -2
3x – 2y = 1 5x - 7y = 1
Cramer’s rule for 33 system.

The solution of the system of equations
a1x + b1y + c1z = d1
a2x + b2y + c2z = d2
a3x + b3y + c3z = d3
D Dy D
is given by: x  x , y , z z
D D D
a1 b1 c1 d1 b1 c1
where D = a2 b2 c2 , Dx = d 2 b2 c2 ,
a3 b3 c3 d3 b3 c3
195
a1 d1 c1 a1 b1 d1
D y = a2 d2 c2 , Dz = a 2 b2 d 2 , and D 0
a3 d3 c3 a3 b3 d3
Example: solve using Cramer’s rule:

A) x – 3y + 7z = 13 B) x + y + z = 9
x+y+z=1 2x + 5y + 7z = 52
x – 2y + 3z = 4 2x + y – z = 0
Solution: A) we have
1 3 7 13  3 7
D = 1 1 1 = -10 , Dx = 1 1 1 = 20
1 2 3 4 2 3
1 13 7 1  3 13
Dy = 1 1 1 = -6 , Dz = 1 1 1 = -24
1 4 3 1 2 4
Dx 20 Dy 6 3 D  24 12
Then x  = = -2 , y =  , z z = 
D  10 D  10 5 D  10 5
There fore, the solution is (-2 ,

3 12
,  and the solution set is {(-2 ,
3 12
, 
5 5 5 5
1 1 1 9 1 1
B ) D = 2 5 7 = -4  0 Dx = 52 5 7 = -4
2 1 1 0 1 1
1 9 1 1 1 9
Dy = 2 52 7 = -12 Dz = 2 5 52 = -20
2 0 1 2 1 0
Dx 4 Dy  12 D  20
Then, x  = =1 y = =3 z z = =5
D 4 D 4 D 4
Hence, the solution is ( 1 , 3 , 5 ) , and the solution set is {( 1 , 3 , 5 )}
Note: 1) when D = 0 Cramer’s rule can’t be used. If D = 0 and D x , Dy , and Dz are 0,

the system is dependent. If D = 0 and one of Dx , Dy , or Dz is not 0, then the
system is inconsistent. These are also true for a 22 system.
2) Cramer’s rule can be extended for nn system of equations.
196
CYP5 Solve the following system of equations using Cramer’s rule.
x – 3y – 2z = 9
3x + 2y + 6z = 20
4x – y + 3z = 25
27.4 INVERSE OF A MATRIX
Definition:
Cofactor matrix is defined to be the matrix obtained by replacing every number aij of the
given matrix A by its cofactor in the determinant of A.
 1 2 3
 
Example 1: let A =  4 5 1  then the cofactor matrix of A is given by:
 2 4 0
 
 5 1 4 1 4 5
  
 4 0 2 0 2 4
 2 3  4 2 6 
1 3 1 2  
  =  12  6 0 
 4 0 2 0 2 4   13 11  3 
 2 3 1 3 1 2  
  
 5 1 4 1 4 5 
Definition:
Let A be a matrix and let C be its cofactor matrix, then the transpose Ct of C is called
the adjoint of A (to be written, in short Adj A).
Example2: for the above example (example1)
 4 2 6    4 12  13 
   
C =  12  6 0  and C =  2  6 11 
t
  13 11  3   6  3 
   0
  4 12  13 
 
Therefore, Adj A = C =  2  6 11 
t
 6  3 
 0
197
Inverse of a matrix A, denoted by A–1, is given by the formula
1
A–1 = Adj A, where A  0 and A is determinant of matrix A.
A
 1 2 3
 
Example3: Find the inverse of the matrix A =  4 5 1 
 2 4 0
 
Solution: we notice A is a square matrix
5 1 4 1 4 5
det A = 1 -2 +3
4 0 2 0 2 4
= -4 + 4 + 3 6 = 18  0 , hence A is invertible.
We have already calculated the adjoint of A ( in example 2 above) thus we have

  4 12  13 
1 1  
 2  6 11 
–1
A = Adj A =
A 18 
 6 0  3 
  4 12  13    2 2  13 
   
 18 18 18   9 3 18 
 2  6 11  =  1  1 11 
 18 18 18   9 3 18 
 6 0 3   1 1 
   0 
 18 18 18   3 6 
Example 4: Find inverse of the matrix

1 2 1
 
A =  2 3 2
 3 2 2
 
3 2 2 2 2 3
Solution: A = 1 -2 +1
2 2 3 2 3 2
=10
The cofactor matrix is
198
 3 2 2 2 2 3
  
 2 2 3 2 3 2
2  5
 2  2
1 1 1 1 2  
C =   =  2 1 4 
 2 2 3 2 3 2  1
 2  0  1 
1 1 1 1 2
  
 3 2 2 2 2 3 
2  2 1 
 
Adj A = C =  2  1 0 
t
 5 4  1
 
2 2 1
1 1  
–1
Hence, A = Adj A =  2 1 0 
A 1 
5 4  1
2  2 1 
 
= 2 1 0 
 5 4  1
 
CYP6 Find the inverse of the following matrices:

1 3 3  9 7 3
   
A) 1 4 3  B)  5  1 4 
1 3 4   6 8 2
   
27.5 SOLUTION OF SYSTEM OF EQUATIONS USING THE INVERSE

MATRICES (OPTIONAL)
Matrix Equation: We can write a matrix equation equivalent to a system of equations.
Example: Write a matrix equation equivalent to the following system of equations.
4x + 2y – z = 3
9x + z = 5
4x + 5z – 2z = 1
Solution: We write the coefficients on the left in a matrix. We then write the product of
that matrix and the column matrix containing the variables, and set the result equal to the
column matrix containing the constant on the right.
 4 2  1   x   3
    
 9 0 1   y    5
 4 5  2  z  1
    
199
4 2 1  x  3
     
If we let A =  9 0 1  X =  y  and B =  5 
 4 5  2 z 1
     
We can write this matrix equation as AX = B. Solve systems of linear equations using a
matrix equation like AX = B.
Consider the matrix equation.

AX = B
Then if A-1 exists, we have
A-1 AX = A-1 B
(=) IX = A-1 B (A-1 A = I identity matrix)
(=) X = A-1 B (property of identity)
Suppose we have the following system of equations.
x - 5y = 2
2x + y = 4
We can write this system as follows:
 1  5  x   2
      
 2 1   y   4
 AX = B , where
 1  5  x  2
A =   , X =   , and B =  
2 1   y  4
 If A  0 so that A exists, the system has unique solution X = A-1B.
-1
Example1: use inverse matri to solve the following system.

x–y=2
x+y=4
Solution: write the matrix equation AX = B:
 1  1  x   2
    =  
1 1   y   4
1
A-1 = adj A
A
 1 1
1  1 1  
=   =  2 2  (see inverse matrix)
2   1 1   1 1 
 2 2
Therefore, AX = B  X = A-1B
 1 1
 x   2 2   2
   =    
 y    1 1   4 
 2 2
200
 1 1 
 x   2   4
   =  2 2 
 y    2   4
1 1
 
 2 2 
 x  3
   =  
 y 1
 x = 3, and y = 1
Hence, the solution is (3 , 1)
Example 2: Solve using inverse matrix.

x – 3y + 7z = 13
x+y+z=1
x – 2y + 3z = 4
Solution: write the matrices representation of the system.
1  3 7   x  13 
     
1 1 1   y  =  1 
1  2 3   z  4
     
1  3 7 
 
 AX = B , where A = 1 1 1 
1  2 3 
 
 x 13 
   
X =  y  , and B =  1 
z 4
   
1
We have, A-1 = adj A
A
 1 1 1 1 1 1 
  
 2 3 1 3 1 2 
 3 7 1 7 1 3 
C =    cofactor matrix of matrix A
 2 3 1 3 1 2
 3 7 1 7 1 3 
  
 1 1 1 1 1 1 
 5  2  3
 
C =   5  4  1
  10 6 4 

201
 5  5  10 
 
Adj A = C =   2  4 6 
t
  3 1 4 

1
A-1 = adj A , A= -10
A
 1 1 
 1 
 5  5  10   2 2 
1    1
 
2 3
A-1 =  2  4 6  = 
 10  5 5 5
  3 1 4   3 1 2
  
 10 10 5
-1
X=A B
 1 1 
 1 
 x  2 2  13 
   
  y =    1
1 2 3
 5 5 5  
z  3
  1 2  4 
  
 10 10 5
 1 1 
   13  1  1 4 
 2 2 
=  1
 13 
2
1    4 
3
 5 5 5 
 3 1 2 
  13  1    4
 10 10 5 
 
 
  2
3
=  
 5 
 12 
 
 5 
3 12
 x = -2, y= , and z =
5 5
3 12
Therefore the solution of the given system is ( -2 , , ) and the solution set is
5 5
3 12
{( -2 , , )}
5 5
202
CYP7 Use inverse matrix to solve:
A) x + 2y + 3z = 11
2x + 4y + 5z = 21
3x + 5y + 6z = 27
B) x + 2y + z = 8
2x + 3y + 2z = 14
3x + 2y + 2z = 13
C) 3x – y + z = 2
-15x + 6y – 5z = 5
5x – 2y + 2z = 3
27.6 SUMMARY
 For each nn matrix A there is a real number called the determinant of A.
 If A is a matrix with det =  A  0, then the inverse of A is given by:
1
A-1 = adj A, where Adj A = Ct and C is called the cofactor matrix of A.
A
 The solution of a system of n linear equation in n variables is given by
Dx Dy Dz
x , y , z , … where D is the determinant of the matrix of
D D D
coefficients of the variable ( D  0) and Dx , Dy , Dz , … are derived from D
by replacing the coefficient s of x , y , z … respectively, by the constants.
This method is called Cramer’s Rule.
27.7 ANSWERS TO CHECK YOUR PROGRESS QUESTIONS (CYP)
CYP1 A)5 B)6 C) 95 D) –60
CYP2 A) M41 = -14 , M44 = 7 B) M12 = 32

C) A41 = 14 , A44 = 7 D) A12 = -32
CYP3 i) det A = 110 ii) det A =5
CYP4 A) S.S.={(1, 1)}

B) S.S.={(-10/41, -13/41)}
CYP5 ( x , y , z) = (2 ,-5 , 4)
203
 17 1  31 
 
 7  3  3  35 7 70 
  1
B) 
3 
CYP6 A)   1 1 0  0
 5 10 
1 0 1    23
 3 22 
 
 35 7 35 
CYP7 A) (x , y , z) = ( 2 , 3 , 1) B) (x , y , z) = ( 1 , 2 , 3)
C) (x , y , z) = ( 1 , 15 , 14)
Section A. Short Answer

Instruction: write short and precise answer on the space provided.
1 3 4
1. If 5 15 10  80 , then x = __________________
1 x 2
 3 0 4
 
2. Given matrix A =  2  1 3 
 4 1 0
 
Then A) M12 = ________________ C) A23 = ____________________
B) M32 = ________________ D) A33 = _____________________
Section B: Work Out Questions

Instruction: for each of the following questions show your work clearly.
 3 0 4
 
1. Given a matrix A =  2  1 3 
 4 1 0
 
Then, A) find det A
B) Determine the cofactor matrix of A.
C) Find the Adjoint of A
D) Find the inverse of A.
2. Given the following system of equations
x + 2y + 3z = 14
3x + y + 2z = 11
2x + 3y + z = 1
Then, A) solve using Cramer’s Rule B) solve using inverse matrix
204
27.9 REFERENCES
- Understanding Pre- College Mathematics, Mesaye Demessie, 2001

- Business Mathematics Quazi Zamerudin, V.K Khanna, S.K Bhambri,1980
UNIT 21: CORRELATION
Contents
21.1 Introduction
21.2 Definition of Correlation
21.3 Types of Correlation
21.4 Scatter Diagram
21.5 Degree of Correlation
21.6 Measuring Simple Linear Correlation
21.7 Measuring Rank Correlation
21.8 Summary
21.11 Glossary
21.12 References
The aim of this unit is to explain the nature, scope and methods of studying correlation.
We introduce two measures relating to correlation.
After going through this unit, the student should be able to:
 Define correlation
 Distinguish between simple and multiple, positive and negative, and linear and
non –linear correlation.
 Recognize the degree of correlation between variables.
 Explain the method of studying correlation by scatter diagram method.
 Compute correlation by applying the formula of Karl Pearson.
205
 Calculate coefficient of correlation by rank correlation method.
21.1 INTRODUCTION
Most of the methods we have developed so far have been for dealing with one variable
only. The scope was strictly confined to the various values of one variable. For example,
measures of central tendency, variation and skewness study with the various values of a
single variable. Those statistical measures are important for comparison and analysis but
they are not useful in looking for the quantitative relationship between the variables.
However, often several different characteristics are measured on each member of a
sample and it may be of great interest to ask whether the variables are interrelated.
21.2 DEFINITION OF CORRELATION
A businessperson may want to know whether the volume of sales for a given month is
related to the amount of advertising the firm spends during that month. Educators, in the
other side, are interested in determining whether the number of hours a student studies is
related to the student’s score in a particular exam. Medical researchers, in their
professional field, are interested in question such as ‘ Is caffeine related to heart
damage?’ These are only a few of the many issues that can be answered by using the
technique of correlation analysis.
Correlation is a statistical method used to determine whether a relationship between

variables exists.
Correlation can be defined as an analysis that deal with the associate between two or
more variables
Correlation is an analysis, which attempts to determine the degree of relationship
between variables.
206
21.3 TYPES OF CORRELATION
On the basis of the nature of relation between the variables, correlation may be
categorized as follow:
A) Simple and Multiple Correlation
B) Positive and Negative Correlation
C) Linear and Non- Linear Correlation
A) Simple and Multiple Correlation
In a simple relationship, there are only two variables under study.
Example: 21.3.0
A manger may wish to see whether the number of years the sales people have been
working for the company has anything to do with the amount of sales of the
representatives. The only two variables are: years of experience and amount of sales.
In multiple relationships, many variables are under study
Example: 21.3.1
An educator may wish to investigate the relationship between the students success in the
University and factors such as the number of hours spent for studying, the students IQ,
and the background of the students. This type of study involves several variables:
success, hours spent, IQ and background.
B) Positive and Negative Correlation

Simple correlation can also be positive or negative
A positive correlation exists when both variables increase or decrease at the same time
207
Example 21.3.2
A firm’s sales volume and advertisement are related; and the relationship is positive,
since the more the sales volume is generally, the more the firm advertises.
In a negative correlation, as one variable increases the other variable decreases, and vice
versa
Example: 21.3.3
If one compares the strength of people over 60 years of age, one will find that as age
increases, the strength generally decreases.
C) Linear and Non-Linear Correlation
Both linear and non-linear correlation can be identified with reference to the amount of
change in the values of variables.
If the amount of change in one variable is accompanied by the same amount of change
in the other variable, it is known as linear correlation.
Example: 21.1.3.4
The variable X: 5, 10, 15, 20, 25, 30

The variable Y: 20, 30, 40, 50, 60, 70
If the amount of change in one variable is not accompanied by the same amount of
change in the other variable, the correlation is said to be non-linear.
Example: 21.1.3.5
The variable X: 10, 12, 30, 70, 74
208
The variable Y: 40, 46, 50, 75, 90
CYP1
Explain simple linear correlation
21.4 SCATTER DIAGRAM
In simple correlation, the researcher collects data on two variables to see whether a
relationship exists between variables or not.
Example 21.1.4.0
If a researcher wishes to see whether there is a relationship between the number of hours
studied by students and their test scores on an exam, he or she must select a random
sample of students, determine the hours each studied, and obtain their grades on the
exam.
Ta b l e 2 1 . 1 . 4 . 0
Student Hours Studied (X) Grade out of 100(Y)
Alemu 5 82
Challa 1 60
Daniel 4 87
Hailemichel 1 68
Fantaye 3 74
 The two variables in this investigation are: Hours studied (X) and Grade
obtained out of 100(Y)
 These two variables are called the independent variable and the dependent
variable.
The independent variable is the variable that can be controlled or manipulated.
209
Hence, in Example 21.1.4.0 the variable “the number of hours studied” is the
independent variable. It is denoted as the X – variable.
The dependent variable is the variable that cannot be controlled or manipulated
Thus, the grade the students received on the exam, in Example 21.1.4.0, is the dependent
variable, designate as the Y – variable.
Remark: The reason for distinction between the variables is that, one assumes that the
grade the student earns (Y) depends on the number of hours the student studied (X).
A scatter diagram is a graph of the independent and dependent variables in correlation

analysis.
Since scatter diagram is an aid for understanding the correlation techniques, after the plot
is drawn, it should be analyzed to determine which type of relationship, if any, exists.
With the help of the dots plotted on the graph,
 Closeness of the dots on the diagram shows high degree of correlation
 If the points on the diagram rise from the lower left hand corner to the upper right
hand corner, the correlation is said to be positive.
 Correlation is said to be negative if the points indicate a decreasing tendency from
the upper right hand corner to the lower left hand corner.
 If all the points lie on a straight line in a positive correlation, the correlation is
said to be perfectly positive, that is, r = +1.
 If the correlation is negative and all the points fall on a straight line, it is said to be
perfectly negative, that is, r = -1.
 If the plotted dots lie on a haphazard manner, correlation is said to be absent, i.e.
no correlation.
The following is diagrammatical illustration: Figure 21.1.4.0
210
Y y x
y
x
x
x x x
x x x x
x x x
x x x
(a) x (b) x (c) x
Perfect Positive correlation Perfect negative correlation High Degree of +ve

correlation
( r = +1) (r = -1) (r is near to +1)
x Y Y
x x x x x
x x x x x
x x x x
x x x x x x
x x x x
x x x x
x x x x
x x
x x x x
X X
X
Low degree of Low degree of negative No correlation (r = 0)
positive correlation (r correlation (r is close to 0
is close to 0 from the from the negative
pos i t i ve
x x
x x
x x
x x
x
211
X
Example 21H .1i.g4h.1degree negative
correlation (r is close to –1)
Construct a scatter diagram for the data obtained in Example 21.14.0
Solution: Step 1: Draw and label to X and Y axes

Step 2 : Plot each ordered pairs (X,Y) on the graph
Y
100
x
80 x
x
x
Grade Obtained
60 x
40
20
X
1 2 3 4 5 6 7 Figure 21.1.4.1
Hours Studied
Figure 21.1.4.1 suggests a positive relationship (correlation) since as a student’s

studying hour increases; the grade scored by the student tends to increase also (it
resembles to figure 21.1.4.0(c))
CYP2
A) Construct a scatter diagram for the data obtained in a study on the number of absences
and the mark scored of seven randomly selected students from a statistics class.
212
T a b l e 2 1 .1 .4 .1
Student Number of Absences (X) Mark Scored (Y)
A 6 82
B 2 86
C 15 43
D 9 74
E 12 58
F 1 90
G 8 78
B) What type of correlation do you observe from the scatter diagram?
21.5 DEGREE OF CORRELATION
The extent of relationship between the variables is calculated with the help of a statistical
technique known as correlation coefficient. According to the formula given, correlation
coefficient always lies between –1 and +1. Here the algebraic sign (+) indicates the
positive relationship between the variables, the sign (-) denotes the negative relationship.
If no relationship exists between the variables under study, the correlation coefficient will
be zero. +1 and –1 denote the perfect positive and perfect negative correlation
respectively.
If the correlation coefficient is close to +1, we may say, there is a higher degree of
positive correlation. On the other hand, if the correlation coefficient is near to –1, we can
say, there is higher degree of negative correlation.
We say the relationship is perfectly positive, if an increase or decrease in one variable is

accompanied by the same amount of increase or decrease in the other variable.
Perfectly negative implies an increase or decrease in one variable is accompanied by the

same amount of change in the other variable in the reverse direction.
213
The correlation coefficient computed from the sample data measures the strength and
direction of a relationship between two variables. The symbol for the sample correlation
coefficient is r.
CYP3
If the value of correlation coefficient is 0, then the given variables are

A) Positively related
B) Negatively related
C) Perfectly correlated
D) Not interdependent
E) None
21.6 MEASURING SIMPLE LINEAR CORRELATION
We use a measure called the correlation coefficient to determine the strength of the
relationship between two variables. To do so, there are several ways to compute the
values of the correlation coefficient. One simple method is to use the formula shown
below.
Formula for the correlation coefficient r
N  XY    X Y
r 
N  X 2
   X   N  Y    Y  
2 2 2
Where, N is the number of data pairs
214
Formula 17.1.6.0 is called Karl Pearson’s coefficient of correlation.
Where x = Total of x series
y = Total of y series
x2 = sum of the square of x series
y2 = sum of the squares of y series
xy = sum of the products of x and y series
N = Number of pairs observed
Example 21.1.6.0
A manager wishes to find out whether there is a relation ship between the number of
radio advertisement aired per week and the amount of sales (in hundreds of Birr) of a
product. The data for the sample are given below.
Number of Advertisement (X) 2 5 8 8 10 12
Sales (Y) 2 4 7 6 9 10
To know the relationship, we have to compute the value of the correlation coefficient for
the sample data.
Solution:
Step 1: Make a table, as shown below
Number of Advertisement (X) Sales (Y) XY X2 Y2
2 2
5 4
8 7
8 6
10 9
12 10
215
Step 2: Find the product of the X and Y values and place the products in the column
labeled XY
Step 3: Square the X values and place them in the column labeled X2
Step 4: Square the Y value and place them in the column labeled Y2
Step 5: Find the sum of each column. The completed table is given below.
Number of Advertisement (X) Sales (Y) XY X2 Y2
2 2 4 4 4
5 4 20 25 16
8 7 56 64 49
8 6 48 64 36
10 9 90 100 81
12 10 120 144 100
 X = 45  Y = 38 XY =  X2 =  Y2 =
338 401 286
Here the number of data pairs, N = 6.

Step 6: substitute in the formula 8.1.6.0 and solve for r.
N  XY    X Y
r 
N  X 2
   X   N  Y    Y  
2 2 2
6 338  45 38

r 
6 401  45  6 286  38 
= 0 .9 8 9
2 2
The correlation coefficient indicates a strong positive correlation.
Therefore, the relationship between the numbers of radio advertisement aired per week
and the amount of sales (in thousands of Birr) is strong which is positive.
216
CYP 4
A) Calculate the correlation coefficient for the data obtained in a study on the number of
hours a person exercises each week and the amount of milk (in liter) each person
consumes per week.
The data follow.
Subject Hours of Exercise (X) Amount of Milk
A 3 48
B 0 8
C 2 32
D 5 64
E 8 10
F 5 32
G 10 56
H 2 72
I 1 48
________________________________________________________________________
________________________________________________________________________
________________________________________________________________________
________________________________________________________________________
________________________
B) What type of correlation is it?

________________________________________________________________________
________________________________________________________________________
____________
217
21.7 MEASURING RANK CORRELATION
We have so far assumed X and Y can both be measured on a continuous scale, and that
they are jointly normally distributed. However, neither of these assumptions may appear
safe to make. Thus, this method, rank correlation coefficient is used when data are not
normal or when the shape of the distribution is not known. Especially it is useful for
variables like beauty, leadership ability and so on particularly and items that cannot be
expressed in quantitative terms generally. Spearman studied these variables and
developed a method of finding out correlation between such two variables. As the result,
this method is called Spearman’s rank correlation coefficient. The formula is presented
next.
Formula for the rank correlation coefficient rs
6 d 2
rs  1 

n n2  1 
where d  difference between corresponding ranks
n  number of pairs observed
Formula 21.1.7.0 (Spearman’s rank correlation coefficient)
Ranks are given to all the items in the distribution in the increasing or decreasing order of
their magnitude. One way of ranking is, the highest value in the distribution is assigned
first rank, and the next highest value is given the second rank and so on.
Example 21.1.7.0
A University wishes to establish whether there is a connection between academic and
sporting achievements. Calculate the Spearman’s rank correlation coefficient. Eight
pupils are selected and ranked.
218
P upi l s A B C D E F G H
Academic Rank 2 3 8 5 6 1 4 7
Sporting Rank 1 8 4 6 7 2 5 3
Solution:
Step 1: make a table as shown below and find d2
Academic Rank Sporting Rank d (deviation) d2
2 1 1 1
3 8 -5 25
8 4 4 16
5 6 -1 1
6 7 -1 1
1 2 -1 1
4 5 -1 1
7 3 4 16
 d2 = 62
Step 2: substitute in the formula 21.1.7.0 and solve for rs

6 d 2
rs  1  ; n  8

n n2  1 
662 
rs  1   0.26

8 82  1 
The connection between academic and sporting achievement is positive weak correlation.
CYP5
According to rank correlation coefficient, all the calculations are based on the original
value of the observations rather than the ranks assigned to them.
219
A) (Say true or false) _____________________.
B) If your answer is false, why?
________________________________________________________________________
________________________________________________________________________
____________
CYP6
Find the Spearman’s rank correlation coefficient from the following data in respect of
marks scored by 10 students in final exam out of 100 in English and Mathematics.
Students A B C D E F G H I J
Mark in 50 60 65 30 40 35 70 75 80 45
English
Mark in Math 45 55 60 40 45 60 58 62 72 76
________________________________________________________________________
________________________________________________________________________
________________________________________________________________________
________________________________________________________________________
________________________
21.8 SUMMARY
Many relationships between variables exist in our practical life in the real world.
Correlation analysis is an analytical statistical measure that helps to find out the direction
and strength (degree) of relationship between two or more variables. On the basis of the
nature of relationship, correlation analysis may be classified in to three categories,
namely
A) Positive and Negative Correlation

B) Simple and Multiple Correlation
C) Linear and Non- Linear Correlation
220
Correlation is measured with the help of coefficient of correlation, which always lies in
between -1 and +1. +1 and –1 indicate the perfect positive and negative correlation
respectively, where as zero (0) implies the absence of correlation between two variables.
The closer the value of the correlation coefficient is to +1 or –1, the stronger the
relationship is between the variables.
The study of correlation is applicable in every field where decision is required. It is

extremely useful to government, business and consumers for knowing the behavior of
various inter-related & interdependent variables.
CYP1, The relationship of two variables, which is linear, is called simple linear
correlation.
CYP2 A) y
120
Mark Scored
90 x
x
x
x
x
60
x
x
30
2 4 6 8 10 12 14 16 x
Number of absences
B) The correlation (relationship) is negative which is strong
1) d
2) A) r = 0.067
B) Weak positive correlation
221
3) A) False
B) According to rank correlation coefficient, the calculations are based on ranks
rather than the original values of the observation.
4) rs = 0.515
Part I: Short Answers

1. Define correlation precisely
2. The extent or strength of relationship between variables can be determined by
_____________________________.
3. Connections between two variables are an every day occurrence. For instance,
kilometer driven in a car and the petrol bought for it.
From the data given above,
i. The two variables are _____________________ and _______________.

ii. The dependent variable is _______________________________.
iii. The independent variable is ______________________________.
4. The relationship of variables can be classified as _____________ or __________.
__________ or ___________________, and ____________ or __________.
based on their correlation nature.
5. ______________________ is a method which provides useful information whether
or not there is association between variables diagrammatically.
Part II: Multiple Choice
The following table shows the connection between the number of hours devoted by five
sample students to study a course Quantitative Methods I and their marks on this course.
Sample Students S1 S2 S3 S4 S5
Hours of Study (X) 3 2 5 4 1
Exam Marks (Y) 6 7 4 5 8
222
Answer the following questions (1-3) based on the given information above,
1. The two variables are
a. Quantitative Methods I and Quantitative Methods II
b. Mathematics and Statistics
c. Marks obtained in Quantitative Methods I and the hour of study
d. Not given
2. The Karl Pearson’s coefficient of correlation is equal to

a. + 0.89 c. – 1 e. – 0.89
b. + 1 d . – 0 .9 8
3. The relation ship between the two given variables is
a. Strong positive c. Perfect negative
b. Perfect positive d. Strong negative
4. The table gave next displays the ranks of six sample students in their English and
Mathematics test out of 10.
Students A B C D E F
Rank in English (X) 3 5 6 2 1 4
Rank in Math (Y) 2 5 3 1 4 6
Then, the value of spearman’s rank correlation coefficient is

a. 0.68 d . + 0 .3 2
b. - 1 e. none
c. - 0.32
5. From question #4, the type of correlation between English and Mathematics is
____________________.
(NB rs is interpreted as r)
a. Perfect negative d. Weak negative
b. We cannot determine e. None
c. Weak positive
223
6. Assume d2 = 0, where d is the difference between the corresponding ranks, then
the Spearman’s correlation coefficient is ______.
a. We cannot determine because n is not given.
b. –1 c. 1 d . 0 .9 8
Part III: Workout Questions:
1. Consider the following relationship between TV advertisement of lollipop and its total
sale
Advertisement Cost (in hundred in Birr) Total Sales (in thousands in Birr)
2 10
3 15
5 12
8 17
10 18
12 20
Required:
a. Construct a scatter diagram on the usual co-ordinate plane (X-Y axis)
b. Find the Karl Pearson’s correlation coefficient, r.
c. Determine the relationship of TV advertisement of lollipop and its sale.
2. Given: x = 42,  y = 7.8, n = 6, x2 = 364 ,  y2 = 10.68, xy = 48.6
Required:
a. Compute the Karl Pearson’s coefficient of correlation
b. What type of correlation is it?
224
3. ABC trading Pvt. Ltd. Co. wishes to determine the relationship between sales
experience and sales volume. A random sample of 11 sales employees is selected and
their years of experience (X) and current annual sales (Y) are collected. The distribution
is given next.
X 1 4 6 7 11 10 3 3 5 9 8
Y 3 5 8 1 10 13 13 7 2 4 6
Determine:
a) The Pearson’s coefficient of correlation ( r ) between experience and sale.
b) The Spearman’s coefficient of correlation (rs) between experience and sale.
(NB r may or may not be equal to rs)
21.11 GLOSSARY
Correlation – refers to the connection or association between two or more related

variables.
Variable – refers to any factor which may vary.
Dependent Variable – is a variable whose value is influenced or is to be forecasted.
Independent Variable – is a variable which influences or used for prediction.
Linear Correlation – the amount of change or rate of change is equal between the values
of variables.
Non- Linear correlation – the amount of change or rate of change is unequal between
the values of variables.
Simple correlation – is the computation of association between two variables.
Multiple correlations – is the calculation of association more than two variables.
Positive correlation – the values of two variables change in the same direction.
Negative correlation – the values of two variables change in the opposite direction.
Scatter diagram – is the pictorial presentation of bi variety data to deal correlation
Rank correlation – is a method of calculating correlation between two series of ranks.
225
21.12 REFERENCES
 G. M Clarke and D. Cooke; A Basic Course in Statistics, 3rd edition.

 Vick F. Sharo, 1979; Statistics for Social Science
 Richard P. Runyon, 1982; Business Statistics
226
UNIT 22: REGRESSION
CONTENTS

22.1 Introduction
22.2 Definition of Regression Analysis
22.3 Simple Linear Regression Analysis
22.4 Least Square Method
22.5 Regression Equations
22.6 Prediction Correlation Analysis Vs Regression Analysis
22.7 Rank Correlation Coefficient
22.9 Check Your Progress Model Answer
22.10 Summary
22.12 Glossary
22.13 References
The aim of this unit is to explain the meaning, importance and computational process of
regression.
At the end of this unit, the student should be able to:

 define the term ‘Regression Analysis’
 list out application areas of regression analysis
 draw the two linear regression lines
 explain the regression equations
 solve problems on regression
 distinguish between correlation and regression
227
22.1 INTRODUCTION
The term “regression” was originally employed by Galton in 1877 for indicating certain
relationships in the theory of heredity but it has come to imply the statistical method
developed to investigate those relationships.
There are problems in business and industry where two or more variables show a mutual
relationship and are hence capable of simultaneous analysis. One of the variables is of
primary interest, which may not be measured directly. On certain other occasions the
variable of primary interest can be measured but direct measurement of this variable is
very expensive. In these problems we first try to build up a suitable functional
relationship between the primary variable and one or more auxiliary variables. On the
basis of this functional relationship we attempt to predict the value of the primary
variables for given values of the auxiliary variables. Several illustrations can be given.
A) A marketing research unit may be interested in knowing whether increased

advertising will result in increased sales. The variable, which is of primary
interest, in this case is sales. The auxiliary variable considered in this case is
advertising expenditure. If on the basis of past record it is possible to build up a
suitable functional relationship between advertising expenditure and sales, it can
be used to predict sales for a given increase in advertising expenditure.
B) A manufacturer of farm tools wishes to study whether the volume of sales can be
predicted from the corresponding farm income. On the basis of the data on sales
and farm income a prediction model can be obtained which may be used to
predict sales from the farm income.
22.2 DEFINITION OF REGRESSION ANALYSIS
After having established the fact that two variables are closely related we may be
interested in estimating (predicting) the values of one variable given the value of another.
Regression analysis reveals average relationship between two variables and this makes
possible estimation of prediction.
228
Regression analysis attempts to establish the nature of the relationship between variables
that is to study the functional relationship between the variables and thereby provide a
mechanism for prediction, or forecasting.
Regression analysis is a mathematical measure of the average relationship between two

or more variables in terms of the original units of the data. In short regression analysis is
a statistical analysis designed to determine the extent to which one factor changes with
changes in another factor or other factors.
22.3 SIMPLE LINEAR REGRESSION
Suppose xi, yi (i = 1, 2, …n) are n pairs of observations corresponding to two variables.

A graph showing these variables on the two co-ordinate axes is known as a scatter
diagram. i.e. a diagram which is obtained by plotting the n ordered pairs (xi , yi) ; i =
1,…n on the X – Y plane.
The shape of the scatter diagram can be taken as a broad guideline about the nature of the
functional relationship between X and Y. If the scatter diagram of the data is broadly
linear (seem to fall on a line) then we are justified in using a linear function to predict the
value of Y(say) by a single variable X.
If a linear relationship is assumed and we are dealing with the case when only one
auxiliary or independent variable is to be used to predict the primary or dependent
variable, this is what we call simple linear regression analysis. It is called simple as it
involves only one regress or (independent) variable and is said to be linear due to the
observed linear relationship.
Let n pairs of observations (xi, yi) (i = 1, 2, …n) are available. If a linear relationship is
assumed, one can use a linear regression model:
Y = a + bx ……(22.2.1)
Where the intercept a and the slope b are unknown constants
In simple linear regression models, it is convenient to regard the regress or (independent)

variable, in the above case, x under the control of the experimenter. It is also desirable
229
to assume that ‘x’ is measured with minimum error and ‘y’ the response variable depends
up on the value of ‘x’ and is therefore a random variable.
22.4 LEAST SQUARE METHOD
The linear regression model given in equation (22.2.1) has two parameters a and b. We
shall develop a procedure for estimating these parameters. Suppose n pairs of
observations (xi, yi) (i = 1, 2, …n) are given. The scatter diagram of these observations is
shown in Fig 22.2.1. Consider the line drawn through the points in Fig 22.2.1
Let the equation of this line be given by,
y = a + bx (22.2.2)
For a given xi , the y value on the line is a + bxi. The corresponding observed y – value,
by definition, is yi. Hence the difference ai = yi – (a + bxi) = yi – a – bxi in a sense is the
measure of deviation of the observed yi value from the line.
(x2 , y2) … y = a + bx
(x1 , y1) e2 en
e1 e3 (xn, yn)
(x3, y3) …
X
Fig. 22.2.1
If we square the ei values to eliminate the effect of positive and negative deviations and
add them up then the quantity,
230
n
R   ei
i  1
2
…… (22.2.3) is called the sum of the squares of errors in the prediction. i.e.
ei = y – ŷ which is the difference between the observed value of y(y) and the expected
value of y(ŷ).
The smaller the value of R the better is the representation of the observations by the line.
Using this approach, we estimate the values of the unknown constants a and b so that the
quantity R (the sum of the squares of errors) is minimized. The estimate of a and b in
 
subject to minimizing R   ei 2  is called the Least Square Estimate. The least square
 i 
estimate have the property that the sum of square of the deviations of the observations
from the line is minimum. The process of estimating the parameters ‘a’ and ‘b’ is
sometimes known as fitting the linear regression to the observed data.
The line y = a + bx where ‘a’ and ‘b’ have been estimated by the least square method is
sometimes referred to as the line of best fit.
Regression lines are drawn on the assumption of ‘Least Squares’. According to this
assumption, the sum of squares of the deviations of the observed values of y from the
2
n
 

fitted line would be the minimum possible. i.e.   y  y  is minimum. Further, the
i 1  
sum of deviations above the line is equal to the sum of deviations below the line, i.e.
 

i 
 y  y  = 0.

22.5 REGRESSION EQUATIONS
The algebraic expressions of regression lines are called regression equations. In the case
of two variables say X and Y, there are two regression lines (equations)
i. The regression equation of Y on X and
ii. The regression equation of X on Y
231
22.5.1 Regression Equation of Y on X
Symbolically, regression equation of Y on X expressed as
Yc = a + bx (22.8.4) where Yc is the most probable value of Y (computed) and
‘a’ and ‘b’ are constants. While ‘a’ denotes the level of the fitted lines, ‘b’
denotes the slope of the line (i.e. the change in ‘Y’ variable per unit change in ‘X’
variable)
 ei    y  a  bxi 
2 2
Given Y = a + bx, the sum of the squares of errors is given by
i i
 ei    y  a  bxi 
2 2
* can be minimized by the principle of maxima and minima
i i
which is differentiating (*) partially with respect to ‘a’ and ‘b’ , which yields to normal
equations:
 yi
i
 na  b xi
i
……(8.2.5)
 xi yi
i
 a  xi  b xi
i i
2
W hi l e  xi ,  yi ,  xiyi
i i i
and  yi
i
2
indicate the totals that are obtained from the
original values of x and y series, n denotes the number of pairs observed for the purpose
of regression analysis.
232
Solving the normal equations simultaneously for ‘a’ and ‘b’ we obtain:
   
b= n.  xiyi -   xi  .   yi 
i i i
(22.2.6)
2
 
n i xi 2 -  i xi 
1
or b=
n
 xiyi - x . y
i
(22.2.7)
1
n
 xi
i
2
 x2
or b=  xiyi - n x . y
i
(22.2.8)
 xi
2
2
nx
i
Sd y
or b= r. (22.2.9) where r is the correlation coefficient between x
Sd x
and y variable, Sdy is the standard deviation of y variable, and Sdx is the standard
deviation of x variable. and a = y  b x (22.2.10)
Substituting these values in (22.2.4) , we have the desired prediction formula (equation)
of y on x as:
y = a + bx
Y = y + r.
Sd y
Sd x
. x  x   = y - r.
Sd y
Sd x
. x + r.
Sd y
Sd x
. x (22.2.11)
a b
233
Example 1: below is a data obtained from 5 families indicating family size and mean
monthly expenditures.
Family 1 2 3 4 5
Family Size (X) 4 3 6 5 2
Expenditure (in hundreds) (Y) 5 2 8 3 4
Fit a linear regression line for family’s expenditure on family size.

Solution:
 Let the family size be denoted by the X variable
 Let the mean monthly expenditure be denoted by the Y variable
 We are requested to find the regression equation of Y on X
Let Y = a + bx
By least square estimate, the unknown constants a and b are given by:
b=  xiyi - n x . y
i
 xi
2
2
nx
i
and a = y  bx
xi yi xi2 xiyi
4 5 16 20
3 2 9 6
6 8 36 48
5 3 25 15
2 4 4 8
 x i = 20 yi = 22  xi2 = 90  xi yi = 97
x = 20/ 5 = 4 y = 2 2 / 5 = 4 .4
234
Then b = 97 – 5  4  4.4
90 – 5  (4)2
= 97 – 88
90 – 80
= 9
10
b = 0 .9
Then a = y  b x
= 4 .4 – 0 .9  4
= 4 .4 – 3 .6
= 0 .8
Thus the regression equation is given by:
Y = a + bx
Y = 0 .8 + 0 .9 x the regression equation of Y on X.
In the table, write down the family sizes in the first column, the expenditures in the
second column. Third column xi2 is the square of each entries of the first column
and fourth column is the product of paired entries of the first and second column.
 xi means sum the first column vertically down and x means the arithmetic mean of the
1
X variables. i.e. x =  xi where n is the number of paired observations. In our case n
n
= 5 and  yi means sum the second column vertically down.  xi2 means sum the third
column vertically down and  xi yi means sum the fourth column vertically down.
1
y=  yi is the arithmetic mean of the Y variable.
n
235
Example 2:
Data below indicates the demand of a certain commodity versus price.
Demand (in units) 10 12 16 14 15
Price (in birr) 40 48 52 46 50
Increase in demand can be taken as a cause for increase in the price.

Fit a linear regression line for price on demand.
Solution: let demand be denoted by X and price be denoted by Y.
yi xi2 xiyi
10 40 100 400
12 48 144 576
16 52 256 832
14 46 196 644
15 50 225 750
i
x i = 67 i
yi = 236  i
xi2 = 921 i
xi yi = 3202
x = 6 7 / 5 = 1 3 .4 y = 2 3 6 / 5 = 4 7 .2
The least square estimate of a and b are given by:
 xi yi
i
 n .x . y
b =
 xi
2
2
 nx
i
3202  5  13.4  47.2

b =
921  5  13.4 
2
3202  3157.4
b =
921  897.8
44.6
b =
23.2
236
b = 1 .9 2
and a = y  bx
a = 4 7 .2 – 1 .9 2  1 3 .4
a = 2 1 .4 7
Thus the regression equation of Y on x is given by Y = a + bx
Y = 2 1 .4 7 + 1 .9 2 x
22.5.2 Regression of X on Y
Symbolically, the regression equation of X on Y is expressed as X = a + by (22.2.13)
where x is the most probable value of X computed and ‘a’ and ‘b’ are constants, while ‘a’
denotes the level of the fitted line (i.e. the distance of the line directly above or below the
origin). ‘b’ denoted the slope of the line (i.e. the change in X variable per unit change in
Y variable). The values of ‘a’ and ‘b’ are obtained through the following two normal
equations:
 xi
i
 na  b yi
i
 xiyi
i
 a yi  b yi 2
i i
(22.2.13)
Solving for a and b simultaneously, we get
n xi yi   xi  yi
b  2
 
n yi    yi 
2
… … 2 2 .2 .1 4
i  i 
 xi yi  n x . y
i
= … … 2 2 .2 .1 5
 yi 2  n . y
2
sd x
=r …… (22.2.16)
sd y
and a = x  b y ……(22.2.17)
Using 22.2.14 and 22.2.17 in 22.2.13, we get the regression equation of X on Y.
237
Example: given the data below, fit the regression line of X on Y.
X 2 3 7 6 4
Y 6 3 8 1 5
Solution:
xi yi Yi2 xiyi
2 6 36 12
3 3 9 9
7 8 64 56
6 1 1 6
4 5 25 0
 x i = 22  yi = 23 Yi2 = 135  xiyi = 103
x = 2 2 / 5 = 4 .4 y = 2 3 / 5 = 4 .6
In regression X on Y, using the linear model X = a + by, where a and b are estimated by
 xi yi  n x . y
i
b = bx y =
 yi
2
2
 n. y
i
103  5  4.4  4.6

=
135  5  4.6 
2
103  101.2
=
135  105.8
1 .8
=
29.2
= 0 .0 6 2
and a = x  b y
= 4 .4 – 0 .0 6 2  4 .6
= 4 .1 2
238
Then the regression equation of X on Y is given by
X = 4.12 + 0.062y
In the regression equation of Y on X, Y = a + bx, b is called the regression coefficient of Y

on X and usually denoted by byx. and in the regression equation of x on Y given by X = a +
by, b is called the regression coefficient of X on Y and usually denoted by bxy.
22.6 PREDICTION
The term prediction usually implies the estimation of a future value of a variate either by
the projection of a trend regression line or by the use of a probabilistic model. Unlike
estimation, which is mainly used to mean the determination of an approximate parameter
value from a sample, prediction is mainly used in association with linear, multivariate or
bivariate regression models. As it can be used as a means of predicting the future value
Y, the regressor X is sometimes also termed as a predictor.
From the two linear regression equations; Y on X and X on Y, given the value of the
independent variable, one can predict the value of the dependent variable.
Example: a manufacturer of wool products had employed an appraiser who estimates

the shrinkage of different varieties of wool. The estimates of the appraiser along with
actual shrinkage of brands of wool are given here.
Appraiser’s estimate of Actual shrinkage reported

Brand No. shrinkage by customers
(Per one thousand unit) (Per one thousand unit)
1 65 68
2 72 74
3 60 63
4 67 62
5 75 80
239
6 52 50
7 54 53
8 69 66
9 58 61
a) Fit the two linear regression lines to the given data

b) Predict the actual shrinkage if the estimate of the appraiser is 70.
c) Predict the appraiser’s estimate of shrinkage of the actual shrinkage reported is 78.
Solution:
Let estimate shrinkage be denoted by X variable and actual shrinkage be denoted by the
Y variable.
a) To regress Y on X and X on Y, let us use the following table for the computation
of the unknown constants ‘a’ and ‘b’
X Y X2 XY Y2
65 68 4225 4420 4624
72 74 5184 5328 5476
60 63 3600 3780 3969
67 62 4489 4154 3844
75 80 5652 6000 6400
52 50 2704 2600 2500
54 53 2916 2862 2809
69 66 4761 4554 4356
58 61 3364 3538 3721
Total X = 572 Y =5 77  X2 = 36868 XY = 37236 Y2 = 37699
572
x   63.56 x 2
= (63.56)2 = 4039.9
9
240
577
y   64.11 y 2 = (64.11)2 = 4110.1
9
x y = 6 3 .5 6  6 4 .1 1
= 4 0 7 4 .8
n = 9
Regression equation on Y on X given by Y = a + bx
Where b =
 xy  n . x . y
x  n x
2 2
37236  9  63.56  64.11

b = = 1 .1 0 4 8
36868  9  63.56 
2
Then a = y  b x = 64.11 - (1.1048)  63.56 = -6.11

Thus the estimated regression line of Y on X is given by
Y = -6.11 + 1.1048x
This equation will help us to predict the actual shrinkage given the appraiser’s estimate of
shrinkage
Regression equation of X on Y, given by: X = a + by
Where b=
 xy  n x . y
y  ny2 2
37236  9  63.56  64.11

= = 0 .7 9
37699  9  64.11
2
and a  x  by
 63.56  0.79  64.11  12.9
Thus the estimated regressing line of X on Y is given by

X = 1 2 .9 + 0 .7 9 y
This equation will help us to predict the appraiser’s estimate of shrinkage given the actual
shrinkage.
241
b) Given x = 70, to predict y, we use the regression equation of Y on X; y = -6.11 +
1.1048x and substitute 70 in place of x, getting
y = -6.11 + 1.1048 (70)

y = 7 1 .2 3
i.e. if the appraiser’s estimate of shrinkage is 70, then the expected (approximated) actual
shrinkage is about 71.23
c) Given the actual shrinkage, to predict the appraiser’s shrinkage, we use the regression
equation of
X on Y formulated in ‘a’ above. i.e. x = 12.9 + 0.79y  X = 12.9 + 0.79 (78) , x
= 7 4 .5 2
 
the two regression lines cross each other at x , y 
 correlation coefficient can be calculated from the regression coefficient by the
relation
r2 = byx . bxy …… (22.2.18)
CYP 1
Given bxy = 0.45 and byx = 1.44, find out coefficient of determination
________________________________________________________________________
________________________________________________________________________
________________________________________________________________________
________________________________________________________________________
________________________
22.7 CORRELATION ANALYSIS Vs REGRESSION ANALYSIS
1. Correlation literally means the relationship between two or more variables, which
vary in sympathy so that, the movement in one tend to be accompanied by the
corresponding movement in the other(s). On the other hand, regressing means
stepping back or returning to the average value and is a mathematical measure
expressing the average relationship between the two variables.
242
2. Correlation coefficient ‘rxy’ between two variables X and Y is a measure of the
direction and degree of the linear relationship between two variables, which is
mutual. It is symmetric i.e. rxy = ryx and it is immaterial which of X or Y is
dependent variable and which is independent variable. But regression analysis
aims at establishing the functional relationship between the two variables under
study and then using this relationship to predict or estimate the value of the
dependent variable for any given value of the independent variable. It also reflects
up on the nature of the variable i.e. which is dependent and which is independent.
Regression coefficient are not symmetric in X and Y, i.e. bxy  byx.
3. Correlation need not imply cause and effect relationship between the variables
under study. However, regression analysis clearly indicates the cause and effect
relationship between the variables. The variable corresponding to cause is taken as
independent variable and the variable corresponding to effect is taken as dependent
variable.
4. Correlation coefficient ‘rxy’ is a relative measure of the linear relationship between

X and Y and is independent of the units of measurement. It is a pure number lying
between –1 and 1. On the other hand, regression coefficient bxy and byx are
absolute measures representing the change in the value of the variable y(x), for a
unit change in the value of the variable x(y). Once the functional form of the
regression curve is known, by substituting the value of the independent variable we
can obtain the value of the dependent variable and this value will be in the unit of
measurement of the variable.
5. There may be non-sense correlation between two variables, which is due to pure
chance and has no practical relevance. E.g. the correlation between the size of shoe
and the intelligence of a group of individuals. There is no such thing like non-sense
regression.
6. Correlation analysis is confined only to the study of linear relationship between the
variables and therefore, has limited applications. Regression analysis had much
243
wider applications as it studies linear as well as non- linear relationship between the
variables.
2 2 .8 RANK CORRELATION COEFFICIENT

Correlation analysis helps us to determine the strength of the linear relationship between
the two variables X and Y, in other words, as to how strongly are these two variables
correlated. Karl Pearson, in 1896, developed an index or coefficient of this association in
cases where the relationship is linear one that is where the trend of the relationship can be
described by a straight line.
The degree of association between these two variables is based on the ranks of
observations but not on the numerical values. The number indicating the position of a
given value in the ranking is termed as its Rank. It is given to the values of the variable
either in ascending or descending order.
The Pearson’s coefficient of correlation is designated by r and is given by r = 1 -
n
6  Di2
i 1
n3  n
Where D i
2
- the sum of square of differences of pairs of ranks.
n – the number of pairs of variables characteristics of r:
1 - the value of r ranges between –1 and +1. If there is no relationship between the two
variables, then its value must be 0. if the relationship is perfect or if all the points on the
scatter diagram fall on the straight line, then the value of r is +1 or – 1, depending on the
direction of the line. Other values of r show an intermediate degree of relationship
between the two variables.
2- the sign of the coefficient can be positive or negative.

It is positive when the slope of the line is positive and it is negative when the slope of the
line is negative.
Hence, the coefficient of correlation r can be defined as a measure of strength of the

linear relationship between the two variables X and Y.
244
Example: -
Find the rank correlation coefficient of the data given below:
Beauty 1 .5 0 1 .7 5 1 .6 0 1 .7 0 1 .9 5 1 .9 0 1 .8 8
Behavior 1.10 1 .5 0 1 .2 0 1 .2 5 1 .6 0 1 .4 4 1 .3 0
Solution: -
X Y Rank of Rank of Difference Square of
Beauty Behavior Beauty Behavior Di = X-Y Difference
Di 2
1 .5 0 1 .1 0 7 7 0 0
1 .7 5 1 .5 0 4 2 2 4
1 .6 0 1 .2 0 6 6 0 0
1 .7 0 1 .2 5 5 5 0 0
1 .9 5 1 .6 0 1 1 0 0
1 .9 0 1 .4 4 2 3 1 1
1 .8 8 1 .3 0 3 4 1 1
6
6
6  Di
i 1 6 6  36
R=1-  1  1  1  10.017
n n
3
7 7
3
336
= 0 .8 9 2 9
The relationship between the two variables is strong.
In general, for a high degree of correlation, which leads to better estimates and
predication, the coefficient of correlating r must have a high value.
Note: if we have the same values of variables, the rank will be equal. In this case we will
jamb the rank succeeding the rank of the similar variables.
Calculate the coefficient of correlation for the data given below: (height
CYP
QQ of sons and fathers)
Father (X): 63 65 66 67 67 68
Son (Y): 66 68 65 67 69 70
245
1. r2 = bxy . byx
= 0 .4 5  1 .4 4
= 0 .6 4 8
2. r = 0.357 there is weak relationship between the two variables.
22.10 SUMMARY
Regression analysis is a statistical technique with the help of which the values of an
unknown variable are estimated on the bases of the known values of another variable.
Regression analysis helps to establish the functional relationship between variables,
which is done with the help of regression lines. The algebraic expression of regression
lines is known as regression equations.
Regression coefficient indicates the degree and the direction of change in the dependent
variable in response to a unit change in the independent variable. Since there are two
regression equations for two variables there will be two regression coefficients: one for
the regression equation of x on y and the other for regression equation of y on x, the
regression coefficient of x on y indicates the degree and direction of change in ‘x’
variable in response to a unit change in ‘y’ variable and the regression coefficient of y on
x indicates the degree and direction of change in ‘y’ variable in response to a unit change
in ‘x’ variable. The square root of the product of both the regression coefficients is equal
to the coefficient of correlation between the variables.
Short Examination Questions

1. What is meant by regression?
2. What are regression lines?
3. Why should there be two regression lines?
4. Distinguish between ‘correlation and regression’
5. What are the regression coefficients?
6. Write the normal equations used to compute the regression coefficients.
246
Exercise
1) Calculate the two regression coefficients from the following data
X 30 40 75 60 50 42 70 72
Y 40 25 35 40 65 52 60 35
(Ans: bxy = 0.048 , byx = 0.032)
2) Calculate the two regression coefficients and correlation coefficient from the following
data.
X 6 .9 8 .5 5 .8 8 .6 9 .6 8 .0 9 .7
Y 2 .9 3 .8 6 .5 2 .3 5 .5 3 .5 3 .2
(Ans: bxy = 0.31 , byx = 0.035, r = 0.33)
3) You are given that x = 190 ,  y = 85 , xy = 575, x2 = 15600 and y2 = 7100 for
ten paired observations. Calculate the two regression equations.
(Ans: X on Y : X = 0.613y + 17.16; Y on X : Y = 0.0867X + 83.35)
4) Given below is a data observed on two variables X and Y
X 4 2 6 8 10 5 7
Y 7 10 8 9 5 6 4
a) Fit a regression equation of Y on X and X on Y.

b) Predict Y if X = 14
c) Predict x if Y = 15
Ans: a) X on Y; X = 10.47 – 0.64y………*
Y on X; Y = 9.58 – 0.43x
b) given X = 14, Y = 9.58 – 0.43 x 14

= 3 .5 6
d) given Y = 15, X = 10.47 – 0.64 (15) = 0.87
247
22.12 GLOSSARY
1. Regression Analysis: it is a mathematical measure of the average relationship

between two or more variables in terms of the original units of data.
2. Regression Coefficient: it shows the degree and direction of change in the

dependent variable in response to a unit change in the independent variable.
3. Regression Line: it is a device used for estimating the value of one variable from
the value of the other consists of a line through the points drawn in such a
manner as to
represent the average relationship between the two variables.
4. Simple Regression: this studies regression between two variables only.
22.13 REFERENCES
 Gupta S.P. “ Statistical Method”, sultan chand & company, New Delhi
 Gupta S.C. “Fundamental of Statistics”, Himalaya Pub. House, Bombay.
 Simpson and Kafka “ Basic Statistics” oxford and I.B.H. publishing company,
Calcutta.
248

Statistics For Finance

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics For Finance

Uploaded by

Copyright:

Available Formats

Oxfo Business & Technology College

Module for the Course

Statistics for Finance

Wisdom at the source of Blue Nile

BLOCK 1: INTRODUCTION TO STATISTICS

UNIT 1: Introduction to Statistics ……………………………………… 1

UNIT2: Collection of Data ……………………………………………… 21

BLOCK 2: CLASSIFICATION AND PRESENTATION

UNIT 3: Classification of Data………………………………………….. 33

BLOCK 3: MEASURES OF CENTRAL TENDENCY

UNIT 5: Definition and Purpose of Averages …………………………… 66

BLOCK 4: MEASURES OF DISPERSION (VARIATION)

UNIT 9: Meaning and Types of Dispersion …………………………… 119

BLOCK 5: ELEMENTARY PROBABILITY THEORY

UNIT 12: Counting Methods …………………………………………… 190

UNIT 14: Discrete Probability Distribution …………………………… 235

BLOCK 7: SAMPLING AND SAMPLING DISTRIBUTION OF THE MEAN

UNIT 16: Concepts And Reasons For Sampling……………………….. 272

BLOCK 8: ESTIMATION AND TEST OF HYPOTHESIS

UNIT 19: Estimation…………………………………………………… 300

BLOCK 9: CORRELATION AND REGRESSION

UNIT 21: Correlation …………………………………………………… 342

APPENDIXES …………………………………………………………………… 387

UNIT 2: COLLECTION OF DATA

UNIT 3: CLASSIFICATION OF DATA

UNIT 5: DEFINITION AND PURPOSE OF AVERAGES

Although, averages serve the purpose of describing the characteristics of a distribution,

UNIT 12: COUNTING METHODS

The theory of probability is a far-reaching branch of statistics, which helps to obtain

UNIT 14: DISCRETE PROBABILITY DISTRIBUTION

UNIT 16: CONCEPTS AND REASONS FOR SAMPLING

BLOCK 3: CLASSIFICATION AND

UNIT 3: CLASSIFICATION OF DATA

UNIT 3: CLASSIFICATION OF DATA

3.0. Aims and Objectives

3.0. AIMS AND OBJECTIVES

At the end of this unit, you will be able to:

 Define ‘classification of data’

help of some kinds of visual aid.

will be seen in the next unit.

Classification: - is the process of arranging things in groups or classes according to

3.3. TYPES OF CLASSIFICATION

1. Geographical Classification: - Data are arranged according to places like continents,

2. Chronological Classification:- Data are arranged according to time like year,

3. Qualitative Classification: - Data are arranged according to attributes like color,

Female Male Female Male

4. Quantitative Classification:- In this type of classification, the statistical data is

B. Continuous Variables – are variables associated with measurement.

3.4. FREQUENCY DISTRIBUTION

Frequency distribution is of two kinds

A. Ungrouped Frequency Distribution (UFD)

Example 7. Consider the number of children in 15 families.

3.5. COMMON TERMINOLOGIES IN A GFD

u = lower limit of a class – upper limit of the preceding class.

To be consistent, we use inclusive classes.

v. Class Mark (cm): it is the mid point (center) of a class

cmi = UCBi + LCBi

CYP 2 consider the following GFD

Class Frequency (f)

a. What is the class frequency of the 3rd class?

3.6. RULES FOR FORMING A GROUPED FREQUENCY

To construct a GFD the following points should be considered