Statistics Module I

STATISTICS FOR ECONOMISTS
A TEACHING MATERIAL FOR DISTANCE

STUDENTS MAJORING IN ECONOMICS
Module I
Prepared By:
Bedru Babulo
Yesuf Mohammednur
Department of Economics
Faculty of Business and Economics
Mekelle University
2005
Mekelle
1.0 INTRODUCTION
Welcome to world of statistics!
Statistics is one of the most important and useful subjects taught in
business as well as economics school. The overall objective of this chapter
is to acquaint you with the introductory concepts in statistics and introduce
you with different aspects of statistics.
Learning Objectives
When you have completed this chapter, you will be able to:
To define statistics and its branches

To introduce you with the basic concepts in statistics
To give some illustrations in its application
Presently, there is no successful social as well as natural scientist can

function with out some knowledge of statistics. Now days, statistics has
developed to an extent that it has become indispensable in all aspects of
our activities. We frequently see or hear the following kinds of
statements:
The unemployment rate has dropped to 11%
The GNP of Sub-Saharan African countries is rising at 1.7%
per annum
The average coffee Price in Addis Ababa become 10 Birr/kg
And so on
2
All the aforementioned statements are some of the statistics (numerical
facts) we encounter. Now let’s turn our discussion towards defining
statistics and the different aspects (branches) of statistics.
1.1 Meaning of Statistics and Its Branches
The word statistik comes from the Italian word statista (meaning
“statesmen”). The term first used by Gottfried Achenwall (1719 –1772), a
professor in Marlborough and Gottingen. Dr E.A.W Zimmerman introduced
the word statistics in to England. Sir. John Sinclair in his work, Statistical
account of Scotland 1791 –1799 popularized its use. Long before the 18th
c, however, people had been recording and using data.
Different individuals have defined the term ‘statistics’ differently.

However, the meaning of statistics can be divided in to two major
categories. These are:
• The plural sense (statistical data/numerical facts)

• The singular sense (statistical method /or as a field of study)
•
The plural sense: Statistics in its plural sense refers to numerical facts,
figures or statistical data. Example: population statistics, production
statistics, etc.
But merely all-numerical data do not fulfill the quality of statistics. Hence,
numerical data must possess the following characteristics in order to be
called statistics.
3
i. Statistics are numerically expressed. But every numerical data may
not be statistics. It should be well defined.
ii. Statistics are aggregates of facts. It should be general and does not
concern about individuals at specific manner.
iii. Statistics are affected to a marked extent by multiplicity of causes.
iv. Statistics are enumerated or estimated according to reasonable
standards of accuracy.
v. Statistics are collected in a systematic manner.
vi. Statistics are collected for predetermined purpose.
vii. Statistics should be placed in relation to each other in time and
space. E.g. the economy grows by 1.9% (Where and when?)
The Singular Sense: Modern statistics qualify its scientific nature in its
singular sense. Statistics is a branch of mathematics or applied research
which is concerned with the development and application of methods and
techniques for collecting organizing, analyzing and interpreting quantitative
data to come up with sound decisions. Or, Statistics as a field of study has
been defined as the art and science of collecting, organizing, analyzing and
interpreting data. According to this definition there are five steps in
numerical investigation.
i. Data collection – It is the process of obtaining measurements,

counts or any other things by experimentation or observation. (It is
the first stage in statistics/statistical analysis)
ii. Data Organization and Presentation- It is the process of
editing, classifying, condensing and presenting data using tables,
graphs, charts, etc.
4
iii. Data Analysis – It is the process of extracting relevant information
from a summarized or organized data with out giving conclusions
about the facts.
iv. Data Interpretation – It is the process of generalizing and
drawing valid conclusions from data analysis.
Now days, there are various statistical software packages that made data
organization and analysis very much easier. (E.g. SPSS, SAS, STATA, etc)
In general, statistics is a field of study concerned with data collection,

organization and presentation, analysis and interpretation. Particularly, in
business and economics, a major reason for studying statistics is to give
managers and decision makers a better understanding of the business and
economic environment and thus enable them to make more informed and
better decisions.
1.2 Descriptive Statistics Versus Inferential Statistics

Basically, the field of statistics has two aspects (or broad sub-divisions):
Descriptive statistics and inferential (analytical) statistics.
Descriptive statistics – is concerned with the collection, processing,

summarizing and describing important features of the data with out going
beyond (i.e. with out any attempt to infer from the data).
For example, in some interval time period, the Central Statistics Authority
(CSA) or Ethiopian Economic Association (EEA) gathers basic data
concerning the number, age distribution, occupational and educational
5
composition of the Ethiopian people. Since the amount of raw data
gathered from Ethiopian people by CSA is immense, it is necessary to
condense and interpret this information to make it useful. So the data will
be summarized and may be presented using tables, graphs or charts.
Table 1.1:Total Population of Ethiopia by Sex, Region, Urban & Rural: July
1/2001.
(In thousands)
Urban Rural Total
Region
Male Female Total Male Female Total Male Female Total
Tigray 321 330 651 1547 1599 3146 1868 1929 3797
Affar 58 45 103 638 502 1140 696 547 1243
Amara 884 875 1759 7496 7493 14989 8380 8368 16748
Oromiya 1391 1391 2782 10101 10140 20241 11492 11531 23023
Somali (1) 317 269 586 1735 1476 3211 2052 1745 3797
Benishangul - Gumuz 25 25 50 253 248 501 278 273 551
southern
Nations/Nationalities &
Peoples 501 507 1008 5911 5984 11895 6412 6491 12903
Gambella 19 18 37 91 88 179 110 106 216

Harari 51 50 101 33 32 65 84 82 166
Addis Ababa 1237 1333 2570 0 0 0 1237 1333 2570
Dire Dawa provisional

Administration 120 119 239 46 45 91 166 164 330
Total 4924 4962 9886 27851 27607 55458 32775 32569 65344
/1/ …. BACKWARD PROJECTED POPULATION.
Source: EEPRI Database
6
Total Population of Ethiopia
70000
60000
population size
50000
40000
30000
Urban Male
20000 Urban Female
10000 Urban Total
0 Rural Male
Rural Female
ay
l
a
n
1)
uz
a
lla
a
r
ar
ta
fa
le
iy
ab
tio
ar
i(
be
gr
um
To
ar
Af
op
Rural Total
Am
tra
Ab
al
Ti
H
ro
am
Pe
-G
m
is
O
is
So
in
G
Total Male
d
&
l
dm
gu
Ad
s
ie
an
lA
Total Female
lit
sh
na
na
ni
io
io
Total Total
Be
is
at
ov
/N
pr
ns
io
a
aw
at
N
D
rn
i re
he
D
ut
so
Region
Source: Chart based on table 1.1 by the course team.
Inferential (Analytical) statistics – is concerned with the process of

using data obtained from sample to make estimates or test hypotheses
about the characteristics of a population. It consists of a host of
techniques that help decision makers to arrive at rational decisions under
uncertainty.
Actually, data are sought for a large group of elements (individuals,

households, products, etc). But due to time, cost and other considerations,
data are collected from only a small portion of the group. Thus,
economists, managers and other decision makers draw conclusions, make
estimates and test hypotheses about the characteristics of population from
the data for a small portion of the group. This process is referred to as
statistical inference. For instance, EEA may want to know the annual
7
income of a household (or individual) in Ethiopia. In this case, EEA should
collect data regarding the income level of households in Ethiopia. This is,
however, too costly and time consuming. Hence, EEA may collect
representative sample data from households and based on this estimates
(or make inference) the annual income of the households.
Whenever statisticians use a sample to make an inference about a

characteristic of a population they provide a statement of the quality, or
precision, associated with the inference. For the EEA’s example, the
statistician (economist) might state that the estimate of the average
income of an individual in Ethiopia for the population is $100 per annual
with + $11 precisions at 95% confidence level (statement of quality).
1.3 Statistics and Economics

Statistical data and methods of statistical analysis render reliable
assistance in the proper understanding of economic problems, economic
policy formulation economic planning and economic forecasts.
For instance an economist may be asked to forecast the inflation rates for
some future period of time. In such situation, an economist uses
statistical information on indicators like producer price index (PPI), the
unemployment rate, and the manufacturing capacity utilization. Often
these statistical indicators are entered in to computerize forecasting
models that predict inflation rates.
Economic problems almost always involve tasks that are capable of being
expressed numerically. Example: wages, prices, outputs (of
8
manufacturing, mining, agriculture) etc. These numerical magnitudes are
the outcomes of the multiplicity of causes and are subject to variations
from time to time or between places or among particular cases.
Accordingly, the study of economic problems is specially suited to
statistical treatment. Statistical approach to an economic problem not only
leads to its correct description but also indicates lines along which it is to
be directed. Generally, statistics is indispensable for economic policy
formulation, planning and forecasting.
Apart from this, the development of economic theory also been facilitated
by the use of statistics. The complexity of modern economic organizations
has rendered deductive reasoning inadequate and difficult. Statistics is
now being used increasingly not only to develop new economic concepts
but also to test the old ones.
Extremely, the increasing importance of statistics in the study of economic

problems has resulted in a new branch of study called Econometrics (A
subject matter to be discussed next year- at junior syllabus).
So far, we have discussed meaning of statistics in its plural as well as

singular sense (as numerical facts and as field of study), the subdivisions
of statistics (descriptive and inferential statistics) and the use (application)
of statistics in economics. Next, we introduce some basic concepts (or
terminologies) that are important in order to understand the theory and
practices of statistics. Lastly, we forward a caution in using statistics as it
can be misused.
9
1.4 Some Basic Concepts (Terminologies) in Statistics
In this section some of the key statistical concepts that under lay the
theory of statistics are discussed below:
Data- are the facts and figures that are collected, analyzed, and
summarized for presentation and interpretation.
Data set – all the data collected in a particular study.
Discrete Data – refers to data obtained by counting. It assumes always
whole numbers.
Continuous Data – refers to data gathered by measuring and can
include decimal numbers.
Qualitative Data – is data that provide labels or names for a
characteristic of an element. Qualitative data may be numeric or non-
numeric.
Quantitative Data- is data that indicate how much or how many of
something. Quantitative data are always numeric.
Elements – are the entities on which data are collected or an individual
member in the data (or population).
Variable – refers to a characteristic of interest for the elements.
Qualitative variable – is a variable with qualitative data.
Quantitative Variable – is a variable with quantitative data.
Cross-sectional Data – is data collected at the same or approximately
the same point in time.
Time Series Data – refers to data collected at several successive periods
of time.
Observation (cases) – is the set of measurements obtained for a single
element.
Population – is the totality of elements of interest in a particular study.
10
Sample – is a subset of the population.
• Sampling- is the process of selecting a small number of items
or parts of a larger population to make conclusions about the
population.
• Census/complete Enumeration– is an investigation of all the
individual elements making up the population.
Population Elements – refers to an individual number of the population.
Target population – is the specific complete group relevant to the study
or research project.
Population Parameters – are variables in a population or measured
characteristics of the population.
E.g. Population mean (µ) population standard deviation (δ) etc. They are
represented /symbolized by Greek letters.
Sample Statistics – are variables in a sample or measures computed

from sample data. E.g. Sample mean (x) sample standard deviation(s) etc.
They are symbolized by small letter English alphabets.
Sampling Frame – is the list of elements from which a sample may be
drawn, also called Working population.
Proportion – is the percentage of population elements that successfully
meet some criteria.
Frequency Distribution – refers to organizing a set of data in table by
summarizing the data showing the frequency (or number) of items in each
of non-over lapping classes.
Percentage Distribution (percent frequency distribution)– is a tabular
summary of a set of data showing the percentage of the total number of
items in each of several non-over lopping classes.
11
Probability Distribution - is a description of how the probability are
distributed over the values the random variable can assume.
1.5 Limitations of Statistics

Indeed, statistics has grater significance in every branches of knowledge
as for as it is properly applied; but not with out limitations. Some of the
limitations of statistics are:
It cannot be applied to all kinds of phenomena since it deals with

only those subjects that can be measured quantitatively and
expressed numerically.
It deals with only aggregates of facts and does not give any
importance to individual items.
Statistical analysis often is based on sampling, which may not be
accurate.
It can be used to establish wrong conclusions if misused.
It may be used to mislead or deceive. Most of the time people probably

feel that statistics-numerical information– is some how “correct” than non-
numerical information. But is not the case all times. Once Benjamin
Disraeli made the statement “there are three kinds of lies: lies, damned
lies, and statistics.” So, we should neither trust all statistics nor distrust
all statistics. The point is well stated in the book statistics, a new
approach by W.A Wallis and H.V. Roberts they put that “he who trusts
statistics indiscriminately will often be duped unnecessarily. But he who
distrusts statistics indiscriminately will often be ignorant unnecessarily.”
(Cited in Bowen and Starr, 1987)
12
With this we end up our discussion of the first chapter an introduction to
statistics. In the coming chapter we see the introductory concepts of
probability theory.
Review Exercises
2.0 Probability Theory: An Introduction
Learning Objectives
Define probability
Describe the classical, the empirical, and the subjective approaches to
probability.
Understand the terms employed in the concept of probability
Calculate probabilities applying the rules of addition and the rules of
multiplication under conditions of statistical dependence and independence
Calculate a probability using Bayes’ theorem
13
2.1 Introduction
Personally, in our daily lives we are faced with a lot of decision-making

situations that involve uncertainty. Perhaps you may ask your self to
analyze one of the following situations:
- What is the chance for me to score “A” in statistics?
- What is the likelihood that your (our/my) weekend picnic be
successful?
- Etc
In such situations we use the concept of probability in our daily life with
out detailed and actual knowledge of the concept in other words we use it
intuitively.
Professionally, much of statistical theory and practice rests on the concept

of probability, since conclusions concerning population are drawn from
samples and this is subject to certain amount of uncertainty. Besides you
may be asked one of the following:
As business economist:
- What is the chance that sales or quantity demand (Qdd) will

increase if the price of the product reduced?
- etc
As project analyst:
- What likely is the project will be completed on time?
14
The subject matter most useful in effectively dealing with such
uncertainties is enclosed under the heading probability.
Probability can be thought as a numerical measure of the chance of
likelihood that a particular event will occur. Here, before we treat
definition of probability in detail, let’s be familiar with some of the basic
concepts (terms) in probability.
2.2 Some Basic Concepts in Probability
Experiment – is process of observing or measuring something we plan to

do in which the outcome is uncertain.
Example: Taking ‘Statistics’ course the outcomes are pass, fail and drop.
Flipping a coin the outcomes are head or tail.
Sample Space - is the set of all possible outcomes that may occur as a
result of a particular experiment.
E.g. S={pass, fail, drop}-Sample space for statistics experiment

S = {Head, Tail} – Sample space for flipping/tossing a coin.
Sample Point(s) [Event(s)]- is any one particular experimental

outcomes; it is the subset of sample space.
Events may be:
Simple Event – is a subset of sample space that has exactly one
sample point. It can also be called as element or fundamental event.
Compound Event – is a subset of sample space that has two or more
sample points.
Complement Event – the complement of event A is denoted by A’. A’ is
the event that has all the points in a sample space that are not in A.
15
E.g. rolling a die: S = {1,2,3,4,5,6}
Event A = {1,3,5,}
Complement event of A, A’ = {2,4,6}.
Impossible Event – is a subset of sample space that contains none of the
points.
E.g. Rolling a die: S = {1,2,3,4,5,6}
E = {7} or E = {0}
Independent Events -Two events are said to be independent when the
happening of one event doesn’t affect the happening of the other.
E.g. rolling a die
Dependent Events- Two events are said to be dependent when the
happening (or occurrence) and non-occurrence of an event affects the
happening of another event. E.g.
Mutually Exclusive Event- Events are said to be mutually exclusive if one
and only of them can take place at a time.
Collectively Exhaustive Events/Lists- When a set of events for an
experiment includes every possible outcome the set is said to be
collectively exhaustive event/list.
E.g. flipping a fair coin twice: S = {HH, HT, TH, TT}
Once looking the basic concepts, we pass to formally give definitions for
probability.
2.3 Conceptual Approach to Probability
In fact, experts disagree about the concept of probability, since there are
various conceptual approaches in defining probability. The most common
are discussed below:
16
1. Classical Approach 2. Relative Frequency Approach
3. Subjective Approach 4. Axiomatic Approach
i. Classical Approach (or A priori Definition of Probability)

This approach is based on the assumption that each of the possible
outcomes must be mutually exclusive and equally likely.
Algebraically:
Probability of an Event = Number of outcomes where the event
occurs
Total number of possible
outcomes.
- Equally likely means that each outcome of an experiment has the

same chance of happening as only other.
Shortcoming of the Classical Approach
The classical method was originally developed in the analysis of gambling

problems, where the assumption of equally likely and naturally exclusive
outcomes is often reasonable. In many economic and business problems,
however, this assumption is not valid. Hence, we look for alternative
methods of assigning probability.
Example: Tossing and coin: S = {H, T}, P (H) = ½ = 0.5

Rolling dice: P (S) = 1/6
17
ii. Relative Frequency Approach
According to this approach, the probability of an event is the proportion of

times that this event occurs over the long run if the experiment is repeated
many times under uniform conditions.
In the 1800s, British statisticians, interested in a theoretical foundation for

calculating risk of loss in life insurance and commercial insurance, began
defining probabilities from statistical data collected on births and deaths.
To day, this approach is called the relative frequency of occurrence.
It defines probability as either:

(1) The observed relative frequency of an event in a very large
number of trials, or
(2) The proportion of times that an event occurs in the long run
when conditions are stable.
Example: Suppose that an insurance company knows from past actuarial
data that of all males 40 years old, about 60 out of every 100,000 will die
with in a one-year period. Using this method, the company estimates the
60
probability of death for that age group as: = 0.0006
100,000
Exercise: Suppose that 400 of the 50,000 fire insured houses has a fire. A
fire insurance company would like to know the probability of fire for fire
insured houses, calculate this probability?
400
Solution: P( f ) = = 0.008
50,000
18
Sometimes this approach referred as objective probability since
experiments should be conducted or recorded data must be there in order
to compute probability.
Shortcoming of the Relative Frequency Approach

o People often use it without evaluating a sufficient number of
outcomes.
o This approach seems to imply that probability can play no part
in situation that occurs only once.
Thus, another method is required to compute probabilities.
iii. Subjective (Personal) Approach

Subjective probability can be defined as the probability assigned to an
event by an individual, based on whatever evidence is available. This
evidence may be in the form of relative frequency of past occurrences, or
it may be just educated guess.
o Subjective probability assignments are frequently found when

events occur only once or at most a very few times.
o In fact, most high-level social and managerial decisions are
concerned with specific unique situations, rather than with a
long series of identical situations, decisions makers at this
level make considerable use of subjective probabilities.
This approach is used when outcomes are not mutually exclusive and
there is no objective data.
19
Generally, though there are three approaches of probability, we can use
any of the aforementioned approach determined depending up on the
problem under consideration.
2.4 Basic Axioms and Theorems Of Probability

1. Given a sample space, S, of a random experiment, the probability of
the entire sample space is 1.
i.e. P (S) = 1
2. The probability of an event ranges from 0 to 1.

0 ≤ P (A) ≤ 1
Where: A is any event in a random experiment
P (A) is the probability of A
3. If two events A and B are mutually exclusive (disjoint events), then

the probability of either A or B or both
P (A or B) = P (AUB) = P (A) + P (B) … Addition rule
Diagrammatically:
A B
4. If two events A and B are not mutually exclusive, then the

occurrence of either event A or B is given by the probability:
Algebraically: P (A or B) = P (AUB) = P (A) + P (B) – P (A n B)
20
A
5. If A is an event from a sample space, S, and A’ it its complement

then:
Algebraically: P (A) + P (A’) = 1
P (AUA’) = P (S) = 1
6. If two events are independent, the probability of both events A and

B occurring simultaneously is a product of the individual
probabilities. Independent events are not necessarily naturally
exclusive.
Symbolically: P (A and B) = P (A n B) = P (A). P (B) … Multiplication

rule for
Independent events
7. If two events are dependent on each other, the probability of both

occurring simultaneously is given by the probability of one event
multiplied by the probability of the other given that the first event
has occurred.
Symbolically: P (A n B) = P (A). P (B/A) Multiplication rule for

Dependent
P (A n B) = P (B). P (A/B) events
Examples:
21
2.4 Probabilities Under conditions of Statistical Independence
When two events happen, the outcome of the first event may or may not
have an effect on the outcome of the second event. That is, the events
may be either dependent or independent. In this section, we examine
events that are statistically independent.
Definition: - statistically independence is the case when the occurrence
of an event has no effect on the probability of the occurrence of any other
event.
There are 3 types of probabilities under statistical independence:

1. Marginal Probability
2. Joint Probability
3. Conditional Probability
1) Marginal Probabilities Under Statistical Independence.

A marginal or unconditional probability is the simple probability of the
occurrence of an event.
Example: In a fair coin toss, P (H) = 0.5, that is, the probability of heads
equal 0.5, and the probability of tails equal 0.5. This is true for every toss,
no matter how many tosses have been made or what their outcomes have
been. Every toss stands alone and is in no way connected with any other
toss. Thus, the outcome of each toss of a fair coin is an event that is
statistically independent of the outcomes of every other toss of the coin.
22
2) Joint Probabilities Under Statistical Independence
The probability of two or more independent events occurring together or
in succession is the product of their marginal probabilities. Mathematically,
this is stated as:
P (A and B) = P (A n B) = P (A). P (B)
Where: P (A n B) = probability of events A and B occurring together or in
succession, this is known as joint probability.
P (A) – Marginal probability of event A occurring
P (B) - Marginal probability of event B occurring
For more than two events: P (A n B n C) = P (A). P (B). P (C)
Example: In terms of the fair coin example, the probability of heads on

two successive tosses is the probability of heads on the first toss (which
we shall call H1) times the probability of heads on the second toss (H2).
We have shown that the events are statistically independent, because the
probability of heads on any toss is 0.5, and P (H1n H2) = 0.5 x 0.5 = 0.25.
Thus the probability of heads on two successive tosses is 0.25.
Exercises: 1.What is the probability of getting, tails, heads, and tails in

that order on three successive tosses of a fair coin?
Solution: P (T1 H2T3) = P (T1). P (H2). P (T3)
= 0.5 x 0.5 x 0.5 = 0.125
You can also check using tree diagram
2. What is the probability of at least one tail on three tosses?
23
Solution: At least one tail = means minimum of one tail otherwise 2 or 3
tails. There is only one case in which no tails occur namely H1H2H3.
Therefore, we can simply subtract for the answer.
P (at least one tail in 3 tosses) = 1 – P (all heads)
= 1 - (H1H2H3)
= 1 – 0.125 = 0.875
3) Conditional Probabilities Under Statistical Independence
Thus far, we have considered two types of probabilities, marginal (or

unconditional) probability and joint probability. Symbolically, marginal
probability is P (A) and joint probability is P (AB). Beside these two, there
is one another type of probability, known as conditional probability.
Conditional probability is the probability that a second event (let’s say B)

will occur if a first event (let’s say A) has already happened.
Symbolically: P (B/A) read as probability of B given that event A has
occurred.
- For statistically independent events, the conditional probability of

event B given that event A has occurred is simply the probability
of event B:
P (B/A) = P (B)
- Thus, statistical independence can be defined symbolically as the
condition in which P (B/A) = P (B).
24
Examples: What is the probability that the second toss of a fair coin will
result in heads, given that heads resulted on the first toss?
Solution: In this case the two events are independent.
Symbolically: the question is written as: P (H2/H1)
*Using conditional probability under statistically independent
situation, P (H2/H1) = P (H2)
P (H2/H1) = 0.5
Check Yourself
1. What is the probability that a couple’s second third will be
a) A boy, given that their first child was a girl?
b) A girl, given that their first child was a girl?
Solution:
a) P (b/g) = P (b) = 0.5, since the events are statistically
independent
b) P (g/g) = P (g) = 0.5, since the events are statistically
independent
2. The four floodgates of a small hydroelectric dam fail and are
repaired independently of each other. From experience, it is known
that each floodgate is out of order 4 percent of the time.
a) If floodgate 1 is out of order what is the probability that
floodgates 2 and 3 are out of order?
b) During a tour of the dam, you are told that the chances of
all four floodgates being out of order are less than 1 in
5,000,000. Is this statement true?
Solution:
a) P (2 and3) = P (2 n 3) = P (2). P (3)
= 0.04 x 0.04 = 0.0016
25
b) P (1n2n3n4) = P (1n2n3n4) = P (1). P (2). P (3). P (4)
= 0.04 x 0.04 x 0.04 x 0.04
= 0.00000256
Compare 0.00000256 with 0.0000002. Comparing the values the
statement is False
2.5 Probabilities Under Conditions of Statistical Dependence

Definition: - Statistical dependence exists when the probability of some
event is dependent up on or affected by the occurrence of some other
event.
Like the statistically independent events, there are three probability types
for statistically dependent events.
1) Conditional Probabilities Under Statistical Dependence

If the occurrence and non-occurrence of an event depends on the
occurrence and non-occurrence of other event, the conditional probability
of an event given other event occurred can be computed as follows:
Conditional probability = Joint probability of A and B

Probability of B
P ( AnB )
Symbolically: P( A / B) =
P( B)
P ( BnA)
P ( B / A) =
P ( A)
26
Example:
2) Joint Probabilities Under Statistical Dependence

Given the formula for computing condition probability under statistical
dependence and by undertaking minor arithmetic rearrangement we can
obtain the formula for joint probabilities under statistical dependence as
given below:
Symbolically: P (A n B) = P (A). P (B/A)

= P (A/B)
Example:
3) Marginal Probabilities Under Statistical Dependence
Marginal probabilities under statistical dependence are computed by

summing up the probabilities of all the joint events in which the simple
event occurs.
Symbolically: P (A) = P (AB) + P (AC)
Where: P (A)- represents probability of event A
P (AB) – refers to the probability of joint occurrence of event A & B

P (AC) - refers to the probability of joint occurrence of event A & C.
27
Example:
Check yourself
1. According to a survey, in Developed Countries the probability that a

family owns two cars if its annual income is greater than $35,000 is
0.75. Of the households surveyed, 60 percent had incomes over
$35,000 and 52 percent had two cars. What is the probability a
family has two cars and an income over $35,000 a year?
Solution:
Given: P (2 Cars/ >$35,000I) = 0.75
P (>$35,000I) = 0.6 P (2 Cars) = 0.52
Required: P (2 cars and > $35,000I) =?
P (2C n $35,000 I) = P (2C). P (> $35,000I/2C)
= 0.6 x 0.75 = 0.45
2. Two events, A and B, are statistically dependent, If P (A) = 0.39, P
(B) = 0.21, and P (A or B) = 0.47, find the probability that
i. Neither A nor B will Occur?
ii. Both A and B will occur?
iii. B will occur given that A has occurred?
iv. A will occur, given that B has occurred?
Solution:
Given: P (A) = 0.39 P (A or B) =P (AUB)= 0.47
P (B = 0.21
i. P (A or B)’ = 1 – P (A or B) = 1 – 0.47 = 0.53
28
ii. P (A n B) = P (A). P (B/A) = P (B). P (A/B) = [P (A) + P (B)] – P
(AUB) = [0.39 + 0.21] – 0.47 = 0.6 – 0.47 =
0.13
P( AnB ) 0.13
iii. P ( B / A) = = = 0.333&
P ( A) 0.39
P ( AnB ) 0.13
iv. P( A / B) = = = 0.62
P( B) 0.21
2.6 BAYES’ THEOREM (Revising prior probabilities and Estimating

posterior ones)
In our discussion of conditional probability, we indicated that revising
probabilities when new information is obtained is an important phase of
probability analysis. Often, we begin our analysis with initial or prior
probability estimates for specific events of interest. Then, from sources
such as a sample, a special report, or some other means, we obtain some
additional information about the events. Given this new information, we
up date the prior probability values by calculating revised probabilities,
referred as posterior probabilities, Bayes’ theorem provides a means for
making these probability calculations. The steps in this probability revision
process are shown in figure below.
Prior New Application of Posterior

Probabilities Information Bayes’ probabilities
Theorem
Figure 2.1: Revising prior probabilities and Estimating posterior

probabilities
29
Example: An application Of Bayes’ Theorem
Consider a manufacturing firm that receives shipments of parts from two

different suppliers. Let A1 denote the event that a part is from supplier 1
and A2 denote the event that a part is from supplier 2. Currently, 65% of
the parts purchased by the company are from supplier 1 and the
remaining 35% are from supplier 2. Hence, if a part is selected at
random, we would assign the prior probabilities P (A1) = .65 and P (A2) =
.35.
The quality of the purchased parts varies with the source of supply.
Historical data suggest that the quality ratings of the two suppliers are as
shown in the table below.
Table: 2.1Historical Quality Levels of Two Suppliers

Percentage Good Percentage
parts Bad parts
Supplier 1 98 2
Supplier 2 95 5
If we let G denote the event that a part is good, and B denote the event
that a part is bad, the information in table 2.1 provides the following
conditional probability values.
P (G/A1) = 0.98 P (B/A1) = 0.02

P (G/A2) = 0.95 P (B/A2) = 0.05
30
Based on the above information we can compute the joint probabilities of
a part being good and comes from supplier 1, good and A2, a part being
bad and supplied by A1; and bad and supplied by A2.
P (A1G) = P (A1) P (G/A1) or = P (G) P (A1/G) = .637
P (A1B) = P (A1) P (B/A1) = .0130

P (A2G) = P (A2) P (G/A2) = .3325
P (A2B) = P (A2) P (G/A2) = 0175
Suppose now that the parts from the two suppliers are used in the firm’s
manufacturing process and that a machine breaks down because it
attempts to process a bad part. Given the information that the part is a
bad, what is the probability that it came from supplier 1 and what is the
probability that it came from supplier 2?
With the prior probabilities and the join probabilities, Bayes’ theorem
can be used to answer these questions.
- Letting B denote the event that the part is bad, we are looking
for the posterior probabilities P (A1/B) and P (A2/B). From the
law of conditional probability and marginal probability, we know
that:
P ( A1 nB )
P ( A1 / B ) =
P( B)
P (A1 n B) = P (A1). P (B/A1) and P (A1 n B) = P (A1). P (B/A1)

P (B) = P (A1nB) + P (A2 n B)
P (B) = P (A1) P (B/A1) + P (A2) P (B/A2)
31
Substituting the above equations, we obtain Bayes’ theorem for the case
of two events.
P ( A1 nB )
P ( A1 / B ) =
P ( A1 nB ) + P ( A2 nB )
P ( A1 ) P ( B / A1 )
P ( A1 / B ) =
P ( A1 ) P ( B / A1 ) + P ( A2 ) P ( B / A2 )
P ( A2 nB )
P ( A2 / B ) =
P ( A1 nB ) + P ( A2 nB )
P ( A2 ) P ( B / A2 )
P ( A1 / B ) =
P ( A1 ) P ( B / A1 ) + P ( A2 ) P ( B / A2 )
Using the above formula:

0.65 x0.02 0.0130
P ( A1 / B ) = = = 0.4262
(0.65 x0.02) + (0.35 x0.05) 0.0305
0.35 x0.05 0.0175
P ( A2 / B ) = = = 0.5738
(0.65 x0.02) + (0.35 x0.05) 0.0305
Note that in this application we started with a probability of .65 that a part
selected at random was from supplier 1. However, given information that
the part is bad, the probability that the part is from supplier 1 drops to
.4262. In fact, if the part is bad, there is a better than 50-50 chance that
the part came from supplier 2; that is, P (A2/B) = .5738.
32
Bayes’ theorem is applicable when the events for which we want to
compute posterior probabilities are mutually exclusive and their union is
the entire sample space. Bayes’ theorem can be extended to the case
where there are n mutually exclusive events A1, A2,…, An whose union is
the entire sample space. In such case, Bayes’ theorem for computing
posterior probability P (Ai/B) can be written symbolically as:
P( Ai ) P( B / Ai )
P( Ai / B) =
P( A1 ) P( B / A1 ) + P( A2 ) P( B / A2 ) + ... + P( An ) P( B / An )
Bayes’ theorem calculated can be conducted using tabular approach as

well as tree diagram.
Check yourself
Once in the night, a speeding taxi struck a man as he crossed the street.
An eyewitness has testified that she thought the taxi (which did not stop)
was blue. The man sued the Blue cab company for his medical expenses.
The city where the accident occurred has only two taxi companies: Blue
cab and Green cab. Green cab has 85 percent of the taxis’ in the city. At
the trial, the man’s lawyer shows that the eyewitness is 80 percent reliable
in identifying the color of taxis. That is, she was able to identify correctly
the color of taxis 80 percent of the time, under conditions like those of the
night accident. The lawyer concludes that it is extremely likely that a Blue
Cab was hit the man. Do you agree? Why or Why not?
Solution:
Given: B = Blue E = eyewitness thought that the taxis was blue.
G = Green
P (E/B) = 0.8 P (E/G) = 0.2
33
P (B) = 0.15 P (G) = 0.85
Required: P (B/E)=?
P ( E / B ).P ( B ) 0.8 x0.15
P( B / E ) = = = 0.41
P( E / B ).P ( B ) + P ( E / G ).P (G ) (0.8 x0.15) + (0.2 x0.85
Review Exercises
34
3.0 THEORETICAL PROBABILITY
DISTRIBUTION
Learning Objectives
Define the terms probability distribution and random variable
Distinguish between a discrete and continuous probability distribution
Calculate the mean, variance, and standard deviation of a discrete
probability distribution
Describe the characteristics and compute probabilities using the binomial
Describe the characteristics and compute probabilities using the Poisson
Describe the characteristics and compute probabilities using the hyper
geometric probability distribution
Describe the characteristics and compute probabilities using the uniform,
normal and exponential probability distributions
Describe how to approximate different probability distributions and the
conditions necessary for approximating a probability distribution by other.
3.1 Random Variables and Probability Distribution
Random Variables: - is a numerical description of the outcome of an

experiment. The particular numerical value of the random variable
depends on the outcome of the experiment, i.e., the value of the random
variable is not known until the experiment outcome is observed.
- Random variable can be classified as either Discrete or
Continuous depending on the numerical value it assumes.
i. Discrete Random Variables: - are random variables that may

assume either a finite whole number of values or an infinite
35
sequence of whole numbers such as 0,1,2…is referred to as a
discrete random variable.
Examples:
Experiment Random Variables Possible values for the
random variable
Take a 20 multiple Number of questions 0,1,2,3…20
choice question answered correctly
examination
Operate a restaurant Number of customers 0,1,2,3…
for one day
Sell an automobile Gender of customer 0=if female
1=if male
Although many experiments have outcomes that are naturally described by

numerical values, others do not.
ii. Continuous Random Variables: - are variables that assume any

numerical value in an interval or collection of intervals. Experimental
outcomes that are based on measurement scales such as time,
weight, distance and temperature can be described by continuous
random variables.
36
Examples:
Experiment Random Variables (X) Possible values for the
random variable
Operate a bank Time between X ≥ 0
customer arrivals in
minutes
Work on a project to Percentage of project 0 %≤ X ≤ 100%
construct new library complete after six
months
Note:One way to determine whether a random variable is discrete or

continuous is to think of the values of the random variable as points on a
line segment. Choose two points representing values of the random
variable. If the entire line segment between the two points also
represents possible values for the random variable, the random variable is
continuous.
Probability Distribution: - the probability distribution for a random

variable describes how probabilities are distributed over the values of the
random variable.
i. Discrete probability Distribution:-If X is a discrete random

variable assuming values X1, X2, …Xn with associated probabilities P
(X1), P (X2), … P (Xn), the set of points:
37
X1 P (X1)
X2 P (X2)
. .
. .
Xn P (Xn)
Conditions for discrete probability function: P (X) ≥ 0 and ∑P (X) = 1
Examples:
Suppose that a fair die is thrown once. The outcomes are number of dots.
Let X be a random variable that represents the number of dots of the die
and P (X) the
Probability:
Random variable (Xi) Probability Of X P

Number of dots (Xi)
1…………………….. 1/6
2…………………….. 1/6
3…………………….. 1/6
4…………………….. 1/6
5…………………….. 1/6
6…………………….. 1/6
∑P(Xi) = 1
38
Graphically:
P (Xi)
1/6
0 1 2 3 4 5 6Xi
The above example illustrates the uniform discrete probability

distribution, where each value of the random variable has the same
probability of being observed.
However it is not an only case that each value of the random variable
assumes equal chance of occurrence rather there exists a case when each
value of the random variable has different probability of being observed,
which is referred as non-uniform discrete probability distribution. A
good example of this is tossing a coin, for instance, twice, three times, etc.
Tossing a coin three times what is the probability of observing head 0, 1,
2, 3 from the 8 possible outcomes.
Sample space (S) = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
39
Random variable Probability
Number of Heads (Xi) P (Xi)
0…………………….. 1/8
1…………………….. 3/8
2…………………….. 3/8
3…………………….. 1/8
numberofoutcomeswheretheeventoccurs 1
P (0 head) = =
totalpossibleoutcomes 8
numberofoutcomeswheretheeventoccurs 3
P (1 head) = =
totalpossibleoutcomes 8
Cumulative Probability Distribution: - If X1, X2, X3, … Xn are different values

of X given in increasing order then the cumulative probability of the first K
values is given by:
P (Xk) = P (X1) + P (X2) +. …+P (Xk)
K
= ∑ P( X
i =1
i )
Example: Taking the last example, what is the cumulative probability

distribution for heads to occur at least twice:
40
Random Probability
Variable P (Xk)
2 3/8
3 4/8
Cum.Prob.= ∑ P( X k ) = 7 8
ii. Continuous Probability Distribution:(Probability Density

Function – PDF)
Assume that random variable X varies continuously from X1 to Xk; we
define the PDF by the following integral:
XK
P(x1 ≤ x ≤ xk ) = ∫ f (x) dx
X1
Cumulative probability distribution- for a continuous random variable, the

PDF of f (X) probability that X will assume any value less than or equal to
Xk is given by:
XK
P(x1 ≤ x ≤ xk ) = ∫ f (x) dx
−∞
Properties of Cumulative probability

1. 0 ≤ f (X) ≤ 1
2. If X1 < X2, then f (X1) < f (X2)
3. If X1 < X2, then P (X1 < X ≤ X2) = f (X2) – f (X1)
lim f ( X )
4. =0
X → −∞
lim f ( x)
5. =1
X →∞
41
Example:
Consider a continuous random variable X that can assume values between
2 and 6 with equal probability. The probability density function f (X) = ¼.
What is the probability that X will be smaller than or equals to 5?
Solution
5 5
P(2 ≤ x ≤ 5) = ∫ 1 dx 1 x ∫ = 1 (5) − 1 (2) = 3
2 4 4 2 4 4 4
3. f (x)= KX2 for 0 ≤ X ≤ 2

0 elsewhere
Find the value of K for which f (x) is a valid PDF
2
Solution: P (0 ≤ X ≤ 2) = ∫ 0
KX 2 dx = 1
2
= K ∫ X 2 dx = 1, ∫ X 2 dx = 1 x 3
3
0
2
= K 13 X 3 ∫
0
= K [ 1 3 (2 3 ) − 13 (0 3 )]
= K 83 = 1 K = 3/8
3.2 Mathematical Expectation
The expected value, or mean (or mathematical expectation); of a random

variable of the is the measure of the central location for the random
variable.
42
Expected Value of a Random Variable: The expected value of discrete
random variable x, denoted by E(x), is the weighted nean of the possible
values that the random variable can assume, where the might attached to
each value is the probability that the random variable will assume this
value. In other words,
m
∑ ( x) = ∑
−∞
f ( xi ) xi dx .... For contionous r. v
Rules of Expected Values

1. If K is constant; then E (K) = K
2. If A and B are constants the E (aX + b) = a E (X) + b (expected
value of a linear function).
3. The mathematical expectation of the sum of two or more random
variables is equal to the sum of the expectations of individual
random variables.
I.e., E (X + Y + Z) = E = (X) + E (Y) + E (Z)
4. If X and Y are independent random variables, then E (XY) = E (X). E
(Y)
E (XY) ≠E (X). E (Y) for dependent random variables.
5. The expected value of the ratio of two random variables is not equal
to the ratio of the expected value of their random variables. I.e.,
 X  E (X )
E ≠
 Y  E (Y )
Examples:
1. A real-estate agent sells 0,1, or 2 houses each working week
with respective probabilities 0.5, 0.3, and 0.2. Compute the
expected value of the number of houses sold per week?
43
2
Solution: E (X) = ∑
i=0
Pi ⋅ X i = (0.5 x 0) + (0.3 x 1) + (0.2 x 2) = 0.7
2. Suppose you role a true die a million times. Calculate the

expected value of the population?
6
E (X) = ∑X
i =0
i ( Pi )
= (1 x 1/6) + (2 x 1/6) + (3 x 1/6) + 4 x 1/6) +

(5 x 1/6) + (6 x 1/6)
= 3 ½ = 3.5
3.3 Variance and Standard Deviation of a Random Variable
The variance measures that how individual values are speeded, dispersed
or distributed around its mean or expected value.
The variance of a random variable X, denoted by δ2 (x), is the expected

value of the squared deviations of the random variable from its expected
value.
δ 2 = E ( [ xi − E ( x)] 2 ) = E ( xi − µx )2
m
= ∑ i =1
( xi − E ( x) 2 P ( xi )
Where: xi is random variable

P (xi ) is the probability of its equaling xi.
2
Or δ = E ( x 2 ) − [E ( x)]2
x
2
Standard Deviation = δ = var ( x) = E ( xi − µx) 2
x
2
δ E ([ xi − E ( x)])
2
=
x
44
m
= ∑i =1
( xi − E ( x)) 2 P ( xi )
E ( x 2 ) − [E ( x)]
2
Or =
Properties of Variance
1. The variance of a constant is zero
2. If X an Y are two independent random variables, then Var. (X + Y)
= Var. (X) + Var. (Y)
3. If b is constant, then Var. (x + b) = Var. (x)
4. If a is constant, then Var. (ax) = a2. Var. (x)
5. If x an Y are random variables and a and b are constants, the
Var. (ax + by) = a2 var. (x) + b2 var. (y)
Example:
1. Mr. Tujar buys a stock whose return (including both dividends and
change in price of stock) depends on whether the nation’s GNP is
rising, constant, or falling. If the GNP is rising, the return is 20
percent (i.e., 20 cents per Birr); if it is constant, the return is 5
percent; and if it is falling, the return is -10 percent. If he believes
that it is equally likely that the GNP will rise, remain constant, or fall.
What is the expected value of the return from this stock? And what
are the variance and standard deviation of this stock’s return.
Solution: If it is equally likely that the GNP will rise, remain constant or fall,
the probability of each of these outcomes must equal 1/3. Thus, the
expected value is:
2
E (X) = ∑
i =0
Pi ⋅ X i = 20(1/3) + 5(1/3) + -10(1/3)= 5 percent
45
The variance is calculated as follows:
2 3
δ
x
= ∑
i =1
( xi − E ( x) 2 P ( xi )
= (20.5) 2 1/3 + (5 – 5) 2 1/3 + (-10 – 5)2 1/3

= 225/3 + 0/3 + 225/3 = 450/ = 150
S.D. δ x = Var (x) = 150 = 12.25 percent.
3.4 CHEBYSHEV’S INEQUALITY
Once we know the standard deviation of a random variable, we can make

some interesting statements about the extent of the dispersion or
variability among the values that the random variable can assume. In
particular, we can apply the following theorem developed by the 19th
century Russian mathematician P. Chebyshev.
Chebyeshev’s inequality: for any random variable, the probability that

the random variable will assume a value with in K standard deviations of
the random variable’s expected value is at least 1 – 1/K2.In other words,
this theorem tells us that the probability that a random variable will
assume a value more then K standard deviation from the random
variable’s expected value is less than 1/K2.
Example: You are given the expected value and standard deviation of he
profits to be made from a particular business venture. The expected value
is 400,000 Birr and the standard deviation is 100,000 Birr. What is the
probability that the profits from this venture will be below zero or above
46
800,000 Birr? And what is the probability that the profit will be between
400,000 Birr and 600,000 Birr?
Solution:
In this case, if you know the probability distribution of profits you can
figure out this probability exactly but you are not given, so use
Chebyshev’s inequality to determine the maximum and minimum amount
the probability can possibly be:
o The probability of the profits below 0 and above 800,000 Birr is the
same as the probability that the profits will assume a value more
than 4 standard deviation from the profits expected value. Thus,
the maximum amount of this probability can be 1/K2, which is equal
to 1/42 = 1/16.
o The probability that the profit will be between 400,000 and 600,000
Birr is the same as the probability that the profit will assume a value
within 2 standard deviations from the profit’s expected value.
Thus, the probability is at least 1 – 1/22 = 1 – ¼ = ¾.
Covariance (Cov.)
The expected value and variance are commonly used to summarize

measures of a univariate PDF. But once we go beyond the univariate PDF,
in addition to the mean and variance of each variable. Some additional
characteristics of multivariate PDFs such as covariance and correlation
need to be considered.
47
Let X and Y are two random variables with means E (X) and E (Y). The
covariance between the two variables is defined as:
Cov. (X, Y) = E [( x − µx) ( y − µy )]
For frequency distribution
Cov. (X, Y) =
∑ (xi − x ) (yi − y )
N
Can assume negative
Cov. (X, Y) = E [( x − µx) ( y − µy )]
Values
= E (X Y) – µx µy
3.5 Discrete Probability Distributions
Discrete Probability Distribution has the following basic distributions:

1. The Binomial probability distribution
2. The Poisson probability distribution
3. The Hyper geometric probability distribution
3.5.1The Binomial Probability Distribution
This distribution is one of the widely used probability distribution of a

discrete random variable. It describes discrete, not continuous, data
resulting from an experiment known as Bernoulli process (or experiment).
This distribution was first developed by 17th century Swiss mathematician,
Jacob Bernoulli.
Properties of Binomial Experiment

1. The experiment consists of a series of n-identical trials.
48
2. In each trial there are only two possible outcomes. We refer to
one outcome as ‘success’ and the other as ‘failure’.
3. The probability of a success on one trial is denoted by P and does
not change from one trial to another. And the probability a
failure, denoted by q, which is equal to 1-P, does not change
from trial to trial. (Stationarity assumption)
4. Statistically, the trials are independent.
If properties 2,3 and 4 are present we say that a Bernoulli process

generates trials.
If property 1 is present in addition to the three, we say that we have
a binomial experiment.
To illustrate, we can use the outcomes of a fixed number of tosses of a fair

coin as an example of a Bernoulli process.
1. We can toss a coin many times.

2. Each toss has only two possible outcomes: head or tail
3. The probability of the outcome of any toss remains unchanged
overtime. With P (H) remains 0.5 for each toss regardless of the
number of times the coin is tossed.
4. The outcome of one toss does not affect the outcome of any
other toss.
In a binomial experiment our interest is in the number of successes

occurring in the n trials. If we let X denote the number of successes in n
trials, we see that X can assume the values of 0,1,2,3…n. Since the
49
number of values is finite, X is a discrete random variable. The probability
distribution associated with this random variable is binomial probability
distribution.
Let r denotes the number of successes in n trials, the r is random variable

that can assume values, 1,2,3…n. If the probability of the happening of
an event is P, the probability that it will happen in exactly r out of n
occasions is given by:
P(r) = ( ) p (1− P)
n
r
r n−r
= ( )p qn
r
r n−r
= nCr pr q n−r
n!
Or P(r) = Pr q n−r
r!(n − r)!
Where: P = probability of success

q = 1-p = probability of failure
n = number of trials undertaken
r = number of successes desired.
Examples:
Suppose a company produces toothpaste. Historically, eight-tenths of the
toothpaste tubes were correctly filled (successes). What is the probability
of getting exactly three of six tubes (half a carton) correctly filled?
Solution:
50
Given: P = 0.8 r=3
Q = 0.2 n = 6, then using binomial formula
n! r n−r
Probability of r success in n trials = P q
r !( n − r ) !
6! 6−3
Probability of 3 correctly filled = 0 .8 3 0 .2
3 !( 6 − 3 ) !
Tubes out of six or P (3) = 6 x 5 x4 x3! (0.512) (0.008)

3 x 2 x 1(3!)
= 120 (0.512) 0(0.0008)

6
= 20 (0.512) (0.008)
= 0.08192
Interpretation- the probability of getting exactly 3 tubes out of six that are
correctly filled is 0.08192.
Binomial tables are available: of course, we could have solved the above
problem by using binomial probability tables. In order to use the table,
take n, p and r values and look for the probability value that corresponds
to n, p and r.
Graphic Illustration of the Binomial Distribution
To this point, we have dealt with the binomial distribution in terms of the
binomial formula and table, but the binomial, like any other distribution,
can be expressed graphically as well. You should understand that there is
51
not just one binomial distribution. Rather, there is a different distribution
for each different pair of n, p values.
Figure 1: Family of binomial probability distribution with constant

n = 5 and various p and q.
P = 0.1
P = 0.3
Q = 0.9
Probability
q = 0.7
Probability
0 1 2 3 4 5
0 1 2 3 4 5
r
r
P = 0.7
P = 0.5 Q = 0.3
Q = 0.5
Probability
Probability
0 1 2 3 4 5
0 1 2 3 4 5 r
r
52
P = 0.9
q = 0.1
Probability
0 1 2 3 4 5
r
Figure 2: Family of binomial distribution with constant P = 0.4 and

various ^ (^ = 5, ^ = 10 and ^ = 30)
^ =5
P = 0.4
Probability
0 1 2 3 4 5
r
^ =10
P = 0.4
Probability
0 1 2 3 4 5 6 7 8 9 10
53
r
Probability
0 5 10 15 20 25 30
r
General Appearance of Binomial Distribution

In the figure above, with n constant and various p and q, we can make the
following generalizations:
When P is small, the binomial distribution is skewed to the right

When P = 0.5, the binomial distribution is symmetrical
When P is larger than 0.5, the binomial distribution is skewed to the
left
From figure above, with P constant and various n, we can make the
following generalizations: as n increases, the vertical lines not only become
more numerous but also tend to bunch up together to form a bell shape.
54
Applications of Binomial Distribution
The binomial distribution is applied extensively on sampling problems. In

these applications it is customary to refer the size of the sample rather
than number of trials.1
Example: A tyre wholesaler has 500 super band tires in a stock. And
those 50 tires with slightly damaged steel belting are randomly mixed in
the stock. A retailer buys 10 tires. What is the probability that the retailer
receives 8 undamaged types?
Solution:
n = 10 P = 450/500 = 0.9
r=8 q = 50/500 = 0.1
10 !
P (8) = 0 . 9 8 0 . 1 10 − 8
8 ! (10 − 8 ) !
8
= 10 x 9 x 8! . (0.9) (0.1) 2
8! (2 x 1)
= 45 (0.9) 8 (0.1) 2
= 0.194
The binomial formula gives the probability of exactly r successes. In many

real world problems, however, we are interested in cumulative
probabilities, such as the probability of at most three successes or more
than 5 successes out of a specified n – suppose n = 10 and P = 01.4, then
the probability of at most three successes would be: P (0) + P (1) + P (2)
+ P (3)
1
To calculate P for population, use the following formula P=R/N. where R be the number of successes in
the population and N be size of the population. The binomial requirements may be relaxed when n is small
compared with N (if n < 0.05 N).
55
- Similarly, the probability of more than 5 successes would be:
P (6) + P (7) + P (8) + P (9) + P (10)
- Therefore: the cumulative (cum.) probability of r.
Cum. P(r) = [P (0) + P (1) + P (2) + …. + P(r)]
m
= ∑
r =i
P (r ) …………………………… 1
P (at most r) = cum P(r) ……………………….…….. 2

P (less than r) = cum P(r-1) ………………………..…. 3
P (more than r) = 1 - cum P(r) …………….…………. 4
P (r) = cum P (r) – cum P (r-1) ……………………….. 5
Illustration: Consider the following cumulative binomial probabilities for

n = 10 and p = 0.4.
r
No of
Cum. Prob. ∑ P(r )
0
Successes
(r)
0. …………… 0.00605
1……………. 0.04636
2……………. 0.16729
3……………. 0.38228
4……………. 0.63310
5……………. 0.83376
6……………. 0.94524
7……………. 0.98771
8……………. 0.99833
9……………. 0.99990
10……………. 1.00000
56
1. What is the probability of fewer than five successes? Ans.063310
2. What is the probability of more than five successes? Ans. 0.16624
3. What is the probability of last four successes? Ans. 0.11772
4. What is the probability of most three successes? Ans. 0.38228
5. What is the probability of exactly six successes? Ans. 0.11148
Mean (Expected Value) and Variance of a Binomial Distribution

Mean (Expected value) of a binomial distribution = n p = µ
The variance of a binomial = δ 2 = nPq = n p (1 − p) distribution
The standard deviation is = δ = nPq
Where: n = number of trials µ= mean P = probability of success

δ 2 = Variance q = probability of failure = 1 – P δ =S.D.
Example:
Take the case of a packaging machine that produces 20 percent defective
packages. If we take a random sample of 10 packages, compute the
mean (expected value) and the standard deviation of the binomial
distribution of that process like this?
Solution: µ = ηρ = 10 x 0.2 = 2 δ = ηρ q = 10 x0.2 x0.8 = 1 .6 =
1.265
3.5.2 The Poisson Probability Distribution
The Poisson distribution named for its originator Simeon Denis Poisson
(1781 – 1840), a French man who developed the distribution from studies
during the latter part of his lifetime.
57
It is useful when dealing with the number of occurrences of an event over
a specified interval of time or space.
Properties of Poisson Distribution/ Conditions Leading to Poisson

Distribution/
1. The probability of an occurrence of the event is the same for any
two intervals of equal length.
2. The occurrence or non-occurrence of the event in any interval is
independent of the occurrence or nonoccurrence in any other

interval.
The Poisson probability function is given by probability of:
µ xe −µ 2
P( x) =
x!
Where: P(x) = Probability of x occurrence in an interval
µ = The expected value or the average number of occurrences in
an interval
e = constant equals to 2.71828 …
Examples of Poisson distribution include the distribution of telephone calls

going through a switch board system, the demand of patients for service
at a health institution, the arrivals of trucks and cars at a tollbooth, and
the a number of accidents at an intersection, etc.
Example:
λ x e −λ
2
In some texts P( x) = where λ the average number of occurrences in an
x!
interval
58
1) A certain restaurant has a reputation for good food. The restaurant
management boasts that on a Saturday night, groups of customers arrive
at a rate of 15 groups every half an hour, on average.
a) What is the probability that 5 minutes will pass with no groups of
customers arriving?
b) What is the probability that 8 groups of customers will arrive in 10
minutes?
Solutions:
a) Given µ = 15 groups in 3 minutes.
15 = 30 min.
? = 5 min.
On average 2.5 groups in 5 minutes.
µ x e −µ
P( x) = =
x!
2.5 0 e −2.5 1.e −2.5 ♣
P ( 0) = = = 0.0821
0! 1
b) 15 = 30 min
? = 10 min
On average 5 groups arrive in 10 minutes = ( µ )
µ xe −µ
P( x) =
x!
5 8 e −5
P (8) = =0.0653♣
8!
♣
fortunately, the answers obtained using hand calculations can be obtained by looking up to the Poisson
probabilities table with out tedious work
59
Just as with the binomial distribution, the Poisson distribution also
involves cumulative probabilities.
Example: Calls at a telephone switchboard follow a Poisson process and

occur at an average rate of six per 10 minutes. The operator leaves for a
5-minute coffee break.
a) What is the value of µ ? Answer. 3
b) What is the probability that exactly two calls come in (and so go
unanswered) while the operator is a way? Answer 0.2240
c) What is the probability that more than three calls go unanswered?
Answer 0.3528
Poisson Distribution as an Approximation of the Binomial

Distribution
Some times, if we wish to avoid the tedious job of calculating binomial

distributions, we can use the Poisson instead. The Poisson distribution can
be a reasonable approximation of the binomial, but only under certain
conditions. There conditions occur when n is large and P is small, that is
when the number of trials is large and the binomial probability of success
is small.
The rule most often used by statisticians is that the Poisson is a

good approximation of the binomial when n is greater than or equal to
20, and P is less than or equal to 0.05. In cases that meet these
conditions, we can substitute the mean of the binomial distribution (nP)
60
in place of the mean of the Poisson distribution ( µ ) so that the formula
becomes:
(np ) x e − np
P( x) = ……… Poisson formula for approximating binomial formula
x!
In approximating the binomial distribution using Poisson distribution,

always there exists a trade off between a bit of accuracy and making
easier calculation.
Example:
3.5.3 Hyper geometric Probability Distribution.
This distribution is closely related to binomial probability distribution. But

in hyper geometric probability distribution, the trials are not independent.
Thus, the probability of success changes from trial to trial, the objective is
to choose random sample of n-items out of a population of N under
condition that once an item has been selected, it is not returned to the
population (with out replacement).
Earlier we noted that the binomial formula could be applied in two-

outcome sampling situations where the sample size n was not more than 5
percent of the population size N. When n greater than 5percent of N, the
hyper geometric formula should be used.
Properties of Hyper geometric Probability Distribution (Conditions for

Hyper geometric Probability Distribution)
61
1. The result of each draw can be classified in to two categories.
2. The probability of success in each draw changes
 R  N − R 
  
r n − r
The hyper geometric probability formula P(r) =   
 N
 
n 
Where: N = Population size

R = Number of successes in apopulation
n = Sample size
r = Number of successes in a sample.
Examples:
1. A population consists of 10 items, four of which are classified as
defective. What is the probability that a random sample of size 3 will
contain two defective items?
Solution: N = 10 R= 4
N=3 r= 2
 4 10 − 4   4  6 
     
 2  3 − 2   2 1 
P (2) = = =
10  10 
   
3  3 
2. Suppose that there are 15 identical tires in stock and 5 are slightly
damaged. What is the probability that a customer who buys 4 tires
will obtain 2 damaged tires?
Solution: N = 15 R=5
n=4 r=2
62
 5 15 − 5   5 10 
     
 2  4 − 2   2  2 
P (2) = = =
15  15 
   
4  4 
Note: Hyper geometric probability distribution is more tedious to compute

by hand. When n is not too large, use binomial formula to approximate
hyper geometric results. Still, it is better to use Poisson formula to
approximate hyper geometric results given the following conditions:
- n ≤ 0.05 N
- n ≥ 20 and p ≤ 0.05
3.6 CONTIONOUS PROBABILITY DISTRIBUTION
So far in this chapter we have been concerned with discrete probability

distributions. In this section, we shall turn to case in which the random
variable can take any value with in a given range and in which the
probability distribution is continuous.
In the discussion of discrete probability distributions, we introduced the

concept of a probability function ƒ (x). Recall that this function provides
the probability that the random variable x assumes some specific value. In
the continuous case, the counterpart of the probability function is the
probability density function, also denoted by ƒ (x). For a continuous
random variable, the probability density function provides the value of the
function at any particular value of x; it does not directly provide the
probability of the random variable assuming some specific value.
63
However, the area under the graph of ƒ (x) corresponding to a given
interval will assume a value in that interval.
There are several continuous probability distributions used in statistical

work. In this course we treat the following, which are the most common.
1. The Uniform probability distribution
2. The Normal probability distribution
3. The Exponential probability distribution
3.6.1 The Uniform probability Distribution
A continuous probability distribution where the probability that the random

variable will assume a value in any interval is the same for each interval of
equal length is called a uniform probability distribution.
Example: Suppose that a random variable x represents the total flight

time of an airplane traveling from Mekelle to Addis. Further, assume that
flight time can be any value in the interval 60 to 80 minutes. Let’s suppose
that sufficient actual data are available to conclude that the probability of
the flight time between 60 and 61 minutes is the same as the probability
of the flight time with in any other 1 minute interval time up to and
including 80 minutes, with very one minute interval being equally likely,
the random variable x is said to have a uniform probability distribution:
1
 ............... for 60 ≤ x ≤ 80
Algebraically: ƒ (x) =  20
0...................elsewhere
64
ƒ (x) ƒ (x) = (1/20) (20)= 1 = Area
1/20
x (Flight time in minute)

0 60 80
Figure 3. :
In general, the uniform probability density functions for a random variable
x which can take a value from a to b can be represented as follows:


 1
ƒ (x) =  for a ≤ x ≤ b elsewhere
 b − a
 0...elsewhere
The graph of the PDF provides the height or value of the function at any
particular value of x. Unlike the discrete probability function the PDF for a
continuous random variable does not represent probability rather it
provides the height of the function at any particular value of x.
Area as a measure of probability

Consider the area under the graph in the interval from 60 to 70 from the
above example.
65
1/20
0
70 80 X
60
Figure 3. :
The probability that the arrival of the plane is between 60 and 70 minutes
is equal to the shaded area [a (rectangle)]3
1
Area = (10) = 0.5 probability
20
Once the PDF has been identified, the probability that x takes on a value
between some lower values (x1) and some high value (X2) can be obtained
by computing the area under the graph of ƒ (x) over the interval x1 and x2:
Symbolically: P (x1 ≤ x ≤ x2) = x2 – x1

Range


 1
Example: P (68 ≤ x ≤ 76) =? Given that ƒ (x) =  60 ≤ x ≤ 8
 20
 0...else where
76 − 68 8
Solution: P (68 ≤ x ≤ 76) = = = 0.4
20 20
3
Area of a rectangle = Base x Height.
66
Expected Value (Mean) and Variance of the Uniform Probability
Distribution.
Mean = E (x) = a + b Where: a = Minimum value of x

2 b = Minimum value of
x
Variance = Var. (x) = (b - a) 2

12
Range = b – a
Height = 1
b-a
Area = 1
Examples:
1. The random variable X is supposed to be uniformly distributed
between 10 and 20.
a) Find P (x ≤ 15)?
b) Find P (12 ≤ x ≤ 18)?
c) Compute E (x) and Var. (x)?
2. The mean of a uniformly distributed random variable is 10 and the

range is 1.8?
a) What are the smallest and the largest values of the
distribution?
b) What is the probability that the random variable can take
values between 9 and 10.5?
67
Solutions:
 1  1
 ,a≤x≤b  , 10 ≤ x ≤ 20
1. (a) f (X) =  b − a f(x) =  20 − 10
0 elsewhere 0 elsewhere
P (10 ≤ x ≤ 15) = 15 – 10 = 5 = 0.5

10 10
(b) P (12 ≤ x ≤ 18) = 18 – 12 = 6 = 0.6

10 10
(c) E (x) = a + b Var. (x) = (b – a) 2

2 12
E (x) = 10 + 20 = 30 = 15 Var. (x) = (20 – 10) 2 = (10) 2

2 2 12 12
= 100 = 25
12 3
2. Given: E (x) = 10= a + b
2
Range = 1.8 = b – a
a) a + b = 10 ⇒ a + b = 20 ……………………..1
2
1.8 = b – a ⇒ -a + b = 1.8……………….2
Solving equations 1and 2 simultaneously, we can find the result:
68
a + b = 20 a + b = 20
-a + b = 1.8 a = 20 – b
2b = 21.8 a – 20 – 10.9
b = 21.8 a = 9.1
b = 10.9
 1
 , 9.1 ≤ x ≤ 10 .9
b) f(x) =  10 . 9 − 9 . 1
0 elsewhere
10.5 − 9 1.5
P (9 ≤ x ≤ 10.5) = = = 0.833
10.9 − 9.1 1.8
3.6.2 The Normal Probability Distribution
The most useful theoretical distribution for continuous random variable

distribution is the Normal Distribution. Several mathematicians were
instrumental in its development, among them the 18th century
mathematician-astronomer Karl Gauss. In honor of his work, the normal
probability distribution is of the called Gaussian distribution.
Importance of the Normal Distribution

There are two basic reasons why the normal distribution occupies such a
prominent place in statistics.
First, it has some properties that make it widely applicable in various

situations in which it is necessary to make inferences by taking samples.
69
Second, it comes close to fitting the actual observed frequency
distributions of many phenomena. For instance, human characteristics
(weights, heights and IQs), outputs from physical/process (like dimensions
and yields) and other measures of interest to economists and business
professional in both the public and private sectors.
E.g. Per capita income in developing countries, air pollution in a
community, etc
The Normal Curve

The form or shape of the normal probability distribution is illustrated by
the bell-shaped curve.
f (x)
x
µ = Mean
=Mode
=Median
The probability density function for a normally distributed probability

distribution is as follows:
  x−µ  
2
 −1   
1  2  σ  
f ( x) = .e
2πσ 2
70
  x−µ  
2
 −1   
1  2  σ  
f ( x) = ⋅e
σ 2∏
Where: x = the variable µ = Mean σ = Standard deviation
e = 2.718281 … π = 3.14 …
Parameters of the Normal Probability Distribution

The parameters or characteristics of normal distributions are the means
(µ) and the standard deviation ( σ ). There is no single normal curve, but
rather a family of normal curves. A particular normal distribution is
specified by its mean and standard deviation.
Characteristics (Properties) of the Normal Probability Distribution
1. The normal curve is bell-shaped and symmetrical about its mean

(µ). If the curves were folded along its vertical axis, the two halves
would coincide. The tails of the curve extend to infinity in both
direction and theoretically never touch the horizontal axis.
2. The highest point on the normal curve occurs at the mean, which
are also the median and the mode of the distribution. The height of
the curve declines as we go on either direction.
3. The standard deviation determines the width of the curve.

Therefore, larger values of standard deviation result in wider and
flatter curves that show more dispersion in the data.
4. The area under the curve is equal to 1.
71
5. Areas under the curve give probabilities for the normally distributed
variables. The area under the normal curve is distributed as follows:
i) µ ± σ = 68.27%, one-tail each = 34.14%
i)) µ ± 1.96 σ = 95%, one-tail each = 47.5%
iii) µ ± 2 σ = 95.45%
iv) µ ± 3 σ = 99.73%
Standard Normal Probability Distribution

The equation of the normal curve depends on mean (µ) and standard
deviation ( σ ) and for different values of µ and σ we will obtain different
curves. This would necessitate separate tables for a normal curve areas
for each pair of µ and σ . However, we will be able to determine normal
curve areas regardless of µ and σ by tabulating only the area under the
normal curve having µ = 0 and σ = 1. Such a normal curve is known as
the Standard Normal Curve.
X ~ N (µ, σ ). ………………… X-scale
Z ~ N (0, 1) ………………… Z scale
In order to transform the X - scale in to Z – scale, we use the following
formula:
Xi −µ
Z =
σ
Graphically: X-scale
Z-scale
72
The Z – value tells us how far away and in what direction X is from its
mean in terms of standard deviation.
The PDF for standard normal probability distribution is:

1 (−12 z )
2
f (Z ) ⋅e
2π
In any problem in which we are interested to determine area under the

normal curve whose µ and σ are given, we change the Xs in to Zs and
then use the standard normal table. The table provides areas for interval
starting from Z = 0 and ending at a positive values of Z. Since normal
distribution is symmetrical, it is not necessary to tabulate probabilities for
negative values of Z.
Examples:
1. Find the area under the normal curve for Z = ± 1.54
Solution: P (0 + 1.54) = 0.4382 (from the table)
P (0 + -1.54) = 0.4382
P (-1.54 ≤ x ≤ 1.54) 0.4382 + 0.4382 = 0.8764
-1.54 0 1.54
2. The area to the right of Z = 0.25

Solution:
73
0 0.25
P (x ≥ 0.25) = 0.5 – 0.0987 = 0.4013
3. The area to the left of Z = 1.96

Solution: 0.5 + 0.475 = 0.975
0 1.96
4. The area between Z = 0.6 and Z = 1.8

Solution:
P (0 + 1.8) = 0.4641
P (0 + 0.06) = 0.2257
P (0.6 ≤ x ≤ 1.8) 0.4641 - + 0.2257 = 0.2384
0 0.6 1.8
5. The area between Z = -0.4 and Z = 0.6, please try yourself.
General rule: If both Zs are on the same side of the mean, then the
area between them can be obtained by subtracting. And if both Zs are on
74
the opposite side of the mean, then the area between them can be
obtained by summing the two values.
Application Example:
1. The income of a group of 1000 persons found to be normally

distributed with mean 750 Birr per month and a standard
deviation of 50 Birr. Show that of this group about 95% had
income exceeding 668 Birr an only 5% had income exceeding
832 Birr?
Solution:
Given: µ = 750 Birr X1 = 668 Birr
σ = 50 Birr X2 = 832 Birr
Z1 = X1 - µ = 668 – 750 = -82 = -1.64

σ 50 50
∴ Income exceeding 668 Birr = 0.5 + 0.4495 = 0.9495= 95%
0.4495
0.5
-1.64
0
Z2 = X2 - µ = 832 – 750 = 82 = 1.64

σ 50 50
75
∴ Income exceeding 832 Birr = 0.5 - 0.4495 = 0.0505
= 5%
0.5
0 1.64
Check yourself
1. 15,000 students appeared for an examination. The mean marks

were 49 and a standard deviation of marks was 6. Assuming that
the marks are normally distributed.
a) What proportion of students scored more than 55

marks? Answer: 15.87%
b) If grade ‘A’ is given to students scoring more than 70
marks, what proportion of students will receive grade
‘A’? Answer 0.0002%
2. The aptitude test score of job applicant are normally

distributed with mean of 140 and standard deviation of 20.
a) What is the probability that a score will be in the
interval 100 and 180? Answer 0.9544.
b) If 500 applicants take the test, how many would you

expect to score 145 or below? Answer 0.5987 x 500 =
299
76
c) What proportions of scores are from 110 to 120?
Answer 15.98%
d) What percent of the scores exceed 183? Answer
1.58%
Inverse Use of the Standard Normal probability Table

This means to find the value of Z, which corresponds to a given probability
in the table.
Example
1. (Z/p = 0.4864) = 2.21
2. (Z/p = 0.4922) = 2.42
Given probability we can find the value of Z, then change Z to X – value

X −µ
using the formula Z =
σ
i.e., Z σ = x - µ
X = zσ + µ
Example: Given: µ=100, σ = 10, what is the value of x for which

the left tail area is 0.05?
Solution: X = Zσ + µ (z/p = 0.45) = -1.64

= -1.64 (10) + 100
= -16.4 + 100
= 83.6
Conditions for the application of Z-statistic

The Z – transformation is applicable only in the following conditions:
77
1. If the population variance (σ x2 ) is known.
2. If the population variance is unknown, but the sample size is large

(when n ≥ 30). If none of these conditions is satisfied we cannot
use the z-statistic.
(Others…)
Normal Approximation of Binomial Probabilities
Earlier in this chapter we presented the binomial probability distribution.
Binomial probability distribution is discrete probability distributions for the
number of successes in a sample of size n, and probability questions
pertain to the probability of x success in n trials.
When the number of trials becomes large (say 15,20,500,1000,etc.) hand

or calculator computation of the binomial formula becomes tiresome and
time consuming, though not impossible. Besides, the binomial table in
most statistics books (at the appendix) do not include values of n greater
than 20. Hence, when we encounter a binomial probability distribution
problem with a large number of trials, we may want to approximate the
binomial probability distribution. In this section you will learn how to make
the approximation and how large n should be for close approximations.
Conditions where the Normal Approximation can be used:
In cases where the number of trials is greater than 20, np ≥ 5, and nq ≥ 5,
the normal probability distribution provides a simpler way to approximate
binomial probabilities.
78
Computing the Approximation:
When a normal approximation is used to binomial, we set

µ = n. p andσ = npq in the definition of the normal curve.
In order to approximate a binomial probability with the normal, the

following steps are used.
1. Find the x values by adding 0.5 or subtracting 0.5 to the binomial
values.
2. Transform the x values in to Z values
3. As for the case of normal distribution, compute the probability from
a standard normal table.
Example: Suppose that a particular company has a history of making

errors in 10% of its invoices. A sample of 100 invoices has been taken,
and we want to compute the probability that 12 invoices contain errors.
That is, we want to find the binomial probability of 12 successes in 100
trials.
Solution:In this case it is difficult, albeit not impossible, to use the binomial
formula. Hence, we should approximate it by another distribution. Since
the conditions for approximating binomial probability by normal probability
(np ≥ 5 and nq ≥ 5) is satisfied, we compute the probability of 12
successes in 100 trials by using normal approximation. To compute, the
following steps are followed:
1. Finding the values of x by adding 0.5 and subtracting 0.5. The 0.5
we add and subtract is called a continuity correction factor. It is
79
introduced since continuous distribution is being used to
approximate a discrete distribution.
x1 = 12 + 0.5 = 12.5 and x2 = 12 – 0.5 = 11.5
Thus, P (x = 12) for the binomial distribution is approximated by P

(11.5 ≤ x ≤ 12.5) for the continuous distribution.
2. Transform the x values in to Z values.

xi − µ µ = n. p = 100 x 0.1 = 10
For x1 = 11.5 Z =
σ
11.5 − 10 δ = npq = 100 x 0.1 x 0.9 = 3

Z1 = = 0 .5
3
12.5 − 10
Forx 2 = 12.5 Z 2 = = 0.83
3
Find the probabilities from the standard normal table for Z values, and
finally compute the P (11.5 ≤ x ≤ 12.5 )
P (0 to Z1) = 0.1915 P (0 to Z2) = 0.2967

There fore, P (11.5 ≤ x ≤ 12.5) = 0.2967 – 0.1915
= 0.1052
- The normal approximation to the probability of 12 successes in 100

trials is 0.1052.
- Given the parameters, the normal distribution is shown in the figure
below.
80
σ =3
Area = 0.1052
µ = 10 11.5 12.5
Figure:3. Normal approximation to a binomial probability distribution with

n = 100 and p = 0.1 showing probability of 12 errors.
3.6.3 Exponential Probability Distribution

It is a continuous probability distribution useful in dealing with the time it
takes to complete a certain task is the exponential probability distribution.
This distribution enables us to measure the length of time between certain
events. For this reason, the exponential distribution some times referred
as waiting-time distribution.
The PDF is given by:
1 −x
µ
f ( x) = e whereµ = mean
µ
e = 2.718
Let λ = 1
µ , then f ( x) = λ e −λx
81
Example:
Assume that the time it takes to get a taxi follows an exponential

probability distribution. If the mean time to get a taxi is 5 minutes, then
the appropriate PDF is:
− λx
f ( x) = λ e λ = 15
−x
= 1 e 5
5
The exponential probability distribution can be represented by the ff curve
f (x)
To calculate probabilities for exponential random variable we need to be

able to find areas under the exponential distribution. Suppose we want to
find the area A to the right of some number a as shown in the following
figure.
f(x)
A
X
a
∞
e − λΧ
Area (A) = a∫ λ dx
Let eu = e − λΧ , u = - λx,
du
= -λ
dx
82
b
∞
Area(A)= ∫a e −λx − λdx = Lim ∫ − e −λx λdx
b →∞ a
− λx
= −e − λb − ( − e − λa )
b
= Lim − e
b →∞
+c a
= 0 − ( − e − λa )
− λ a
Area (A) = e = Probability (x ≥ a)
Area (A) + Area (B) =1

Area (B) = 1 – Area (A)
− λ a
= 1- e = Probability (x < a)
Example:
−x
 1 15 

Given the following PDF f (x) =  f ( x) = e 
15 
 
(a) What is the probability that it takes 6 minutes or less to get a taxi.
−6
= 1− e
− λ a 5
Solution: p (x< 6)= 1- e
b) What is the probability it takes more than 6minutes?

−6
− λ a
Solution: p (x> 6)= e = e 5
Review Exercises:
Behavioral Objectives
A. You should be able to define the following key concepts in this chapter:
experiment *combination
outcome *permutation
83
event conditional probability
frequency definition of probability multiplication rule
subjective definition of probability independent events
sample space Bayes' theorem
mutually exclusive events prior probability
exhaustive events state of nature
addition rule
B. Make sure that you can do each of the following:
1. Present the frequency and subjective concepts of probability.

2. State, prove, and apply the addition rule for probabilities.
3. State and apply the concepts of conditional probability and
independence.
4. State, prove, and apply the multiplication rule for robabilities.
5. Use Bayes' theorem to calculate the probabilities that various
hypotheses are true, given that a particular event occurs.
6. Discuss the assumptions underlying Bayes' theorem and the
controversies concerning its usefulness.
*7. Calculate the number of ways we can choose one of each of m kinds
of items.
*8. Compute the number of permutations of x items that one can select
from n items.
*9. Calculate the number of combinations of x items that one can select
from n items.
Breakfast Food: A Case Study
84
Experts in marketing have devoted considerable study during the past20
years to the way in which the probability that a consumer will purchase a
given brand of a product depends on what brands he or she has
purchased in the past. I As a very simple illustration, suppose that it has
been determined that a consumer, if he or she purchases breakfast food,
has a 20 percent chance of purchasing a particular brand of breakfast food
if he or she has purchased this brand once before, and a 10 percent
chance of purchasing it if he or she has never purchased it before.
(a) Suppose that this consumer had not tried this brand at the beginning'
of April, and that he or she purchased breakfast food once in April and
once in May. What is the probability that he or she did not purchase,
this brand either time?
Multiple-Choice Questions
1. Suppose thatP(A) = O.5,P(B) = 0.2, and P(A and B) = 0.2. Which of the
following is true?
(a) A and B are mutually exclusive and statistically independent events.

(b) A and B are mutually exclusive but not statistically independent
events.
(c) A and B are statistically independent but not mutually exclusive
events. ,
(d) A and B are neither statistically independent nor mutually exclusive
events.
(e) None of the above.
85
5. The probability that the Jones Company will go bankrupt in 1984 is 0.1.
The probability that it will lose money in 1984 is 0.2. The probability that
it will both go bankrupt and lose money in 1984 is 0.1. The probability
that it will either go bankrupt or lose money (or both) in 1984 equals
(a) less than 0.1.

(b) 0.1
(c) 0.2
(d) more than 0.2.
(e) none of the above.
6. In the preceding question, whether the Jones Company will go bank

rupt in 1984
(a) is not statistically independent of whether it loses money in 1984.

(b) is statistically independent of whether it loses money in 1984. (c)
cannot be represented by a probability.
(d) cannot be represented by a subjective probability.
(e) cannot be analyzed by statistical methods.
8. If peA) = 0.3 and PCB) = 0.6, what is P (not A and not B) if A and B
are statistically independent?
86
A. You should be able to define the following key concepts in this
A. You should be able to define the following key random variable .

concepts in this chapter: Bernoulli process / probability
random variable ./ Bernoulli trials distribution
probability distribution / binomial distribution variance of a
variance of a random variable.//acceptance random variable
sampling
expected value of a random variable/ expected value of a
acceptance number Chebyshev's random variable
acceptable quality level inequality / rejectioncontinuous random
continuous random variable /number discrete variable
random variable Bernoulli process
binomial distribution
Chebyshev's inequality
1. The random variable X has the following probability distribution
Value of X Probability
0 .10
1 .20
2 .40
3 .20
4 .10
A. You should be able to define the following key concepts in this chapter:
87
Z value
Poisson distribution
Normal distribution
Standard normal distribution
1. The scores on a particular psychological test are normally distributed

with mean equal to 100 and standard deviation equal to 20. The
probability that a score will exceed 130 equals:
(a) .4332.
(b) .0668.
(c) .3413.
(d) .1587.
2. A manufacturer of pipe knows that the pipe lengths it produces vary in

diameter and that the diameters are normally distributed. The mean
diameter is 1 inch, and the probability that a length of pipe will have a
diameter exceeding 1. inches is .1587. The standard deviation of the
diameters must therefore be
(a) 1 inches.
(b) .1 inches.
(c) 2 inches.
(d) .2 inches.
88
3. The pipe manufacturer in the previous question wants to know what the
probability is that a diameter will exceed 1.2 inches. You are hired as a
consultant. Your answer should be
(a) .05.
(b) .10.
(c) .0228
(d) .0793.
4. The probability that the value of a standard normal variable is less than
1.0 equals
(a) .0228.
(b) .0287.
(c) .6915.
(d) .8413.
(e) .0919.
5. The probability that the value of a standard normal variable exceeds 1.9
equals
(a) .0228.
(b) .0287.
(c) .6915.
(d) .8413.
(e) .0919.
89
9. An insurance company finds that .003 percent of the population dies of
a certain disease each year. The company has insured 100,000 people
against death from this disease.
a) What is the probability that the firm must payoff in three or more cases
next year? (Use the Poisson distribution.)
b) What is the expected number of persons insured by this company

who will die of the disease next year? What is the most likely number
of persons who will die of the diseases next year?
14. An experiment with three outcomes has been repeated 50 times and it
was learned that EI occurred 20 times, E2 occurred 13 times, and E3
occurred 17 times. Assign probabilities to the outcomes. What method
did you use?
38. The survey of subscribers to Forbes showed that 45.8% rented a car
during the past 12 months for business reasons, 54% rented a car
during the past 12 months for personal reasons, and 30% rented a car
during the past 12 months for both business and personal reasons
(Forbes 1993 Subscriber Study).
a) What is the probability that a subscriber rented a car during

the past 12 months for business or personal reasons?
b) What is the probability that a subscriber did not rent a car
during the past 12 months for either business or personal
reasons?
90
43. Assume that we have two events, A and B, that are mutually exclusive.
Assume further that we know peA) = .30 and PCB) = .40.
a. What is peA n B)?

b. What is peA I B)?
c. A student in statistics argues that the concepts of mutually exclusive
events and independent events are really the same, and that if
events are mutually exclusive they must be independent. Do you
agree with this statement? Use the probability information in this
problem to justify your answer.
d. What general conclusion would you make about mutually exclusive
and independent events given the results of this problem?
3. Three students have interviews scheduled for summer employment

at the Brook wood Institute. In each case the result of the
interview will be that a position is either offered or not offered.
Experimental outcomes are defined in terms of the results of the
three interviews.
a. List the experimental outcomes.
b. Define a random variable that represents the number of
offers made. Is this a discrete or continuous
random variable?
c. Show the value of the random variable for each of the
experimental outcomes.
39. Suppose a salesperson makes a sale on 20% of customer contacts. A
normal work week will enable the salesperson to contact 25 customers.
What is the expected number of sales for the week? What is the
91
variance for the number of sales for the week? What is the standard
deviation for the number of sales for the week?
44. Phone calls arrive at the rate of 48 per hour at the reservation desk
for Regional Airways.
a. Find the probability of receiving three calls in a five-minute interval of

time.
b. Find the probability of receiving exactly 10 calls in 15 minutes.
c. Suppose no calls are currently on hold. If the agent takes five
minutes to com current call, how many callers do you expect to be
waiting by that time? What probability that none will be waiting?
d. If no calls are currently being processed, what is the probability that
the agent three minutes for personal time without being interrupted?
56. Axline Computers manufactures personal computers at two plants, one

in Texas and the other in Hawaii. Ther areAO employees at the Texas
plant and 20 in Haaii. A random sample of 10 employees is to be asked
to fill out a benefits questionnaire.
a. What is the probability that none are at the plant in Hawaii?

b. What is the probability that one is at the plant in Hawaii?
c. What is the probability that two or more are at the plant in Hawaii?
d. What is the probability that nine are at the plant in Texas?
17. The demand for a new produt is assumed to be normally

distributed with µ = 20 ∂ = 40. Letting x be the number of units
92
demanded, find the following probabilities.
a) P(180 ≤ x ≤ 220)
b) P ( x ≥ 250)
c) P(x ≤ 100)
d) P(225 ≤ x ≤ 250)
25. Team Marketing Report, a sports-business newsletter, estimates that

the average total cost for a family of four to attend a 1994 major league
baseball game was $95.80 (The Wall Street. Journal, April 5, 1994).
Assume that a normal distribution applies and that the standard
deviation is $10.00.
a. What is the probability that the cost will exceed $IOO.OO?
b. What is the probability that a family of four will spend $75.00 or
less?
c. What is the probability that the cost will be between $85.00 and
$1O0.00?
28. A Consumer Reports survey listed Saturn, Infiniti, and Lexus

automobile dealers as the top three in customer service (Consumer
Reports, April 1994). Saturn ranked number one, with only 4% of the
Saturn customers citing some form of dissatisfaction with the dealer.
Answer the following questions about a group of 250 Saturn customers.
a. What is the probability that 12 or fewer customers will have some

form of dissatisfaction with the dealer?
b. What is the probability that five or more customers will have some
form of dissatisfaction with the dealer?
c. What is the probability that eight customers will have some form of
dissatisfaction with the dealer?
93
29. The true unemployment rate is 7% (Business Week, November 7,
1994). Assume that 100 employable people are selected randomly.
a. What is the expected number who are unemployed?
b. What is the variance and standard deviation of the number who are
unemployed?
c. What is the probability that exactly nine are unemployed?
d. What is the probability that at least five are unemployed?
35. The average life of a television set is 12 years (Money, April 1994).
Productlifl follow an exponential probability distribution. Assume that
this is the case fortli a television set.
a. What is the probability that the lifetime will be six years or less?
b. What is the probability that the lifetime will be 15 years or more?
c. What is the probability that the lifetime will be between five and 10
years?
94

Statistics Module I

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics Module I

Uploaded by

Copyright:

Available Formats

STATISTICS FOR ECONOMISTS

A TEACHING MATERIAL FOR DISTANCE

To define statistics and its branches

Presently, there is no successful social as well as natural scientist can

1.1 Meaning of Statistics and Its Branches

Different individuals have defined the term ‘statistics’ differently.

• The plural sense (statistical data/numerical facts)

i. Data collection – It is the process of obtaining measurements,

In general, statistics is a field of study concerned with data collection,

1.2 Descriptive Statistics Versus Inferential Statistics

Descriptive statistics – is concerned with the collection, processing,

Gambella 19 18 37 91 88 179 110 106 216

Dire Dawa provisional

Source: Chart based on table 1.1 by the course team.

Inferential (Analytical) statistics – is concerned with the process of

Actually, data are sought for a large group of elements (individuals,

Whenever statisticians use a sample to make an inference about a

1.3 Statistics and Economics

Extremely, the increasing importance of statistics in the study of economic

So far, we have discussed meaning of statistics in its plural as well as

Sample Statistics – are variables in a sample or measures computed

1.5 Limitations of Statistics

It cannot be applied to all kinds of phenomena since it deals with

It may be used to mislead or deceive. Most of the time people probably

2.0 Probability Theory: An Introduction

Personally, in our daily lives we are faced with a lot of decision-making

Professionally, much of statistical theory and practice rests on the concept

- What is the chance that sales or quantity demand (Qdd) will

- What likely is the project will be completed on time?

2.2 Some Basic Concepts in Probability

Experiment – is process of observing or measuring something we plan to

E.g. S={pass, fail, drop}-Sample space for statistics experiment

Sample Point(s) [Event(s)]- is any one particular experimental

2.3 Conceptual Approach to Probability

i. Classical Approach (or A priori Definition of Probability)

- Equally likely means that each outcome of an experiment has the

Shortcoming of the Classical Approach

The classical method was originally developed in the analysis of gambling

Example: Tossing and coin: S = {H, T}, P (H) = ½ = 0.5

According to this approach, the probability of an event is the proportion of

In the 1800s, British statisticians, interested in a theoretical foundation for

It defines probability as either:

Shortcoming of the Relative Frequency Approach

iii. Subjective (Personal) Approach

o Subjective probability assignments are frequently found when

2.4 Basic Axioms and Theorems Of Probability

2. The probability of an event ranges from 0 to 1.

3. If two events A and B are mutually exclusive (disjoint events), then

4. If two events A and B are not mutually exclusive, then the

5. If A is an event from a sample space, S, and A’ it its complement

6. If two events are independent, the probability of both events A and

Symbolically: P (A and B) = P (A n B) = P (A). P (B) … Multiplication

7. If two events are dependent on each other, the probability of both

Symbolically: P (A n B) = P (A). P (B/A) Multiplication rule for

There are 3 types of probabilities under statistical independence:

1) Marginal Probabilities Under Statistical Independence.

For more than two events: P (A n B n C) = P (A). P (B). P (C)

Example: In terms of the fair coin example, the probability of heads on

Exercises: 1.What is the probability of getting, tails, heads, and tails in

3) Conditional Probabilities Under Statistical Independence

Thus far, we have considered two types of probabilities, marginal (or

Conditional probability is the probability that a second event (let’s say B)

- For statistically independent events, the conditional probability of