Chapter One - Probability Distribution

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

CHAPTER ONE

RANDOM VARIABLES AND PROBABILITY DISTRIBUTION

1.1 Introduction
In this chapter we discuss probability distributions of random variables and these are customary
used to model some problems in various fields such as business, finance, economics and in
general life. To clearly understand this chapter, you need some basic knowledge in probability
fundamentals.

1.2 Definition of key Terms


1.2.1 A variable
It is an attribute that can take on different values or characteristics from one individual or object
to another. For example, number of items sold per day varies from one day to another, then
“number of items sold” is a variable.
1.2.2 Random Variable (R.V)
This is a variable whose numerical value is determined by the outcome of a random or statistical
experiment. In other words this is the variable that is subject to randomness and it can take on
different values which make it different from algebraic variable. Random variables also are
normally used in econometrics to determine relationships among one another. On the other hand,
random experiment is an experiment which results into at least two possible outcomes without a
prior knowledge to which one will occur. That is, an experiment whose outcome cannot be
predicted in advance. Most of these statistical experiments are described in words (e.g. an
experiment of tossing a fair coin, asking a sample of 10 people if they have ever been in the UK,
etc), however, at one point in time, the outcomes from these experiments are described more
meaningful in terms of numbers. But also in real life, most of sample data are in explicated in
terms numbers than in words. Therefore in short random variables are numerical values.

Traditionally random variables are denoted by capital letters, X, Y, Z or X1, X2, X3 etc.

1.2.2.1 Types Of Random Variable

There are two types of random variable (R.V), these are discrete and continuous random
variables. A discrete random variables takes on only a finite number of values and these are
integers. Examples of discrete random variables are; the number of cars passing through the
roadblock, an experiment of tossing one or more fair coins, number of defective items in a sample,
number of death by COVID-19 in year 2020, etc. On the other hand a continuous random variable
is a random variable that can take on any value (always real numbers) in a given interval of values.
Examples of continuous R.V are, height, weight, rainfall, temperature etc.

General Examples of random variables in the real life are; the unemployment rate, consumer
price index, number of sales made in week, yearly profit of a company, share prices, return on
investments, money supply, GDP, wages, cash flows, interest rates, etc.

1.3 Probability Distribution Of Random Variable


Definition: A probability distribution is a Graph, Table or a function that links each outcome
of a random experiment with its probability of occurrence. In other words we may say probability
distribution is a function that can be used to derive probabilities of each outcome of a random
variable. That is the value taken by this random variable and their associated probabilities. There
are two types of Probability Distribution of a Random Variable (R.V), these are discrete and
continuous Probability Distribution. The following chart may help you understand the branching
of probability distribution

Figure 1: Branches of probability distribution


1.3.1 Discrete Probability Distributions
If a random variable is a discrete variable its probability distributions is called a discrete
probability distribution. With a discrete probability distribution, each possible value of the
discrete random variable can be associated with a non-zero probability. Thus, a discrete
probability distribution can always be presented in Tabular form. The discrete Probability
Distribution is commonly known as Probability Mass Function (PMF). See Figure 1.
1.3.1.1 Probability Mass Function (PMF)

If X is a discrete random variable, the function denoted by 𝑓(𝑥) = 𝑃(𝑋 = 𝑥) for each 𝑥 within the
range of X is called Probability Mass Function of X. To capture clearly the meaning of the
probability distribution of a discrete random variable consider Example 1.1.

Example 1.1
Consider an experiment of tossing two fair coins simultaneously. Find the probability distribution
of obtaining a total number of heads.

Solution:
The following are the procedures for building probability distribution:

1. list all possible events


2. calculate probability of each event
3. present these probabilities in a suitable table or diagram

The list of all possible events can be obtained easily by using a structure of tree diagram as shown
below:

H
T

Start H

T
Figure 2: Tree Diagram
From a tree diagram as shown in Figure 2 the list of all possible outcomes of an experiment (i.e
sample space, S) is as shown below:

𝑆 = {𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇}


From the sample space described above, a random variable (i.e. number of heads) takes on three
different values, 0, 1, 2 depending on whether zero head (no head), one head, or two heads were
obtained in the experiment of tossing two fair coins. That is

𝐻𝐻 → 2heads
𝐻𝑇 → 1 heads
𝑇𝐻 → 1 head
𝑇𝑇 → 0 head
Probability of an event can be obtained by employing traditional definition of probability, that is:

n( E )
P( E ) =
n( S )
Let 𝑋 be the number of observed heads. The probabilities of the number of heads showing up are
as indicated below and the probability distribution is shown in Table 1.
𝑛(0) 1
𝑓(0) = 𝑃(𝑋 = 0) = =
𝑛(𝑠) 4
𝑛(1) 1
𝑓(1) = 𝑃(𝑋 = 1) = =
𝑛(𝑠) 2
𝑛(2) 1
𝑓(2) = 𝑃(𝑋 = 2) = =
𝑛(𝑠) 4

Probability Distribution
Number of heads 𝑓(𝑥) = 𝑃(𝑋 = 𝑥)
(𝑋)

0 𝟏/𝟒

1 𝟏/𝟐
2 𝟏/𝟒
Total 1.00

Properties of PMF
1. 𝑓(𝑥) ≥ 0 for each 𝑥 ∈ 𝑋
2. 0 ≤ 𝑓(𝑥) ≤ 1
3. ∑ 𝑓(𝑥) = 1

Example 1.2
An employment rate of a certain country A (𝑅𝐴 ) in percentage is assumed to be a discrete random
variable whose probability distribution is as shown below:
𝑅𝐴 -12 -10 -6 0 4 8 10 12
Prob. 0.10 0.15 0.10 0.15 0.1 0.15 0.1 0.15

Find
a) 𝑃(𝑅𝐴 ≥ 0)
b) 𝑃(𝑅𝐴 ≥ −10)

Solution:
a) 𝑃(𝑅𝐴 ≥ 0) = 𝑃(𝑅𝐴 = 0) + 𝑃(𝑅𝐴 = 4) + 𝑃(𝑅𝐴 = 8) + 𝑃(𝑅𝐴 = 10) + 𝑃(𝑅𝐴 = 12)
= 0.15 + 0.1 + 0.15 + 0.1 + 0.15
= 0.65
c) 𝑃(𝑅𝐴 ≥ −10) = 1 − 𝑃(𝑅𝐴 < −10)
= 1 − 𝑃(𝑅𝐴 = −12)
= 1 − 0.10
= 0.9

1.4 Cumulative Mass Function (CMF)


Definition: Cumulative Mass Function, 𝐹(𝑥) associated with Probability Mass Function, 𝑓(𝑥) of
a discrete random variable 𝑋 is defined as follows:
𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥)

Where 𝑃(𝑋 ≤ 𝑥) means the probability that a random variable 𝑋 takes a value of less than or equal
to a specific value 𝑥, where 𝑥 is given. For example 𝑃(𝑋 ≤ 2) means the probability that the
random variable 𝑋 takes the value less than of equal to 2.

Example 1.3

Find Probability Mass Function (PMF) and Cumulative Mass Function (CMF) of a total number
of heads obtained by tossing a fair coin three times.

Solution:
The following tree diagram is used to obtain the sample space 𝑆

H T

H
H
T T
Start

H
T T
H
T

T
Therefore 𝑺 = {𝐻𝐻𝐻, 𝐻𝐻𝑇, 𝐻𝑇𝐻, 𝐻𝑇𝑇, 𝑇𝐻𝐻, 𝑇𝐻𝑇, 𝑇𝑇𝐻, 𝑇𝑇𝑇 }
By letting 𝑋 =number of heads shown up, we find that 𝑋 can take values 0, 1, 2 or 3 and hence
the corresponding PMF will be obtained as indicated below:
1
𝑓(0) = 𝑃(𝑋 = 0) =
8
3
𝑓(1) = 𝑃(𝑋 = 1) =
8
3
𝑓(2) = 𝑃(𝑋 = 2) =
8
1
𝑓(3) = 𝑃(𝑋 = 3) =
8
In Tabula form:
𝑥 0 1 2 3
𝑓(𝑥) 1/8 3/8 3/8 1/8

It follows that, the Cumulative Mass Function (CMF) will be obtained as indicated here:
From:
𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥)
1
𝐹(0) = 𝑃(𝑋 ≤ 0) = 𝑃(𝑋 = 0) =
8
4
𝐹(1) = 𝑃(𝑋 ≤ 1) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) =
8
7
𝐹(2) = 𝑃(𝑋 ≤ 2) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2) =
8
𝐹(3) = 𝑃(𝑋 ≤ 3) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2) + 𝑃(𝑋 = 3) = 1
However in plotting the CMF the following will be the ranges of:

0 for x0
1
 for 0  x  1
8

4
F ( x) =  for 1 x  2
8
7
8 for 2 x3


1 for 3 x
With reference to the previous example, it can be observed that, a CMF is merely an
accumulation of PMF for the values of 𝑋 less than or equal to a given 𝑥. That is,

𝐹(𝑥) = ∑ 𝑓(𝑥)
𝑥

Note: The results for both PMF and CMF can also be summarized in the following Table:
Number of
heads values of PMF values of CMF
𝑿 𝒙 𝒇(𝒙) 𝒙 𝑭(𝒙)
0 0≤𝑥<1 1/8 𝑥=0 1/8
1 1≤𝑥<2 3/8 𝑥≤1 4/8
2 2≤𝑥<3 3/8 𝑥≤2 7/8
3 3≤𝑥 1/8 𝑥≤3 1

1.5 Characteristics of Probability Distribution


Although a probability distribution shows the values taken by a random variable and their
corresponding probabilities, in most cases a researcher might be interested in deducing some
summary characteristics from such probability distribution. These summary characteristics
include among others; the expected value (population mean), variance, covariance, correlations,
etc.

1.5.1 Expected Value of a Discrete Probability Distribution


The average value of a random variable is called the expected value of the random variable, and
this is denoted by E ( X ) .
Definition:
Let 𝑋 be a random variable with probability 𝑓(𝑥) = 𝑃(𝑋 = 𝑥) then the expected value 𝐸(𝑋) is
given by
𝐸(𝑋) = ∑ 𝑥𝑃(𝑋 = 𝑥) or
= ∑ 𝑥𝑓(𝑥)
=𝜇
In other words; if a discrete random variable 𝑋 has possible values 𝑥1 , 𝑥2 , 𝑥3 ⋯ 𝑥𝑛 with
corresponding probabilities 𝑃(𝑋 = 𝑥1 ), 𝑃(𝑋 = 𝑥2 ), 𝑃(𝑋 = 𝑥3 ), ⋯ , 𝑃(𝑋 = 𝑥𝑛 ) then expected
value is obtained by multiplying the value the random variable takes with the corresponding
probability of occurrence, i.e.

𝐸(𝑋) = 𝑥1 𝑃(𝑋 = 𝑥1 ) + 𝑥2 𝑃(𝑋 = 𝑥2 ) + 𝑥3 𝑃(𝑋 = 𝑥3 ) + ⋯ + 𝑥𝑛 𝑃(𝑋 = 𝑥𝑛 )


Therefore Σ denotes summation notation whose properties are as indicated below:
Properties of summation notation
1. If 𝑘 is constant, then
𝑛

∑ 𝑘 = 𝑛𝑘
𝑖=1

2. If 𝑘 is constant, then
𝑛 𝑛

∑ 𝑘𝑋𝑖 = 𝑘 ∑ 𝑋𝑖
𝑖=1 𝑖=1

3. It both 𝑎 and 𝑏 are constants, then


𝑛 𝑛

∑(𝑎 + 𝑏𝑋𝑖 ) = 𝑛𝑎 + 𝑏 ∑ 𝑋𝑖
𝑖=1 𝑖=1

Properties of Expected Value


1. The expected value of a constant is equal to the same constant. Hence if 𝑘is a constant,
𝐸(𝑘) = 𝑘
2. The expected value of the sum of two random variables is equal to sum of expected value
of the two random variables. That is for the random variables X and Y.
𝐸(𝑋 + 𝑌) = 𝐸(𝑋) + 𝐸(𝑌)

3. Also

𝐸(𝑋𝑌) ≠ 𝐸(𝑋)𝐸(𝑌)

That is, generally, the expected value of the product of two random variables is not equal
to product of the expected values of those random variables. However, there is an
exception to the rule, if X and Y are independent then
𝐸(𝑋𝑌) = 𝐸(𝑋)𝐸(𝑌)
4. If 𝑘 is a constant, then
𝐸(𝑘𝑋) = 𝑘𝐸(𝑋)
That is to say, the expected value of a constant times a random variable X, is equal to the
constant times the expected value of the R.V
5. If 𝑎 and 𝑏 are constants, then
𝐸(𝑎𝑋 + 𝑏) = 𝐸(𝑎𝑋) + 𝐸(𝑏)
= 𝑎𝐸(𝑋) + 𝑏
1.5.2 Variance of a Discrete Probability Distribution
1.5.2.1 Variance and standard deviation

Variance indicates how individual values are spread, dispersed or distributed around the mean
value. But also the statistical concept of variance is a useful measure of risk of any kind. Generally
if X is a discrete random variable, then its variance is given by:
2
𝑉𝑎𝑟(𝑋) = 𝐸(𝑋 − 𝐸(𝑋))
= 𝐸(𝑋 2 ) − (𝐸(𝑋))2
= 𝐸(𝑋 2 ) − 𝜇2
= 𝜎2
Where 𝐸(𝑋 2 ) = ∑ 𝑥 2 𝑓(𝑥)
The standard deviation of X is therefore given by
𝑆𝐷(𝑋) = √Var(𝑋)
=𝜎
Example 1.4
A company estimates the net profit for a new product to be launched with its corresponding
probabilities under different market conditions as follows;

Market Condition Good Fair Poor


Net Profit (in million Tsh.) 30 10 -3
Probability , 𝑷(𝑿 = 𝒙) 0.15 0.25 0.60

Required:
a) Calculate the expected value of the net profit for the Company
b) What is standard deviation of the net profit

Solution:
a) We know that
n
E ( X ) =  xP( X = x)
i =1
Market Condition Net Profit (in million Tsh) Probability 𝒇(𝒙) 𝒙𝒇(𝒙)
(𝒙)
Good 30 0.15 4.5
Fair 10 0.25 2.5
Poor -3 0.6 -1.8
Total 5.2
𝑛

𝐸(𝑋) = ∑ 𝑥𝑃(𝑋 = 𝑥) = 30 × 0.15 + 10 × 0.25 − 3 × 0.6 = 5.2


𝑖=1

Therefore the expected value of the net profit for the company under all three given market
conditions is 5.2 million Tsh.

b) The variance is given by


𝑛 𝑛 2
2
𝑉𝑎𝑟(𝑋) = ∑ 𝑥 𝑓(𝑥) − [∑ 𝑥𝑓(𝑥)]
𝑖=1 𝑖=1

Market Net Profit (in million Tsh) Probability 𝑥𝑓(𝑥) 𝑥 2 𝑓(𝑥)


Condition (𝑥) 𝑓(𝑥)
Good 30 0.15 4.5 135
Fair 10 0.25 2.5 25
Poor -3 0.6 -1.8 5.4
Total 5.2 165.4

𝑉𝑎𝑟(𝑋) = 165.4 − 5.22 = 138.36


The Standard Deviation is given by:

𝜎𝑥 = √𝑉𝑎𝑟(𝑋) = √138.36 = 11.76


Therefore the standard deviation of the net profit for the company under all three given market
conditions is 11.76 million Tsh. This tells us how much the net profit deviates from the expected
value of 5.2 million Tsh. Thus, we may say that although the expected net profit is about 5.2
million Tsh, it may go above or below this value by 11.76 million Tsh. You may calculate the
confidence interval to estimate the interval on which the expected net profit will fall.
Example 1.5

A return of certain investment B (𝑅𝐵 ) in percentage is a discrete random variable whose


probability distribution is as shown below:
𝑅𝐵 -10 -6 0 4 8 12

Prob. 0.15 0.2 0.15 0.1 0.15 0.25

Find the expected value and the standard deviation of 𝑅𝐵


Solution
Let 𝑋 be a return of investment. For the purpose of simplification, we can summarise the sums
as indicated in the Table below:

𝑥 𝑓(𝑥) 𝑥𝑓(𝑥) 𝑥 2 𝑓(𝑥)


-10 0.15 -1.5 15
-6 0.2 -1.2 7.2
0 0.15 0 0
4 0.1 0.4 1.6
8 0.15 1.2 9.6
12 0.25 3 36
TOTAL 1.9 69.4
Hence:

𝐸(𝑋) = ∑ 𝑥𝑓(𝑥)

= 1.9
2
𝑉𝑎𝑟(𝑋) = 𝐸(𝑋 2 ) − (𝐸(𝑋))
= ∑ 𝑥 2 𝑓(𝑥) − (∑ 𝑥𝑓(𝑥))2
= 69.4 − 1.92
= 65.79

𝑆𝐷(𝑋) = √𝑉𝑎𝑟(𝑋)
= √65.79
= 8.11
Example 1.6

A monthly income of workers in millions of TShs from a certain sector with their associated
probabilities are as indicated in the following probability distribution

Income 1.4 3.5 2.0 0.9 3.0

Probability 0.25 0.2 0.15 0.1 0.3

Find the expected income and standard deviation of all workers


Solution:
Let 𝑋 be a monthly income of worker. For the purpose of simplification, we can summarise the
sums as indicated in the Table below:

𝑥 𝑓(𝑥) 𝑥𝑓(𝑥) 𝑥 2 𝑓(𝑥)


1.4 0.25 0.35 0.49
3.5 0.2 0.7 2.45
2 0.15 0.3 0.6
0.9 0.1 0.09 0.081
3 0.3 0.9 2.7
TOTAL 2.34 6.321
Hence:

𝐸(𝑋) = ∑ 𝑥𝑓(𝑥)

= 2.34
2
𝑉𝑎𝑟(𝑋) = 𝐸(𝑋 2 ) − (𝐸(𝑋))
= ∑ 𝑥 2 𝑓(𝑥) − (∑ 𝑥𝑓(𝑥))2
= 6.321 − 2.342
= 0.8454

𝑆𝐷(𝑋) = √𝑉𝑎𝑟(𝑋)
= √0.8454
= 0.9194

Properties of variance

1. The variance of a constant is zero. That is to say, if 𝑎 is a constant, then


Var(𝑎) = 0
2. If X and Y are two independent random variables, then
𝑉𝑎𝑟(𝑋 + 𝑌) = 𝑉𝑎𝑟(𝑋) + 𝑉𝑎𝑟(𝑌) and

𝑉𝑎𝑟(𝑋 − 𝑌) = 𝑉𝑎𝑟(𝑋) + 𝑉𝑎𝑟(𝑌)

3. If 𝑏 is a constant, then
𝑉𝑎𝑟(𝑏 + 𝑋) = 𝑉𝑎𝑟(𝑋)

4. If 𝑎 is a constant, then
𝑉𝑎𝑟(𝑎𝑋) = 𝑎2 𝑉𝑎𝑟(𝑋)

5. If X and Y are independent random variables and 𝑎 and 𝑏 are constants, then
𝑉𝑎𝑟(𝑎𝑋 + 𝑏𝑌) = 𝑎2 𝑉𝑎𝑟(𝑋) + 𝑏 2 𝑉𝑎𝑟(𝑌)

1.12.1 Binomial Distribution

1.12.1.1 Definition of key terms

1.12.1.2 Binomial experiment

A binomial experiment is a statistical experiment which consist of 𝑛 repeated trials. Each trial can
result in just two possible outcomes. We call one of these outcome a success and the other, a
failure. The probability of success, denoted by P, is the expected to be constant on every trial.

Examples of binomial experiment

• An experiment of tossing a fair coin


• An experiment of selecting a defective item from a given sample
• Asking 10 people if they have ever been to South Africa

1.12.1.3 Binomial Random Variable

This is the number of successes denoted by a letter X in n repeated trials of a binomial


experiment.

Binomial Probability Distribution


To understand binomial distributions and binomial probability, it helps to understand binomial
experiments and some associated notation. A binomial experiment (also known as the
outcome of Bernoulli Process) as also explicated from previous section, is a random
experiment that has the following properties:
i) The experiment consists of finite n number of repeated trials.
ii) Each trial can result in just two mutually exclusive possible outcomes. We call one of these
outcomes a success and the other, a failure.
iii) The probability of success, denoted by P, is the same (constant) on every trial.
iv) The trials are independent; that is, the outcome on one trial does not affect the outcome
on other trials.

Consider the following random experiment. You flip a coin 10 times and count the number of
times the coin lands on tails. This is a binomial experiment because:

• The experiment consists of repeated trials. We flip a coin 10 times.


• Each trial can result in just two possible outcomes – it's either heads or tails.
• The probability of success is constant, i.e. 0.5 on every trial.
• The trials are independent; that is, getting tails on one trial does not affect whether we get
tails on other trials.

Notation: Now, let us suppose that, we have:


• x = the number of successes that result from the binomial experiment.
• n = the number of trials in the binomial experiment.
• p = the probability of success on an individual trial.
• q = the probability of failure on an individual trial (This is equal to 1 - p.)
• b(x; n, p) = Binomial probability - the probability that an n-trial binomial experiment
results in exactly x successes, when the probability of success on an individual trial is p.
Then, the model for specifying the probability of obtaining exactly x successes in a given number
of trials, n is given by

n
b( x; n, p) = P( X = x) =   p x q n − x for x = 0, 1, 2,    , n
 x
n n n!
Note: 
 x = Cx = x!(n − x)!
 
1.12.2 Mean, Variance, and Standard Deviation for a Binomial Random Variable
The expected value (mean) of binomial random variable is given by 𝐸(𝑋) = 𝑛𝑝, and the standard
deviation is given by 𝑆𝐷(𝑋) = √𝑛𝑝(1 − 𝑝. That is,
Mean: 𝜇 = 𝑛𝑝

Variance: 𝜎 2 = 𝑛𝑝𝑞

Standard Devition: 𝜎 = √𝑛𝑝𝑞

Examples 1.10
Given that the expected value of a binomial distribution is 40 and standard deviation is 6.

Required:
Calculate n, p and q.

Solution:
From
 = np = 40
and
 2 = npq = 36
Put 1 into 2, we have
40q = 36
q = 0.9  p + q = 1  p = 0.1
Thus p = 0.1, q = 0.9, and n = 400
Example 1.11
Assume that on an average one telephone number out of 15 is busy.

Required:
What is the probability that if six randomly selected telephone numbers are dialled
a) Not more than three will be busy?
b) At least three of them will be busy?

Solution:
1 14
p= , q = , n = 6 , then
15 15
a) p( x  3) = p( x = 0) + p( x = 1) + p( x = 2) + p( x = 3) = 0.9997 (Use the
Binomial distribution formula)
b) P( x  3) = 1 − P( x  3) = 1 − P( x = 0) + P( x = 1) + P( x = 2) = 0.0051 (use the same
approach as in part a)
Example 1.12

The probability that a student is accepted to a prestigious college is 0.3. If 5 students from the
same school apply, what is the probability that at most 2 are accepted?

Data Given:

𝑝 = 0.3 𝑛 = 5
Find 𝑃(𝑋 ≤ 2)
𝑃(𝑋 ≤ 2) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2)
From
𝑛
𝑃(𝑋 = 𝑥) = ( ) 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥
𝑥
𝑛!
= 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥
(𝑛 − 𝑥)! 𝑥!
5!
𝑃(𝑋 = 0) = × 0.30 × 0.75
(5 − 0)! 4!
5!
= × 0.30 × 0.75
5! 0!
= 𝟎. 𝟏𝟔𝟖𝟎𝟕
5!
𝑃(𝑋 = 1) = × 0.31 × 0.74
(5 − 1)! 1!
5!
= × 0.31 × 0.74
5! 1!
= 𝟎. 𝟑𝟔𝟎𝟏𝟓
5!
𝑃(𝑋 = 2) = × 0.32 × 0.73
(5 − 2)! 2!
5!
= × 0.32 × 0.73
5! 2!
= 𝟎. 𝟑𝟎𝟖𝟕

Therefore:

𝑃(𝑋 ≤ 2) = 0.16807 + 0.36015 + 0.3087


= 𝟎. 𝟖𝟑𝟔𝟗𝟐
1.12.2 Poisson Probability Distribution
A Poisson experiment is a random experiment that has the following properties:
• The occurrence of events is independent
• The average number of successes (𝜆) that occurs in a specified region is known.
• There is a possibility of infinite number of occurrences in a specified region
• The probability that a success will occur is proportional to the size of the region.

Note: the specified region could take many forms. For instance, it could be a length, an area, a
volume, a period of time, etc.
1.12.2.1 Application of Poisson Distribution
• The number of deaths by COVID-19 in the global in 2020
• The number of birth defects and genetic mutations
• The number of car accidents in Dar es Salaam city
• The number of typing errors on a page
• The spread of an endangered animal in Sub Saharan Africa
• The number of failure of a machine in one month

Notation
The following notation is helpful, when we talk about the Poisson distribution.
• 𝑒: A constant equal to approximately 2.71828. (Actually, e is the base of the natural
logarithm system.)
• λ: The mean number of successes that occur in a specified region.
• x : The actual number of successes that occur in a specified region.
• P ( x,  ) : The Poisson probability that exactly x successes occur in a Poisson

experiment, when the mean number of successes is .


The probability distribution of the Poisson random variable X , representing the number of
outcomes occurring in a given time interval or a specified region of space is:

e − x
P ( x,  ) = for x = 0,1,2...
x!
Or
𝑒 −𝜆 𝜆𝑥 𝑥 = 0,1,2, ⋯ 𝑛
𝑃(𝑋 = 𝑥) =
𝑥!

Where λ represent the average number of outcomes occurring in the specified time or region.
Furthermore, if X has a Poisson distribution, then 𝐸(𝑋) = 𝜆 and 𝑆𝐷(𝑋) = √𝜆

Examples 1.13
The average number of days a school is closed due to snow during winter in a certain City in USA
is 4. Calculate the probability that the schools in this city will close for 6 days during a winter?

Solution:

Given  =4, x = 6 using Poisson distribution


e −4 4 6
p( x = 6) = = 0.1042
6!
Note: When the value of 𝑛 in a binomial distribution is very large and the value of probability
𝑝 is very small, the binomial distribution can be approximated by a Poisson distribution. More
specifically, if 𝑛 > 20, 𝑛𝑝 ≤ 10, and 𝑝 ≤ 0.01 then the Poisson is a good approximation.

See the following example

Example 1.14
Suppose that on average, 1 person in every 1000 is an alcoholic. Find the probability that a random
sample of 8000 people will yield fewer than 7 alcoholics.
Solution:
Let 𝑥 represent the number of alcoholic persons

1
p( x) = = 0.001, n = 8000
1000
Since p is very small, and n is very large, then
𝜆 = 𝑛𝑝 = 0.001 × 8000 = 8
Now,

p( x  7) = p( x = 0) + p( x = 1) + p( x = 2) + ... + p( x = 6)
e −8 8 0 e −8 81 e −8 8 2 e −8 8 6
= + + + ... + = 0.3134
0! 1! 2! 6!
Example 1.15

The number of customers attended at CRDB bank follows Poisson distribution with a mean of
10 customers per hour, find the probability that in any given hour

a) exactly 6 customers will be attended


b) No customer will be attended
c) At least 2 customers will be attended

Solution:
Data Given
𝜆 = 10
From:
𝑒 −𝜆 𝜆𝑥
𝑃(𝑋 = 𝑥) =
𝑥!
𝑒 −10 ×106
a) 𝑃(𝑋 = 6) = 6!
= 𝟎. 𝟎𝟔𝟑𝟎𝟓𝟓

𝑒 −10 ×100
b) 𝑃(𝑋 = 0) = 0!
= 𝟎. 𝟎𝟎𝟎𝟎𝟒𝟓𝟒

c) 𝑃(𝑋 ≥ 2) = 1 − [𝑃(< 2)]


𝑒 −10 × 100 𝑒 −10 × 101
=1−[ + ]
0! 1!
= 1 − [0.0000454 + 0.000454]
= 𝟎. 𝟗𝟗𝟗𝟓

1.13 Continuous Probability Distributions


If a random variable is a continuous variable, its probability distribution is called a continuous
probability distribution. There are different types of Continuous distribution; normal
distribution is perhaps the single most used continuous distribution which will be discussed in
this lecture notes.
A continuous probability distribution differs from a discrete probability distribution in several
ways.
• The probability that a continuous random variable will assume a particular value is zero.
• As a result, a continuous probability distribution cannot be expressed in Tabular form.
• Instead, an equation or formula f (x ) is used to describe a continuous probability
distribution function.

Most often, the equation used to describe a continuous probability distribution is called a
probability density function (PDF). Sometimes, it is referred to as a density function. For
a continuous probability distribution, the density function has the following properties:

1. 0  f ( x)  1
The continuous random variable is defined over a continuous range of values (called the
domain of the variable), the graph of the density function will also be continuous over
that range.

2.  f ( x)dx = 1
The area bounded by the curve of the density function and the x-axis is equal to 1, when
computed over the domain of the variable.
b

3. A =  f ( x)dx
a

The probability that a random variable assumes a value between a and b is equal to the
area under the density function bounded by a and b.

1. 14 Normal Distribution
Normal distribution is perhaps the single most important probability distribution involving
continuous random variable. By definition, a continuous random variable X has a normal
distribution if its probability distribution function (PDF) is given by:
𝟏 𝑿−𝝁 𝟐
𝟏 − ( )
𝒇(𝒙) = 𝒆 𝟐 𝝈 , −∞ < 𝑋 < ∞
𝝈√𝟐𝝅

If X has a normal distribution, then 𝐸(𝑋) = 𝜇 and 𝑉𝑎𝑟(𝑋) = 𝜎 2


The common notation for a normal random variable is 𝑋~𝑁(𝜇, 𝜎 2 ). Where ~ means distributed
as, N stands for normal distribution, and the variables enclosed in brackets are the parameters of
the distribution, termed as population mean or expected value 𝜇 and variance 𝜎 2
1.14.1 Properties of normal distribution
1) It is bell-shaped ranging from negative to positive infinity
2) It is symmetrical around its mean value 𝜇 (see figure 1)
3) The area under the curve is unity

Figure 1: Normal Distribution Curve

1.14.2 Standard Normal distribution


PDF of a normal random variable is to some extent complicated to use. Therefore in order to find
probability of a normal random variable let say X, we normally use the so called standard normal
variable Z defined as indicated below.
𝑋−𝜇
𝑍=
𝜎
Where Z has a zero mean and a unit variance. The common notion of expressing a standard
normal random variable is as indicated below:
𝑍~𝑁(0,1)

Therefore, a normally distributed random variable with a given mean and variance can be
converted to a standard normal variable (aka normal deviate), which greatly simplifies our task
of computing probabilities.

1.14.3 The Standard Normal Distribution Table


A standard normal distribution Table shows a cumulative probability associated with a
particular z-score. Table rows show the whole number and tenths place of the z-score. Table
columns show the hundredths place.

Example 1.15
Find the probability that a 𝑍-score will be greater than 3.00 from the standard normal Table.
Solution:
Required to find 𝑃(𝑍 > 3.00)
𝑃(𝑍 > 3.00) = 0.5 − 𝑃(0 ≤ 𝑍 ≤ 3.00)
= 0.5 − 0.4987
= 0.0013

Example 1.16
It is given that, the daily sale of bread in a bakery, follows the normal distribution with a mean of
70 loaves and variance of 9, i.e 𝑋~𝑁(70,9). What is the probability that on any given day the sale
of bread is greater than 75 loaves?
Solution:
Data Given:
𝜇 = 70, 𝜎 2 = 9, 𝜎 = 3
Let 𝑋 represent the daily sale of the bread in a bakery, then:

𝑋 − 𝜇 75 − 70
𝑃(𝑋 > 75) = 𝑃 ( > )
𝜎 3
= 𝑃(𝑍 > 1.67)
= 𝑃(𝑍 > 1.67)
= 0.5 − 𝑃(0 ≤ 𝑍 ≤ 1.67)
= 0.5 − 0.4525
= 𝟎. 𝟎𝟒𝟕𝟓

Example 1.17
An investor is considering to purchase a stock whose monthly return is approximately normally
distributed with an expected return of 0.01 and a standard deviation of 0.02. Use the standard
normal distribution table to find the probability that the stock return is positive.
Data Given:
𝜇 = 0.01, 𝜎 = 0.02
Let 𝑋 represent the stock return, then required to find:
𝑋 − 𝜇 0 − 0.01
𝑃(𝑋 ≥ 0) = 𝑃 ( ≥ )
𝜎 0.02
= 𝑃(𝑍 ≥ −0.5)
= 𝑃(𝑍 ≥ −0.5)
≅ 𝑃(𝑍 ≤ 0.5)
= 0.5 + 𝑃(0 ≤ 𝑍 ≤ 0.5)
= 0.5 + 0.1915
= 𝟎. 𝟔𝟗𝟏𝟓

Example 1.18
The income in thousands of dollars of a given company are normally distributed with the mean
20 and the standard deviation of 5. Find the probability that a selected income will be
a) More than twenty five thousand dollars
b) Anywhere between eighteen twenty four thousand dollars
Data Given:
𝜇 = 20, 𝜎 = 5
Let 𝑋 represent the income of the company, then required to find:

𝑋 − 𝜇 25 − 20
𝑃(𝑋 > 25) = 𝑃 ( > )
𝜎 5
= 𝑃(𝑍 > 1)
= 𝑃(𝑍 > 1)
= 0.5 − 𝑃(0 ≤ 𝑍 ≤ 1)
= 0.5 − 0.0000393
= 𝟎. 𝟒𝟗𝟗𝟗
18 − 20 𝑋 − 𝜇 24 − 20
𝑃(18 ≤ 𝑋 ≤ 24) = 𝑃 ( ≤ ≤ )
5 𝜎 5
= 𝑃(−0.4 ≤ 𝑍 ≤ 0.8)
≅ 𝑃(0 ≤ 𝑍 ≤ 0.4) + 𝑃(0 ≤ 𝑍 ≤ 0.8)
= 0.1554 + 0.2881
= 𝟎. 𝟒𝟒𝟑𝟓
Example 1.19
Applicants for a certain job are given an aptitude test. Past experience shows that score from the
test are normally distributed with a mean of 60 points and a standard deviation of 12 points. What
percentage of candidates would be expected to pass the test, if a minimum score of 75 is required?
Data Given;
𝜇 = 60 𝜎 = 12
Let 𝑋 represent the scores of candidate, required to find 𝑃(𝑋 ≥ 75)
Solution:
𝑋 − 𝜇 75 − 60
𝑃(𝑋 ≥ 75) = 𝑃 ( ≥ )
𝜎 12
= 𝑃(𝑍 ≥ 1.25)
= 0.5 − 𝑃(0 ≤ 𝑍 ≤ 1.25)
= 0.5 − 0.3944
= 01056
Conclusion: Almost 10.56% of candidates would be expected to pass the test

Example 1.20

For a group of 1800 employees of a manufacturing company, IQ is approximately normally


distributed with mean 110 and standard deviation 12. It is known from experience that for a
particular job only persons with IQs of at least 95 are intelligent enough to do it, but those with
IQs greater than 120 soon become bored and unhappy with it. On the basis of IQ alone, how many
of the 1800 employees would you expect to be suitable for the work?
Data Given;
𝑁 = 1800, 𝜇 = 110 𝜎 = 12

Let 𝑋 represent the IQ of employees. Required to find 𝑃(95 ≤ 𝑋 ≤ 120)


Solution:
95 − 110 𝑋 − 𝜇 120 − 110
𝑃(95 ≤ 𝑋 ≤ 120) = 𝑃 ( ≤ ≤ )
12 𝜎 12
= 𝑃(−1.25 ≤ 𝑍 ≤ 0.833
= 𝑃(0 ≤ 𝑍 ≤ 1.25) + 𝑃(0 ≤ 𝑍 ≤ 0.833)
= 0.3944 + 0.2967
= 0.6911
≈ 66.11%
Then:
Find 66.11% of the total employees:
= 0.6611 × 1800
= 1244
Conclusion: The results indicates that 1244 candidates would be suitable for the work based
on IQ test alone

You might also like