Download as pdf or txt
Download as pdf or txt
You are on page 1of 18



2.1 Mutually Exclusive Events
Events that do not occur at the same time
For example, when a coin is tossed then the result will be
either head or tail, but we cannot get both the results. Such
events are also called disjoint events since they do not happen

If A and B are mutually exclusive events then its probability

is given by P(A or B) or P (A U B

Probability of Disjoint (or) Mutually Exclusive Event = P ( A

and B) = 0
In probability, the specific addition rule is valid when two
events are mutually exclusive. It states that the probability of
either event occurring is the sum of probabilities of each event

If A and B are said to be mutually exclusive events then the

probability of an event A occurring or the probability of event
B occurring that is P (A ∪ B) formula is given by P(A) + P(B),

P (A or B) = P(A) + P(B)

P (A ∪ B) = P(A) + P(B)

If the events A and B are not mutually exclusive, the
probability of getting A or B that is P (A ∪ B) formula is given
as follows:
P (A ∪ B) = P(A) + P(B) – P (A and B)

Examples of mutually exclusive events

Some of the examples of the mutually exclusive events are:

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

1. When tossing a coin, the event of getting head and tail are
mutually exclusive. Because the probability of getting head
and tail simultaneously is 0.

2. In a six-sided die, the events “2” and “5” are mutually

exclusive. We cannot get both the events 2 and 5 at the
same time when we throw one die.

2.2 Collectively Exhaustive Events

In probability, a set of events is collectively exhaustive if they

cover all of the probability space:
The probability of any one of them happening is 100%. If a set of
statements is collectively exhaustive we know at least one of them
is true.
These types of events or statements may or may not be mutually
This is because knowing that they cover all possibilities doesn’t
tell us anything about whether or not they are redundant or whether
two or more events may happen at the same time.

If you are rolling a six-sided die, the set of events {1, 2, 3,
4, 5, 6} is collectively exhaustive. Any roll must be
represented by one of the set.

Sometimes a small change can make a set that is not collectively

exhaustive into one that is.

Another way to describe collectively exhaustive events, is that

their union must cover all the events within the entire sample

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

For example, events A and B are said to be collectively exhaustive
if where S is the sample space. Compare this to the concept of a
set of mutually exclusive events.

In such a set no more than one event can occur at a given time.
The set of all possible die rolls is both collectively exhaustive
and mutually exclusive. The outcomes 1 and 6 are mutually exclusive
but not collectively exhaustive.

The outcomes "even" and "not-6" are collectively exhaustive but

not mutually exclusive. In some forms of mutual exclusion only one
event can ever occur, whether collectively exhaustive or not

2.3 Statistically Dependent Events

Dependent events in probability means events whose occurrence of
one affect the probability of occurrence of the other.


suppose a bag has 3 red and 6 green balls. Two balls are drawn
from the bag one after the other.

Let A be event of drawing red ball in the first draw and B be the
event of drawing green ball in the second draw.

If the ball drawn in the first draw is not replaced back in the
bag, then A and B are dependent events because P(B) is decreased
or increased according to the first draw results as a red or green

The probability is a chance of some event to happen. The term

“event” actually means one or even more outcomes. The event is
described as the outcome which is able to occur. Total events are
defined as all the outcomes which may possibly occur relevant to

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

the experiment asked in the question. Also, the events of interest
are known as favorable events.

In probability, an event is defined to be the set of all the

possible outcomes for an experiment.


i) Obtaining a head in a toss of a coin may be called an event.

ii) Getting a 6 on a roll of a die is said to be an event.
(iii) Getting a sum of 9 on the roll of a pair of dice is an event.
An event whose chances of happening is 100 % is called a sure
event. The probability of such an event is 1. In a sure event, one
is likely to get the desired output in the whole sample experiment.
On the other hand, when there are no chances of an event happening,
the probability of such an event is likely to be zero.

This is said to be an impossible event.

On the basis of quality events, these are classified into three

types which are as follows:
A) Independent Events
B) Dependent Events
C) Mutually-Exclusive Events

Dependent Events - Definition

Dependent events are those which depend upon what happened
before. These events are affected by the outcomes that had
already occurred previously. i.e. Two or more events that depend
on one another are known as dependent events. If one event is by
chance changed, then another is likely to differ.

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

Example 1
Sharon has to select two students from a class of 23 girls and
25 boys. What is the probability that both students chosen are
Solution: Total number of students = 23 + 25 = 48
Probability of choosing the first boy, say Boy 1 = 25/48
Probability of choosing second boy, say Boy 2 = 24/47
P(Boy 1 and Boy 2) = P(Boy 1) and P(Boy 2|Boy 1)
= 25/48 . 24/47
= 600/2256
Example 2
In a survey it was found that 10 out of 13 people walk to the
office. 3 persons are selected randomly. What is the probability
that all three walks to the office?
The probability that all three walk to the office = 3/13 * 2/12
* 1/11 = 6/1716
Example 3
A bag contains 6 red, 5 blue, and 4 yellow balls. 2 balls are
drawn, but the first ball is drawn without replacement. Find the
a] P (red, then blue)
b] P (blue, then blue)
a] There are six red balls and a total of fifteen balls.
P (red) = 6 / 15
The probability of the second draw affected the first.
Number of blue balls = 5
Total number of balls left = 14
P (drawing blue after red) = 5 / 14

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

The probability of drawing a blue ball = 5 / 15
The probability of the second draw affected the first.
Now there are 4 blue balls left and a total of 14 balls left.
P (drawing a blue ball after a blue ball) = 4 / 14
Example 4
A wallet contains 4 bills of 5 dollars, 5 bills of 10 dollars
and 3 bills of 20 dollars. 2 bills are chosen randomly without
replacement. Find the P (drawing a 5 dollar bill followed by a 5
dollar bill).

Number of 5 dollar bills = 4

Total number of bills = 12

P (drawing a 5 dollar bill) = 4 / 12

The probability of the second draw affected the first.

Number of 5 dollar bills left = 3

A total of 11 bills are left.

P (drawing a 5 dollar bill after a 5 dollar bill) = 3 / 11

P (drawing a 5 dollar bill followed by a 5 dollar bill)

= P (drawing a 5 dollar bill) * P (drawing a 5 dollar bill

after a 5 dollar bill)

= (4/12) x (3/11)
= 1/11

2.4 Random Variables

A random variable is a variable whose value is unknown or a
function that assigns values to each of an experiment's

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

Random variables are often used in econometric or regression
analysis to determine statistical relationships among one

Random variables are often designated by letters and can be

classified as discrete, which are variables that have specific
values, or continuous, which are variables that can have any
values within a continuous range.

In probability and statistics, random variables are used to

quantify outcomes of a random occurrence, and therefore, can take
on many values. Random variables are required to be measurable
and are typically real numbers. For example, the letter X may be
designated to represent the sum of the resulting numbers after
three dice are rolled. In this case, X could be 3 (1 + 1+ 1), 18
(6 + 6 + 6), or somewhere between 3 and 18, since the highest
number of a die is 6 and the lowest number is 1.

Example of a Random Variable

A typical example of a random variable is the outcome of a coin

toss. Consider a probability distribution in which the outcomes
of a random event are not equally likely to happen. If random
variable, Y, is the number of heads we get from tossing two coins,
then Y could be 0, 1, or 2. This means that we could have no
heads, one head, or both heads on a two-coin toss.

However, the two coins land in four different ways: TT, HT, TH,
and HH. Therefore, the P(Y=0) = 1/4 since we have one chance of
getting no heads (i.e., two tails [TT] when the coins are tossed).
Similarly, the probability of getting two heads (HH) is also 1/4.
Notice that getting one head has a likelihood of occurring twice:
in HT and TH. In this case, P (Y=1) = 2/4 = 1/2.

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

2.5 Probability Distributions

A probability distribution is a statistical function that

describes all the possible values and likelihoods that a random
variable can take within a given range. These factors include the
distribution's mean (average), standard deviation, skewness, and

For instance, if X is used to denote the outcome of a coin toss

("the experiment"), then the probability distribution of X would
take the value 0.5 (1 in 2 or 1/2) for X = heads, and 0.5
for X = tails (assuming that the coin is fair).

Examples of random phenomena include the weather conditions at

some future date, the height of a randomly selected person, the
fraction of male students in a school, the results of
a survey to be conducted, etc.

Example of a Probability Distribution

As a simple example of a probability distribution, let us look

at the number observed when rolling two standard six-sided dice.
Each die has a 1/6 probability of rolling any single number, one
through six, but the sum of two dice will form the probability
distribution depicted in the image below. Seven is the most
common outcome (1+6, 6+1, 5+2, 2+5, 3+4, 4+3). Two and twelve,
on the other hand, are far less likely (1+1 and 6+6).

2.5.1 Binomial Distribution

A binomial distribution can be thought of as simply the

probability of a SUCCESS or FAILURE outcome in an experiment or
survey that is repeated multiple times. The binomial is a type

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

of distribution that has two possible outcomes (the prefix “bi”
means two, or twice).

For example, a coin toss has only two possible outcomes: heads
or tails and taking a test could have two possible outcomes:
pass or fail

 The first variable in the binomial formula, n, stands for the

number of times the experiment runs.

 The second variable, p, represents the probability of one

specific outcome.
For example, let’s suppose you wanted to know the probability of
getting a 1 on a die roll. if you were to roll a die 20 times, the
probability of rolling a one on any throw is 1/6. Roll twenty times
and you have a binomial distribution of (n=20, p=1/6). SUCCESS
would be “roll a one” and FAILURE would be “roll anything else.”

If the outcome in question was the probability of the die landing

on an even number, the binomial distribution would then become
(n=20, p=1/2). That’s because your probability of throwing an even
number is one half.

Binomial distributions must also meet the following three


1. The number of observations or trials is fixed. In other

words, you can only figure out the probability of something
happening if you do it a certain number of times. This is
common sense—if you toss a coin once, your probability of
getting a tails is 50%. If you toss a coin a 20 times, your
probability of getting a tails is very, very close to 100%.

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

2. Each observation or trial is independent. In other words,
none of your trials have an effect on the probability of the
next trial.
3. The probability of success (tails, heads, fail or pass)
is exactly the same from one trial to another

A coin is tossed 10 times. What is the probability of getting
exactly 6 heads?
x = 6 n=10 p=0.5 q=0.5
𝟏𝟎 !
P(6)= (𝟎. 𝟓𝟔 ⁎ 𝟎. 𝟓𝟑 )= 105/512 = 0.2051
(𝟏𝟎 – 𝟔 ) ! 𝟔!

2.5.2 Normal Distribution

Normal distribution, also known as the Gaussian distribution,

is a probability distribution that is symmetric about the mean,
showing that data near the mean are more frequent in occurrence
than data far from the mean. In graph form, normal distribution
will appear as a bell curve

A random variable with a Gaussian distribution is said to

be normally distributed, and is called a normal deviate.

Normal distributions are important in statistics and are often

used in the natural and social sciences to represent real-
valued random variables whose distributions are not known. Their
importance is partly due to the central limit theorem. It states
that, under some conditions, the average of many samples
(observations) of a random variable with finite mean and variance
is itself a random variable—whose distribution converges to a
normal distribution as the number of samples increases. Therefore,
physical quantities that are expected to be the sum of many
independent processes, such as measurement errors, often have
distributions that are nearly normal.

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

2.5.3 Exponential Distribution
In probability theory and statistics, the exponential distribution
is the probability distribution of the time between events in a
Poisson point process, i.e., a process in which events occur
continuously and independently at a constant average rate.

The exponential distribution is often concerned with the amount of

time until some specific event occurs. For example, the amount of
time (beginning now) until an earthquake occurs has an exponential
distribution. Other examples include the length of time, in
minutes, of long distance business telephone calls, and the amount
of time, in months, a car battery lasts. It can be shown, too,
that the value of the change that you have in your pocket or purse
approximately follows an exponential distribution.

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

Values for an exponential random variable occur in the following
way. There are fewer large values and more small values. For
example, marketing studies have shown that the amount of money
customers spend in one trip to the supermarket follows an
exponential distribution. There are more people who spend small
amounts of money and fewer people who spend large amounts of

Exponential distributions are commonly used in calculations of

product reliability, or the length of time a product lasts.

where μ is the historical average waiting time.

and has a mean and standard deviation of 1/μ.

An alternative form of the exponential distribution formula

recognizes what is often called the decay factor. The decay factor
simply measures how rapidly the probability of an event declines
as the random variable X increases. When the notation using the
decay parameter m is used, the probability density function is
presented as:

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

f(x) = me−mxf(x) = me−mx
where m=1μm=1μ

In order to calculate probabilities for specific probability

density functions, the cumulative density function is used. The
cumulative density function (cdf) is simply the integral of the
pdf and is:

Let X = amount of time (in minutes) a postal clerk spends with
his or her customer. The time is known to have an exponential
distribution with the average amount of time equal to four
X is a continuous random variable since time is measured. It is
given that μ = 4 minutes. To do any calculations, you must
know m, the decay parameter.
m=1μm=1μ. Therefore, m=14=0.25m=14=0.25
The standard deviation, σ, is the same as the mean. μ = σ
The distribution notation is X ~ Exp(m).
Therefore, X ~ Exp(0.25).

The probability density function is f(x) = me–mx. The number e =

2.71828182846… It is a number that is used often in mathematics.
Scientific calculators have the key “ex.” If you enter one for x,
the calculator will display the value e.
The curve is:

f(x) = 0.25e–0.25x where x is at least zero and m = 0.25.

For example, f(5) = 0.25e−(0.25)(5) = 0.072. The postal clerk spends
five minutes with the customers. The graph is as follows:

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

Notice the graph is a declining
curve. When x = 0,
f(x) = 0.25e(−0.25)(0) = (0.25)(1) =
0.25 = m. The maximum value on
the y-axis is m.
2.5.4 Poisson Distribution
In statistics, a Poisson distribution is a probability
distribution that is used to show how many times an event is likely
to occur over a specified period. In other words, it is a count
distribution. Poisson distributions are often used to understand
independent events that occur at a constant rate within a given
interval of time.

The Poisson distribution is a discrete function, meaning that the

variable can only take specific values in a (potentially infinite)
list. Put differently, the variable cannot take all values in any
continuous range. For the Poisson distribution the variable can
only take the values 0, 1, 2, 3, etc., with no fractions or

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

A Poisson distribution can be used to estimate how likely it is
that something will happen "X" number of times. For example, if
the average number of people who buy Coke from a fast-food chain
on a Friday night at a single restaurant location is 200, a
Poisson distribution can answer questions such as:

"What is the probability that more than 300 people will buy
Coke?" The application of the Poisson distribution thereby
enables managers to introduce optimal scheduling systems that
would not work with, say, a normal distribution.

The Poisson Distribution formula is:

P(x; μ) = (e-μ) (μx) / x!

2.5.5 Bayes Theorem

A mathematical formula for determining conditional probability.

Conditional probability is the likelihood of an outcome occurring,
based on a previous outcome occurring. Bayes' theorem provides a
way to revise existing predictions or theories (update
probabilities) given new or additional evidence. In finance,
Bayes' theorem can be used to rate the risk of lending money to
potential borrowers. Bayes' theorem is also called Bayes' Rule or
Bayes' Law and is the foundation of the field of Bayesian

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

Let A, B = events

P(A/B) = probability of A given B is true

P(B/A) = probability of B given A is true

P(A) , P(B) = the independent probabilities of A and B

Formula for Bayes' Theorem
P(A|B) – the probability of event A occurring, given event B has
occurred. P(B|A) – the probability of event B occurring, given
event A has occurred. P(A) – the probability of event A. P(B) –
the probability of event B.


P(A|B) – the probability of event A occurring, given event

B has occurred

P(B|A) – the probability of event B occurring, given event

A has occurred

P(A) – the probability of event A

P(B) – the probability of event B

Note that events A and B are independent events (i.e., the

probability of the outcome of event A does not depend on the
probability of the outcome of event B).


Imagine you are a financial analyst at an investment bank.

According to your research of publicly-traded companies, 60% of

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

the companies that increased their share price by more than 5% in
the last three years replaced their CEOs during the period.

At the same time, only 35% of the companies that did not increase
their share price by more than 5% in the same period replaced their
CEOs. Knowing that the probability that the stock prices grow by
more than 5% is 4%, find the probability that the shares of a
company that fires its CEO will increase by more than 5%.



P(A) = the probability that the stock price increases by 5%

P(B) = the probability that the CEO is replaced

P(A|B) = the probability of the stock price increases by 5%

given that the CEO has been replaced

P(B|A) = the probability of the CEO replacement given the

stock price has increased by 5%.

Using the Bayes’ theorem, we can find the required probability:

Thus, the probability that the shares of a company that replaces

its CEO will grow by more than 5% is 6.67%.


MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

DeGroot M, Schervish M 2014 Probability and Statistics (4th
Edition) Pearson Publishers

Grinstead C 1997 Introduction to Probability American

Mathematical Society

Gupta B, Guttman I, Jayalath K 2020 Statistics and Probability

with Application for Engineers and Scientists (2nd Edition) Wiley

MME 8201 Industrial Modeling & Algorithms Dr Kizito Paul Mubiru

You might also like