Probability and Stochastic Processes: Reza Pulungan

Probability and Stochastic
Processes
Reza Pulungan
Department of Computer Science and Electronics

Faculty of Mathematics and Natural Sciences
Universitas Gadjah Mada
Yogyakarta
September 22, 2020

Independence
Definition
Suppose we toss two identical coins in different places in a
room.
Intuitively, we expect that both coins will not affect/influence
one another.
A mathematical concept for this not affecting each other is
expressed by independence.
Definition
Events A and B are independent if Pr(B) = 0 or:
Pr(A | B) = Pr(A).
In other words, A and B are independent if knowing that B

happens does not change the probability of event A to happen.
Reza Pulungan Probability and Stochastic Processes 3
Be Careful!
Students usually think that disjoint events must be

independent.
The opposite is true!
If A ∩ B = ∅, then knowing that A happens means we know
that B cannot happen.
Disjoint events are never independent; unless one of them
occurs with probability 0.

Alternative Formulation
Theorem
Events A and B are independent if and only if:
Pr(A ∩ B) = Pr(A) · Pr(B).

Independence as an Assumption
Generally, independence is something we assume in

modelling a phenomenon.
For instance: let A be the event where the result of tossing
the first coin is head, and B be the where the result of
tossing the second one is head. If we assume that A and B
are independent, then:
1 1 1
Pr(A ∩ B) = Pr(A) · Pr(B) = · = .
2 2 4

Independence as an Assumption
Of course, there are many cases where assuming
independence has no justification.
For instance: let C be the event where tomorrow is cloudy,
and R be the event where tomorrow is raining. Let
Pr(C) = 1/5 and Pr(R) = 1/10. If both events are
independent, then the probability that tomorrow is cloudy
and raining is:
1 1 1
Pr(C ∩ R) = Pr(C) · Pr(R) = · = .
5 10 50
This is of course wrong. The events are not independent.

Typically, every rainy day is also a cloudy day. Then the
probability that tomorrow is cloudy and raining is 1/10.

Mutual Independence
We have defined independence for 2 events. What if there
are n events?
Events E1 , · · · , En are called mutually independent if and
only if the probability of any event Ei is not influenced by
other events.
Definition
A set of events {E1 , · · · , En } is mutually independent if
∀i ∈ {1, · · · , n} and ∀S ⊆ {1, · · · , n} − {i }:
   
\ \
Pr  Ej  = 0, or Pr(Ei ) = Pr Ei | Ej  .
j ∈S j ∈S

Mutual Independence
Definition
A set of events {E1 , · · · , En } is mutually independent if
∀i ∈ {1, · · · , n} and ∀S ⊆ {1, · · · , n} − {i }:
   
\ \
Pr  Ej  = 0, or Pr(Ei ) = Pr Ei | Ej  .
j ∈S j ∈S
In other words, regardless of which events are occurring, the

probability of event Ei does not change.
Example: if we toss 100 coins in different times, it makes sense

to assume that those tossings are mutually independent,
because the probability that the i -th coin produces head is
equal to 1/2, regardless of the results of other coins.
Alternative Formulation
Theorem (Mutual Independence)

A set of events {E1 , · · · , En } is mutually independent if and only
if ∀S ⊆ {1, · · · , n}:
 
\ Y
Pr  Ej  = Pr(Ej ).
j ∈S j ∈S
Example: if we toss n coins, those tossings are mutually

independent if and only if for all m ∈ {1, n} and any subset with
m coins inside it, the probability that each coin in that subset
produces head is 2−m .

Be Careful in Making Assumptions
Suppose in a DNA testing someone is found to have 5 markers,
and that:
Marker A occurs once in 100 people,
Marker B occurs once in 50 people,
Marker C occurs once in 40 people,
Marker D occurs once in 5 people,
Marker E occurs once in 170 people.
If the occurrences of those markers are assumed to be mutually
independent, then the probability that someone has the five
markers simultaneously is:
Pr(A ∩ B ∩ C ∩ D ∩ E ) = Pr(A) · Pr(B) · Pr(C) · Pr(D) · Pr(E ),

1 1 1 1 1 1
= · · · · = .
100 50 40 5 170 170.000.000

Pairwise Independence
The definition of mutual independence looks highly
complex.
The number of subsets we must inspect is large and grows
very fast.
The following example illustrate this: suppose we toss 3
fair and mutually independent coins and then define the
following three events:
1 A1 : event where the results of the first and second coins are
the same,
2 A2 : event where the results of the second and third coins
are the same, and
3 A3 : event where the results of the third and first coins are
the same.
The question is: are events A1 , A2 , A3 mutually
independent?

The question is: are events A1 , A2 , A3 mutually
independent?
The sample space of the experiment is:
{HHH , HHT , HTH , HTT , THH , THT , TTH , TTT }.
Each outcome has probability ( 21 )3 = 18 based on the

assumption that the coins are mutually independent.
The probability of each event is:
1 1 1 1 1
Pr(A1 ) = Pr({HHH , HHT , TTH , TTT }) = + + + = .
8 8 8 8 2
and Pr(A2 ) = Pr(A3 ) = 21 .

To determine whether A1 , A2 , A3 are mutually independent,

we can use the Mutual Independence Theorem.
We have seen that Pr(A1 ) = Pr(A2 ) = Pr(A3 ) = 12 .
How about the intersection of the two events?
1 1
Pr(A1 ∩ A2 ) = Pr({HHH , TTT }) = +
8 8
1
=
4
1 1
= · = Pr(A1 ) · Pr(A2 ).
2 2
We can also compute Pr(A1 ∩ A3 ) and Pr(A2 ∩ A3 ) in a
similar fashion.

However, we still have one more condition, namely:
1 1
Pr(A1 ∩ A2 ∩ A3 ) = Pr({HHH , TTT }) = +
8 8
1
=
4
1
6= Pr(A1 ) · Pr(A2 ) · Pr(A3 ) = .
8
Therefore, the three events are not mutually independent,

even though any two events among them are mutually
independent.
Independence for any two events is called pairwise
independence.

This can be generalized:

Definition (k -way Independence)
A set of events {A1 , A2 , · · · } is called k -way independent if and
only if any subset containing k elements is mutually
independent. The set is called pairwise independent if and only
if it is 2-way independent.
The previous example is not mutually independent, but pairwise

independent, because any 2 events are mutually independent.
Pairwise independence is a much weaker condition compared

to mutual independence.

Be Careful in Making Assumptions
Suppose those previous markers only occur pairwise
independently, then:
Pr(A ∩ B ∩ C ∩ D ∩ E ) ≤ Pr(A ∩ E ) = Pr(A) · Pr(E ),
1 1 1
= · = .
100 170 17.000
What if there is no independence at all? For instance:

Anyone with marker E also has marker A,
Anyone with marker A also has marker B,
Anyone with marker B also has marker C, and
Anyone with marker C also has marker D.
In this case, the probability that someone has the five markers
simultaneously is only:
1
Pr(E ) = .
170

Birthday Paradox
Assume that there are 100 students in a probability course.

What is the probability that at least 2 of them have the
same birthday?
Because there are 365 days in a year and there are only
100 students, you might think that the probability that any 2
of them have the same birthday is around 1/3.
This is a mistake! The probability is more than
0.999999692.
Why is the probability so large?
The answer is because of a phenomenon called the
birthday paradox.

Birthday Paradox
Before we analyze this problem, we first make some

assumptions:
For each student, any birthday has the same probability.
This means, someone’s birthday is determined by a
random process.
Birthdays are mutually independent.
To generalize, we can talk about 100 students and 365 days.
Suppose there are m students and N days in a year.
We solve this problem by using the four-step method.

Birthday Paradox
Step 1: Determining the sample space
In this case, the outcome of an experiment is a sequence

(b1 , b2 , · · · , bm ) where bi is the birthday of student i .
Therefore, the sample space is:
S = {(b1 , b2 , · · · , bm ) | bi ∈ {1, · · · , N }}.

Birthday Paradox
Step 2: Defining the events of interest
Our goal is to define event A, where some students have

the same birthday.
However, defining the event in this ways is quite complex.
We can use a general and important technique, we define
event Ā, namely the complement of A.
Event Ā represents that all students have different (distinct)
birthdays:
Ā = {(b1 , b2 , · · · , bm ) | all bi are distinct}.
If we can calculate Pr(Ā), then Pr(A) can also be obtained

easily, because Pr(A) + Pr(Ā) = 1.

Birthday Paradox
Step 3: Assigning the outcome probabilities
We wish to determine that the probability that m students

have birthdays with combination (b1 , b2 , · · · , bn ).
There are N possible birthdays and each (based on our
assumption) has the same probability for each student.
Therefore, the probability that student i was born on day bi
is 1/N .
Because we assume that birthdays are mutually
independent, then we can simply multiple those
probabilities. Hence, the probability of the outcome
(b1 , b2 , · · · , bm ) is (1/N )m .
Therefore each outcome in the sample space has the
same probability (uniform), namely (1/N )m .

Birthday Paradox
Step 4: Calculating the probability of the event
We would like to calculate the probability of the event:

Ā = {(b1 , b2 , · · · , bm ) | all bi are distinct}.
This set is very large. There are N ways to select b1 , N − 1

ways to select b2 , and so on. Then based on the
generalized product rule:
N!
|Ā| = = N · (N − 1) · (N − 2) · · · (N − m + 1).
(N − M )!
Since the sample space is uniform, we can conclude:

|Ā| N!
Pr(Ā) = m
= m .
N N (N − m)!

Birthday Paradox
Since the sample space is uniform, we can conclude:
|Ā| N!
Pr(Ā) = m
= m .
N N (N − m)!
However, we would like to have a closed-form solution,

without factorial. We can use the factorial approximation
method (Stirling) that we have studied in Discrete Maths.
We approximate N ! and (N − m)! by:
√ N N −m
N p N −m
2πN and 2π(N − M ) .
e e

Birthday Paradox
We can approximate N ! and (N − m)! with:
√ N N −m
N p N −m
2πN and 2π(N − M ) .
e e
Hence, we can compute the approximation:

√ N
2πN Ne
Pr(Ā) = N −m ,
N m 2π(N − M ) N −m
p
e
1 N
= e N −m+ 2 ln( N −m )−m .

Birthday Paradox
Setting m = 100 and N = 365, we get the probability that

the 100 students have distinct birthday (Pr(Ā)) is around
3.7 · 10−7 .
By setting m = 23 and N = 365, the probability that 23
students have distinct birthdays (Pr(Ā)) is around 0.49.
Hence, with only 23 students with a probability of more
than a half more than a student have the same birthday.
This is why this phenomenon is called the birthday paradox.

Probability and Stochastic Processes: Reza Pulungan

Uploaded by

Copyright:

Available Formats

You might also like

Probability and Stochastic Processes: Reza Pulungan

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability and Stochastic Processes: Reza Pulungan

Uploaded by

Copyright:

Available Formats

Probability and Stochastic

Department of Computer Science and Electronics

September 22, 2020

In other words, A and B are independent if knowing that B

Students usually think that disjoint events must be

Reza Pulungan Probability and Stochastic Processes 4

Pr(A ∩ B) = Pr(A) · Pr(B).

Reza Pulungan Probability and Stochastic Processes 5

Generally, independence is something we assume in

Reza Pulungan Probability and Stochastic Processes 6

This is of course wrong. The events are not independent.

Reza Pulungan Probability and Stochastic Processes 7

Reza Pulungan Probability and Stochastic Processes 8

In other words, regardless of which events are occurring, the

Example: if we toss 100 coins in different times, it makes sense

Theorem (Mutual Independence)

Example: if we toss n coins, those tossings are mutually

Reza Pulungan Probability and Stochastic Processes 10

Pr(A ∩ B ∩ C ∩ D ∩ E ) = Pr(A) · Pr(B) · Pr(C) · Pr(D) · Pr(E ),

Reza Pulungan Probability and Stochastic Processes 11

Reza Pulungan Probability and Stochastic Processes 12

{HHH , HHT , HTH , HTT , THH , THT , TTH , TTT }.

Each outcome has probability ( 21 )3 = 18 based on the

and Pr(A2 ) = Pr(A3 ) = 21 .

Reza Pulungan Probability and Stochastic Processes 13

To determine whether A1 , A2 , A3 are mutually independent,

Reza Pulungan Probability and Stochastic Processes 14

Therefore, the three events are not mutually independent,

Reza Pulungan Probability and Stochastic Processes 15

This can be generalized:

The previous example is not mutually independent, but pairwise

Pairwise independence is a much weaker condition compared

Reza Pulungan Probability and Stochastic Processes 16

What if there is no independence at all? For instance:

Reza Pulungan Probability and Stochastic Processes 17

Assume that there are 100 students in a probability course.

Reza Pulungan Probability and Stochastic Processes 18

Before we analyze this problem, we first make some

We solve this problem by using the four-step method.

Reza Pulungan Probability and Stochastic Processes 19

Step 1: Determining the sample space

In this case, the outcome of an experiment is a sequence

S = {(b1 , b2 , · · · , bm ) | bi ∈ {1, · · · , N }}.

Reza Pulungan Probability and Stochastic Processes 20

Our goal is to define event A, where some students have

Ā = {(b1 , b2 , · · · , bm ) | all bi are distinct}.

If we can calculate Pr(Ā), then Pr(A) can also be obtained

Reza Pulungan Probability and Stochastic Processes 21

We wish to determine that the probability that m students

Reza Pulungan Probability and Stochastic Processes 22

We would like to calculate the probability of the event:

This set is very large. There are N ways to select b1 , N − 1

Since the sample space is uniform, we can conclude:

Reza Pulungan Probability and Stochastic Processes 23

Since the sample space is uniform, we can conclude:

However, we would like to have a closed-form solution,

Reza Pulungan Probability and Stochastic Processes 24