Probability & Statistics

Anh Tuan Tran (Ph.D.) & Thinh Tien Nguyen (Ph.D.)
First principle (Rule of product):

 Suppose an experiment has m outcomes; and another

experiment has n outcomes.
 Then the two experiments jointly have m. n outcomes.


Rolling a die and flipping a coin can have a total of

6.2 = 12
different outcomes, combined.

 Let
H ≔ h1 , h2 , … , hn
be a set of n different objects.
 The permutations of H are the different orders in
which one can write all of the elements of H.
 There are
n! ≔ 1.2 … n
of them.
 We set 0! ≔ 1

 The results of a horse race with horses

H ≔ A, B, C, D, E, F, G
are permutations of H.

 A possible outcome is E, G, A, C, B, D, F (E is the

winner, G is second, etc.).

 There are 7! = 5040 possible outcomes.

Permutations with repetitions

 Let H ≔ h1 … h1 , … , hr … hr be a set of r different types

of repeated objects: n1 many of h1 , nr of hr .
 The permutations with repetitions of H are the different
orders in which one can write all of the elements of H.
 There are
n n!
n1 n2 … nr ≔ n1 ! n2 ! … nr !
of them, where n = n1 + ⋯ + nr is the total number of
objects. This formula is also known as the multinomial
Permutations with repetitions

We can make

11 11!
≔ = 83160
52211 5! 2! 2! 1! 1!

different words out of the letters

A, B, R, A, C, A, D, A, B, R, A.

 Let
H ≔ h1 , h2 , … , hn
be a set of n different objects.
 The k-permutations of H are the different ways in which
one can pick and write k of the elements of H in order.
 There are
n−k !
of these k-permutations.

 The first three places of a horse race with horses

H ≔ A, B, C, D, E, F, G
form a 3-permutation of H.

 A possible outcome is (E, G, A) (E is the winner, G is

second, A is third.).

 There are
= 210
7−3 !
possible outcomes for the first three places.
k-permutations with repetitions


 Let H ≔ h1 , … , h2 , … , hr , … be a set of r different

types of repeated objects, each of infinite supply.
 The k-permutations with repetitions of H are the
different orders in which one can write an ordered
sequence of length k using the elements of H.
 There are r k such sequences.
k-permutations with repetitions


There are 263 = 17576 possible (k = 3)-letter words

using the r = 26 letters of the English alphabet.
 Let
H ≔ h1 , h2 , … , hn
be a set of n different objects.
 The k-combinations of H are the different ways in
which one can pick k of the elements of H without
 There are
n n!

k k! n − k !
of these k-combinations.
 This formula is also known as the binomial coefficient
(“n choose k”).


There are
30 30!
= = 142506
5 5! 30 − 5 !
possible ways to form a committee of 5 students out of
a class of 30 students.

In a similar way, there are

n n!
k1 k 2 … k r ≔ k1 ! k 2 ! … k r !
many ways to form unordered groups of sizes
k1 , k 2 , … , k r
of n = k1 + k 2 + ⋯ + k r objects.

Thus, the multinomial coefficient generalizes the

binomial coefficient.
The Binomial coefficient

 Recall the definition, for n, k non-negative integers,

n n! n
= =
k k! n − k ! n−k
for all 0 ≤ k ≤ n.

 We extend this by
in all other cases. (It is possible to define these
coefficients for any real n, but we won’t need that.)
The Binomial coefficient

Theorem (Pascal’s Identity):

For any integers n ≥ 1 and 0 ≤ k ≤ n,

n n−1 n−1
= + .
k k−1 k
The Binomial Theorem

Theorem (Newton’s Binomial Theorem):

For any real numbers x, y and integer n ≥ 1,

n n k n−k
x+y = x y .
The Multinomial Theorem


For real numbers x1 , x2 , … , xr and integer n ≥ 1,

x1 + x2 + ⋯ + xr
= k1 k 2 … k r x1 k1 x2 k2 … xr kr .
k1 +k2 +⋯+kr =n
0≤k1 ,k2 ,…,kr ≤n
2.Elementary probability
Sample space


 Experiment: Is it going to rain today?

 Sample space: Ω = Yes, No .
 An event: E = Yes (It rains).
 Cardinality: Ω = 2, E = 1.
Sample space

 Experiment: Finishing order of a race of 7 horses?

H ≔ A, B, C, D, E, F, G .
 Sample space: Ω = Permutations of H .
 An event: E = B wins (Permutations starting with
 Another event: F = G wins, D is second
(Permutations starting with G, D.)
 Cardinality: Ω = 7!, E = 6!, F = 5!.
Sample space
 The union of E and F is the event E or F. The event
occurs if E or F occurs.
 The intersection of E and F is the event E and F. The
event occurs if E and F occur.


 The union i Ei is the event at least one of the Ei s.

 The intersection i Ei is the event each of the Ei s.
Sample space

 Experiment: Flipping 2 coins.

 Sample space: Ω = Orderd pairs of the 2 coins =
HH, HT, TH, TT .
 An event: E = The 2 coins come up different =
HT, TH .
 Another event: F = Both flips come up Hs = HH .
 Cardinality: Ω = 4, E = 2, F = 1.
 E ∪ F = HT, TH, HH = At least 1 H . E ∪ F = 3.
Sample space

 Experiment: Rolling 2 dice.

 Sample space:
Ω = Orderd pairs of the 2 outcomes =
i, j : i, j = 1,2, … , 6 .
 An event:
E = The sum of the rolls is 4 = 1,3 , 2,2 , (3, 1) .
 Another event:
F = 2 rolls are the same = i, i : i = 1,2, … , 6 .
 Cardinality: Ω = 36, E = 3, F = 6.
 E ∩ F = 2,2 . E ∩ F = 1.
Sample space


 If E ∩ F = ∅, E and F are mutually exclusive events.

 If E1 , E2 , … satisfy Ei ∩ Ej = ∅ for i ≠ j, E1 , E2 , … are
“mutually” exclusive events.

Mutually exclusive events cannot happen at the same

Simple properties of events

If E ⊆ F, the occurrence of E implies that of F.


The experiment of rolling a die.

E = Rolling 1 on the die

⊆ Rolling a number on the die = F.
Simple properties of events


The complement of an event E is E C ≔ Ω − E. This is

the event that E does not occur.


𝐸 and 𝐸 𝐶 are mutually exclusive.

Simple properties of events


The experiment of rolling a die.

Ω = Rolling 1,2,3,4,5, or 6 on the die ,

E = Rolling 1 or 5 on the die ,
E C = Ω − E = Rolling 2, 3, 4, or 6 on the die .
Simple properties of events
 Commutativity:

1. E ∪ F = F ∪ E.
2. E ∩ F = F ∩ E.

 Associativity:

1. E∪F ∪G=E∪ F∪G .

2. E∩F ∩G=E∩ F∩G .

 Distributivity:

1. E∪F ∩G= E∩G ∪ F∩G .

2. E∩F ∪G= E∪G ∩ F∪G .
Simple properties of events

 De Morgan’s Law:

1. E∪F C = EC ∩ FC .
2. E∩F = EC ∪ FC .

 De Morgan’s Law (generalization):

1. i Ei = i i .
2. i Ei = i i .


The probability P on a sample space Ω assigns

numbers to events of Ω in such a way, that:

1. P ∅ = 0 ≤ P E ≤ 1 = P Ω for any event E.

2. For mutually exclusive countable events E1 , E2 , …,

P Ei = P Ei .
i i
Equally likely outcomes
We are interested in

Probability P on a sample space Ω such that P(ω) for

all ω ∈ Ω are the same. Then we also say that ω is an
elemetary event.


The experiment of tossing a “fair” coin.

Ω = Head, Tail .
P {ω = Head} = P {ω = Tail} = .
Equally likely outcomes


For finite sample space Ω, we define

P E ≔ ,
where ⋅ is the counting measure. Then P is a
probability on Ω.
Equally likely outcomes


The experiment of rolling a die.

Ω = Rolling 1,2,3,4,5, or 6 on the die ,

E = Rolling 1 or 5 on the die .

The probability of E:
E 1,5 2 1
P E = = = = .
Ω {1,2,3,4,5,6} 6 3
Equally likely outcomes

 Experiment: Finishing order of a race of 7 horses?

H ≔ A, B, C, D, E, F, G .
 Sample space: Ω = Permutations of H .
 An event: E = B wins (Permutations starting with
 Cardinality: Ω = 7!, E = 6!.
 Probability of E:
E 6! 1
P E = = = .
Ω 7! 7
Equally likely outcomes


The experiment of rolling a die.

Ω = Rolling 1,2,3,4,5, or 6 on the die ,

E = Rolling 1 or 5 on the die .

The probability of E:
E 1,5 2 1
P E = = = = .
Ω {1,2,3,4,5,6} 6 3
Simple properties of probability


1. P E C = 1 − P(E).
2. P E∪F =P E +P F −P E∩F .
3. For E ⊆ F, P F − E = P F − P(E). Moreover,
P E ≤ P(F).
Simple properties of probability

Inclusion-exclusion principle:

P E1 ∪ ⋯ ∪ En = P Ei − P Ei1 ∩ Ei2
1≤i ≤n 1≤i1 <i2 ≤n

+ P Ei1 ∩ Ei2 ∩ Ei3 − ⋯

1≤i1 <i2 <i3 ≤n

+ −1 n−1 P E1 ∩ ⋯ ∩ En
Simple properties of probability

In a class of 50 student, there are 25 students good at

English, 20 students good at Mathematics, and 10 students
good at both subjects. Pick randomly a student. What is the
probability that that student is good at Mathematics or

E = The student is good at Mathematics ,

F = The student is good at English .

20 25 10
P E∪F =P E +P F −P E∩F = + − = 0.7
50 50 50
3. Conditional probability
Conditional probability

The experiment of rolling 2 dice.

E = The sum of the rolls is 7 .

 What is P(E)?
 If we know that we roll a 2 on the first die, what is P(E)

Partial information can change the probability of the

Conditional probability


Let F be an event with P F > 0.

The conditional probability of an event E, given F is
defined as
P E|F = = .
P(F) F
Conditional probability

The experiment of rolling 2 dice.

E = The sum of the rolls is 7 ,

F = The first die gives 2 .

P E|F =
2,5 1
= = .
2,1 , 2,2 , 2,3 , 2,4 , 2,5 , (2,6) 6
Conditional probability

The experiment of rolling 2 dice.

E = The sum of the rolls is 7 ,

F = The first die gives 2 .

P F|E =
2,5 1
= = .
1,6 , 6,1 , 2,5 , 5,2 , 3,4 , 4,3 6
Simple properties of conditional probability


Let F be an event with P F > 0. Then

1. P ∅|F = 0 ≤ P E|F ≤ 1 = P Ω|F for any event E.

2. For mutually exclusive countable events E1 , E2 , …,
P i Ei |F = i P Ei |F .
Simple properties of conditional probability


Let F be an event with P F > 0. Then

1. P E C F = 1 − P E F .
2. P E ∪ G|F = P E|F + P G|F − P E ∩ G|F .
3. For E ⊆ G, P G − E|F = P G|F − P(E|F). Moreover,
P E|F ≤ P(G|F).
Simple properties of conditional probability

Multiplication rule:

For events E1 , E2 , … , En ,

P E1 ∩ E2 ∩ ⋯ ∩ En = P E1 P E2 E1 P E3 E1 ∩ E2 …
P(En |E1 ∩ E2 ∩ ⋯ ∩ En−1 )
Simple properties of conditional probability


An urn contains 6 red and 5 blue balls. We draw three

balls at random, at once (that is, without replacement).
What is the chance of drawing one red and two blue
Simple properties of conditional probability

R i = The ith ball is red ,
Bi = The ith ball is blue ,
E = 1 red and 2 blue balls .

P E = P R1 ∩ B2 ∩ B3 + P B1 ∩ R 2 ∩ B3
+ P B1 ∩ B2 ∩ R 3
Simple properties of conditional probability

 P R 1 ∩ B2 ∩ B 3 = P R 1 P B2 R 1 P B3 R 1 ∩ B2 =
6 5 4 4
= .
11 10 9 33
 P B1 ∩ R 2 ∩ B3 = P B1 P R 2 B1 P B3 B1 ∩ R 2 =
5 6 4 4
= .
11 10 9 33
 P B1 ∩ B2 ∩ R 3 = P B1 P B2 B1 P R 3 B1 ∩ B2 =
5 4 6 4
= .
11 10 9 33
4 4
 P E = ⋅ 3= .
33 11
Simple properties of conditional probability

Law of total probability:

For events E and F,

P E = P F P E F + P FC P E FC .
Simple properties of conditional probability


According to an insurance company,

 30% of population are accident-prone, they will have
an accident in any given year with 0.4 chance;
 the remaining 70% of population will have an
accident in any given year with 0.2 chance.
Accepting this model, what is the probability that a new
customer will have an accident in a specific year?
Simple properties of conditional probability


F = The customer is accident − prone ,

E = The customer has accident in a specific year .

P E = P F P E F + P F C P E F C = +
= 0.26.
Simple properties of conditional probability

Law of total probability (generalization):

Let {F1 , F2 , … , Fn } be a partition of Ω, i.e., they are

mutually exclusive and the union of them forms Ω.

P E = P F1 P E F1 + ⋯ P Fn P(E|Fn )
Bayes’ Theorem


For any events E and F,

P F|E = C C
P F P E F + P(F )P E F
Bayes’ Theorem

Theorem (Generalization):

For any events E and a partition F1 , F2 , … , Fn of Ω,

P Fi P E Fi
P Fi |E = .
P F1 P E F1 + ⋯ + P(Fn )P E Fn
Bayes’ Theorem


According to an insurance company,

 30% of population are accident-prone, they will have
an accident in any given year with 0.4 chance;
 the remaining 70% of population will have an
accident in any given year with 0.2 chance.
Accepting this model, what is the probability that a new
customer is accident-prone if we know that that
customer did have an accident?
Simple properties of conditional probability


F = The customer is accident − prone ,

E = The customer has accident in a specific year .

P F|E = C C
P F P E F + P(F )P E F +
≈ 0.46.

The experiment of rolling a die twice. Let

E = The first roll gives 3 ,

F = The second roll gives 5 .

If we know that the second roll gives 5. Then

E∩F 3,5
P EF = =
F { 1,5 , 2,5 , 3,5 , 4,5 , 5,5 , (6,5)}
= .

E 3,1 , 3,2 , 3,3 , 3,4 , 3,5 , (3,6)

P E = =
Ω { 1,1 , 1,2 , … , 6,5 , (6,6)}
6 1
= = .
36 6

The appearance of F does not change the chance of E.



Two events E and F are independent if P E F = P(E)

and P F E = P F . It implies

P E∩F =P E P F .
Example (Dependent events):

The experiment of rolling a die twice. Let

E = The sum of the rolls is 8 ,

F = The second roll gives 5 .

If we know that the second roll gives 5. Then

E∩F 3,5
P EF = =
F { 1,5 , 2,5 , 3,5 , 4,5 , 5,5 , (6,5)}
= .

E 2,6 , 6,2 , 3,5 , 5,3 , 4,4 5

P E = = = .
Ω { 1,1 , 1,2 , … , 6,5 , (6,6)} 36

The appearance of F changes the chance of E.



Rolling 2 dice. Let

E = The sum of the rolls is 7 ,

F = The first die shows 3 ,
G = The second die shows 4 .

P E∩F =P E∩G =P F∩G = .

E= 1,6 , 6,1 , 2,5 , 5,2 , 3,4 , (4,3) ,

F= 3,1 , 3,2 , 3,3 , 3,4 , 3,5 , (3,6) ,
G= 1,4 , 2,4 , 3,4 , 4,4 , 5,4 , (6,4) .

6 1
P E =P F =P G = = .
36 6

P E∩F =P E P F .
P E∩G =P E P G .
P F∩G =P F P G .

E∩F∩G | 3,4 | 1
P E|F ∩ G = = =1≠ =P E .
F∩G 3,4 6

These events are actually dependent but “mutally”


The events E, F, and G are independent if

1. P E ∩ F = P E P F ,
2. P E ∩ G = P E P G ,
3. P F ∩ G = P F P G ,
4. P E ∩ F ∩ G = P E P F P G .

We can generalize the definition of independence of

more than 3 events, …

Consider an experiment with p probability of success

for 0 ≤ p ≤ 1. Perform that experiment n times
independently. What is the probability of the event that
there are k successes for 0 ≤ k ≤ n?


n k n−k
p 1−p
Conditional independence

According to an insurance company,

 30% of population are accident-prone, they will have
an accident in any given year with 0.4 chance;
 the remaining 70% of population will have an
accident in any given year with 0.2 chance.
Accepting this model, what is the probability that a new
customer will have an accident in the next year if we
know that that customer did have an accident this
Conditional independence

F = The customer is accident − prone ,

E = The customer has accident in this year ,
G = The customer will have accident in next year .

P(G ∩ E) P(G ∩ E ∩ F) P(G ∩ E ∩ F C )

P G|E = = +
P(E) P(E) P(E)
P(G ∩ E ∩ F) P(E ∩ F) P G ∩ E ∩ F C P(E ∩ F C )
= . + C
P(E ∩ F) P(E) P E∩F P(E)
= P G E ∩ F P F E + P G E ∩ F C P(F C |E).
Conditional independence

 P G E ∩ F P F E ≈ = 0.184.
 P G E ∩ F C P F C E ≈ = 0.108.

P G E ≈ 0.29.

 P G = P G F P F + P G FC P FC
= + ≈ 0.26.


