STAT2372 Topic1 2020

STAT2372 Probability


Topic 1
Elementary Notions

A Short History of Probability
• Ancient Egypt and Greece (games of chance)
• Gerolamo Cardano (1501–1576), Liber de ludo aleae (“Books on
Games of Chance”), Blaise Pascal (1623–1662) and Pierre de
Fermat (1601–1665), correspondence
• Jacob Bernoulli (1654–1705), Ars Conjectandi, 1713 (“The Art of
Conjecturing”), Abraham de Moivre (1667–1754), Leonard Euler
(1707–1783), Joseph-Louis Lagrange (1736–1813), Pierre-Simon
Laplace (1749–1827), Carl Friedrich Gauss (1777–1855), Siméon
Denis Poisson (1781–1840)
• Andrey Kolmogorov (1903–1987), Grundbegriffe der
Wahrsheinlichkeitsrechnung, 1933 (translation “Foundations of
the Theory of Probability”, 1950).

Sample Spaces and Events
• Consider an experiment whose outcome is not predictable with
certainty, but for which all possible outcomes are known. We call
the set of all possible outcomes the sample space of the
experiment, and usually denote it by S or Ω. The elements are
listed in braces - {}. e.g. If the experiment consists of flipping
one coin, the outcomes are H and T, where H denotes ‘head’ and
T denotes ‘tail’, and the sample space is
Ω = {H, T } .
If the experiment consists of flipping two coins, the sample space
Ω = {(H, H) , (H, T ) , (T, H) , (T, T )}
The ordered pair represents the outcome of the experiment. Thus
(H, T ) means that a head turned up on the first flip and a tail on

the second.
• Any subset E of the sample space Ω is known as an event. For
example, if E1 is the event that a head turns up on the first coin,
E1 = {(H, H) , (H, T )} .
If E2 is the event that at least one tail appears, then

E2 = {(H, T ) , (T, H) , (T, T )} .

• Two events A and B are said to be mutually exclusive if

A ∩ B = ∅, (the empty set) i.e. that A and B have no outcomes
in common.

The Axioms of Probability

A function P whose domain is the set of all possible events and

which satisfies the three conditions (axioms)
1. 0 ≤ P (E) ≤ 1;
2. P (Ω) = 1;
3. For any sequence of mutually exclusive events E1 , E2 , . . .

P (E1 or E2 or . . .) = P (E1 ) + P (E2 ) + . . .

i.e. !

[ ∞
P Ei = P (Ei ) .
i=1 i=1
is called a probability function.

Probability Notation and Properties

If A and B are any two events within the sample space, i.e. subsets
of Ω, then
• the symbol ∩ denotes intersection and A ∩ B is the event that
both A and B occur.

• A orA is the complement of A (read as “not A”), and so

P (A) + P A = P (Ω) = 1;

• the symbol ∪ denotes union and A ∪ B is the event that either A

or B or both A and B have occurred.
We have, using axiom 3 above,

P (A ∪ B) = P (A) + P (B) − P (AB) .

• If A and B are mutually exclusive, then

P (A ∪ B) = P (A) + P (B) ;

• Important: Events A and B are said to be independent, if

P (AB) = P (A) P (B) ;

• An event F is a sub-event of the event E if E contains F i.e.

F ⊂ E. Then P (F ) ≤ P (E) .
e.g. in the example of tossing two coins, let E be the event that
at least one head appears and F be the event that the first coin
is a head. Then

E = {(H, T ) , (T, H) , (H, H)}

F = {(H, T ) , (H, H)} .

Clearly, F is a subset of E , and P (F ) ≤ P (E) no matter how
the numeric probabilities are assigned to the events.
• Note that the domain of the function P is the set of all events.
Thus if we write P (E) , we are implying that E is an event.

Independence: Pairwise and Mutual

• Events A and B are said to be independent, if

P (AB) = P (A) P (B) ;

• More precisely, events A and B are said to be pairwise

independent if they adhere to the above condition.
• What about three or more events?
• Three events A, B and C are said to be mutually independent if

– P (ABC) = P (A)P (B)P (C)
– All combination of event pairs are pairwise independent, i.e.
∗ P (AB) = P (A)P (B)
∗ P (AC) = P (A)P (C)
∗ P (BC) = P (B)P (C)

Conditional Probability
• If A and B are any two events, then the conditional probability
of A given B is defined by
P (AB)
P (A|B) =
P (B)
provided P (B) > 0. It is undefined if P (B) = 0.
• This makes sense if you draw a Venn diagram, and think about

redefining probabilities when you know that only the outcomes in
B are possible.
• Let us extend this to three events. Suppose we toss two ordinary
six-sided dice, (i.e. the numbers on the faces are 1,2,3,4,5 and 6)
and the tosses are “independent”.
• Let A be the event that the number on the first die is odd, B the
event that the number on the second die is odd and C the event
that the sum of the numbers is odd.
• Clearly, events A and B are independent, as the tosses are
• However, events A and C would seem to be dependent, as the
result of the first toss is used to determine the sum. How do we
investigate this?
• First, we obtain the probabilities by counting, assuming equally

likely outcomes. The sample spaces for the outcomes for each of
the two dice are
Ω1 = Ω2 = {1, 2, 3, 4, 5, 6} .

• Assuming the two dice are perfectly constructed, the outcomes

are equally likely and therefore each have probability 16 .
• Thus
3 1
A = {1, 3, 5} and P (A) = = ;
6 2
3 1
B = {1, 3, 5} and P (B) = = .
6 2
Now C = {(1, 2) , (1, 4) , (1, 6) , (2, 1) , (2, 3) , (2, 5) , . . . , (6, 5)} .
The number of outcomes comprising C is 18, the total number of
outcomes is 6 × 6 = 36 and therefore
P (C) = .

• Now,
P (C|A) = P (sum is odd | first is odd)
= P (first is even and second is odd | first is odd)
+ P (first is odd and second is even | first is odd) ,
= P (first is odd and second is even | first is odd)
= P (second is even | first is odd)

• Since the two tosses are independent, the conditioning can be

dropped. Thus
P (C|A) = P (second is even)

=P B
= 1 − P (B)

and therefore
P (C|A) = = P (C) .
• Hence the events A and C are independent!
• We can also show that B and C are independent.
• Hence, pairwise independence has been shown for the events A, B
and C.
• However, these events are not mutually independent, since

P (C|AB) = 0

and so
P (ABC) = 0.
i.e. if A and B occur, then C cannot.

The Law of Total Probability
• Suppose events E1 , E2 , . . . En are mutually exclusive and
exhaustive of the sample space Ω (i.e. Ei = Ω). Then, for any
event A, we can write
A= (AEi ) .

• Because the Ei are mutually exclusive events, the AEi are also
mutually exclusive.

• Thus,
P (A) = P (AEi )
= P (A|Ei ) P (Ei ) .

• Now, using the conditional probability rule, we get

P (AB)
P (A|B) =
P (B)
and so
P (AB) = P (A|B) P (B)
or P (AB) = P (B|A) P (A) .

This can be visualised in the following Venn Diagram:

Bayes’ Rule (or Theorem)
• Using the Law of Total Probability, the result known as Bayes’
rule, which is named after Thomas Bayes (1701–1761), can be
derived from the above.
• If {Ei ; i = 1, . . . n} are mutually exclusive and exhaustive, and A
is any event, then for j = 1, . . . , n,
P (Ej A)
P (Ej |A) =
P (A)
P (A|Ej ) P (Ej )
= Pn .
P (A|Ei ) P (Ei )

• Note that the index in the denominator is i, and not j. i is a

dummy. Using the same variable in numerator and denominator,
especially if one index is a summation index is sloppy notation

which will be avoided.
• Example: A company buys its tyres from four suppliers: A(20%),
B (30%) , C (45%) , D (5%) . 10% of A′ s tyres are faulty, 8% of
B ′ s, 20% of C ′ s and 2% of D ′ s. If a tyre is selected at random
and found to be faulty, what is the probability that it came from
supplier C?
• Let F be the event that the tyre is faulty, and Ei be the event
that the tyre came from supplier i, (i = 1, 2, 3, 4), corresponding
to suppliers A, B, C and D respectively.
• Then
P (E1 ) = 0.20 P (F |E1 ) = 0.1
P (E2 ) = 0.30 P (F |E2 ) = 0.08
P (E3 ) = 0.45 P (F |E3 ) = 0.20
P (E4 ) = 0.05 P (F |E4 ) = 0.02

• Thus

P (E3 |F )
P (F |E3 ) P (E3 )
P (F |Ei ) P (Ei )
(0.20) (0.45)
(.1) (.2) + (.08) (.3) + (.2) (.45) + (.02) (.05)
≃ 0.6667

• Note that the denominator value is a “weighted average” of the

conditional probabilities.

