Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

CL202: Introduction to Data Analysis

MB, SCP

Mani Bhushan, Sachin Patwardhan


Department of Chemical Engineering,
Indian Institute of Technology Bombay
Mumbai, India- 400076
mbhushan,sachinp@iitb.ac.in

Acknowledgements: Santosh Noronha

Spring 2015

MB, SCP (IIT Bombay) CL202 Spring 2015 1 / 35


Probability: Introduction

Statement: 60% probability of finding oil in a region


Frequency interpretation of probability: probability=proportion of
experiments that result in that outcome. It is a property of outcome. Used
by scientists/engineers
Subjective (or personal) interpretation of probability: a statement about the
belief of the person, used by economists, philosophers, etc.

MB, SCP (IIT Bombay) CL202 Spring 2015 2 / 35


Sample Space

Experiment: outcome ω not known in advance but set of all possible outcomes is
known.
Sample space S of an experiment: set of all possible outcomes
Examples:
1) Coin toss: S={H,T}
2) Dice throw: S={1,2,3,4,5,6}
3) Race involving 7 horses: S={all orderings of (1,2,3,4,5,6,7)} where
outcome (2,3,1,6,5,4,7) means for instance that horse numbered 2 finishes
first followed by horse numbered 3, etc.
4) S for the value of a concentration in a kinetics experiment is [0, ∞).

MB, SCP (IIT Bombay) CL202 Spring 2015 3 / 35


Events

Any subset E of sample space S is known as an event.


If outcome of experiment is contained in E, then we say E has occurred.
Example: horse race: event E={all outcomes in S starting with 3} i.e. E is
the event that number 3 horse wins the race.

MB, SCP (IIT Bombay) CL202 Spring 2015 4 / 35


Algebra of Events

Events are sets, so several set operations apply.


E ∪ F : union of two events is set of all outcomes that are either in E or in F.
EF or E ∩ F : intersection of two events (set of outcomes present in both
events).
E C : complement of event E, i.e. set of all outcomes not in E.
Mutually exclusive events: if E ∩ F = φ where φ is the null event (event
consisting of no outcomes)
For events E , F if all outcomes in E are also in F , then E ⊂ F =⇒ when E
occurs, F will also occur.

MB, SCP (IIT Bombay) CL202 Spring 2015 5 / 35


Algebra of Events (cont.)

Venn diagrams as a useful tool


Laws:
I Commutative: E ∪ F = F ∪ E ; EF = FE .
I Associative: (E ∪ F ) ∪ G = E ∪ (F ∪ G ); (EF )G = E (FG ).
I Distributive: (E ∪ F )G = EG ∪ FG ; EF ∪ G = (E ∪ G )(F ∪ G ).
I DeMorgan’s laws:
(E ∪ F )c = E c F c
(EF )c = E c ∪ F c

MB, SCP (IIT Bombay) CL202 Spring 2015 6 / 35


Axioms of Probability

Empirically, let n(E ) be number of times event E occurs in N repeated


experiments. Then,

n(E )
P(E ) = lim = constant
N→∞ N
Mathematically, for each event E of an experiment with sample space S, there is a
number denoted P(E ) (called probability of event E) consistent with the following
three axioms:
1. 0 ≤ P(E ) ≤ 1
2. P(S) = 1
3. For any sequence of mutually exclusive events E1 , E2 , ..., En (i.e. Ei ∩ Ej = φ
when i 6= j),
 n  X n
P ∪ Ei = P(Ei ), n = 1, 2, ..., ∞
i=1
i=1

MB, SCP (IIT Bombay) CL202 Spring 2015 7 / 35


Additional Propositions Concerning Probability

Using the above axioms, the following can be shown:


P(E C ) = 1 − P(E )
P(E ∪ F ) = P(E ) + P(F ) − P(EF )
Example: 60% Indians like cricket, 20% like football and 10% like both. What %
of Indians don’t like both cricket and football.
Soln: E: event that randomly chosen Indian likes cricket, F: ... likes football.

P(E ∪ F ) = P(E ) + P(F ) − P(E ∩ F ) = 0.6 + 0.2 − 0.1 = 0.7

i.e. 70% like atleast one of the two. Thus 30% don’t like both.

MB, SCP (IIT Bombay) CL202 Spring 2015 8 / 35


Odds of an event

Odds of an event A defined as:


P(A) P(A)
=
P(AC ) 1 − P(A)

Odds tell how much more likely it is that A occurs than that it does not occur.
Example: If P(A) = 3/4 then odds are 3 i.e. it is 3 times as likely that A occurs
as it is that it does not.

MB, SCP (IIT Bombay) CL202 Spring 2015 9 / 35


Sample Spaces Having Equally Likely Outcomes

For some experiments, each point in sample space is equally likely to occur,
i.e if
S = {ω1 , ω2 , ..., ωN }
Then if, P({ω1 }) = P({ω2 }) = ... = P({ωN }) = p, say
N
X 1
=⇒ P(S) = P({ωi }) = Np = 1 =⇒ p =
N
i=1

Thus, for any event E


number of points in E
P(E ) =
N
3
Example: Probability of getting an even number from dice throw= 6 = 0.5.

MB, SCP (IIT Bombay) CL202 Spring 2015 10 / 35


Basic Principle of Counting
To compute probabilities, necessary to effectively count the number of different
ways that a given event can occur.
Basic principle of counting :
I Suppose two experiments are to be performed.
I If Experiment 1 can result in any one of m possible outcomes, and if,
I For each outcome of Experiment 1 there are n possible outcomes of
Experiment 2,
I Then together there are mn possible outcomes of the two experiments.
Example: Two balls are randomly drawn from a bowl containing 6 white and 5
black balls. What is the probability that one of the drawn balls is white and the
other black?
Soln: E=event that one ball is white and other black.
Number of ways that E can occur= 6 × 5 + 5 × 6. Thus,
6×5+5×6 6
P(E ) = =
11 × 10 11

Generalized basic principle of counting: extension to more than 2 experiments


MB, SCP (IIT Bombay) CL202 Spring 2015 11 / 35
Generalized Basic Principle of Counting

Illustration:
Determine number of different ways n distinct objects can be a arranged in a
linear order:

n × (n − 1) × . . . (n − r + 1) × . . . × 1
1st exp 2nd exp rth exp nth exp

which is n!.
Each arrangement is known as a permutation.
Example: Mr. X has 10 books: 4 maths, 3 chemistry, 2 history and 1 english.
How many arrangements of books on a bookshelf are possible such that all books
dealing with the same subject are together on the shelf.
Soln:  
4! 3! 2! 1! 4! = 6912
mathschemhistoryenglish order of subjects

MB, SCP (IIT Bombay) CL202 Spring 2015 12 / 35


Combinations

Define: For r ≤ n,  
n n n!
Cr = =
r (n − r )!r !
is the number of combinations of n objects taken r at a time, or number of
different groups of r objects that could be formed from a set of n objects.
Note:  
n n(n − 1)(n − 2)...(n − r + 1)
=
r r!
where numerator is the number of different ways that a group of r items could be
selected from n items when the order of selection is considered relevant,
denominator denotes that each group of r items is counted r ! times.

MB, SCP (IIT Bombay) CL202 Spring 2015 13 / 35


Combination: Example

Example: A committee of size 5 to be selected from a group of 6 men and 9


women. If the selection is made randomly, what is the probability that the
committee consists of 3 men and 2 women?
Soln.:
6 9
 
3 2 240
15
 =
5
1001

Note: Randomly selected means that each of the 15



5 possible combinations is
equally likely.

MB, SCP (IIT Bombay) CL202 Spring 2015 14 / 35


Summary so far

Algebra of events
Axioms of probability
Principles of counting
n!
Permutations (order is relevant): n Pr = (n−r )!
n n!

Combinations (order not considered): n Cr = r = (n−r )!r !

MB, SCP (IIT Bombay) CL202 Spring 2015 15 / 35


Conditional Probability

P{A|B} indicates the conditional probability of event A happening, given


event B.
Consider a Venn diagram of two overlapping sets A and B.

P{A ∩ B}
P{A|B} =
P{B}

Special case: A=B

P{B ∩ B} P{B}
P{B|B} = = =1
P{B} P{B}

This assumes that P{B} is not zero.

MB, SCP (IIT Bombay) CL202 Spring 2015 16 / 35


Dice throw example

Roll a pair of dice


F=event that first die lands on side 3.
E=event that sum of dice equals 8.
Sample space S = {(1, 1), (1, 2), ..., (6, 6)} (36 outcomes)
P{(3, 5)} 1
P(E |F ) = =
P{(3, 1), (3, 2), ..., (3, 6)} 6

MB, SCP (IIT Bombay) CL202 Spring 2015 17 / 35


Son example

Organization that Jones works for is organizing a father-son dinner for those
employees having atleast one son.
If Jones is known to have two children, what is the conditional probability
that they are both boys given than he is invited to the dinner?
Soln.
I Sample space S = {(b, b), (b, g ), (g , b), (g , g )}. Assume each outcome
equally likely.
I B: event that both children are boys, A: event that atleast one of them is a
boy.
I To find: P(B | A).

P(BA) P({(b, b)}) 1/4 1


P(B|A) = = = =
P(A) P({(b, b), (b, g ), (g , b)}) 3/4 3

I Note: Answer is not 12 .

MB, SCP (IIT Bombay) CL202 Spring 2015 18 / 35


Conditional probability rearrangement

Rearranging the conditional probability equation gives the product rule:


if A and B are two events with P{B} = 6 0, then

P{A ∩ B} = P{B} · P{A|B}

MB, SCP (IIT Bombay) CL202 Spring 2015 19 / 35


The general product rule

For example,
A = {person is IITB student},
B = {person lives in Mumbai},
C = {person wears specs}.
Then
P{A ∩ B ∩ C } = P{A | (B ∩ C )}P{B ∩ C } = P{A | B ∩ C }P{B | C }P{C }.
The general product rule: If A1 , A2 , ..., An be n events, then
P{A1 ∩ A2 ∩ ... ∩ An }

= P{A1 } P{A2 |A1 } P{A3 |A1 ∩ A2 }...P{An |A1 ∩ A2 ∩ ... ∩ An−1 }

MB, SCP (IIT Bombay) CL202 Spring 2015 20 / 35


Conditional Probability Identities

If all probabilities are conditioned on the same event, then

P{A ∪ B | C } = P{A | C } + P{B | C } − P{A ∩ B | C }

This is the conditional inclusion-exclusion rule.


For any event C , we can define a new probability measure

Pc := P{A|C }

MB, SCP (IIT Bombay) CL202 Spring 2015 21 / 35


Conditional Probability Identities

Be careful about the events before and after the condition bar:

P{A | B ∪ C } = P{A | B} + P{A | C }

is false!
For example, if B and C are separate non overlapping subsets of A, then

P{A | B} = P{A | C } = 1

(why??) and
P{A | B ∪ C } = 1
whereas
P{A | B} + P{A | C } = 1 + 1

MB, SCP (IIT Bombay) CL202 Spring 2015 22 / 35


Useful result for computing probability using conditional
probabilities

For any events E and F with sample space S

E = E ∩ S = E ∩ (F ∪ F C ) = EF ∪ EF C

Then, since these are mutually exlcusive

P(E ) = P(EF ) + P(EF C )


= P(E | F )P(F ) + P(E | F C )P(F C )
= P(E | F )P(F ) + P(E | F C )(1 − P(F ))

Probability of event E is a weighted average of the conditional probability of


E given that F has occurred and the conditional probability of E given that
F has not occurred: Each conditional probability is given as much weight as
the event it is conditioned on has of occurring.
MB, SCP (IIT Bombay) CL202 Spring 2015 23 / 35
Generalization: Law of total probability

If a sample space = disjoint union of B0 , B1 , . . . , BN then for an event A


(using the sum rule)
N
X N
X
P{A} = P{A ∩ Bi } = P{A | Bi }P{Bi }
i=0 i=0

MB, SCP (IIT Bombay) CL202 Spring 2015 24 / 35


Example: Insurance Policy

Two types of people:


(i) Accident prone: will have an accident within a year with probability 0.4
(ii) Non-accident prone: will have an accident within a year with probability 0.2
Given: 30% population is accident prone.
Question: what is the probability that a new policy holder will have an accident
within a year of purchasing a policy?
A1 : event that a new policy holder will have an accident within a year of
purchasing a policy.
A: event that policy holder is accident prone.
A1 = A1 A ∪ A1 AC

P(A1 ) = P(A1 | A)P(A) + P(A1 | AC )P(AC )


= 0.4 × 0.3 + 0.2 × (1 − 0.3) = 0.26

MB, SCP (IIT Bombay) CL202 Spring 2015 25 / 35


Bayes’ Theorem

Bayes’ theorem is about conditional probability and inference:

P(A and B) = P(A ∩ B) = P(B) P(A|B) = P(A) P(B|A)

This tells us how to update our probabilities once some information becomes
known.
This is often rearranged to the forms

P(A)P(B|A)
P(A|B) =
P(B)
or,
log P(A|B) = log P(A) + log P(B|A) − log P(B)

MB, SCP (IIT Bombay) CL202 Spring 2015 26 / 35


Bayes’ Theorem

P(A)P(B|A)
P(A|B) =
P(B)

P(A) is the prior probability.


P(A|B) is called the posterior probability with respect to B
Bayes’ theorem is extremely important because it captures new information
(i.e. learning, inference) and describes how to update the prior degree of
belief in A, P(A), when a new piece of information B becomes available.

MB, SCP (IIT Bombay) CL202 Spring 2015 27 / 35


Example: Coin Tosses

Two coin example: One coin = fair, Second coin has P{heads} = 1;
If a coin is randomly chosen and flipped, and heads obtained, what is the
probability that the fair coin was flipped?
A = fair coin was flipped, and B = heads was obtained.

P{A ∩ B} P{B | A}P{A}


P{A | B} = =
P{B} P{B}

Also P(B) = P(B | A)P(A) + P(B | AC )P(AC ) = (1/2)(1/2) + 1(1/2) = 3/4


Thus,
(1/2)(1/2) 1
P(A | B) = =
1/4 + 1/2 3

MB, SCP (IIT Bombay) CL202 Spring 2015 28 / 35


Bayes’ Theorem: Another Example (Kit Test)

10% of people are HIV+ .


A new diagnostic kit is available.
For an HIV+ patient there is a 10% chance that the kit says he is HIV−
(false negative).
For an HIV− patient, there is a 30% chance of being found HIV+ .

MB, SCP (IIT Bombay) CL202 Spring 2015 29 / 35


Bayes’ Theorem: Kit Test Example (cont.)
A random person is chosen. What is the Pr{person tested as HIV+ is really
HIV+ }?
Event of interest:
I A = person is HIV+ ;
I B = test was +ve.
I To find: P{A|B}.
I Given: P(A) = 0.1, P(B C |A) = 0.1, P(B|AC ) = 0.3.
I Can infer: P(AC ) = 0.9, P(B|A) = 0.9.

P{A ∩ B} 0.09 1
P{A|B} = = =
P{B} 0.09 + 0.27 4

I So when a person tests HIV+ , there is only a 25% chance he is HIV+


You could toss a coin and do better!
Not using the kit at all & calling everyone healthy [or use a different kit
which always gives negative results] ⇒ accurate 90% of the time!
Most people are healthy and the test makes too many false predictions.
May still be useful: correct prediction of HIV+ patients is 90%.
MB, SCP (IIT Bombay) CL202 Spring 2015 30 / 35
Independent Events

In general P(E | F ) 6= P(E ).


Event E is said to be independent of event F when P(E | F ) = P(E ). This
implies P(EF ) = P(E )P(F ).
Whenever E is independent of F so if F of E.
Events that are not independent said to be dependent.

MB, SCP (IIT Bombay) CL202 Spring 2015 31 / 35


Independent Events: Example

A card selected randomly from a deck of 52 cards.


A: event that selected card is ace
H: event that it is a heart

P(AH) = 1/52 and P(A) = 4/52, P(H) = 13/52

Events are independent


(1 of four aces is a heart and 13 of 52 cards (again 1 of 4) is a heart)

MB, SCP (IIT Bombay) CL202 Spring 2015 32 / 35


Generalization to more than 2 events

Three events E , F , G said to be independent if

P(EFG ) = P(E )P(F )P(G )


P(EF ) = P(E )P(F )
P(EG ) = P(E )P(G )
P(FG ) = P(F )P(G )

Check example 3.8c from book


In general: events E1 , E2 , ..., En independent if for every subset
E10 , E20 , ..., Er 0 , r ≤ n:

P(E10 E20 ...Er 0 ) = P(E10 )P(E20 )...P(Er 0 )

MB, SCP (IIT Bombay) CL202 Spring 2015 33 / 35


Chapter 3 Completed
Read it!

Next lecture: chapter 4


Random Variables and Expectations

MB, SCP (IIT Bombay) CL202 Spring 2015 34 / 35


THANK YOU

MB, SCP (IIT Bombay) CL202 Spring 2015 35 / 35

You might also like