Accelerator 2 - Probability

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

Data Analysis

PAN Baoqian, Kris


ISOM 5510

Accelerator 2: Probability
For COVID-19 pandemic, can wearing a
face mask reduce the chance of infection?

2
What’s the chance of winning if you bet
on horse NO.8?

3
What’s the chance that your favorite Olympics
athletes will win the gold medal?

4
What are the odds and chances of
having a boy or a girl?

5
How likely is it that the price of GOOGLE
will go up by more than 10% by next
month? Or drop by more than 5%?

6
Source: https://finance.yahoo.com/
How risky is it to offer a loan to a certain
company?

7
Goals for this topic

 Understand how to incorporate


uncertainties into our decision-making, by
knowing what’s probability and the basic
probability rules.

8
Big and Small
 Big and small, also known as Sic bo, is a casino game.
 Gameplay involves betting that a certain condition (e.g.
that all three dice will roll the same) will be satisfied by
a roll of the three dice.
 Uncertainty in the outcome of rolling the dice.
9
 What is the chance of winning a particular bet?
The Concept of Probability

 An experiment is the observation of some


activity or the act of taking some measurement
 The experiment is rolling a dice.

 The outcome is the particular result of an


experiment.
 The possible outcomes of rolling a dice are the numbers
1, 2, 3, 4, 5, and 6.
10
The Concept of Probability continued

 An event is the collection of one or more outcomes


of an experiment.
 A fair dice is rolled once:
 the occurrence of the number 6.

 the occurrence of an even number, i.e., 2, 4, 6.

 A sample space of an experiment is the set of all


possible experimental outcomes
 The sample space of rolling a dice is {1, 2, 3, 4, 5, 6}
11
The Concept of Probability continued

 Probability is a measure of the likelihood, or


a number that measures chances, that an
event in the future will happen.
 If E is an experimental outcome, then P(E) denotes
the probability that E will occur and:
Conditions
1. 0  P(E)  1 such that:
 If E can never occur, then P(E) = 0
 If E is certain to occur, then P(E) = 1
2. The probabilities of all the experimental outcomes
must sum to 1.

12
The Concept of Probability continued

 Probability is a measure of the likelihood, or


a number that measures chances, that an
event in the future will happen.
 If E is an experimental outcome, then P(E) denotes
the probability that E will occur and:
Conditions
1. 0  P(E)  1 such that:
 If E can never occur, then P(E) = 0
 If E is certain to occur, then P(E) = 1
2. The probabilities of all the experimental outcomes
must sum to 1.

13
Assigning Probabilities to
Experimental Outcomes
 Assigning probability is the process of
assigning a numerical value to the
likelihood of an event.
 Three methods:
 Long-run relative frequency
 Classical Method
 Subjective

14
Long-Run Relative Frequency
Method
 Example -Loan Application: A bank is considering a
loan application from a company.
 In the past, this company had received 20 loans

of similar size and paid back on time 19 times,


late once.
 If the bank offers the loan, how likely do you

think the company will pay back on time? 19/20


 Let E be an outcome of an experiment. If it is =0.95
performed many times, P(E) is defined as the =95%
relative frequency of E

15
 Law of Large Numbers (LLN): The
relative frequency of an outcome
converges to the probability of the
outcome, as the experiment is repeated
over and over.

16
Classical Method
 All the experimental outcomes are equally likely
to occur

 Example: the probability of getting a head from tossing


a “fair” coin is ½.
 Exercise: Pew Research reports that of 10,190 randomly
generated working phone numbers, the initial results of
the calls were as follows:
the probability of a contact is
1) Classical method: 1/6
2) Relative frequency method:
7400 / 10,190 = 0.7262 17
Subjective Probability
 Using experience, intuitive judgment, or
expertise to assess a probability

 Subjective probability may differ from


person to person
 A media development team assigns a 60%
probability of success to its new ad campaign.
 The chief media officer of the company is less
optimistic and assigns a 40% of success to
the same campaign
18
Don’t lie with Statistic
 A psychiatrist reported once that
practically everybody is neurotic.

19
In-class
Exercise:
Big & Small
 3 fair dice, a player can bet on
 “Any Triple” (win if all three dice will roll the same)
 “Big” (win if the total score will be between 11 and 17
(inclusive) with the exception of a triple)
 “Small” (win if the total score will be between 3 and 10
(inclusive) with the exception of a triple)
 ..
20
 Pick a bet and calculate your chance of winning.
Union and Intersection of Events
 The union of A and B consists of outcomes that belong to either A or B or both
 Written as A  B (A or B)
 The intersection of A and B consists of outcomes that belong to both A and B
 Written as A ∩ B (A and B)

Venn diagrams are graphs for depicting the relationships among events 21
Mutually Exclusive Events
 Events are mutually exclusive (or disjoin) if
the occurrence of any one event means that
none of the others can occur at the same time.
 P(A∩B) = 0
 Example:  Example:
Head vs Tail Delayed vs On Time

22
Collectively Exhaustive events
 Events are collectively exhaustive if at least one of
the events must occur when an experiment is conducted.
 Example: a car repair is either covered
by the warranty (A) or not (B).
 Exercise: Are the above event A and
event B mutually exclusive?

 Events are said to be collectively exhaustive,


if P( )=1
 Example: a HK PARKnSHOP
supermarket customer can pay by
credit card (A), EPS (B), cash (C),
e-commerce platforms and payment
service (D), or Octopus Card (E). 23
Application of Mutually Exclusive and
Collectively Exhaustive to Market Sizing
 Company need to size the market for a product or service
as part of deciding whether entering the market will be a
worthwhile investment
 Example: Ways to break down the HK cell phone market
MECE Break Down Non-MECE Break Down
By customer’s age:
Age 35+
0-18, 19-22, 23-34, 35+
By customer’s sex:
female, male
Female Business
By type of customer: Consumer,
Business or Government
Correctly estimate the population in each Overlapping groups, double-count or triple-
bucket and the size of the revenue from a count customers and therefore
typical customer in each bucket overestimate the size of the market.24
Independent Events
 Events are independent if the occurrence of one
event does not affect the occurrence of another.

 P(A and B)=P(A)P(B)


 Example:
 The decisions of two managers are independent if the
choices made by one don’t influence the choices
made by the other.

 Exercise: Which of the following events are


independent?
 Winning the lottery and running out of milk.
 Robbing a bank and going to jail.
25
Contingency Table
 A table that shows counts of cases on one
categorical variable contingent on the
value of another (for every combination of
both variables)
 Example – Amazon (Continued)

Contingency Table (Counts) for Amazon.com

26
Converting Counts to Probabilities
 The following contingency table summarizes n = 17, 619 visits
to amazon.com during the fall season.
Contingency Table (Counts) for Amazon.com

 Q1: What’s the probability that the next visitor to Amazon from
one of these hosts will make a purchase?
 Q2: What’s the probability that the next visitor to Amazon from
one of these hosts will be from MSN and make a purchase?
 Q3: What’s the probability that the next visitor to Amazon from
MSN will make a purchase? 27
Converting Counts to Probabilities
Contingency Table (Counts) for Amazon.com
 To answer these
questions, we
can use the long
run relative
frequency
method

 Assume the next visitor to Amazon.com behaves like a random


choice from the 17,619 cases in the contingency table
 Divide each count by 17,619 to get fractions (probabilities)
Probabilities for Amazon.com

28
Marginal Probability
 Marginal probability is the probability of
observing an outcome with a single attribute,
regardless of its other attributes
 Displayed in the margins of a contingency table
 For Amazon.com, there are 5 marginal probabilities,
e.g., Answer to Q1: What’s the probability that the next visitor to
Amazon from one of these hosts will make a purchase?
P(YES) = 0.016 + 0.000 + 0.013 = 0.029

29
Joint Probability
 A joint probability measures the likelihood that
two or more events will happen concurrently.
 Displayed in cells of a contingency table
 For Amazon.com, there are 6 joint probabilities;
e.g., Answer to Q2: What’s the probability that the next visitor to
Amazon from one of these hosts will be from MSN and make a purchase?
P(MSN and Yes) = 0.016

30
Conditional Probability
 Conditional Probability: The Probability of some
event A, given that the event B has occurred.
 Conditional probability is denoted: P(A|B), and is
read "the probability of A, given B".
 Further, P(A|B) = P(A∩B) / P(B)
 P(B) ≠ 0
What’s the probability that
 For Amazon.com, Answer to Q3 the next visitor to Amazon
from MSN will make a
purchase?
P(Yes І MSN)
= P(Yes and MSN) / P(MSN)
= 0.016 / 0.412
= 0.039 31
Exercise - Venture capital investment
 The following table containing information about the size
of investments required by 300 applicants for venture
capital from three different sectors:
Size of Investment
Sector Small Medium Large Total
Electronic 20 20 10 50
Biotech 10 90 50 150
Internet 0 10 90 100
Total 30 120 150 300

 What is the probability an application is for a large


investment given that it is from the electronic sector?

32
Basic Probability Rules
S
 1. P(S) = 1

 2. Complement rule: P(A) = 1 – P(Ac)


 Some denote as
 Example: The Wall Street Journal reports that

about 33% of all new small businesses fail within


the first 2 years. The probability that a new small
business will survive is:
P(Survival) = 1 – P(Failure) = 1-0.33 = 0.67 or 67%.
Note: for more probability rules, please refer to the Appendix. 33
Take away from this topic
 Basic probability concept
 Experiment, outcome, event, sample space
 Frameworks for assigning probability
 Long-run relative frequency
 In the long run
 Classical Method
 For equally likely outcomes
 Subjective
 Assessment based on experience, expertise or intuition
 Relationship between events
 Union, intersection
 Independent, mutually exclusive, collective exhaustive
 Contingency table 34
That’s all, folks!
Adios~

35
Appendix

36
Example
 A fund manager who runs several unit trusts
successfully in the UK is considering selling funds
in France in order to expand his market. He could
achieve a market presence in France:
 by buying an existing French fund manager
 by recruiting a local sales force
 by sending a UK salesman over to France
 by appointing a French company to act as an agent
 Uncertainty in each of these options
 The best the fund manager can do is estimate
the probabilities of the possible outcomes and
use these to assist in his decision making. 37
 Exercise: In the game Big and Small, are the events “Big”,
“Small” and “any Triple”
 Mutually Exclusive?

 Collective exhaustive? (Hint: Sample space for the


outcome of shaking 3 dice: 3, 4, 5, …, 17, 18)

Answer: Yes to Both. When 3 dice are roll, the final outcome can only be
3 (all 1s in Any Triple),
4-10 (Small, all 2s and all 3s in Any Triple),
11-17 (Big, all 4s and all 5s in Any Triple),
18 (all 6s in Any Triple). 38
Basic Probability Rules
S
 1. P(S) = 1
 2. Complement rule: P(A) = 1 – P(Ac)
 Some denote 𝐴 as 𝐴̅
 Example: The Wall Street Journal reports that
about 33% of all new small businesses fail within
the first 2 years. The probability that a new small
business will survive is:
P(Survival) = 1 – P(Failure) = 1-0.33 = 0.67 or 67%.
 3. Addition Rule:
P(A ∪ B) = P(A) + P(B) – P(A ∩ B)

 Geometrically, the area of A ∩ B (the


A A∩B B
shaded brown part) is counted twice if
we add up the areas of A and B, so we
need to subtract it from the sum 39
Basic Probability Rules (Continued)

 4. Law of total probability:


B BC
A
P(A) = P(A ∩ B) + P(A ∩ BC)
 5. Multiplication Rule:
P(A∩B) = P(A) • P(B|A) = P(B) • P(A|B)
Special Rule of Independence:
If are independent events, then
40
An Application of the Special Rule of
Independence: Customer Service
 A major producer and marketer of consumer products,
assessed the service it provides by surveying the attitudes
of its customers regarding 10 different aspects of customer
service—order filled correctly, billing amount on invoice
correct, delivery made on time, and so forth.
 Survey results showed that only 59 percent of the survey
participants indicated that they were satisfied with all 10
aspects of the company’s service.
 Upon investigation, each of the 10 departments responsible
for the aspects of service considered in the study insisted
that it satisfied its customers 95 percent of the time.
 Company executives were confused and felt that there was
a substantial discrepancy between the survey results and
the claims of the departments providing the services. 41
 However, a company statistician pointed out that there was no
discrepancy.
 Why?
 Consider randomly selecting a customer from among the
survey participants, and define 10 events (corresponding to
the 10 aspects of service studied):
A1= the customer is satisfied with aspect 1.
A2 = the customer is satisfied with aspect 2.
:
A10= the customer is satisfied with aspect 10.
Also, define the event
S = the customer is satisfied with all 10 aspects of customer
service. 42
 Because 10 different departments are responsible for
the 10 aspects of service being studied, it is reasonable
to assume that all 10 aspects of service are independent
of each other.

 If, as the departments claim, each department satisfies


its customers 95 percent of the time, then the probability
that the customer is satisfied with all 10 aspects is

 This result is almost identical to the 59 percent


satisfaction rate reported by the survey participants.
43
Exercise P(MBA | ‘>=director’)= 0.6
Some of the managers at a company have an MBA degree. Of
managers at the level of director or higher, 60% have an MBA.
P(MBA | ‘<director’) = 0.35
Among other managers of lower rank, 35% have an MBA. For this
company, 15% of managers have a position at the level of director or
higher. T/F? P(‘>=director’) = 0.15
P(‘<director’) [Hint: Complement rule] = 1- 0.15 =0.85
 1. 85% of managers have lower rank.
 2. The chance that the manager has an MBA and at the level of
director or higher is 9%. P(MBA ∩ ‘>=director’) [Hint: Multiplication Rule]
= P(MBA| ‘>=director’) P(‘>=director’) = 0.6 0.15 = 0.09
 3. 39% of the managers have an MBA. P(MBA) [Hint: Law of total probability]
= P(MBA ∩ ‘>=director’) + P(MBA ∩ ‘<director’)=0.09+P(MBA|‘<director’) P(‘<director’)=0.3875
 4. The chance that the manager has an MBA or at the level of
director or higher is 45%. P(MBA ∪ ‘>=director’) [Hint: Addition Rule]
= P(MBA) + P(‘>=director’) - P(MBA ∩ ‘>=director’) = 0.3875 +0.15 – 0.09 =0.4475
 5. If you meet an MBA from this firm, 23% chances that this person
is a director (or higher). P(‘>=director’ | MBA) [Hint: conditional prob. Def.] 44
= P(MBA ∩ ‘>=director’) / P(MBA) = 0.09/0.3875 =0.23
Probability Tree (Tree Diagram)
 Tree diagrams may represent a series
of independent events or conditional probabilities
 The probability of each branch is written on the branch
 The outcome is written at the end of the branch

 Example: tree diagram for the toss of a coin:

There are two "branches"


(Heads and Tails)

45
Example –
Supply of Bicycle Parts
 An bicycle company gets 60% of its supply of a
particular part from manufacturer A and the
remainder from manufacturer Z.
 The quality of the parts delivered is given below:

Manufacturer % Good Parts % Bad Parts


A 97 3
Z 93 7

46
 Probability Tree
Manufacturer Good or bad part Outcome Probability

P(G|A)=0.97 1 P(A∩G)=P(A)*P(G|A)=0.582
Good

P(A)=0.6
A P(B|A)=0.03
Bad 2 P(A∩B)=P(A)*P(B|A)=0.018

P(G|Z)=0.93 3 P(Z∩G)=P(Z)*P(G|Z)=0.372
P(Z)=0.4 Z Good

P(B|Z)=0.07 Bad 4 P(Z∩B)=P(Z)*P(B|Z)=0.028

 What is the probability of receiving a bad part?


P(B)=P(A∩B)+P(Z∩B)= 0.018 + 0.028 = 0.046 47
Exercise - Goat vs Car
 A TV game Goat vs Car. Behind three closed doors, one has
a car, and two have goats. A player will get the prize (a goat
or a car) behind a closed door which he picked.
Door I Door II Door III
Goat Goat Car
 The game proceeds as follows:

a. Player picks one door, say door II


b. Game host shows a goat behind another door, say door I
c. There are two closed doors left: doors II & III, and the
player is asked if he would like to switch his choice (i.e., door
II) to the other closed door.

 Question: Is it better to switch (S) or not to switch (NS)? 48


 Tree diagram: outcomes represented by end branches

Solution Compare the frequency of winning (W) if switch,


and the frequency of winning (W) if not switch.
 Tree diagram for TV game:

Outcome P(W|S) = proportions of winning


Door selected
Action cases among switch cases
Car W
originally S = (# of times S & W)/(# of times S)
Goat
NS Goat L = 2/3
P(W|NS) = proportions of winning
S Car W
Goat cases among non-switch cases
NS Goat L = (# of times NS & W)/(# of times NS)
S
= 1/3
Goat L
Car
Simulation: http://www.curiouser.co.uk/monty/mo
NS Car W ntygame.htm 49
Example – Filtering junk mail
 Past data indicates the following
probabilities:
P(Nigerian general | Junk mail) = 0.20
P(Nigerian general | Not Junk mail) = 0.001
P(Junk mail) = 0.50

 Is there a way to help workers filter out


junk mail from important email messages if
the term “Nigerian general” appears ?
50
Bayes Rule (Bayes Theorem)
 Bayes Rule (Bayes Theorem):
P(A|B) P(B) P(A|B) P(B)
P(B|A)=
P(A) P(A|B) P(B) P(A|B ) P(B )

51
Example – Filtering junk mail
(Continued)

P(Nigerian general | Junk mail) = 0.20


P(Nigerian general | Not Junk mail) = 0.001
P(Junk mail) = 0.50
 Method - Bayes Rule:

P (Junk mail | Nigerian general)


= P(Nigerian general | Junk mail) P (Junk mail)
/ [P(Nigerian general | Junk mail) P (Junk mail) +
P(Nigerian general |Not junk mail) P (Not junk mail)]
= 0.20×0.5 / (0.2×0.5 + 0.001×0.5) ≈ 0.995
52
Exercise – HIV Testing
 A test is 98% effective at detecting HIV (test positive)
 However, test has a “false positive” rate of 1%
 0.5% of US population has HIV.
 What is the chance you have HIV when you test result is
positive? =?
Hint: let
T = test result is positive for HIV with this test
D = actually have HIV
“false positive”: the test result is positive, but the actual
situation is not.
“false positive” rate: the proportion test detecting HIV (test
positive) among the people who do not have HIV
53
 Basic probability rules (Summary)
 P(S)=1
 Complement rule: P( ) = 1 – P( )
 Addition Rule:
P(A B) = P(A) + P(B) – P(A ∩ B)
 Multiplication Rule:
P(A ∩ B) = P(A) P(B|A) = P(B) P(A|B)
 Law of total probability:
P(A) = P(A ∩ B) + P(A ∩ B )
 Bayes Rule (Bayes Theorem):
P(A|B) P(B) P(A|B) P(B)
P(B|A)=
P(A) P(A|B) P(B) P(A|B ) P(B ) 54

You might also like