Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 97

Probability

Probability
Businesses want to be able to quantify the
uncertainty of future events.
For example, what are the chances that next
month’s revenue will exceed last year’s average?

How can we increase the chance of positive future


events and decrease the chance of negative future
events?
The study of probability helps us understand and
quantify the uncertainty surrounding the future.
Probability
 Definitions
The probability of an event is a number that
measures the relative likelihood that the event will
occur.
The probability of event A [denoted P(A)], must lie
within the interval from 0 to 1:
0 < P(A) < 1

If P(A) = 0, then the If P(A) = 1, then the event


event cannot occur. is certain to occur.
Probability
 Definitions
In a discrete sample space, the probabilities of all
simple events must sum to unity:
P(S) = P(E1) + P(E2) + … + P(En) = 1

For example, if the following number of purchases


were made by
credit card: 32% P(credit card) = .32
debit card: 20% Probability P(debit card) = .20
cash: 35% P(cash) = .35
check: 18% P(check) = .18
Sum = 100% Sum = 1.0
Random Experiments
 Sample Space
A random experiment is an observational process
whose results cannot be known in advance.
The set of all outcomes (S) is the sample space for
the experiment.
A sample space with a countable number of
outcomes is discrete and countless number of
outcomes is continuous.
Random Experiments
 Sample Space
For example, when CitiBank makes a consumer
loan, the sample space is:
S = {default, no default}

The sample space describing a Life Style


customer’s payment method is:
S = {cash, debit card, credit card, check, other}
Random Experiments
 Sample Space
For a single roll of a die, the sample space is:
S = {1, 2, 3, 4, 5, 6}
When two{(1,1),
dice(1,2),
are rolled, the (1,5),
(1,3), (1,4), sample space is
(1,6),
the following pairs:
(2,1), (2,2), (2,3), (2,4), (2,5), (2,6),
S=
(3,1), (3,2), (3,3), (3,4), (3,5), (3,6),

(4,1), (4,2), (4,3), (4,4), (4,5), (4,6),

(5,1), (5,2), (5,3), (5,4), (5,5), (5,6),


Random Experiments
 Sample Space
Consider the sample space to describe a randomly
chosen United Airlines employee by
2 genders,
21 job classifications,
6 home bases (major hubs) and
4 education levels
There are: 2 x 21 x 6 x 4 = 1008 possible outcomes

It would be impractical to enumerate this sample


space.
Random Experiments
 Sample Space
If the outcome is a continuous measurement, the
sample space can be described by a rule.
For example, the sample space for the length of a
randomly chosen cell phone call would be
S = {all X such that X > 0}

or written as S = {X | X > 0}

The sample space to describe a randomly chosen


student’s CGPA would be
S = {X | 0.00 < X < 10.00}
Random Experiments
 Events
An event is any subset of outcomes in the sample
space.
A simple event or elementary event, is a single
outcome.
A discrete sample space S consists of all the
simple events (Ei):
S = {E1, E2, …, En}
Random Experiments
 Events
Consider the random experiment of tossing a
balanced coin.
What is the sample space?
S = {H, T}

What are the chances of observing a H or T?


These two elementary events are equally likely.
When you buy a lottery ticket, the sample space
S = {win, lose} has only two events.
Are these two events equally likely to occur?
Random Experiments
 Events
A compound event consists of two or more simple
events.
For example, in a sample space of 6 simple
events, we could define the compound events
A = {E1, E2}
B = {E3, E5, E6}

These are
displayed in a
Venn diagram:
Random Experiments
 Events
Many different compound events could be defined.

For example, the compound event


A = “rolling a seven” on a roll of two
dice consists of 6 simple events:

S = {(1,6), (2,5), (3,4), (4,3), (5,2), (6,1)}


Random Experiments
 Mutually Exclusive outcomes
Events A and B are mutually exclusive (or disjoint)
if their intersection is the null set () that contains
no elements.

If A  B = , then P(A  B) = 0
Random Experiments
 Collectively Exhaustive outcomes
Events are collectively exhaustive if their union is
the entire sample space S.
Two mutually exclusive, collectively exhaustive
events are dichotomous (or binary) events.

For example, a car repair is either


covered by the warranty (A) or not
(B).

No
Warranty P(S) = P(A) + P(B) = 1
Warranty
Random Experiments
 Collectively Exhaustive outcomes
More than two mutually exclusive, collectively
exhaustive events are polytomous events.
For example, a Wal-Mart customer can pay by credit card (A), debit card
(B), cash (C) or check (D).

Cash Debit
Card
P(S) = P(A) + P(B) + P(C) + P(D) = 1

Credit
Card Check
ABL: Exercises 4.1
a) A customer at Cross-word can use Visa (V), Master (M) or American
Express (A) and may buy books (B), electronic item (E) or others (O).
Enumerate the elementary events in the sample space describing a
customer's purchase. Would each elementary event equally likely?
Explain.
b) A die is thrown (1,2,3,4,5,6) and a coin is tossed (H,T) simultaneously.
Enumerate all compound events at each trial. Are they equally likely?
Explain.
c) A survey reports that CEO salaries in Indian Corporates range
anywhere from 45 to 400 lakhs of rupees. Describe the sample space.
Approaches for assigning
Probability
Three approaches to probability:

Approach Example
Empirical There is a 2 percent chance
of twins in a randomly-
chosen birth.
Classical There is a 50 % probability
of heads on a coin flip.

Subjective There is a 15 % chance that England will


Join EU by 2020.
Approaches to Probability
 Empirical Approach
Use the empirical or relative frequency approach to
assign probabilities by counting the frequency (fi) of
observed outcomes defined on the experimental
sample space.

For example, to estimate the default rate on


student loans:

P(a student defaults) = f /n = number of defaults


number of loans
Approaches to Probability
 Empirical Approach

Necessary when there is no prior knowledge of


events.

As the number of observations (n) increases or the


number of times the experiment is performed, the
estimate will become more accurate.
Approaches to Probability
 Classical Approach
In this approach, we envision the entire sample
space as a collection of equally likely outcomes.
Instead of performing the experiment, we can use
deduction to determine P(A).
a priori refers to the process of assigning
probabilities before the event is observed.
a priori probabilities are based on logic, not
experience.
Approaches to Probability
 Classical Approach
For example, the two dice experiment has 36
equally likely simple events. The P(7) is
number of outcomes with 7 dots 6
P( A)    0.1667
number of outcomes in sample space 36
The probability is
obtained a priori
using the classical
approach as shown in
this Venn diagram for
2 dice:
Approaches to Probability
 Subjective Approach
A subjective probability reflects someone’s
personal belief about the likelihood of an event.
Used when there is no repeatable random
experiment.
 For example,
- What is the probability that a new
product method will show a return on
investment of at least 10 percent?
- What is the probability that the price of RIL
stock will rise within the next 7 days?
Approaches to Probability
 Subjective Approach
These probabilities rely on personal judgment or
expert opinion.

Judgment is based on experience with similar


events and knowledge of the underlying causal
processes.
ABL: Exercises 4.2
What kind of probability estimation are these? How would they have
been derived?
a) There is a 20% chance that a new stock offered in an IPO will reach
or exceed its target price on the first day.
b) There is 50% chance that Flipkart and Myntra will merge?
c) Commercial rocket launches have a 95% success rate?
d) The probability of rolling three sevens in a row with dice is .0046.
Rules of Probability
 Complement of an Event
The complement of an event A is denoted by
A′ and consists of everything in the sample space
S except event A.
Rules of Probability
 Complement of an Event
Since A and A′ together comprise the entire
sample space,
P(A) + P(A′ ) = 1
The probability of A′ is found by
P(A′ ) = 1 – P(A)
For example, The Wall Street Journal reports that
about 33% of all new small businesses fail within
the first 2 years. The probability that a new small
business will survive is:
P(survival) = 1 – P(failure) = 1 – .33 = .67 or 67%
Rules of Probability
 Union of Two Events
The union of two events consists of all outcomes in
the sample space S that are contained either in
event A or in event B or both
(denoted A  B or “A or B”).

 may be read
as “or” since
one or the other
or both events
may occur.
Rules of Probability
 Union of Two Events
For example, randomly choose a card from a deck
of 52 playing cards.
If Q is the event that we draw a
queen and R is the event that we
draw a red card, what is Q  R?
It is the possibility of drawing
either a queen (4 ways)
or a red card (26 ways)
or both (2 ways).
Rules of Probability
 Intersection of Two Events
The intersection of two events A and B
(denoted A  B or “A and B”) is the event
consisting of all outcomes in the sample space S
that are contained in both event A and event B.
 may be read as
“and” since both
events occur. This is
a joint probability.
Rules of Probability
 Intersection of Two Events
For example, randomly choose a card from a deck
of 52 playing cards.

If Q is the event that we draw a


queen and R is the event that we
draw a red card, what is
Q  R?
It is the possibility of getting
both a queen and a red card
(2 ways).
Rules of Probability
 General Law of Addition
The general law of addition states that the
probability of the union of two events A and B is:
P(A  B) = P(A) + P(B) – P(A  B)
When you add the So, you have to
P(A) and P(B) A and B subtract
together, you count P(A  B) to avoid
the P(A and B) twice. over-stating the
probability.
A B
Rules of Probability
 General Law of Addition
For the card example:
P(Q) = 4/52 (4 queens in a deck)
P(R) = 26/52 (26 red cards in a deck)
P(Q  R) = 2/52 (2 red queens in a deck)
P(Q  R) = P(Q) + P(R) – P(Q  Q)

Q and R = 2/52 = 4/52 + 26/52 – 2/52

= 28/52 = .5385 or 53.85%


Q R
4/52 26/52
Rules of Probability
 Conditional Probability
The probability of event A given that event B has
occurred.
Denoted P(A | B).
The vertical line “ | ” is read as “given.”

P( A  B)
P( A | B) 
P( B)
Rules of Probability
 Conditional Probability
Consider the logic of this formula by looking at the
Venn diagram. The sample space is
P( A  B)
P( A | B)  restricted to B, an event
P( B)
that has occurred.
A  B is the part of B
that is also in A.
The ratio of the relative
size of A  B to B is
P(A | B).
Rules of Probability
 Example: High School Dropouts
Of the population aged 16 – 21 and not in college:
Unemployed 13.5%
High school dropouts 29.05%
Unemployed high school dropouts 5.32%

What is the conditional probability that a member


of this population is unemployed, given that the
person is a high school dropout?
Rules of Probability
 Example: High School Dropouts
First define
U = the event that the person is unemployed
D = the event that the person is a high school dropout

P(U) = .1350 P(D) = .2905 P(UD) = .0532

P (U  D) .0532
P(U | D )    .1831 or 18.31%
P( D) .2905
P(U | D) = .1831 > P(U) = .1350
Therefore, being a high school dropout is related to
being unemployed.
Independent Events
Event A is independent of event B if the conditional
probability P(A | B) is the same as the marginal
probability P(A).
To check for independence, apply this test:
If P(A | B) = P(A) then event A is independent of B.

Another way to check for independence:


If P(A  B) = P(A)P(B) then event A is independent of event B
since

P(A | B) = P(A  B) = P(A)P(B) = P(A)


P(B) P(B)
Independent Events
 Example: Television Ads
Out of a target audience of 2,000,000, ad A
reaches 500,000 viewers, B reaches 300,000
viewers and both ads reach 100,000 viewers.
500, 000 300, 000
P ( A)   .25 P( B)   .15
2, 000, 000 2, 000, 000
100, 000
P( A  B)   .05 What is P(A | B)?
2, 000, 000
P( A  B) .05
P( A | B)    .30
.3333 or 33%
P( B) .15
Independent Events
 Example: Television Ads
So, P(ad A) = .25
P(ad B) = .15
P(A  B) = .05
P(A | B) = .3333
Are events A and B independent?
P(A | B) = .3333 ≠ P(A) = .25
P(A)P(B)=(.25)(.15)=.0375 ≠ P(A  B)=.05
Independent Events
 Dependent Events
When P(A) ≠ P(A | B), then events A and B are
dependent.
For dependent events, knowing that event B has
occurred will affect the probability that event A will
occur.
Statistical dependence does not prove causality.
For example, knowing a person’s age would affect
the probability that the individual uses text
messaging but causation would have to be proven
in other ways.
Independent Events
 Multiplication Law for Independent Events
The probability of n independent events occurring
simultaneously is:
P(A1  A2  ...  An) = P(A1) P(A2) ... P(An)
if the events are independent
To illustrate system reliability, suppose a Web site
has 2 independent file servers. Each server has
99% reliability. What is the total system reliability?
Let,
F1 be the event that server 1 fails
F2 be the event that server 2 fails
Independent Events
 Multiplication Law for Independent Events
Applying the rule of independence:
P(F1  F2 ) = P(F1) P(F2) = (.01)(.01) = .0001

So, the probability that both servers are down is .


0001.
The probability that at least one server is “up” is:
1 - .0001 = .9999 or 99.99%
Independent Events
 Example: Space Shuttle
Redundancy can increase system reliability even
when individual component reliability is low.
NASA space shuttle has three independent flight
computers (triple redundancy).
Each has an unacceptable .03 chance of failure
(3 failures in 100 missions).

Let Fj = event that computer j fails.


Independent Events
 Example: Space Shuttle
What is the probability that all three flight
computers will fail?
P(all 3 fail) = P(F1  F2  F3)

= P(F1) P(F2) P(F3)  presuming


that failures
are
= (0.03)(0.03)(0.03) independent

= 0.000027
or 27 in 1,000,000 missions.
ABL: Exercises 4.3
a) Given P(A) = .40, P(B) = .50 and P(A ∩ B) = .05. Find (a) P(A ᴜ B), (b) P(A |
B) and (C) P (B |A). Sketch a Venn diagram.
b) Let S be the event that a randomly chosen male aged 18-24 is a smoker.
Let C be the event that a randomly chosen male is an Asian. Given P(S)
= .246, P(C) = .830, and P(S ∩ C) = .232, find each probability and express
the event in words. (a) P(S’), (b) P(S ᴜ C), (C) P(S |C) and (d) P(S |C’).
c) Which pairs of events are independent?
1. P(A) = .60, P(B) = .40, P(A ∩ B) = .24
2. P(A) = .90, P(B) = .20, P(A ∩ B) = .18
3. P(A) = .50, P(B) = .70, P(A ∩ B) = .25
ABL: Exercises 4.3 Continued…
d) Rahul sets two alarm clocks to be sure he arises for his Monday
8:00 A.M. accounting exam. There is a 75% chance that either clock
will wake up Rahul. (a) What is the probability that Rahul will over
sleep? (b) If he Rahul had three clocks, would he have a 99 percent
chance of waking up?
Contingency Tables
 What is a Contingency Table?

A contingency Variable 1

table is a cross- Col 1 Col 2 Col 3


tabulation of Row 1
frequencies into

Variable 2
Row 2 Cell
rows and Row 3
columns. Row 4

A contingency table is like a frequency distribution


for two variables.
Contingency Tables
 Example: Salary Gains and MBA Tuition
Consider the following cross-tabulation table for n
= 67 top-tier MBA programs:
Contingency Tables
 Example: Salary Gains and MBA Tuition
Are large salary gains more likely to accrue to
graduates of high-tuition MBA programs?
The frequencies indicate that MBA graduates of
high-tuition schools do tend to have large salary
gains.
Also, most of the top-tier schools charge high
tuition.
More precise interpretations of this data can be
made using the concepts of probability.
Contingency Tables
 Marginal Probabilities
The marginal probability of a single event is found
by dividing a row or column total by the total
sample size.
For example, find the marginal probability of a
medium salary gain (P(S2)).
P(S2) = 33/67 = .4925
Conclude that about 49% of salary gains at the
top-tier schools were between $50,000 and
$100,000 (medium gain).
Contingency Tables
 Marginal Probabilities
Find the marginal probability of a low tuition P(T1).

P(T1) = 16/67 = .2388


There is a 24% chance that a top-tier school’s
MBA tuition is under $40.000.
Contingency Tables
 Joint Probabilities
A joint probability represents the intersection of two
events in a cross-tabulation table.
Consider the joint event that the school has
low tuition and large salary gains
(denoted as P(T1  S3)).
Contingency Tables
 Joint Probabilities
So, using the cross-tabulation table,

P(T1  S3) = 1/67 = .0149


There is less than a 2% chance that a top-tier
school has both low tuition and large salary gains.
Contingency Tables
 Conditional Probabilities
Found by restricting ourselves to a single row or
column (the condition).
For example, knowing that a school’s MBA tuition
is high (T3), we would restrict ourselves to the third
row of the table.
Contingency Tables
 Conditional Probabilities
Find the probability that the salary gains are small
(S1) given that the MBA tuition is large (T3).
P(S1 | T3) = 5/32 = .1563

What does this mean?


Contingency Tables
 Independence
To check for independent events in a contingency
table, compare the conditional to the marginal
probabilities.
For example, if large salary gains (S3) were
independent of low tuition (T1), then
P(S3 | T1) = P(S3).
Conditional Marginal
P(S3 | T1)= 1/16 = .0625 P(S3) = 17/67 = .2537
What do you conclude about events S3 and T1?
Contingency Tables
 Relative Frequencies
Calculate the relative frequencies below for each
cell of the cross-tabulation table to facilitate
probability calculations.

Symbolic notation for relative frequencies:


Contingency Tables
 Relative Frequencies
Here are the resulting probabilities (relative
frequencies). For example,
P(T1 and S1) = 5/67 P(T2 and S2) = 11/67 P(T3 and S3) = 15/67

P(S1) = 17/67 P(T2) = 19/67


Contingency Tables
 Relative Frequencies
The nine joint probabilities sum to 1.0000 since
these are all the possible intersections.
Summing the across a row or down a column gives
marginal probabilities for the respective row or
column.
Contingency Tables
 Example: Payment Method and Purchase Quantity
A small grocery store would like to know if the
number of items purchased by a customer is
independent of the type of payment method the
customer chooses to use.
Why would this information be useful to the store
manager?
The manager collected a random sample of 368
customer transactions.
Contingency Tables
 Example: Payment Method and Purchase Quantity
Here is the contingency table of frequencies:
Contingency Tables
 Example: Payment Method and Purchase Quantity

Calculate the marginal probability that a customer


will use cash to make the payment.
Let C be the event cash.

P(C) = 126/368 = .3424


Now, is this probability the same if we condition on
number of items purchased?
Contingency Tables
 Example: Payment Method and Purchase Quantity
P(C | 1-5) = 30/88 = .3409
P(C | 6-10) = 46/135 = .3407

P(C | 10-20) = 31/89 = .3483

P(C | 20+) = 19/56 = .3393

P(C) = .3424, so what do you conclude about


independence?
Based on this, the manager might decide to offer a
cash-only lane that is not restricted to the number
of items purchased.
Contingency Tables
 How Do We Get a Contingency Table?
Contingency tables require careful organization
and are created from raw data.

Consider the data


of salary gain and
tuition for n = 67
top-tier MBA schools.
Contingency Tables
 How Do We Get a Contingency Table?
The data should be coded so that the values can
be placed into the contingency table.
Once coded,
tabulate the
frequency
in each cell of
the contingency
table using
MINITAB’s
Stat | Tables | Cross Tabulation
ABL: Exercises 4.4
This contingency table describes 200 business students. Find each
probability and interpret it in words.

Gender Accounting (A) Economics (E) Statistics (S) Row Total


Female (F) 44 30 24 98
Male (M) 56 30 16 102
Column Total 100 60 40 200

(a) P(A), (b) P(M), (C) P(A ∩ M), (d) P(F ∩ S), (e) P(A|M), (f) P(A |F), (g)
P(F |S), (h) P(E ᴜ F).
Is major independent of Gender?
ABL: Exercises 4.4 Continued…
• The contingency table shows average yield (rows) and average
duration (columns) for 38 bond funds
Intermediate Row
Yield Short (D1) (D2) Long (D3) Total
Small (Y1) 8 2 0 10
Medium (Y2) 1 6 6 13
High (Y3) 2 4 9 15
Column Total 11 12 15 38

Tabulate all marginal, joint and conditional probabilities and test the
independence of yield and duration
Tree Diagrams
 What is a Tree?
A tree diagram or decision tree helps you visualize
all possible outcomes.
Start with a contingency table.
For example, this table gives expense ratios by
fund type for 21 bond funds and 23 stock funds.
Tree Diagrams
 What is a Tree?
To label the tree, first calculate conditional
probabilities by dividing each cell frequency by its
column total.
For example, P(L | B) = 11/21 = .5238
Here is the table of conditional probabilities
Tree Diagrams
 What is a Tree?
The tree diagram shows all events along with their
marginal, conditional and joint probabilities.
To calculate joint probabilities, use
P(A  B) = P(A | B)P(B) = P(B | A)P(A)

The joint probability of each terminal event on the


tree can be obtained by multiplying the probabilities
along its branch.
For example, P(B  L) = P(L | B)P(B)
= (.5238)(.4773) = .2500
Tree Diagrams
 Tree Diagram for Fund Type and Expense Ratios
ABL: Exercises 4.5
Of grocery shoppers who have a shopping cart, 70% pay by credit/debit
card (event C1), 20% pay cash (event C2) and 10% pay by cheque (event
C3). Of shoppers without shopping cart, 50% pay by credit/debit card
(event C1), 40% pay by cash (event C2) and 10% pay by cheque (event
C3). On a Sunday morning, 80% of the shoppers take a shopping cart
(event S1) and 20% do not (event S2).
a) Sketch a tree based on this data
b) calculate all joint probabilities and verify that the joint probabilities
sum to 1.
ABL: Exercises 4.5 Continued…
A study showed that 60% of the wall street journal subscribers watch
CNBC every day. Of these, 70% watch it outside the home. Only 20% of
those who don’t watch CNBC every day watch it outside the home. Let
D be the event “watches CNBC daily” and O be the event “watches
CNBC outside the home.”
a) Sketch a tree based on this data.
b) Calculate all joint probabilities.
c) Verify that all joint probabilities sum to 1.
Bayes’s Theorem
Thomas Bayes (1702-1761) provided a method
(called Bayes’s Theorem) of revising probabilities
to reflect new probabilities.

The prior (marginal) probability of an event B is


revised after event A has been considered to yield
a posterior (conditional) probability.

P( A | B) P( B)
Bayes’s formula is: P( B | A) 
P( A)
Bayes’s Theorem
Bayes’s formula begins as:
P( A | B) P( B)
P( B | A) 
P( A)

In some situations P(A) is not given. Therefore,


the most useful and common form of Bayes’s
Theorem is:
P( A | B) P( B)
P ( B | A) 
P ( A | B ) P( B )  P ( A | B ') P ( B ')
Bayes’s Theorem
 How Bayes’s Theorem Works
Consider an over-the-counter pregnancy testing kit
and it’s “track record” of determining pregnancies.
If a woman is actually pregnant, what is the test’s
“track record”?
If a woman is not pregnant, what is the test’s “track
record”? False Positive
False Negative

96% of time 4% of time


1% of time 99% of time
Bayes’s Theorem
 How Bayes’s Theorem Works
Suppose that 60% of the women who purchase the
kit are actually pregnant.

Intuitively, if 1,000 women use this test, the results


should look like this.
Bayes’s Theorem
 How Bayes’s Theorem Works
Of the 580 women who test positive, 576 will
actually be pregnant.

So, the desired probability is:


P(Pregnant│Positive Test) = 576/580 = .9931
Bayes’s Theorem
 How Bayes’s Theorem Works
Now use Bayes’s Theorem to formally derive the
result P(Pregnant | Positive) = .9931:
First define
A = positive test B = pregnant
A' = negative test B' = not pregnant
From the contingency And the compliment of
each event is:
table, we know that:
P(A | B) = .96
  Positive (A) Negative (A') P(B*) P(A' | B) = .04
P(A | B') = .01
Pregnant (B) 0.96 0.04 0.60 P(A' | B') = .99
P(B) = .60
Not Pregnant (B') 0.01 0.99 0.40
P(B') = .40
Bayes’s Theorem
 How Bayes’s Theorem Works
P(A | B)P(B)
P(B | A) =
P(A | B)P(B) + P(A | B')P(B')   Positive (A) Negative (A') P(B*)

(.96)(.60) Pregnant (B) 0.96 0.04 0.60


Not Pregnant (B') 0.01 0.99 0.40
=
(.96)(.60) + (.01)(.40)
.576 .576
= = = .9931
.576 + .04 .580

So, there is a 99.31% chance that a woman is


pregnant, given that the test is positive.
Bayes’s Theorem
 How Bayes’s Theorem Works
Bayes’s Theorem shows us how to revise our prior
probability of pregnancy to get the posterior
probability after the results of the pregnancy test
are known.
Prior Posterior
Before the test After positive test result
P(B) = .60  P(B | A) = .9931

Bayes’s Theorem is useful when a direct


calculation of a conditional probability is not
permitted due to lack of information.
Bayes’s Theorem
 How Bayes’s Theorem Works
A tree diagram helps visualize the situation.
Bayes’s Theorem
 How Bayes’s Theorem Works
The 2 branches showing a positive test (A) comprise a reduced sample space

B  A and B'  A,

so add their
probabilities
to obtain the
denominator
of the fraction
whose
numerator is
P(B  A).
Bayes’s Theorem
 General Form of Bayes’s Theorem
A generalization of Bayes’s Theorem allows event
B to be polytomous (B1, B2, … Bn) rather than
dichotomous (B and B').
P ( A | Bi ) P ( Bi )
P ( Bi | A) 
P ( A | B1 ) P ( B1 )  P ( A | B2 ) P ( B2 )  ...  P( A | Bn ) P( Bn )
Bayes’s Theorem
 Example: Hospital Trauma Centers
Based on historical data, the percent of cases at 3
hospital trauma centers and the probability of a
case resulting in a malpractice suit are as follows:

let event A = a malpractice suit is filed


Bi = patient was treated at trauma center i
Bayes’s Theorem
 Example: Hospital Trauma Centers
Applying the general form of Bayes’ Theorem, find
P(B1 | A).
P ( A | B1 ) P ( B1 )
P ( B1 | A) 
P ( A | B1 ) P ( B1 )  P( A | B2 ) P( B2 )  P( A | B3 ) P( B3 )

(0.001)(0.50)
P ( B1 | A) 
(0.001)(0.50)  (0.005)(0.30)  (0.008)(0.20)

0.0005 0.0005
P ( B1 | A)    0.1389
0.0005  0.0015  0.0016 0.00036
0.
Bayes’s Theorem
 Example: Hospital Trauma Centers
Conclude that the probability that the malpractice
suit was filed in hospital 1 is .1389 or 13.89%.
All the posterior probabilities for each hospital can
be calculated and then compared:
Bayes’s Theorem
 Example: Hospital Trauma Centers
Intuitively, imagine there were 10,000 patients and
calculate the frequencies:
Hospital Malpractice Suit No Malpractice Total
Filed Suit Filed
1 5 4,995 5,000 = 10,000x.5
2 15 2,985 3,000 = 10,000x.3
3 16 1,984 2,000 = 10,000x.2
Total 36 9,964 10,000

= 5,000 x .001 = 5,000 - 5


= 3,000 x .005 = 3,000 - 15
= 2,000 x .008 = 1,984 - 16
Bayes’s Theorem
 Example: Hospital Trauma Centers
Now, use these frequencies to find the probabilities
needed for Bayes’ Theorem.
For example,
Hospital Malpractice No Malpractice Total
Suit Filed Suit Filed
1 P(B1|A)=5/36=.1389 P(B1|A')=.5012 P(B1)=.5
2 P(B2|A)=15/36=.4167 P(B2|A')=.2996 P(B2)=.3
3 P(B3|A)=16/36=4444 P(B3|A')=.1991 P(B3)=.2
Total P(A)=36/10000=.0036 P(A')=.9964 1.0000
Bayes’s Theorem
 Example: Hospital Trauma Centers
Consider the following visual description of the
problem:
Bayes’s Theorem
 Example: Hospital Trauma Centers
The initial sample space consists of 3 mutually
exclusive and collectively exhaustive events
(hospitals B1, B2, B3).
Bayes’s Theorem
 Example: Hospital Trauma Centers
As indicated by their relative areas, B1 is 50% of
the sample space, B2 is 30% and B3 is 20%.

30%
50%

20%
Bayes’s Theorem
 Example: Hospital Trauma Centers
But, given that a malpractice case has been filed
(event A), then the relevant sample space is
reduced to the yellow area of event A.
The revised probabilities are the relative areas
within event A.
P(B2 | A)

P(B1 | A) P(B3 | A)
ABL: Exercises 4.5
• A drug test for athletes has a 5% false positive rate and a 10% false
negative rate. Of the athletes tested, 4% have actually been using the
prohibited drug. If an athlete tests positive, what is the probabilities
that the athlete has actually been using the prohibited drug?

• An airport gamma ray luggage scanner coupled with a neutral net


artificial intelligence program can detect a weapon in suitcases with
false positive rate of 2% and a false negative rate of 2%. Assume a .
001 probability that a suitcase contains a weapon. If a suitcase
triggers the alarm, what is the probability that it contains a weapon?
• Bayes Theorem is super confusing so I found two examples from an old
ABL: Exercises 4.5
business stats class I took that might help you guys see it.  One
thing I suggest for anyone confused with stats principles is to NOT
ask Heng Ji to do an example 'involving linguistics'.  If you want to
learn about probabilities the best examples always involve drug/
disease tests, decks of cards and dice.  Once you master the concept
then start applying to linguistics because linguistics is already
confusing enough with out good old Bayes, Bernoulli, Poisson and
crew.....

A drug test for athletes has a 5% false positive rate and a 10% false
negative rate.  OF the ahtletes tested, 4% have actually been using
the prohibited drug.  If an athlete tests positive, what is the
probability that the ahtlete has actually been using the prohibited
drug?

We are looking for P(drug|positive test)


Let A = using the drug.
P(A) is .04.  Since you can either use the drug or not it's COMPLEMENT
is P(A') = .96.
Let's let T = test.
P(T|A') = probability of positive test given no drug = .05
P(T'|A') = probability of negative test given no drug = .95
P(T'|A) = probability of negative test given drug = .10
P(T|A) = probability of positive test given drug = .90.

The probability of a positive test is the P(False Positive) * P(no


drug) + P(Positive test) * P(drug)
.05*.96 (5% chance of false positive times .96 percent change of no
drug) + .9 * .04 (90 percent change of accurate test * probability of
using drug.
So P(T) = .9*.04 + .05*.96 = .084.
Remember we are looking for P(A|T).
Bayes theorem says that P(A|T) = ( P(T|A) * P(A) )   /  P(T)  We now
have all these variables.

(.9 * .04) / .084 = .4286  =   Probability of a drug given a positive


test.   = P(A | T)

This one comes in handy when convincing your employer you're not on
drugs! "Yeah I got a positive drug test but there is only a 42.9 %
chance I'm actually on drugs!"   hahahaha.
ABL: Exercises 4.5
• An airport gamma ray luggage scanner coupled with a neural net
artificial intelligence program can detect a weapon in suitcases with
a false positive rate of 2% and a false negative rate of 2%.  Assume
a .001 probability of a suitcase containing a weapon.  If a suitcase
triggers and alarm what is the probability that the suitcase contains
a weapon.

Let's let W represent a weapon and A represent alarm.


So let's find P(W | A).  To find it we'll have to do (P( A | W) *
P( W) )/ P(A)

P(W) = .001  P(W') = .999


P(A | W') = false positive = .02 = alarm and no weapon
P(A' | W')  = .98 = no alarm and no weapon
P(A' | W) = false negative = .02 = no alarm and weapon
P(A | W) = .98 = alarm and weapon.

We have P(A | W) and P( W) so we only need P(A) to solve.  The


probability of the alarm going off is the probability of a false alarm
times the probability of no weapon + the probability of a real alarm
times the probability of a weapon.

P(A) = P(A|W') * P(W')   +  P(A|W) * P(W)  =   (.02*.999) + (.98*.001)


= .02096

So:  P( W | A) = ( P(A |W) * P(W) ) / P(A)  =   .98*.001 / .02096  = .


04676  So there is 4.7% chance of having a weapon given that the alarm
goes off.

The biggest secret here is A) realizing that the probability of a the


alarm going off given you have a weapon and the probability of having
a weapon given that the alarm go off are completely different.
(Statistics books love to quote that they asked a bunch of doctors if
those two are the same and 75% got it wrong because of it's seeming
counter-intuitiveness) and B) Writing out the information you're given
in the problem, the information that can be derived and seeing how you
can use Bayes to solve it.

You might also like