Bayesian Lecture Notes

Week
13
This week:
• Tuesday: Bayesian methods
• Thursday: No class (Thanksgiving)
Next week:
• Tuesday: No class (work on assignment
• Thursday: Review session, help with assignment
• Assignment 3 due Thursday November 30
Schools of statistical thought
Frequentist
- P(D|H), the probability of the data given the hypothesis
- Data are an estimate of some “truth” about the world
Bayesian
- P(H|D), the probability of the hypothesis given the data
- Data are “true”, what is the probability of the model?
Data analysis in a Bayesian framework is related to likelihood
methods
With likelihood, the data are considered an estimate of some

truth, and we vary the parameter to find the value for which
the probability of the data is highest.
Bayesian methods, treat the parameter (or hypothesis) as

some estimate of the truth, by considering it a random
variable, and seeking the parameter value having the highest
probability, given the data.
Calculating a posterior probability requires that we have a

prior probability for the parameter value, or hypothesis.
To talk about Bayesian methods, we need to remember what
likelihood is, and therefore what probability is…
So, what is probability?
A frequentist would say:

The probability of an event is the proportion of time that the event
would occur if we repeated a random trial over and over again
under the same conditions
A probability distribution is a list of all mutually exclusive outcomes

of a random trial and their probabilities of occurrence.
Probability statements that makes sense under a frequentist
definition…
• If we toss a fair coin, what is the probability of 10 heads in a row?
• If we assign treatments randomly to subjects, what is the

probability that the observed difference between treatment
means is what we’d get if the null hypothesis were true
• What is the probability of a result at least as extreme as that

observed if the null hypothesis is true
• Under a process of genetic drift in a finite population, what is the
probability of fixation of a rare allele?
Here, sampling error is the source of uncertainty.
Probability statements that do not makes sense under a
frequentist definition…
Why?
• What is the probability that Iran is building nuclear weapons?
Either Iran is or isn’t – there is no random trial
• What is the probability that hippos are the sister group to the
whales
Either they are or they’re not – there is no random trial
• What is the probability that polar bears will be extinct in the wild
in 40 years?
Difficult to cast as a frequency of occurrence
Here, there is no random trial, so no sampling error
The information is the source of uncertainty, not sampling error!

An alternative (Bayesian) definition of probability
Probability is the measure of a degree of belief associated with the

occurrence of an event
A probability distribution is a list of all mutually exclusive events

and the degree of belief associated with their occurrence
Bayesian statistics applies the mathematics of probability to model

uncertainty as a subjective degree of belief
Bayesian methods are increasingly used in ecology and evolution
“Ecologists should be aware that Bayesian methods constitute a radically different way of
doing science. Bayesian statistics is not just another tool to be added into the ecologists’
repertoire of statistical methods. Instead, Bayesians categorically reject various tenets of
statistics and the scientific method that are currently widely accepted in ecology and other
sciences.” B. Dennis, 1996, Ecology
“Bayesian statistics is like handing guns to children” J. Clark

Bayes Theorem H = Hypothesis
D = Data
Likelihood ‘Prior’ probability
‘Posterior’ P (D H ) P (H )
probability P ( H D) =
P ( D)
The data
Bayes Theorem: example
Down Syndrome (DS) occurs in about 1 in 1000 pregnancies. We use

a test to know the probability that a particular baby has it. The test
is low risk, but also low accuracy.
Fetus has DS? Test result +ve Probability
Yes
Yes
No
Yes
No
No
• Conditional probability is the probability of an event occurring given that a
condition is met
• The probability of a positive test result from the test is 0.6, given that a fetus
has DS. The probability of a positive result is 0.05, given that a fetus is not DS.

0.60 Yes 0.0006
Yes
0.001 0.0004
0.40 No
0.05 Yes 0.04995

0.999
No
No 0.94905
0.95
1.00000
• What is the probability that a fetus has DS if the test is positive?
P ( positive DS ) P ( DS )
P ( H D ) = P ( DS positive) =
P ( positive)

0.60 Yes 0.0006
Yes
0.001 0.0004
0.40 No
0.05 Yes 0.04995

0.999
No
No 0.94905
0.95
1.00000
0.60 × 0.001 • Just 1.2%

P ( H D ) = P ( DS positive) = = 0.012
0.0006 + 0.04995
Bayes Theorem H = Hypothesis
D = Data
‘Prior’ probability Fetus has DS? Test result +ve Probability

of having DS 0.60 Yes 0.0006
Yes
0.001 0.0004
0.40 No
0.05 Yes 0.04995

0.999
No
No 0.94905
‘Posterior’ 0.95
1.00000
probability of
having DS
0.60 × 0.001
P ( H D ) = P ( DS positive) = = 0.012
0.0006 + 0.04995
Bayesian inference with data
‘Prior’ probability Hypothesis Data

Data collected
H1
p
All other data
Data collected
1 – p
‘Posterior’ H2
All other data
probability
P ( D H 1 ) P ( H1 )
P ( H1 D ) =
P ( D H1 ) P ( H1 ) + P ( D H 2 ) P ( H 2 )
The dangers of Bayes Theorem
• The prior probability is subjective and can have a large influence
Example: Forensic evidence.
What is the probability of guilt given a positive DNA match?
‘Prior’ probability Defendant DNA match? Probability

of guilt guilty? 1 Yes p
Yes
p 0
0 No
10-6 Yes (1 – p)10-6

1 – p
‘Posterior’ No
No (1 – p)(1 – 10-6)
probability of guilt 1 – 10-6
1( p)
P ( H D ) = P ( guilt match ) =
1( p) +10 −6 (1− p)
The dangers of Bayes Theorem
‘Prior’ probability Defendant DNA match? Probability
of guilt guilty? 1 Yes p
Yes
p 0
0 No
10-6 Yes (1 – p)10-6

1 – p
‘Posterior’ No
No (1 – p)(1 – 10-6)
probability of guilt 1 – 10-6
1( p)
P ( H D ) = P ( guilt match ) =
1( p) +10 −6 (1− p)
If p = 10-6, then P(guilt|match) = 0.5

If p = 0.5, then P(guilt|match) = 0.999
So…is the defendant guilty or not guilty??

The dangers of Bayes Theorem: example
Study of the sex ratio of the communal-living bee, (Paxton and
Tengo, 1996, J. Insect. Behav.)
What is the proportion of males in the reproductive adults

emerging from colonies?
To begin, we need to come up with a prior probability distribution

for the proportion
A “non-informative” prior (sometimes called a flat prior) is like an

expression of ignorance (which can be a good thing!)
An “informative” prior captures previous information based on

theory of previous data
Case 1: A “non-informative” prior (sometimes called a flat prior)

Case 2: An “informative” prior based on sex-ratio theory (that
predicts a 50:50 ratio)
Case 3: An “informative” prior based on different sex-ratio theory

(that predicts female biased sex ratios)
proportion p from MLE = 0.39
p = 0.39
p = 0.40
p = 0.36
With lots of data, the choice of prior has little effect
• The estimated proportion based on the Bayesian posterior

probability distribution depends on the prior probability
distribution
• A source of controversy is that the prior is partly subjective.

Different researchers may use different priors and get different
results
• Non-informative priors may get around the subjectivity, but
they could also be considered a form a subjectivity (the same
way we assume a normality in a frequentist framework)
• Non-informative priors prevent us from incorporating prior

information, which is regarded as one of the strengths of the
Bayesian approach
Bayesian estimates of uncertainty
95% credible interval
Frequentist:
95% Confidence Interval (which is based on the likelihood) means:
In a many random samples taken from the same population, the

true population mean will fall somewhere within 95% of the
confidence intervals
Frequentist:
95% Confidence Interval (which is based on the likelihood) means:
In a many random samples taken from the same population, the

true population mean will fall somewhere within 95% of the
confidence intervals
Bayesian:
95% Credible Interval (which is based on Bayes Theorem) means:
There is a 95% chance that the population mean lies within that
interval
Bayesian model selection
BIC = Bayesian Information Criterion

DIC = Deviance Information Criterion
Derived from a very different theory, but yields a formula similar to

that for AIC.
AIC = −2 ln L(model|data) + 2k
BIC = −2 ln L(model|data) + k log(n)
DIC = −2 ln L(model|data) + k
k = number of parameters estimated in the model

n = sample size
Bayesian inference is different to what you usually do
The approach requires a prior probability
The prior probability represents the investigator’s strength of belief
about the hypothesis, or of the parameters value
The influence of the prior declines with more data
The posterior probability expresses how the investigator’s beliefs

have been ‘updated’ by the data
Bayes theorem is used to estimate parameters and test hypotheses

using the posterior distribution
The interpretation of interval estimates differ from the frequentist

definition

Bayesian Lecture Notes

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bayesian Lecture Notes

Uploaded by

Copyright:

Available Formats

Week

With likelihood, the data are considered an estimate of some

Bayesian methods, treat the parameter (or hypothesis) as

Calculating a posterior probability requires that we have a

So, what is probability?

A frequentist would say:

A probability distribution is a list of all mutually exclusive outcomes

• If we toss a fair coin, what is the probability of 10 heads in a row?

• If we assign treatments randomly to subjects, what is the

• What is the probability of a result at least as extreme as that

Here, there is no random trial, so no sampling error

The information is the source of uncertainty, not sampling error!

Probability is the measure of a degree of belief associated with the

A probability distribution is a list of all mutually exclusive events

Bayesian statistics applies the mathematics of probability to model

“Bayesian statistics is like handing guns to children” J. Clark

Likelihood ‘Prior’ probability

Down Syndrome (DS) occurs in about 1 in 1000 pregnancies. We use

Fetus has DS? Test result +ve Probability

0.05 Yes 0.04995

Fetus has DS? Test result +ve Probability

0.05 Yes 0.04995

0.60 × 0.001 • Just 1.2%

‘Prior’ probability Fetus has DS? Test result +ve Probability

0.05 Yes 0.04995

‘Prior’ probability Hypothesis Data

‘Prior’ probability Defendant DNA match? Probability

10-6 Yes (1 – p)10-6

10-6 Yes (1 – p)10-6

If p = 10-6, then P(guilt|match) = 0.5

So…is the defendant guilty or not guilty??

What is the proportion of males in the reproductive adults

To begin, we need to come up with a prior probability distribution

A “non-informative” prior (sometimes called a flat prior) is like an

An “informative” prior captures previous information based on

Case 1: A “non-informative” prior (sometimes called a flat prior)

Case 3: An “informative” prior based on different sex-ratio theory

• The estimated proportion based on the Bayesian posterior

• A source of controversy is that the prior is partly subjective.

• Non-informative priors prevent us from incorporating prior

95% Confidence Interval (which is based on the likelihood) means:

In a many random samples taken from the same population, the

95% Confidence Interval (which is based on the likelihood) means:

In a many random samples taken from the same population, the

95% Credible Interval (which is based on Bayes Theorem) means:

BIC = Bayesian Information Criterion

Derived from a very different theory, but yields a formula similar to

BIC = −2 ln L(model|data) + k log(n)

k = number of parameters estimated in the model

The influence of the prior declines with more data

The posterior probability expresses how the investigator’s beliefs

Bayes theorem is used to estimate parameters and test hypotheses

The interpretation of interval estimates differ from the frequentist

You might also like