Download as pdf or txt
Download as pdf or txt
You are on page 1of 62

Bayesian estimation and hypothesis testing

Andrew Thangaraj

IIT Madras

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 1 / 15


Section 1

Bayesian estimation

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 2 / 15


Parameter estimation

X1 , . . . , Xn ∼ iid X , parameter θ

Two schools of thought for design of estimators

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 3 / 15


Parameter estimation

X1 , . . . , Xn ∼ iid X , parameter θ

Two schools of thought for design of estimators


Frequentist: treat θ as an unknown constant
I Method of moments
I Maximum likelihood

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 3 / 15


Parameter estimation

X1 , . . . , Xn ∼ iid X , parameter θ

Two schools of thought for design of estimators


Frequentist: treat θ as an unknown constant
I Method of moments
I Maximum likelihood

Bayesian: treat θ as a random variable with a known distribution


I Bayesian estimation

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 3 / 15


Example 1: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

Suppose that p ∼ Uniform{0.25, 0.75}


I Assume p is chosen first at random according to the above distribution
I Once p is chosen, the samples are drawn according to Bernoulli(p)

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 4 / 15


Example 1: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

Suppose that p ∼ Uniform{0.25, 0.75}


I Assume p is chosen first at random according to the above distribution
I Once p is chosen, the samples are drawn according to Bernoulli(p)

Samples: 1, 0, 1, 1, 0

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 4 / 15


Example 1: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

Suppose that p ∼ Uniform{0.25, 0.75}


I Assume p is chosen first at random according to the above distribution
I Once p is chosen, the samples are drawn according to Bernoulli(p)

Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 4 / 15


Example 1: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

Suppose that p ∼ Uniform{0.25, 0.75}


I Assume p is chosen first at random according to the above distribution
I Once p is chosen, the samples are drawn according to Bernoulli(p)

Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)
I Estimate using Bayes’ rule

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 4 / 15


Example 1: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

Suppose that p ∼ Uniform{0.25, 0.75}


I Assume p is chosen first at random according to the above distribution
I Once p is chosen, the samples are drawn according to Bernoulli(p)

Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)
I Estimate using Bayes’ rule
F P(p = 0.25|S) = P(S|p = 0.25)P(p = 0.25)/P(S) =
0.253 × 0.752 × 0.5/P(S) = 0.25

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 4 / 15


Example 1: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

Suppose that p ∼ Uniform{0.25, 0.75}


I Assume p is chosen first at random according to the above distribution
I Once p is chosen, the samples are drawn according to Bernoulli(p)

Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)
I Estimate using Bayes’ rule
F P(p = 0.25|S) = P(S|p = 0.25)P(p = 0.25)/P(S) =
0.253 × 0.752 × 0.5/P(S) = 0.25
F P(p = 0.75|S) = 0.753 × 0.252 × 0.5/P(S) = 0.75

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 4 / 15


Example 1: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

Suppose that p ∼ Uniform{0.25, 0.75}


I Assume p is chosen first at random according to the above distribution
I Once p is chosen, the samples are drawn according to Bernoulli(p)

Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)
I Estimate using Bayes’ rule
F P(p = 0.25|S) = P(S|p = 0.25)P(p = 0.25)/P(S) =
0.253 × 0.752 × 0.5/P(S) = 0.25
F P(p = 0.75|S) = 0.753 × 0.252 × 0.5/P(S) = 0.75
F P(S) = 0.253 × 0.752 × 0.5 + 0.753 × 0.252 × 0.5 = 0.252 × 0.752 × 0.5

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 4 / 15


Example 1: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

Suppose that p ∼ Uniform{0.25, 0.75}


I Assume p is chosen first at random according to the above distribution
I Once p is chosen, the samples are drawn according to Bernoulli(p)

Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)
I Estimate using Bayes’ rule
F P(p = 0.25|S) = P(S|p = 0.25)P(p = 0.25)/P(S) =
0.253 × 0.752 × 0.5/P(S) = 0.25
F P(p = 0.75|S) = 0.753 × 0.252 × 0.5/P(S) = 0.75
F P(S) = 0.253 × 0.752 × 0.5 + 0.753 × 0.252 × 0.5 = 0.252 × 0.752 × 0.5
I Estimator 1: Since P(p = 0.75|E ) > P(p = 0.25|S), we could estimate
p̂ = 0.75

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 4 / 15


Example 1: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

Suppose that p ∼ Uniform{0.25, 0.75}


I Assume p is chosen first at random according to the above distribution
I Once p is chosen, the samples are drawn according to Bernoulli(p)

Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)
I Estimate using Bayes’ rule
F P(p = 0.25|S) = P(S|p = 0.25)P(p = 0.25)/P(S) =
0.253 × 0.752 × 0.5/P(S) = 0.25
F P(p = 0.75|S) = 0.753 × 0.252 × 0.5/P(S) = 0.75
F P(S) = 0.253 × 0.752 × 0.5 + 0.753 × 0.252 × 0.5 = 0.252 × 0.752 × 0.5
I Estimator 1: Since P(p = 0.75|E ) > P(p = 0.25|S), we could estimate
p̂ = 0.75
I Estimator 2: Posterior mean,
p̂ = 0.25 P(p = 0.25|S) + 0.75 P(p = 0.75|S) = 0.625

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 4 / 15


Example 2: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

0.9 0.1
Suppose that p ∼ {0.25, 0.75}

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 5 / 15


Example 2: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

0.9 0.1
Suppose that p ∼ {0.25, 0.75}
Samples: 1, 0, 1, 1, 0

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 5 / 15


Example 2: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

0.9 0.1
Suppose that p ∼ {0.25, 0.75}
Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 5 / 15


Example 2: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

0.9 0.1
Suppose that p ∼ {0.25, 0.75}
Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)
I Estimate using Bayes’ rule

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 5 / 15


Example 2: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

0.9 0.1
Suppose that p ∼ {0.25, 0.75}
Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)
I Estimate using Bayes’ rule
F P(p = 0.25|S) = P(S|p = 0.25)P(p = 0.25)/P(S) =
0.253 × 0.752 × 0.9/P(S) = 0.75

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 5 / 15


Example 2: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

0.9 0.1
Suppose that p ∼ {0.25, 0.75}
Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)
I Estimate using Bayes’ rule
F P(p = 0.25|S) = P(S|p = 0.25)P(p = 0.25)/P(S) =
0.253 × 0.752 × 0.9/P(S) = 0.75
F P(p = 0.75|S) = 0.753 × 0.252 × 0.1/P(S) = 0.25

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 5 / 15


Example 2: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

0.9 0.1
Suppose that p ∼ {0.25, 0.75}
Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)
I Estimate using Bayes’ rule
F P(p = 0.25|S) = P(S|p = 0.25)P(p = 0.25)/P(S) =
0.253 × 0.752 × 0.9/P(S) = 0.75
F P(p = 0.75|S) = 0.753 × 0.252 × 0.1/P(S) = 0.25
F P(S) = 0.253 × 0.752 × 0.9 + 0.753 × 0.252 × 0.1 = 0.252 × 0.752 × 0.3

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 5 / 15


Example 2: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

0.9 0.1
Suppose that p ∼ {0.25, 0.75}
Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)
I Estimate using Bayes’ rule
F P(p = 0.25|S) = P(S|p = 0.25)P(p = 0.25)/P(S) =
0.253 × 0.752 × 0.9/P(S) = 0.75
F P(p = 0.75|S) = 0.753 × 0.252 × 0.1/P(S) = 0.25
F P(S) = 0.253 × 0.752 × 0.9 + 0.753 × 0.252 × 0.1 = 0.252 × 0.752 × 0.3
I Estimator 1: Since P(p = 0.25|S) > P(p = 0.75|S), we estimate
p̂ = 0.25

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 5 / 15


Example 2: Bernoulli(p)

X1 , . . . , Xn ∼ iid Bernoulli(p)

0.9 0.1
Suppose that p ∼ {0.25, 0.75}
Samples: 1, 0, 1, 1, 0
I Notation: S = (X1 = 1, X2 = 0, X3 = 1, X4 = 1, X5 = 0)
I Estimate using Bayes’ rule
F P(p = 0.25|S) = P(S|p = 0.25)P(p = 0.25)/P(S) =
0.253 × 0.752 × 0.9/P(S) = 0.75
F P(p = 0.75|S) = 0.753 × 0.252 × 0.1/P(S) = 0.25
F P(S) = 0.253 × 0.752 × 0.9 + 0.753 × 0.252 × 0.1 = 0.252 × 0.752 × 0.3
I Estimator 1: Since P(p = 0.25|S) > P(p = 0.75|S), we estimate
p̂ = 0.25
I Estimator 2: Posterior mean,
p̂ = 0.25 P(p = 0.25|S) + 0.75 P(p = 0.75|S) = 0.375

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 5 / 15


Bayesian estimation

X1 , . . . , Xn ∼ iid X , parameter Θ

Prior distribution of Θ: Θ ∼ fΘ (θ)

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 6 / 15


Bayesian estimation

X1 , . . . , Xn ∼ iid X , parameter Θ

Prior distribution of Θ: Θ ∼ fΘ (θ)


Samples: x1 , . . . , xn , Notation: S = (X1 = x1 , . . . , Xn = xn )
Bayes’ rule: posterior ∝ likelihood × prior

P(Θ = θ|S) = P(S|Θ = θ)fΘ (θ)/P(S)

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 6 / 15


Bayesian estimation

X1 , . . . , Xn ∼ iid X , parameter Θ

Prior distribution of Θ: Θ ∼ fΘ (θ)


Samples: x1 , . . . , xn , Notation: S = (X1 = x1 , . . . , Xn = xn )
Bayes’ rule: posterior ∝ likelihood × prior

P(Θ = θ|S) = P(S|Θ = θ)fΘ (θ)/P(S)

Estimation using “posterior” probability


I Posterior mode: θ̂ = arg maxθ P(S|Θ = θ)fΘ (θ)
I Posterior mean: θ̂ = E [Θ|S], mean of posterior distribution
F (Θ|S) may be a known distribution, and its mean might become a
simple formula in some cases

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 6 / 15


Meaning of prior distribution

Prior distribution
I Captures what we might know about the parameter
I This could be using some scientific model or expert opinion

Posterior ∝ Likelihood of samples × Prior


I Intuitively understood as incorporating “data” into prior
I Useful in modeling

What if we do not know anything?


I You can choose a flat prior, uniform over the entire range

Lots of debates between frequentists vs Bayesians


I Search “frequestist vs Bayesian”

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 7 / 15


Section 2

Choice of prior and examples

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 8 / 15


How to pick prior?

Flat, uninformative
I Nearly flat over the interval in which the parameter takes value
I This usually reduces to something close to maximum likelihood

Conjugate priors
I Pick a prior so that the posterior is in the same class as prior
I Examples
F Prior: Normal and Posterior: Normal
F Prior: Beta and Posterior: Beta

Informative priors
I This needs some justification from the domain of the problem
I Parameterize the prior so that its flatness can be controlled

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 9 / 15


Bernoulli(p) samples with uniform prior

X1 , . . . , Xn ∼ iid Bernoulli(p)

Prior p ∼ Uniform[0, 1], continuous distribution

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 10 / 15


Bernoulli(p) samples with uniform prior

X1 , . . . , Xn ∼ iid Bernoulli(p)

Prior p ∼ Uniform[0, 1], continuous distribution


Samples: x1 , . . . , xn
Posterior: p|(X1 = x1 , . . . , Xn = xn ) is continuous
I Posterior density ∝ P(X1 = x1 , . . . , Xn = xn |p = p)fp (p)
I Posterior density ∝ p w (1 − p)n−w , 0 ≤ p ≤ 1
F w = x1 + · · · + xn : number of 1s in samples

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 10 / 15


Bernoulli(p) samples with uniform prior

X1 , . . . , Xn ∼ iid Bernoulli(p)

Prior p ∼ Uniform[0, 1], continuous distribution


Samples: x1 , . . . , xn
Posterior: p|(X1 = x1 , . . . , Xn = xn ) is continuous
I Posterior density ∝ P(X1 = x1 , . . . , Xn = xn |p = p)fp (p)
I Posterior density ∝ p w (1 − p)n−w , 0 ≤ p ≤ 1
F w = x1 + · · · + xn : number of 1s in samples

Posterior density: Beta(w + 1, n − w + 1)


w +1 w +1 x1 +···+xn +1
I Posterior mean = w +1+n−w +1 = n+2 = n+2

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 10 / 15


Bernoulli(p) samples with uniform prior

X1 , . . . , Xn ∼ iid Bernoulli(p)

Prior p ∼ Uniform[0, 1], continuous distribution


Samples: x1 , . . . , xn
Posterior: p|(X1 = x1 , . . . , Xn = xn ) is continuous
I Posterior density ∝ P(X1 = x1 , . . . , Xn = xn |p = p)fp (p)
I Posterior density ∝ p w (1 − p)n−w , 0 ≤ p ≤ 1
F w = x1 + · · · + xn : number of 1s in samples

Posterior density: Beta(w + 1, n − w + 1)


w +1 w +1 x1 +···+xn +1
I Posterior mean = w +1+n−w +1 = n+2 = n+2

X1 + · · · + Xn + 1
p̂ =
n+2

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 10 / 15


Bernoulli(p) samples with beta prior

X1 , . . . , Xn ∼ iid Bernoulli(p)

Prior p ∼ Beta(α, β), continuous distribution


I fp (p) ∝ p α−1 (1 − p)β−1 , 0 ≤ p ≤ 1

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 11 / 15


Bernoulli(p) samples with beta prior

X1 , . . . , Xn ∼ iid Bernoulli(p)

Prior p ∼ Beta(α, β), continuous distribution


I fp (p) ∝ p α−1 (1 − p)β−1 , 0 ≤ p ≤ 1

Samples: x1 , . . . , xn
Posterior: p|(X1 = x1 , . . . , Xn = xn ) is continuous
I Posterior density ∝ P(X1 = x1 , . . . , Xn = xn |p = p)fp (p)
I Posterior density ∝ p w +α−1 (1 − p)n−w +β−1 , 0 ≤ p ≤ 1
F w = x1 + · · · + xn : number of 1s in samples

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 11 / 15


Bernoulli(p) samples with beta prior

X1 , . . . , Xn ∼ iid Bernoulli(p)

Prior p ∼ Beta(α, β), continuous distribution


I fp (p) ∝ p α−1 (1 − p)β−1 , 0 ≤ p ≤ 1

Samples: x1 , . . . , xn
Posterior: p|(X1 = x1 , . . . , Xn = xn ) is continuous
I Posterior density ∝ P(X1 = x1 , . . . , Xn = xn |p = p)fp (p)
I Posterior density ∝ p w +α−1 (1 − p)n−w +β−1 , 0 ≤ p ≤ 1
F w = x1 + · · · + xn : number of 1s in samples

Posterior density: Beta(w + α, n − w + β)


w +α w +α x1 +···+xn +α
I Posterior mean = w +α+n−w +β = n+α+β = n+α+β

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 11 / 15


Bernoulli(p) samples with beta prior

X1 , . . . , Xn ∼ iid Bernoulli(p)

Prior p ∼ Beta(α, β), continuous distribution


I fp (p) ∝ p α−1 (1 − p)β−1 , 0 ≤ p ≤ 1

Samples: x1 , . . . , xn
Posterior: p|(X1 = x1 , . . . , Xn = xn ) is continuous
I Posterior density ∝ P(X1 = x1 , . . . , Xn = xn |p = p)fp (p)
I Posterior density ∝ p w +α−1 (1 − p)n−w +β−1 , 0 ≤ p ≤ 1
F w = x1 + · · · + xn : number of 1s in samples

Posterior density: Beta(w + α, n − w + β)


w +α w +α x1 +···+xn +α
I Posterior mean = w +α+n−w +β = n+α+β = n+α+β

X1 + · · · + Xn + α
p̂ =

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 11 / 15


Observations for Beta prior

Prior: Beta(α, β)
I α, β ≥ 0
I PDF ∝ p α−1 (1 − p)β−1 , 0 < p < 1
I How to pick α, beta?
α = β = 1: Uniform[0, 1]
I Flat prior
I Estimate close to, but not equal to, Maximum-Likelihood
α=β=0
I Estimate coincides with Maximum-Likelihood
α=β
I Symmetric prior
α, β may depend
√ on n, the number of samples
I α=β= n/2 is an interesting choice

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 12 / 15


Normal samples with unknown mean and known variance

X1 , . . . , Xn ∼ iid Normal(M, σ 2 )

Prior M ∼ Normal(µ0 , σ02 ), continuous distribution


2
I fM (µ) = √ 1
2πσ0
exp(− (µ−µ
2σ 2
0)
)
0

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 13 / 15


Normal samples with unknown mean and known variance

X1 , . . . , Xn ∼ iid Normal(M, σ 2 )

Prior M ∼ Normal(µ0 , σ02 ), continuous distribution


2
I fM (µ) = √ 1
2πσ0
exp(− (µ−µ
2σ 2
0)
)
0

Samples: x1 , . . . , xn , Sample mean: x = (x1 + · · · + xn )/n


Posterior: M|(X1 = x1 , . . . , Xn = xn ) is continuous
I Posterior density ∝ f (X1 = x1 , . . . , Xn = xn |M = µ)fM (µ)
2 2 2
I Posterior density ∝ exp(− (x1 −µ) +···+(x
2σ 2
n −µ)
) exp(− (µ−µ
2σ 2
0)
)
0

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 13 / 15


Normal samples with unknown mean and known variance

X1 , . . . , Xn ∼ iid Normal(M, σ 2 )

Prior M ∼ Normal(µ0 , σ02 ), continuous distribution


2
I fM (µ) = √ 1
2πσ0
exp(− (µ−µ
2σ 2
0)
)
0

Samples: x1 , . . . , xn , Sample mean: x = (x1 + · · · + xn )/n


Posterior: M|(X1 = x1 , . . . , Xn = xn ) is continuous
I Posterior density ∝ f (X1 = x1 , . . . , Xn = xn |M = µ)fM (µ)
2 2 2
I Posterior density ∝ exp(− (x1 −µ) +···+(x
2σ 2
n −µ)
) exp(− (µ−µ
2σ 2
0)
)
0

Posterior density: Normal


nσ 2 σ 2
I Posterior mean = x nσ2 +σ
0
2 + µ0 nσ 2 +σ 2
0 0

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 13 / 15


Normal samples with unknown mean and known variance

X1 , . . . , Xn ∼ iid Normal(M, σ 2 )

Prior M ∼ Normal(µ0 , σ02 ), continuous distribution


2
I fM (µ) = √ 1
2πσ0
exp(− (µ−µ
2σ 2
0)
)
0

Samples: x1 , . . . , xn , Sample mean: x = (x1 + · · · + xn )/n


Posterior: M|(X1 = x1 , . . . , Xn = xn ) is continuous
I Posterior density ∝ f (X1 = x1 , . . . , Xn = xn |M = µ)fM (µ)
2 2 2
I Posterior density ∝ exp(− (x1 −µ) +···+(x
2σ 2
n −µ)
) exp(− (µ−µ
2σ 2
0)
)
0

Posterior density: Normal


nσ 2 σ 2
I Posterior mean = x nσ2 +σ
0
2 + µ0 nσ 2 +σ 2
0 0

X1 + · · · + Xn nσ02 σ2
µ̂ = + µ 0
n nσ02 + σ 2 nσ02 + σ 2
Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 13 / 15
Oberservations for Normal prior

Prior: Normal(µ0 , σ02 )


I How to pick µ0 and σ0 ?
Estimate is combination of data and prior
I Prior is “updated” using data to get posterior
If n is very large, µ̂ → sample mean
I Data dominates the estimate
I Prior plays no significant role
If n is small, prior contributes significantly to the estimate
I Prior needs to have some justification when n is small
If variance of prior is large compared to variance of samples, prior tends
to be flat or uninformative
I Choice of variance of prior is important

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 14 / 15


Section 3

Problems: Finding estimators

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 15 / 34


Problem 1
Suppose X is a discrete random variable taking values {0, 1, 2, 3} with
respective probabilities {2θ/3, θ/3, 2(1 − θ)/3, (1 − θ)/3}, where 0 ≤ θ ≤ 1
is a parameter. Consider the estimation of θ from samples
2, 2, 0, 3, 1, 3, 2, 1, 2, 3.
Find the method of moments and maximum likelihood estimates.
Using a Uniform[0, 1] prior, find the posterior distribution and mean.

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 16 / 34


Working

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 17 / 34


Problem 2
Consider n iid samples from a Geometric(p) distribution.
Find the method of moments estimate.
Find the MLE.
Using a Uniform[0, 1] prior, find the posterior distribution and mean.

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 18 / 34


Working

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 19 / 34


Problem 3
Consider n iid samples from a Poisson(λ) distribution.
Find the method of moments estimate.
Find the MLE.
Using a Gamma[α, β] prior, find the posterior distribution and mean.

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 20 / 34


Working

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 21 / 34


Section 4

Problems: Fitting distributions

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 22 / 34


Problem 1
Fit a Poisson distribution to the following frequency data on number of
vehicles (n) making a right turn at an intersection in a 3-minute interval.
Find an approximate 95% confidence interval for the sample mean using a
normal approximation for the sampling distribution.

n Frequency n Frequency
0 14 7 14
1 30 8 10
2 36 9 6
3 68 10 4
4 43 11 1
5 43 12 1
6 30 13+ 0

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 23 / 34


Problem 2
Fit a Geometric distribution to the following frequency data on number of
hops (n) between flights of birds. Find an approximate 95% confidence
interval.

n Frequency n Frequency
1 48 7 4
2 31 8 2
3 20 9 1
4 9 10 1
5 6 11 2
6 5 12 1

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 25 / 34


Working

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 26 / 34


Problem 3
Data from a genetic experiment and expected distribution in terms of an
unknown parameter θ are given in the following table.

Type Frequency Theory


1 1997 0.25(2 + θ)
2 906 0.25(1 − θ)
3 904 0.25(1 − θ)
4 32 0.25θ

Find the ML estimate for θ.

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 27 / 34


Working

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 28 / 34


Section 5

Problems: Model and estimation

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 29 / 34


Problem 1
To find the size of an animal population, 100 animals are captured and
tagged. Some time later, another 50 animals are captured, and 20 of them
were found to be tagged. How will you estimate the population size? What
are your assumptions?

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 30 / 34


Working

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 31 / 34


Problem 2
In a new machine, suppose that, out of 10 produced items, no item was
found to be defective. How will you estimate the fraction of defective items
produced by the new machine? From data collected from other similar
machines, the average of the fraction of defective items was found to be
10%, and the actual fraction was between 5% and 15% in 95% of the cases.

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 32 / 34


Working

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 33 / 34


Problem 3
Colab sheet

Andrew Thangaraj (IIT Madras) Bayesian estimation and hypothesis testing 34 / 34

You might also like