Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

MS2 Instructors: A.

Hadji
25 February 2022 TAs: S. Rice

Practical 3
Gibbs sampling

Aim: Gibss sampling in practice. Application to model selection in a linear regression setting.
Prerequisite: Knowledge from IMS. Lecture 1, 2 & 3 of MS2

In this practical, we would like to perform Bayesian linear regression to explain the weight at
birth of babies. Download the data:
> i n s t a l l . packages ( ”MASS” )
> require (MASS)
> data ( b i r t h w t )
There are 7 explanatory variables:
1 lwt mother’s weight before pregnancy
2 race mother’s race (1=white, 2=black, 3=other); categorical variable
3 smoke smoking status during pregnancy
4 ptl number of previous premature labours
5 ht history of hypertension
6 ui presence of uterine irritability
7 ftv number of physician visits during the first trimester
The output variable is bwt, the birth weight in grams. There is an additional output variable:
low which indivates whether the birth weight was less that 2500 grams. Our linear model is

yi = β0 + β1 x1i + ... + β8 x8i + εi


εi ∼ N (0, σ 2 )

1. Transform the categorical variable race so that it can be used in linear regression.

2. Frequentist analysis - Give the frequentist estimates of β = (β0 , ..., β8 ) and σ 2 . You can
use the R function lm.
> summary(lm( y˜X) )

3. Bayesian inference - Consider the following prior:

β|σ 2 , X ∼ N (β̃, gσ 2 (X T X)−1 )


P (σ 2 ) ∝ 1/σ 2

(a) Take β̃ = 0. Calculate the posterior distribution of (β, σ 2 ).


(b) For g = 0.1, 1, 10, 100, 1000, give E[σ 2 |y, X] and E[β0 |y, X]. What can you conclude
about the impact of the prior on the posterior?
(c) We would like to test the hypothesis H0 : β7 = β8 = 0
i. Give the explicit form of the marginal likelihood m(y|X).
ii. Deduce the expression of the corresponding Bayes factor.
iii. Compute this quantity given our data and conclude, using Jeffreys scale of evi-
dence
4. Model choice - In this question, we restrict ourselves to the first 3 explanatory variables
> X1 = X [ , 1 : 4 ]

We would like to know which variables to include in our model, and assume that the
intercept is necessarily included. We have 23 = 8 possible models. To each model, we
associate the variable γ = (γ1 , γ2 , γ3 ) where γi = 1 if xi is included in the model, and
γi = 0 otherwise. Let Xγ be the submatrix of X where we only keep the columns i such
that γi = 1.
Using the marginal probability formula of question 3(c)i, compute for each model its
marginal likelihood m(y|Xγ ). Deduce the most likely model a posteriori.

5. Model choice using Gibbs sampling - We now consider all 8 explanatory variables.
We thus need to choose between 29 models. We will use a Gibbs sampler on γ = (γi )8i=1 ,
where
(k+1)
!
(k+1) ∗ exp{m(y|Xγ ∗ , γi = 1)}
γi |γi− ∼ Bernoulli (k+1) (k+1)
exp{m(y|Xγ ∗ , γi = 0)} + exp{m(y|Xγ ∗ , γi = 1)}
(k+1) (k+1) (k+1) (k) (k)
γ ∗ = (γ1 , ..., γi−1 , γi , γi+1 , ..., γ8 ).

In other words, at the iteration (k + 1), we choose to keep the variable i depending on the
(k+1) (k)
values of (γj )j<i and (γj )j>i with a certain probability given in the Bernoulli. We
propose to use the following algorithm:

(0)
• Initialization - For i ∈ {1, ..., 8}: generate γi ∼ Bernoulli(1/2)
• At iteration k:
∗ (k)
– For i ∈ {1, ..., 8}: generate γi |γi−

(a) Implement this algorithm.


(k)
(b) Run the algorithm for 10,000 iterations. Plot the trajectory of (γi ), the histogram of
(k)
the (γi ) over the last 9,000 iterations, and the autocorrelations (use the R function
acf).
(c) Give a Monte Carlo estimate of the posterior mean of γ.
(d) Which explanatory variables would you keep in the model?

You might also like