Professional Documents
Culture Documents
3 Practical
3 Practical
Hadji
25 February 2022 TAs: S. Rice
Practical 3
Gibbs sampling
Aim: Gibss sampling in practice. Application to model selection in a linear regression setting.
Prerequisite: Knowledge from IMS. Lecture 1, 2 & 3 of MS2
In this practical, we would like to perform Bayesian linear regression to explain the weight at
birth of babies. Download the data:
> i n s t a l l . packages ( ”MASS” )
> require (MASS)
> data ( b i r t h w t )
There are 7 explanatory variables:
1 lwt mother’s weight before pregnancy
2 race mother’s race (1=white, 2=black, 3=other); categorical variable
3 smoke smoking status during pregnancy
4 ptl number of previous premature labours
5 ht history of hypertension
6 ui presence of uterine irritability
7 ftv number of physician visits during the first trimester
The output variable is bwt, the birth weight in grams. There is an additional output variable:
low which indivates whether the birth weight was less that 2500 grams. Our linear model is
1. Transform the categorical variable race so that it can be used in linear regression.
2. Frequentist analysis - Give the frequentist estimates of β = (β0 , ..., β8 ) and σ 2 . You can
use the R function lm.
> summary(lm( y˜X) )
We would like to know which variables to include in our model, and assume that the
intercept is necessarily included. We have 23 = 8 possible models. To each model, we
associate the variable γ = (γ1 , γ2 , γ3 ) where γi = 1 if xi is included in the model, and
γi = 0 otherwise. Let Xγ be the submatrix of X where we only keep the columns i such
that γi = 1.
Using the marginal probability formula of question 3(c)i, compute for each model its
marginal likelihood m(y|Xγ ). Deduce the most likely model a posteriori.
5. Model choice using Gibbs sampling - We now consider all 8 explanatory variables.
We thus need to choose between 29 models. We will use a Gibbs sampler on γ = (γi )8i=1 ,
where
(k+1)
!
(k+1) ∗ exp{m(y|Xγ ∗ , γi = 1)}
γi |γi− ∼ Bernoulli (k+1) (k+1)
exp{m(y|Xγ ∗ , γi = 0)} + exp{m(y|Xγ ∗ , γi = 1)}
(k+1) (k+1) (k+1) (k) (k)
γ ∗ = (γ1 , ..., γi−1 , γi , γi+1 , ..., γ8 ).
In other words, at the iteration (k + 1), we choose to keep the variable i depending on the
(k+1) (k)
values of (γj )j<i and (γj )j>i with a certain probability given in the Bernoulli. We
propose to use the following algorithm:
(0)
• Initialization - For i ∈ {1, ..., 8}: generate γi ∼ Bernoulli(1/2)
• At iteration k:
∗ (k)
– For i ∈ {1, ..., 8}: generate γi |γi−