2019 Ex3

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

MAS472/6004: Assessed Exercises 3

Richard Wilkinson
8 April 2019

This is the first set of assessed exercises for MAS6004/472.


• It is worth 5% of the module mark.
• The deadline for submission is 12pm (noon) on Tuesday 7 May.
• All work should be submitted through MOLE.
• Requests for extensions will require a medical note.
• An integer mark will be awarded out of 5 for each piece of coursework.
• Solutions will be provided.
• Please use Rmarkdownhttp://rmarkdown.rstudio.com/authoring_quick_tour.html to produce your
solutions (pdf, html or word is acceptable as a submission format). This will ensure that your work is
reproducible, help you to avoid errors in your coding, and will ensure I can see all the steps you took in
producing your solution.
• Please use a filename of the form StudentNumber_Name.pdf. So if your student number is 12345 and
your name Aloysius, you would name your file 12345_Aloysius.pdf.
Present your work in exam format: you must include all your working and present your solutions clearly, but
otherwise, no marks will be awarded for presentation or commentary.
Your submitted solutions must be entirely your own work: do not work with anyone else on your exercises.

Exercise 3

In this set of exercises we will use importance sampling to do Bayesian inference for the parameters in a
normal linear model. Consider a model of the form

y = a + bx +  where  ∼ N (0, σ 2 ).

This model has three parameters, θ = (a, b, σ 2 ). We will use the prior distributions
a, b ∼ N (0, 102 ) and σ 2 ∼ Exp(0.001)
where Exp(0.001) denotes an exponential distribution with mean 1000. All three parameters are considered
to be independent a priori. The log-likelihood for this problem is

n
n 1 X
log L(θ) = − log(2πσ 2 ) − 2 (yi − (a + bxi ))2
2 2σ i=1
where n is the number of observations.
1. We will use the hills dataset from the MASS package and let y =time and x =dist. Write a function
in R to calculate the log-likelihood for any given value of a, b and σ 2 , and another which calculates
the log-prior density for any parameter value. As a check, you should find that the log-likelihood at
θ = (−5, 8, 300) is −154.5, and the log-prior is −14.1.
Note that you can fit a linear model to this data set using maximum likelihood as follows:
library(MASS)
fit <- lm(time~dist, data = hills)
n <- dim(hills)[1]
sigma2 <- deviance(fit)/n

1
2. Find the maximum a posteriori (MAP) estimators of a, b and σ 2 , i.e., the values that maximize the
posterior (you can use numerical optimization if you wish).
3. Estimate the posterior mean of the three parameters using importance sampling. Use the prior
distribution as the proposal density, and use N = 105 particles.
4. Use the Laplace approximation to the posterior to build a multivariate Gaussian proposal density, i.e.,
use a multivariate Gaussian proposal distribution, using the map estimate as the mean, and construct
the covariance matrix using the Hessian matrix (please report the value of the Hessian matrix you
use). Use this in an importance sampling scheme to again estimate the posterior mean (use N = 105 as
before).
5. Is this a better choice of importance distribution g than the prior distribution?
6. Calculate a 95% credibility interval for a, b and σ 2 , i.e., an interval for which the posterior probability
the parameter lies within it is 0.95. Plot the marginals of the posterior distribution for a, b and σ 2 .

You might also like