Professional Documents
Culture Documents
MAS473 Exercises3
MAS473 Exercises3
MAS473 Exercises3
Exercises 3
The solutions to these exercises are available now on MOLE. If you would like a ‘formal’ check
on your progress, you can choose to ignore the solutions and hand in your work for marking.
Alternatively, you can first attempt the exercises, then check them against the solutions, and
then ask for feedback on anything you are still unsure of (I suggest you annotate your work). If
you have done everything correctly, I probably won’t have much to say (other than well done!).
Distance learners should email their work to me. Everyone else should hand in their work at
the lecture. Please do not submit work via MOLE or the Hicks Reception. See the MOLE
discussion board for the deadline. Please write your name (rather than registration number) on
your work.
Note that these questions are also typical exam questions, and will serve as a past exam
paper for the missing data part of the module.
1. In the lecture notes we considered exponential random variables which were right-censored
(i.e., values bigger than some threshold c were missing). Here, we will consider left-censored
exponential random variables.
Suppose the sequence of random variables X1 , . . . , Xn are iid with an exponential distri-
bution with mean λ1 , i.e., they have pdf
(
λ exp(−λx) for x ≥ 0
f (x; λ) =
0 otherwise.
Suppose further that Xi is missing if Xi < ci for some set of thresholds c1 , . . . , cn . Finally,
suppose that we are told that X1 , . . . , Xr are observed, but Xr+1 , . . . , Xn are missing. The
data can thus be summarized by the thresholds, the observed values of x, and the pattern
of missingness.
so that
E(X) = E(X|X < ci )P(X < ci ) + E(X|X ≥ ci )P(X ≥ ci )
2
as {X < ci }0 = {X ≥ ci }.
Hence, or otherwise, show that
1 − (1 + λci )e−λci
Li := E(Xi |Xi ≤ ci ) = .
λ(1 − e−λci )
2. The mice package contains the dataset mammalsleep. The data are from Allison and
Cicchetti (1976), and records information on the interrelationship between sleep, ecological,
and constitutional variables for 62 mammal species. The dataset contains missing values
on five variables. The covariates are
i) Use the commands md.pattern and md.pairs to describe the pattern of missingness
in the data.
ii) Use the mice package to create 10 imputed datasets.
iii) For each of your imputed datasets, fit a linear model to predict ts from brw and bw.
iv) Using the pool command, combine these estimates to give an expected value and a
standard error for the coefficient of bw.
v) Repeat this calculation by coding up the pooling estimates by yourself (i.e. using
Rubin’s formulas).