Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

STA3002: Generalized Linear Models Spring 2023

Homework #5
This is the homework for week 7&8. The due date is midnight, March 29th , 2023. Late submission
will automatically result in “0” grade.

1 Computation Questions
(Question1): (15 points) Let Yi be the number of successes in ni trials with

Yi ∼ Bin(ni , πi )

where the probabilities πi have a Beta distribution

πi ∼ Beta(α, β).

The probability density function for the beta distribution is

f (x; α, β) = xα−1 (1 − x)β−1 /B(α, β)

for x ∈ [0, 1], α > 0 and


R 1 β > 0, and the beta function B(α, β) defining the normalizing
constant to ensure that 0 f (x; α, β)dx = 1. Let α/(α + β), and hence, show that,
θ(1−θ)
(a) E[π] = θ, and var(π) = α+β+1

(b) E[Yi ] = ni θ
(c) var(Yi ) = ni θ(1 − θ)[1 + (ni − 1)ϕ] so that var(Yi ) is larger than the Binomial variance
(unless ni = 1 or ϕ = 0).
(Question2): (15 points) Sometimes, count data explicitly omit zero counts. Examples include
the numbers of days patients spend in hospital (only patients who actually stay overnight in
hospital are considered, and so the smallest possible count is one); the number of people per
car using a rural road (the driver at least must be in the car); and a survey of the number of
people living in each household (to respond, the households must have at least one person).
Using a Poisson distribution is inadequate, as the zero counts will be modelled as true zero
counts.
In these situations, the zero-truncated Poisson distribution may be suitable, with probability
function
e−λ λy
P (y; λ) =
{1 − exp(−λ)}y!
where y = 1, 2, ... and λ > 0.
(a) Show that the truncated Poisson distribution is an edm by identifying θ and κ(θ).
(b) Show that µ = E[y] = λ/{1 − exp(−λ)}, and that µ > 1.
(c) Find the variance function for the truncated Poisson distribution.
(d) Plot the truncated Poisson distribution and the Poisson distribution for λ = 2, and
compare.

1
2 Programming Questions
1. (25 points) The Independent newspaper tabulated the gender of all candidates running for
election in the 1992 British general election (dataset: belection)
(a) Plot the proportion of female candidates against the Party, and comment.
(b) Plot the proportion of female candidates against the Region, and comment.
(c) Find a suitable binomial glm, ensuring a diagnostic analysis.
(d) Is overdispersion evident?
(e) Interpret the fitted model.
(f) Estimate and interpret the odds of a female candidate running for the Conservative and
Labour parties. Then compute the odds ratio of the Conservative party fielding a female
candidate to the odds of the Labour party fielding a female candidate.
(g) Determine if the saddlepoint approximation is likely to be suitable for these data.
2. (20 points) In a study of depressed women, women were classified into groups (dataset:
dwomen) based on their depression level (Depression), whether a severe life event had oc-
curred in the last year (SLE), and if they had three children under 14 at home (Children).
Model these counts using a Poisson glm, and summarize the data if possible. (Hint: consider
loglinear model for contingency tables, and explain the dependence among three factors.)

3. (25 points) The number of deaths for 1969–1973 (1969–1972 for Belgium) due to cervical
cancer is tabulated (Table 10.14; data set: cervical) by age group for four different countries
(a) Plot the data, and discuss any prominent features.
(b) Explain why an offset is useful when fitting a glm to the data.
(c) Fit a Poisson glm with Age and Country as explanatory variables. Produce the plot of
residuals against fitted values, and evaluated the model.
(d) Fit the corresponding quasi-Poisson model. Produce the plot of residuals against fitted
values, and evaluated the model.
(e) Fit the corresponding negative binomial glm. Produce the plot of residuals against fitted
values, and evaluated the model.
(f) Which model seems appropriate, if any?

You might also like