Download as pdf or txt
Download as pdf or txt
You are on page 1of 3



SC 517 - Nonparametric & Categorical Data Analysis
Assignment #2

Due Date: On or before 13th May 2022 at 11.59 pm. Answer All Questions.

1. The data in following table comes from an study which compares intra-muscular magnesium
injections with placebo for the treatment of chronic fatigue syndrome.

Magnesium Placebo Total

Felt better 7 2 9
Did not feel better 4 9 13
Total 11 11 22

Out of the 11 patients who had the intra-muscular magnesium injections 7 felt better whereas,
out of the 11 on placebo, only two felt better.

(i) Test whether there is a significance difference between the two treatments with respect to
feeling better or not at 5% significance level. (Show your all work.)
(ii) By using R, test whether intra-muscular magnesium is better than placebo for the treat-
ment of chronic fatigue syndrome at 5% significance level. (Include the R codes you used
and show the output.)

2. The following table shows results from a matched case-control study. A study of effects on
birth weight matched each case in which the child was underweight with a control in which the
child had normal weight. The mothers, who were matched according to their age, were asked
whether they were smokers.

Test whether the proportion of low birth weight of smoking mothers and the proportion of
normal birth weight of smoking mothers are equal at 5% significance level.
3. A study investigated characteristics associated with y = whether a cancer patient achieved
remission (1 = yes, 0 = no). An important explanatory variable was a labeling index (LI =
percentage of “labeled” cells) that measures proliferative activity of cells after a patient receives
an injection of tritiated thymidine. The data and R output for a logistic regression model is
shown below.

(i) Explain why a logistic regression model is suitable to model these data.
(ii) State the model in terms of the unknown population parameters and the explanatory
variable being considered.
(iii) State the form of the estimated model in logit terms.
(iv) Compute the median effective level.
(v) When LI increases by 1, show that the estimated odds of remission multiply by 1.16.
(vi) Summarize the LI effect by how P̂ (Y = 1) changes over the range or interquartile range
of LI values.
(vii) Test the hypothesis for the effect of LI on achieving remission of a cancer patient at 5%
level of significance and clearly write your conclusions.

Page 2
4. Sudden death is an important, lethal cardiovascular endpoint. Most previous studies of risk
factors for sudden death have focused on men. Looking at this issue for women is important
as well. For this purpose, data were used from the Framingham Heart Study. Several potential
risk factors, such as age, blood pressure and cigarette smoking are of interest and need to be
controlled for simultaneously. Therefore a multiple logistic regression was fitted to these data
as shown below. The response is 2-year incidence of sudden death in females without prior
coronary heart disease.

(i) Assess the statistical significance of the individual risk factors and explain the practical
implications of your findings.
(ii) Give brief interpretations of the age and vital capacity coefficients.
(iii) Provide 95% confidence interval for odds ratio for the variable ‘age’.
(iv) Predict the probability of sudden death for a 50 year old woman with systolic blood
pressure of 120 mmHg, a relative weight of 100% a cholesterol level of 250 mg/100mL, a
glucose level of 100 mg/100mL, a hematocrit of 40%, and a vital capacity of 450 centiliters
who smokes 10 cigarettes per day. (Note that these numbers are near average for a healthy
woman except for the cholesterol level which is high, and of course the number of cigarettes

Page 3

You might also like