Harare Institute of Technology

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

HARARE INSTITUTE OF TECHNOLOGY SST216

Biostatistics
Pharmacy, Biotechnology, Food Processing Technology
November 2017
Time : 3 hours

Instructions to Candidates

1 This question paper consists of six (6) printed pages including the cover page.

2 A formula sheet should accompany this question paper, if it is not provided ask for it.

3 Answer ALL questions in section A and any THREE questions in section B.

page 1 of 6
SST216

SECTION A (40 marks)


Candidates may attempt ALL questions being careful to number them A1 to A4.

A1. The following data pertain to weights (in kg) of malaria patients at a clinic in Harare.

44 45.2 62.5 30 75 102.5 48.2 73.5


85 66 58.2 61.5 77 53.4 49 118

(a) Calculate the median, the first quartile and the third quartile. [1,2,2]
(b) Construct a box plot of the weights. [4]
(c) Comment on whether or not there are any outliers. [1]

A2. (a) Suppose that a new screening test is proposed for the detection of fracture. The
prevalence of fracture in the general population is 10%. The test has been inves-
tigated in fracture subjects and was found to give positive result in 80% of such
cases (sensitivity). When given to subjects without fracture, the test yielded a
positive result of 5%. What is the proportion of subjects with positive test whom
when followed up will actually be found to have fracture. [5]
(b) Treatment Y causes a toxic reaction in 20% of persons to whom it is administered.
If 5 persons on treatment Y are selected at random, calculate the probability that:
(i) exactly 2 of them will have a toxic reaction. [2]
(ii) At most 2 of them will have a toxic reaction. [3]

A3. (a) A hospital researcher wishes to estimate the mean weight of full term new borns.
A random sample of 20 full term newborns had a mean birth weight of 2.7kg with
a variance of 0.5kg. Compute a 95% confidence interval for the mean weight of
all full term new borns. [3]
(b) A study is conducted over 10 days to investigate long term complication rates in
patients treated with two different therapies. The data are shown below:

Complications
Therapy No Yes
1 911 89
2 873 127

(i) Compute a 95% confidence interval for the difference in proportions in pa-
tients with complications. [5]
(ii) Based on (b)(i) above, is there a significant difference in proportions. [2]

page 2 of 6
SST216

A4. A small study is conducted involving 15 infants to investigate the association between
gestational age at birth, measured in weeks and birth weight (in kg).
Gestational age 34.7 36 29.3 40.1 35.7 42.4 40.3 37.3 40.9
Birth weight 1.9 2.03 1.44 2.84 3.09 3.83 3.26 2.69 3.29
Gestational age 38.3 38.5 41.4 39.7 39.7 41.1
Weight 2.92 3.43 3.66 3.69 3.35 3.26
(a) Calculate the correlation coefficient between gestational age and birth weight. [4]
(b) Test the hypothesis H0 : ρ = 0 versus H1 : ρ 6= 0 at the 5% level of significance,
where ρ is the population correlation coefficient between gestational age and birth
weight. [6]

SECTION B (60 marks)


Candidates may attempt THREE questions being careful to number them B5 to B10.

B5. The following data pertain to 11 randomly selected people’s maximum heart rate (Y)
and their age in years (X).
Age 30 38 41 38 29 39 46 41 42 24 49
Rate 186 183 171 177 191 177 174 176 171 196 168
(a) Plot Y versus X and comment on the relationship between X and Y. [3]
(b) Fit a simple linear regression model of the form Y = β0 + β1 xi + i . [4]
(c) Interpret β1 in the context of this problem. [2]
(d) Construct an analysis of variance table and test for the significance of regression
at the 5% level of significance. [7]
(e) Calculate R2 and comment on it. [2]
(f) Predict the maximum heart rate of a person who is 43 years old. [2]

B6. Calcium is an important mineral that regulates the heart, is important for blood clot-
ting and for building healthy bones. A study is designed to test whether there is a
difference in mean daily calcium intake in adults with normal bone density, adults with
osteopenia (a low bone density which may lead to osteoporosis) and adults with osteo-
porosis. Adults 60 years of age with normal bone density, osteopenia and osteoporosis
are selected at random from hospital records and their daily calcium intake is measured
based on reported food intake and supplements.
Normal bone density Osteopenia Osteoporosis
1200 100 680
1000 1100 400
980 700 550
900 800 420
750 500 200
800 700 250

page 3 of 6
SST216

(a) Is there a statistically significant difference in mean calcium intake in patients


with normal bone density as compared with osteopenia and osteoporosis. Use
α = 0.05. [10]
(b) Use Fisher’s LSD to compare the means calcium intake in normal bone density
with that of patients with Osteopenia. [5]
(c) Use the Scheffe procedure to compare means of calcium intake of patients with
osteopenia with patients with osteoporosis. [5]

B7. (a) Genetic counselors work with pregnant women (usually women at high risk of
fetal abnormalities or those who might not be at high risk but screen positive for
abnormalities based on standard screening tests) and hypothesize that about half
of all abnormalities are trisomy 21, one third are trisomy 18 and the remainder
are trisomy 13. A sample of 200 women who deliver babies with abnormalities
were included in the study.

trisomy 21 trisomy 18 trisomy 13


number of women 107 70 23

Based on the data, is the assumption regarding the distribution of abnormalities


appropriate? [8]
(b) Consider the study in (a) above. Suppose we wish to test if there is a difference in
distribution of abnormalities among clinical centres. Based on the following data,
is there evidence of a significant relationship between the types of abnormalities
and clinical centre? Use α = 0.05
Type of abnormality
trisomy 21 trisomy 18 trisomy 13 Total
Centre A 107 70 23 200
Centre B 65 62 23 150
Centre C 60 32 8 100
Total 232 164 54 450

[12]

B8. (a) According to tables provided by the National Centre for Health Statistics in Vital
Statistics of the United States, a person of age 20 years has about an 80% chance
of being alive at the age of 65 years. Suppose that 500 people of age 20 years are
selected at random. Find the probability that:
(i) Exactly 390 of them will be alive at age 65. [2]
(ii) Between 375 and 425 of them, inclusive, will be alive at age 65. [4]
(b) The probability of contracting HIV in a single sexual encounter is 1 in 500. Assume
each encounter is independent and carries the same risk. What is the probability
of contracting HIV on the 5th encounter. [4]

page 4 of 6
SST216

(c) Suppose that births in a hospital occur randomly at an average rate of 1.5 births
per hour.
(i) What is the probability of observing at most 2 births per hour? [3]
(ii) What is the probability of observing five births in a given two hour period. [3]
(iii) What is the probability that at least two hours will elapse between births? [4]

B9. The following data pertains to length of stay (in days) in a hospital by TB patients.

number of days 1 − 10 11 − 20 21 − 30 31 − 40 41 − 50 51 − 60
Frequency 4 3 10 7 6 5

Calculate

(a) The mean length of stay in hospital by TB patients. [2]


(b) The median length of stay. [4]
(c) The mode. [4]
(d) Variance and standard deviation. [5,1]
(e) Mean absolute deviation. [4]

B10. A soft drink bottler is analysing the vending machine service routes in his destination
system. He is interested in predicting the amount of time required by the route driver
to service the vending machines in an outlet. This service activity includes stocking
the machine with beverage products and minor maintenance or housekeeping. The
industrial engineer responsible for the study has suggested that the two most important
variables affecting the delivery time (y) are the number of cases of product stocked (x1 )
and distance walked by the route driver (x2 ). The following data was recorded:

y 16.68 11.5 12.03 14.88 13.75 18.11 8.00 17.83 79.24 21.5 40.33 21
x1 7 3 3 4 6 7 2 7 30 5 16 10
x2 500 220 340 80 150 330 110 210 1460 605 688 215

y 13.5 19.75 24 29 15.36 19 9.5 35.1 17.9 52.32 18.75 19.83 10.75
x1 4 6 9 10 6 7 3 17 10 26 9 8 4
x2 255 462 448 776 200 132 36 770 140 810 450 635 150

Assume a regression model of the form yi = β0 + β1 x1i + β2 x2i + i

(a) Write down the necessary commands necessary to fit the above model in R. [5]
(b) After running your model in R, the following output was obtained:

page 5 of 6
SST216

Call:
lm(formula = y ~ x1 + x2, data = data1)

Residuals:
Min 1Q Median 3Q Max
-5.8346 -0.8747 0.4535 1.1912 7.2886

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.35126 1.07550 2.186 0.039727 *
x1 1.59279 0.16962 9.390 3.75e-09 ***
x2 0.01494 0.00360 4.151 0.000417 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 3.202 on 22 degrees of freedom


Multiple R-squared: 0.961, Adjusted R-squared: 0.9575
F-statistic: 271.1 on 2 and 22 DF, p-value: 3.163e-16

Interpret the above outputconcentratig on: the fitted equation, significance of


parameter estimates and the interpretation of R2 . [6]
(c) Estimate the delivery time (y) if the number of cases stocked is 25 and distance
is 780. [2]
(d) Based on the above output , test for the significance of regression. [4]
(e) Based on the following residual plots, comment on the adequacy of the regression
model used. [3]

END OF QUESTION PAPER

page 6 of 6

You might also like