Professional Documents
Culture Documents
Harare Institute of Technology
Harare Institute of Technology
Harare Institute of Technology
Biostatistics
Pharmacy, Biotechnology, Food Processing Technology
November 2017
Time : 3 hours
Instructions to Candidates
1 This question paper consists of six (6) printed pages including the cover page.
2 A formula sheet should accompany this question paper, if it is not provided ask for it.
page 1 of 6
SST216
A1. The following data pertain to weights (in kg) of malaria patients at a clinic in Harare.
(a) Calculate the median, the first quartile and the third quartile. [1,2,2]
(b) Construct a box plot of the weights. [4]
(c) Comment on whether or not there are any outliers. [1]
A2. (a) Suppose that a new screening test is proposed for the detection of fracture. The
prevalence of fracture in the general population is 10%. The test has been inves-
tigated in fracture subjects and was found to give positive result in 80% of such
cases (sensitivity). When given to subjects without fracture, the test yielded a
positive result of 5%. What is the proportion of subjects with positive test whom
when followed up will actually be found to have fracture. [5]
(b) Treatment Y causes a toxic reaction in 20% of persons to whom it is administered.
If 5 persons on treatment Y are selected at random, calculate the probability that:
(i) exactly 2 of them will have a toxic reaction. [2]
(ii) At most 2 of them will have a toxic reaction. [3]
A3. (a) A hospital researcher wishes to estimate the mean weight of full term new borns.
A random sample of 20 full term newborns had a mean birth weight of 2.7kg with
a variance of 0.5kg. Compute a 95% confidence interval for the mean weight of
all full term new borns. [3]
(b) A study is conducted over 10 days to investigate long term complication rates in
patients treated with two different therapies. The data are shown below:
Complications
Therapy No Yes
1 911 89
2 873 127
(i) Compute a 95% confidence interval for the difference in proportions in pa-
tients with complications. [5]
(ii) Based on (b)(i) above, is there a significant difference in proportions. [2]
page 2 of 6
SST216
A4. A small study is conducted involving 15 infants to investigate the association between
gestational age at birth, measured in weeks and birth weight (in kg).
Gestational age 34.7 36 29.3 40.1 35.7 42.4 40.3 37.3 40.9
Birth weight 1.9 2.03 1.44 2.84 3.09 3.83 3.26 2.69 3.29
Gestational age 38.3 38.5 41.4 39.7 39.7 41.1
Weight 2.92 3.43 3.66 3.69 3.35 3.26
(a) Calculate the correlation coefficient between gestational age and birth weight. [4]
(b) Test the hypothesis H0 : ρ = 0 versus H1 : ρ 6= 0 at the 5% level of significance,
where ρ is the population correlation coefficient between gestational age and birth
weight. [6]
B5. The following data pertain to 11 randomly selected people’s maximum heart rate (Y)
and their age in years (X).
Age 30 38 41 38 29 39 46 41 42 24 49
Rate 186 183 171 177 191 177 174 176 171 196 168
(a) Plot Y versus X and comment on the relationship between X and Y. [3]
(b) Fit a simple linear regression model of the form Y = β0 + β1 xi + i . [4]
(c) Interpret β1 in the context of this problem. [2]
(d) Construct an analysis of variance table and test for the significance of regression
at the 5% level of significance. [7]
(e) Calculate R2 and comment on it. [2]
(f) Predict the maximum heart rate of a person who is 43 years old. [2]
B6. Calcium is an important mineral that regulates the heart, is important for blood clot-
ting and for building healthy bones. A study is designed to test whether there is a
difference in mean daily calcium intake in adults with normal bone density, adults with
osteopenia (a low bone density which may lead to osteoporosis) and adults with osteo-
porosis. Adults 60 years of age with normal bone density, osteopenia and osteoporosis
are selected at random from hospital records and their daily calcium intake is measured
based on reported food intake and supplements.
Normal bone density Osteopenia Osteoporosis
1200 100 680
1000 1100 400
980 700 550
900 800 420
750 500 200
800 700 250
page 3 of 6
SST216
B7. (a) Genetic counselors work with pregnant women (usually women at high risk of
fetal abnormalities or those who might not be at high risk but screen positive for
abnormalities based on standard screening tests) and hypothesize that about half
of all abnormalities are trisomy 21, one third are trisomy 18 and the remainder
are trisomy 13. A sample of 200 women who deliver babies with abnormalities
were included in the study.
[12]
B8. (a) According to tables provided by the National Centre for Health Statistics in Vital
Statistics of the United States, a person of age 20 years has about an 80% chance
of being alive at the age of 65 years. Suppose that 500 people of age 20 years are
selected at random. Find the probability that:
(i) Exactly 390 of them will be alive at age 65. [2]
(ii) Between 375 and 425 of them, inclusive, will be alive at age 65. [4]
(b) The probability of contracting HIV in a single sexual encounter is 1 in 500. Assume
each encounter is independent and carries the same risk. What is the probability
of contracting HIV on the 5th encounter. [4]
page 4 of 6
SST216
(c) Suppose that births in a hospital occur randomly at an average rate of 1.5 births
per hour.
(i) What is the probability of observing at most 2 births per hour? [3]
(ii) What is the probability of observing five births in a given two hour period. [3]
(iii) What is the probability that at least two hours will elapse between births? [4]
B9. The following data pertains to length of stay (in days) in a hospital by TB patients.
number of days 1 − 10 11 − 20 21 − 30 31 − 40 41 − 50 51 − 60
Frequency 4 3 10 7 6 5
Calculate
B10. A soft drink bottler is analysing the vending machine service routes in his destination
system. He is interested in predicting the amount of time required by the route driver
to service the vending machines in an outlet. This service activity includes stocking
the machine with beverage products and minor maintenance or housekeeping. The
industrial engineer responsible for the study has suggested that the two most important
variables affecting the delivery time (y) are the number of cases of product stocked (x1 )
and distance walked by the route driver (x2 ). The following data was recorded:
y 16.68 11.5 12.03 14.88 13.75 18.11 8.00 17.83 79.24 21.5 40.33 21
x1 7 3 3 4 6 7 2 7 30 5 16 10
x2 500 220 340 80 150 330 110 210 1460 605 688 215
y 13.5 19.75 24 29 15.36 19 9.5 35.1 17.9 52.32 18.75 19.83 10.75
x1 4 6 9 10 6 7 3 17 10 26 9 8 4
x2 255 462 448 776 200 132 36 770 140 810 450 635 150
(a) Write down the necessary commands necessary to fit the above model in R. [5]
(b) After running your model in R, the following output was obtained:
page 5 of 6
SST216
Call:
lm(formula = y ~ x1 + x2, data = data1)
Residuals:
Min 1Q Median 3Q Max
-5.8346 -0.8747 0.4535 1.1912 7.2886
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.35126 1.07550 2.186 0.039727 *
x1 1.59279 0.16962 9.390 3.75e-09 ***
x2 0.01494 0.00360 4.151 0.000417 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
page 6 of 6