Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

DEDAN KIMATHI UNIVERSITY OF TECHNOLOGY

University Examinations 2015/2016


THIRD YEAR SPECIAL/SUPPLEMENTARY EXAMINATION FOR THE DEGREE OF
BACHELOR OF SCIENCE IN ACTUARIAL SCIENCE

STA 2311 Statistical Programming II

DATE: 17 TH M ARCH 2016 TIME: 8.30 AM − 10.30 AM


Instructions: Answer QUESTION ONE and any other TWO QUESTIONS.
QUESTION ONE (30 MARKS) (COMPULSORY)
(a) Monte Carlo methods depend crucially on the ability to generate pseudo random num-
bers in the interval (0, 1). Make a short list of what you consider to be the most
important properties of a good pseudo random number generator. (4 marks)

(b) Distinguish between a matrix and a dataframe as used in R giving their functions with
the main arguments. (4 marks)

(b) Create a vector named vec containing the numbers 22, 11, 76, 10, and 56. Then ask R
to print the third entry in vec. Throw out the entry 11 from vec, and place the result
in a vector named vec1. (3 marks)

(c) You decide to go on vacation to either Nairobi, Kisumu, or Mombasa. The cost of a
7-day vacation to each spot is 10,000, 12,000, and 15,000 dollars respectively. The travel
times to each spot are 13, 21, and 10 hour respectively. Write the R code to create a
data frame named vacation for that will output the following:

destination cost travel.time


Nairobi 600 13
Kisumu 1500 21
Mombasa 2500 10

(4 marks)

(d) Show that the equation x2 + 1 + x = 3 has a root in the interval (1, 1.4).

Use the bisection method to obtain an estimate of the root with maximum possible
error 0.025.
Determine how many additional iterations of the bisection process would be required to
reduce the maximum possible error to less than 0.005. (6 marks)

(e) You have ordered 10 bags of cement, which are supposed to weigh 94 kg each. The
average weight of the 10 bags is 93.5 kg. Assuming that the 10 weights can be viewed as
a realization of a random sample from a normal distribution with unknown parameters,
construct a 95% confidence interval for the expected weight of a bag. The sample
standard deviation of the 10 weights is 0.75. Write the R code to construct the confidence
interval. (5 marks)

(f) Using Monte Carlo integration, approximate the integral


Z π/4
log(1 + tan2 (x))dx
0

with n = 1, 000, 000. (4 marks)

QUESTION TWO (20 Marks)

(a) Write an R code to generate n = 100 values from a Normal Distribution with µ = 50
and σ 2 = 15. Assuming a univariate normal distribution, write an R program that will
carry out maximum likelihood estimation for the mean and variance. Use the Newton-
Raphson optimization algorithm with analytic first and second partial derivatives.
(8 marks)

(b) A Cauchy density function takes the form


1
f (x) = , x ∈ R,
π{1 + (x − θ)2 }

where θ is a parameter

(i) Write an R code to generate 50 random numbers from a Cauchy distribution with
θ = 1. (1 mark)
(ii) Treat the data you get from step (i) as sample observations from a cauchy distri-
bution with an unknown parameter θ. Obtain the log-likelihood function for θ and
write the R function for the log-likelihood function for θ.
(5 marks)
(iii) Using the bisection method write an R function to find the maximum likelihood
estimator of θ.
(6 marks)

QUESTION THREE (20 Marks)

(a) Let X < − matrix(c(1,2,3,1,4,9), ncol=2). Calculate the matrix H = X(X T X)−1 X T ,
where X is as defined above. (4 marks)

Page 2 of 4
(b) Given the model yi = β0 + β1 x1i + β2 x2i + i and the following data:

y 5 20 27 38 53 57 62 66
x1 3 5 6 7 9 10 12 12
x2 0 -1 0 1 -1 0 -1 2

(i) Write down the predictor matrix X and the response vector y. (2 marks)
(ii) Compute XT X, (XT X)−1 and XT y. (8 marks)
(iii) Estimate β0 , β1 and β2 by OLS using the matrix notations. (3 marks)

(c) Write a program in R that solves the following LP problem:

Minimize: z = 2x1 + x2 + 13x3 − 5x4


Subject to:
− x1 + x2 + x3 − x4 = 15
6x1 + 5x2 − 7x3 + 2x2 ≥ 18
10x1 − 8x2 + 2x3 + 4x4 ≤ 25
x1 , x 2 , x 3 , x 4 ≥ 0

(3 marks)

QUESTION FOUR (20 Marks)


1
Pn
(a) Let Xi be a sequence of i.i.d. random Pn variables. Define the sample mean X̄ = n i=1 Xi
1
and the sample variance S 2 = n−1 i=1 (X i − X̄)2
. Then, for c > 0, the interval
 
xS xS
X̄ − √ , X̄ + √
n n
is a confidence interval for the mean µ = E(Xi ).

(i) For n = 10 and Xi ∼ N (7, 4), use Monte Carlo integration to estimate the confi-
dence level of this confidence interval for c = 2.262. (4
marks)

(b) Similarly, estimate the corresponding confidence level for Xi ∼ Exp(1). What do
you observe? Comment on your result. (3 marks)

(b) The recovery time (in days) is measured for 10 patients taking a new drug and for 10
patients taking a placebo. We wish to test the hypothesis that the mean recovery time
for patients taking the drug is less than for those taking a placebo. The data are:

With drug: 15, 10, 13, 7, 9, 8, 21, 9, 14, 8

Placebo: 15, 14, 12, 8, 14, 7, 16, 10, 15, 12

For our test, we will assume that the two population means are equal.

Page 3 of 4
(i) Input the data into R platform. (3 marks)
(ii) Write the R code to carry out this analysis. (4 marks)

(c) Trying to encourage people to stop driving to campus, the university claims that on
average it takes people 30 minutes to find a parking space on campus and that the
standard deviation is minutes. You, however, don’t think it takes so long to find a spot.
In fact, based on the last 5 times you drove to campus, you found the mean time to find
a parking spot to be 20 minutes. Assuming that the time it takes to find a parking spot
is normal, perform a hypothesis test with level of significance of 0.10 to see if your claim
is correct. (6 marks)
QUESTION FIVE (20 Marks)
(a) An insurance company has four types of policies, which we will label A, B, C, and D.

– They have a total of 245 921 policies.


– The annual income from each policy is $10 for type A, $30 for type B, $50 for type
C, and $100 for type D.
– The total annual income for all policies is $7 304 620.
– The claims on these policies arise at different rates. The expected number of
typeAclaims is 0.1 claims per year, type B 0.15 claims per year, type C 0.03 claims
per year, and type D 0.5 claims per year.
– The total expected number of claims for the company is 34 390.48 per year.
– The expected size of the claims is different for each policy type. For type A, it is
$50, for type B it is $180, for type C it is $1500, and for type D it is $250.
– The expected total claim amount is $6 864 693. This is the sum over all policies
of the expected size of claim times the expected number of claims in a year.

Use R to answer the following questions:

(i) Find the total number of each type of policy. (4 marks)


(ii) Find the total income and total expected claim size for each type of policy.
(4 marks)
(iii) Assuming that claims arise in a Poisson process, and each claim amount follows a
Gamma distribution with a shape parameter of 2 and the means listed above, use
simulation to estimate the following:
(a) The variance in the total claim amount. (4 marks)
(b) The probability that the total claim amount will exceed the total annual
income from these policies. (4 marks)

Write a function to do these calculations, and do it once for the overall company income
and claims, and once for each of the four types of policy. (4 marks)

Page 4 of 4

You might also like