Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

CS2B MOCK 1 ACTUATORS EDUCATIONAL INSTITUTE

CS2B
MOCK 1

RISK MODELLING AND SURVIVAL


ANALYSIS
MOCK EXAM FOR 2023
CA PRAVEEN PATWARI 1 JAI SHREE RAM
CS2B MOCK 1 ACTUATORS EDUCATIONAL INSTITUTE

CS2: MOCK EXAM 1 — PAPER B QUESTIONS


1. This question uses the following data:

TimeSeriesData.csv

(i) (a) Import the data and convert it to a time series object.

(b) Plot the time series as a line graph.

(c) Comment on whether there is any seasonality present in the data. [5]

(ii) Determine how many times the data should be differenced before fitting a model by examining
relevant sample ACFs for the first 30 lags. [7]

(iii) Fit an MA(1) model to your differenced data, writing down the equation of your fitted model. [2]

(iv) (a) Plot the residuals and the ACF of the residuals for the fitted MA(1) model.

(b) Comment on your graphs, comparing them to the theoretical behavior of the residuals if the
model is a good fit. [7]

(v) Fit an MA(3) model to your differenced data, stating the parameters of your fitted model. [2]

(vi) (a) Plot the 'residuals and the ACF of the residuals for the fitted MA(3) model.

(b) Comment on your graphs, comparing them to the graphs from part (iv)(b) [7]

(vii) Suggest which may be the more appropriate model. [1]

[Total 31]

x
2. The mortality of a population has been found to follow Makeham’s law of mortality  x  A  Bc with

parameter values A=0.0025, B=0.00004 and c =1.11 for ages 50  x  100 and to have a limiting age of
100.

(i) (a) Construct a function with one input, x, that returns the value of  x .

CA PRAVEEN PATWARI 2 JAI SHREE RAM


CS2B MOCK 1 ACTUATORS EDUCATIONAL INSTITUTE

(b) Calculate 50 and 51 using your function. [4]

(ii) (a) Plot graphs of  x and ln  x .

(b) Comment on the shapes of your graphs.

If mortality follows Makeham's law of mortality, then survival probabilities are given by:

t
t px  s g
 
c x c t 1

 B 
g  exp   and s  exp   A 
where  log c  [8]

(iii) (a) Construct a function that takes two inputs, x and t, and calculates t p x . Your function should
also output an error if the inputted value of x is less than 50.

(b) Calculate 0 p 50 with your function.

(c) Show the output of your function when inputting x = 45 and t = 2. [6]

(iv) Calculate the following numerical values, using your function for t p 50 :

(a) the probability that a life aged exactly 50 will survive to age 100

(b) the complete expectation of life for a life aged exactly 50

(c) the curtate expectation of life for a life aged exactly 50

(d) the central rate of mortality for age 50. [8]

There are 10,000 individuals in the population that are currently aged 50.

(v) (a) Plot a line graph showing the expected number of lives alive at ages 50 to 100,based on the
given Makeham's law of mortality.

(b) Update your graph to include a line showing the expected number of lives alive at ages 50 to

100 based on the uniform distribution of deaths (UDD) assumption and a limiting age of

100. [6]
CA PRAVEEN PATWARI 3 JAI SHREE RAM
CS2B MOCK 1 ACTUATORS EDUCATIONAL INSTITUTE

(vi) (a) State the complete expected future lifetime for individuals currently aged 50 based on the
UDD assumption and a limiting age of 100.

(b) Compare you answers to parts (iv)(a) and (vi)(a) using your graph from part (v)(b). [3]

[Total 35]

3. Happy Life insurance company is assessing a list of potential customers provided by a data analysis
company.

Happy Life sent marketing material to a representative sample of approximately 100 people and
recorded whether or not they made a purchase in the month that followed. The company then
produced a file containing information about each individual (such as age, income etc) as well as
whether or not a successful sale was made.

The marketing department is considering using a decision tree to gauge the prospects of the full list of
potential customers, based on the data collected for the representative sample.

They have provided you with a data file ‘HappyLife.txt’ containing information on the representative
sample of approximately 100 potential customers, which includes the following columns:

● AGE, ie age last birthday (eg 35)

● SEX (M or F)

● MARRIED (recorded as Y or N)

● CHILDREN, ie do they have any dependent children? (Y or N)

● HIGH, ie does their estimated household income exceed a specified level (Y or N}

● SALE, ie was a successful sale made? (Y or N)

Some fields are recorded as NA where the data was not available or was considered unreliable.

You are given that the command na.omit(<data>) removes rows with NA from <data>.

CA PRAVEEN PATWARI 4 JAI SHREE RAM


CS2B MOCK 1 ACTUATORS EDUCATIONAL INSTITUTE

(i) (a) Read the data file HappyLife.txt into an object called happy. You should ensure that
character columns are read as factors.

Hint: you may wish to use the stringsAsFactors argument in read.table().

(b) Update happy by removing any rows containing NA.

(c) Show that the number of rows in the updated object is 100.

Happy life wants to use a decision tree to predict Sales. [2]

(ii) Create a training data set called happy_train by randomly selecting 60% of the rows of happy
setting a seed of 38328. You should store the selected row indices in the vector training_rows.

In order to construct the tree, they are considering using the column SEX or the column
CHILDREN for the first split. [2]

(iii) (a) Calculate the Gini index after splitting the 60 training individuals by the column SEX.

(b) Calculate the Gini index after splitting the 60 training individuals by the column CHILDREN.

(c) Explain which split would be preferred when using the greedy approach. [9]

The marketing department decides to try the following tree in order to predict sales:

CA PRAVEEN PATWARI 5 JAI SHREE RAM


CS2B MOCK 1 ACTUATORS EDUCATIONAL INSTITUTE

(iv) Construct a function in R that will determine the predicted outcome using this tree based on the
five input variables.

Your function should take five inputs (values for AGE, SEX, Married, CHILDREN and HIGH) and
return a value of either “Y” for a sale “N” for no sale.

Hint: you may wish to use the function ifelse(). [5]

(v) Test that your function is working correctly by running it for the following two individuals and
comparing the output to the given decision tree:

● a 45-year-old male who is married with children and has an income lower than the
specified level

● a 45-year-old female who is not married, has no children and has an income higher than
the specified level. [2]

(vi) (a) Construct a confusion matrix for comparing the tree's predictions with the actual outcomes
for the test data (ie the customers not in the training data).

(b) Calculate the precision and recall metrics for the tree's performance on the test data
(treating SALE as the positive outcome). [6]

(vii) (a) Construct a decision tree from the training data called package.tree using the tree() function
from the tree package.

(b) Construct a confusion matrix for comparing the predictions using package.tree with the
actual outcomes for the test data (treating SALE aa the positive outcome).

(c) Calculate the precision and recall metrics for the performance of package.tree on the test
data.

The marketing department wants to use one of these two trees for deciding who to market
to from the full list of potential customers. [6]

(viii) Discuss which tree the department may wish to use, using your answers to part (vi) and part
(vii). [2]

[Total 34]

CA PRAVEEN PATWARI 6 JAI SHREE RAM

You might also like