CS2A Workbook

www.sankhyiki.
in
+91-9711150002
INDEX
1. Reinsurance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
2. Risk Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18
3. Survival Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4. Estimating the lifetime distribution function. . . . . . . . . . . . . . . . . . . . . . . . . . . .48
5. Proportional hazards models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6. Exposed to Risk. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7. Graduation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .117
8. Mortality Projection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9. Stochastic Processes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
10. Markov Chains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
11. Time-homogeneous and inhomogeneous Markov Jump processes. . . . . . . .207
12. Time Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
13. Extreme Value Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .271
14. Copulas……………………………………………………………………………..275
Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 1

www.sankhyiki.in
+91-9711150002

www.sankhyiki.in
+91-9711150002
REINSURANCE & LOSS DISTRIBUTION
1. The loss amount, X on a certain type of insurance policy, has a Pareto

distribution with density function f(x), where: -
3  4003
f (x)  (x > 0)
(400  x ) 4
A policyholder deductible of £100 is applied to these policies.
(i) Calculate the expected claim size paid by the insurance company.
(ii) Comment on the difference between your answer to (i) and the expected
loss amount, E(X). [UK April 2002]
2. The last ten claims under a particular class of insurance policy were:
1330 201 111 2368 617
309 35 4,685 442 843
(i) Assuming that the claims came from a lognormal distribution with
parameters  and  derive the formula for the maximum likelihood
estimates of these parameters and estimate the parameters based on the
observed data.
(ii) Assuming that the claims come from a Pareto distribution with
parameters  and , use the method of moments to estimate these
parameters.
(iii) Assuming that the claims come from a Weibull distribution with
parameters c and y, use the method of percentiles (based on the 25th and
75th percentiles) to estimate these parameters.
(iv) If the insurance company takes out reinsurance cover with an individual
excess of loss of 3,000 estimate the percentage of claims that will involve
the reinsurer under each of the three models above. [UK April 2002]
3. (i) Show that:

b 1 
1
(ln x   ) 2
1
 2   ln b    2   ln a    2  
 22
e 2 2
dx  e 2  
  
  

 

    
a
(ii) Individual claim amounts on a certain type of general insurance policy

have a log-normal distribution, with mean 264 and standard deviation

www.sankhyiki.in
+91-9711150002
346. A policyholder excess of 100 is a standard condition on each policy,

so that the insurance company only covers the loss amount in excess of
100.
(a) Calculate the expected claim size payable by the insurance
company.
(b) Next year, claims are expected to increase by 10%. Also, a new
condition is introduced on all policies so that the maximum amount
that the insurance company will pay on any claim is 1,000. The
policyholder excess will remain unchanged at 100.
Calculate the expected claim size payable by the insurance
company. [UK Sept 2002]
4. Claims on a portfolio of general insurance policies have a Pareto distribution
with density function f(x), where: -
a
f (x)  (x > )
x  1
Excess of loss reinsurance is arranged with retention M (M > ).
(i) (a) Show that P(X > x) = . / (x > ).

(b) Derive an expression for the expected amount paid by the
reinsurer, on a claim, which involve the reinsurer.
(ii) Last year 10 claims were received, of which 4 involved the reinsurer: The
claims, which involved the reinsurer, are denoted by x j : j  1,2,..., 4
(xj > M)}.
Write down the likelihood for these data. [UK April 2003]
5. (i) The distribution of claims on a portfolio of general insurance policies is a
Weibull distribution, with density function f1(x) where:
f1 ( x )  2cxe cx
2
(x > 0)
It is expected that one claim out of every 100 will exceed £1000. Use this
information to estimate c.
(ii) An alternative suggestion is that the density function is f2(x) where:
f 2 ( x )  e x (x > 0)
Use the same information as in part (i) to estimate .
(iii) For each of f1(x) and f2(x) calculate the value of M such that
( ) : [UK Sept 2003]

www.sankhyiki.in
+91-9711150002
6. The loss severity distribution for a portfolio of household insurance policies is

assumed to be Pareto with parameters  = 3.5,  = 1,000.
Next year, losses are expected to increase by 5% and the insurer has decided to
introduce a policyholder excess of £100.
Calculate the probability that a loss next year is borne entirely by the
policyholder. [UK April 2004]
7. (i) Show that:

d
1
m  m 2  2  log d    m2 
 x f ( x )dx  e  
m 2
0
  
1 1 
 1  log x    
2

where f ( x )  exp   
x  2 
 2   
(ii) The loss amounts X, from a portfolio of non-life insurance policies are
assumed to be independently distributed with mean £800 and standard
deviation £1,200.
Calculate the values of the parameters of a lognormal distribution with
this mean and standard deviation.
(iii) The company is considering purchasing reinsurance cover, and has to
decide whether to purchase excess-of-loss or proportional reinsurance.
The amounts paid by the direct insurer and reinsurer respectively, are
given by:
x1(Pr op )  (1  k )X X1( XL )  min(X, d)
and
X (Pr
R
op )
 kX X (RXL )  max( 0, X  d)
where X denotes the loss amount.
Using the loss distribution from (ii), calculate the value of k such that:
E(X1(Pr op ) )  0.7E[X]
and show that if d = 1,189.4, E(X(jXL ) )  0.7E(X)
(iv) Using the values of k and d from (iii), calculate the values of var[ X1(Pr op ) ]
and var[ X1( XL ) ] [UK April 2004]

www.sankhyiki.in
+91-9711150002
8. (i) The random variable X has a Pareto distribution with parameters  and .
Show that for L, d > 0:
( )
∫ ( ) 0 1
( ) ( )
(ii) Claims on a certain class of insurance policy have a Pareto distribution
with mean £3,000 and standard deviation £6,000. The insurance company
arranges a layer of excess-of-loss reinsurance with a retention level of
£8,000. The maximum amount the reinsurer will pay on any individual
claim is £6,000.
(a) Calculate the mean claim amount paid by the reinsurer on claims which
involve the reinsurer.
(b) Next year the claim amounts on these policies are expected to increase by
10% but the reinsurance treaty will remain unchanged. Calculate the mean
claim amount to be paid next year by the reinsurer on claims, which
involve the reinsurer. [UK Sept 2004]
9. An insurer believes that claim amounts, X, on its portfolio of pet insurance

policies Follow an exponential distribution with mean £200.
A reinsurance policy is arranged such that the reinsurer pays XR, where:
0 if X  50

X R  X  50 if 50  X  M
M  50 if X  M

Calculate M such that E(XR) = £100. [UK April 2005]
10. An insurer believes that claims from a particular type of policy follow a Pareto
distribution with parameters  = 2.5 and  = 300. The insurer wishes to introduce
a deductible such that 25% of losses result in no claim on the insurer.
(i) Calculate the size of the deductible:
(ii) Calculate the average claim amount net of the deductible. [UK Sept 2005]
11. (i) Let X denote the claim amount under an insurance policy, and suppose
that X has a probability density fx(x) for x > 0. The insurer has an
individual excess of loss reinsurance arrangement with a retention of £M.
Let Y be the amount paid by the insurer net of reinsurance. Express Y in

www.sankhyiki.in
+91-9711150002
terms of X and hence derive an expression for the probability density

function of Y in terms of fX(x).
For a particular class of policy X is believed to follow a Weibull distribution
with probability density function:
0.75
f X ( x)  0.75cx  0.25e  cx (x > 0)
where c is an unknown constant. The insurer has an individual excess of loss
reinsurance arrangement with retention £500. The following claims data are
observed:
Claims below retention: 78, 104, 116, 135, 189, 243, 270, 350, 411, 491
Claims above retention: 3 in total
Total number of claims: 13
(ii) Estimate c using maximum likelihood estimation.
(iii) Apply the method of percentiles using the median claim to estimate c.
[UK Sept 2006]
12. The random variable X has an exponential distribution with mean 1,000.
Individual claim amounts on a certain type of insurance policy, Y, are such that:
Y = X for 0 < X < 2,000
and P(Y = 2,000) = P(X 2,000)
The insurer applies a deductible of 100 on claims from this type of insurance.
Calculate the mean of the distribution of individual claim amounts paid by the
insurer. [UK April 2007]
13. Losses on a portfolio of insurance policies in 2006 are assumed to have an

exponential distribution with parameter . In 2007 loss amounts have increased
by a factor k (so that a loss incurred in 2007 is k times an equivalent loss
incurred in 2006).
(i) Show that the distribution of loss amounts in 2007 is also exponential and
determine the parameter of the distribution.
Over the calendar years 2006 and 2007 the insurer had in place an individual
excess-of-loss reinsurance arrangement with retention of M. Claims paid by the
insurer were:
2006: 4 amounts of M and 10 claims under M for a total of 13,500.
2007: 6 amounts of M and 12 claims under M for a total of 17,000.

www.sankhyiki.in
+91-9711150002
(ii) Show that the maximum likelihood estimate of  is:

22
ˆ 
17,000 6M
13,500   4M 
k k
(iii) The insurer is negotiating a new reinsurance arrangement for 2008. The
retention was set at 1,600 when the current arrangement was put in place
in 2006. Loss inflation between 2006 and 2007 was 10% (i.e. k = 1.1) and
further loss inflation of 5% is expected between 2007 and 2008.
(a) Use this information to calculate ̂ 
(b) The insurer wishes to set the retention M‘ for 2008 such that the
expected (net of re-insurance) payment per claim for 2008 is the
same as the expected payment per claim for 2006. Calculate the
value of M‘, using your estimate of  from (iii)(a).
[UK Sept 2008]
14. The following claim amounts are believed to come from a lognormal
distribution with unknown parameters  and 2:
50, 87, 103, 119, 126, 154, 183, 203
Estimate the parameters  and 2 using:
(i) the method of moments:
(ii) the method of percentiles using the upper and lower quartiles.
[UK Sept 2009]
15. An underwriter has suggested that losses on a certain class of-policies follow a
Weibull distribution. She estimates that the 10th percentile loss is 20 and the 90th
percentile loss is 95.
(i) Calculate the parameters of the Weibull distribution that fit these
percentiles.
(ii) Calculate the 99. 5th percentile loss. [UK Sept 2010]
16. An insurance company has a portfolio of policies under which individual loss
amounts follow an exponential distribution with mean 1/ There is an
individual excess of loss reinsurance arrangement in place with retention level
100. In one year, the insurer observes:
85 claims for amounts below 100 with mean claim amount 42; and 39 claims for
amounts above the retention level.
(i) Calculate the maximum likelihood estimate of .

www.sankhyiki.in
+91-9711150002
(ii) Show that the estimate of  produced by applying the method of moments
to the distribution of amounts paid by the insurer is 0.011164.
[UK Sept 2010]
17. Loss amounts under a class of insurance policies follow an exponential
distribution with mean 100. The insurer wishes to enter into an individual excess
of loss reinsurance arrangement with retention level M set such that 8 out 10
claims will not involve the reinsurer.
(i) Find the retention M.
For a given claim, let Y denote the amount paid by the insurer and Z the amount
paid by the reinsurer.
(ii) Calculate E(Y) and E(Z). [UK Sept 2011]
18. Claim amounts on a certain type of insurance policy follow a distribution with
density
( ) for x>0
where c is an unknown positive constant. The insurer has in place individual
excess of loss reinsurance with an excess of 50. The following ten payments are
made by the insurer:
Losses below the retention: 23, 37, 41, 11, 19, 33
Losses above the retention: 50, 50, 50, 50
Calculate the maximum likelihood estimate of c. [UK April 2012]
19. Claims arising on a particular type of insurance policy are believed to follow a
Pareto distribution. Data for the last several years shows that mean claim size is
170 and the standard deviation is 400.
(i) Fit a Pareto distribution to this data using method of moments.
(ii) Calculate median claim using the fitted parameters. [UK Sept 2012]
20. Claims on a certain type of insurance policy are believed to follow an

exponential distribution. The upper quartile claim size is 240. Calculate the
mean claim size. [UK April 2013]
21. Claim amounts on a certain type of insurance policy follow an exponential

distribution with mean 100. The insurance company purchases a special type of
reinsurance policy so that for a given claim X the reinsurance company pays

www.sankhyiki.in
+91-9711150002
0 if 0 <X < 80;

0.5X – 40 if 80 < X < 160;
X – 120 if X 160
Calculate the expected amount paid by the reinsurance company on a randomly
chosen claim. [UK Sept 2013]
22. Claim amounts arising from a certain type of insurance policy are believed to
follow a Lognormal distribution. One thousand claims are observed and the
following summary statistics are prepared:
mean claim 230
standard
amount 110
deviation
lower quartile 80
upper quartile 510
(i) Fit a Lognormal distribution to these claims using
(a) The method of moments
(b) The method of percentiles
(ii) Compare the fitted distributions from part (i). [UK April 2014]
23. The random variable X follows a Pareto distribution with parameters  and .
(i) Show that for L, d > 0
( )
∫ ( ) 0 1
( ) ( )
Claims on a certain type of motor insurance policy follow a Pareto distribution
with mean 16,000 and standard deviation 20,000. The insurance company has an
excess of loss reinsurance policy with a retention level of 40,000 and a maximum
amount paid by the reinsurer of 25,000.
(ii) Determine the mean claim amount paid by the reinsurer on claims that
involve the reinsurer.
Claim amounts increase by 5%.
(iii) State the new distribution of claim amounts. [UK Sept 2014]

www.sankhyiki.in
+91-9711150002
24. An insurer believes claims amounts (in thousands of INR) from its
property portfolio follow a Pareto distribution with parameters =3 and
=300. The insurer wishes to introduce a deductible such that 20% of the
losses result in no claim for the insurer.
i) Calculate the size of the deductible.
ii) Calculate the average claim amount net of deductible. [India May 2014]
25. A general insurer believes that claims in the motor insurance portfolio arise
as an Exp () distribution. There is a retention limit of Rs. 1,00,000 in force, and
claims in excess of Rs 1,00,000 are paid by the reinsurer.
The insurer, wishing to estimate , observes the last year claims and finds that out
of total 250 claims, that the average amount of the 226 claims that did not exceed
Rs 1,00,000 was Rs 540. The each of the remaining 24 claims were above Rs.
1,00,000 and are yet to be settled by the reinsurer.
Write down the likelihood function clearly and find the MLE estimate of .
[India May 2014]
26. Individual claims may be regarded as realizations of a random variable

200X, where X has the distribution with p.d.f.
. /( )
( )
{ }
In addition, for each claim, there is a 25% chance that an additional fixed
expense of 500 will be incurred. Calculate the mean and variance of the
total individual claim amounts. [India Nov 2012]
27. The claim amount arising from policies of a general insurance portfolio is
assumed to have probability density function f(x) given by
( ) for x >0
c being an unspecified parameter. In a year, there are 1000 claims of amounts.

The median of these claim amounts is 5000, the mean is 5120.5, and it is also
known that

www.sankhyiki.in
+91-9711150002
(i) Derive an explicit expression for the maximum likelihood estimator of

the parameter c on the basis of the data.
(ii) Compute the MLE of c from the given data summary.
(iii) Compute the method of moments estimate of c from the given data
summary.
(iv) Compute the method of percentiles estimate of c from the given data
summary. [India May 2012]
28. The individual claim amounts for the current year, from a stable portfolio
of a large insurer, has the probability density function
( )
( )
( )
The portfolio is reinsured by an excess of loss reinsurance arrangement with a
fixed retention limit Rs 600 lakhs. The claim amount is expected to inflate at a
constant rate of 10% per annum from now.
i) Calculate the probability density function of the individual claim amounts
after n years.
ii) Calculate the expected size of the individual claim amounts after n years.
iii) Calculate the expected claim amount paid by the insurer in respect of an
individual claim, after n years.
iv) What happens to the expected claim amount paid by the insurer after n
years, as n tends to infinity? Explain either by general reasoning or by
analyzing the result of part (iii).
v) What happens to the insurer‘s share of the expected claim amount
paid, as n tends to infinity? Explain. [India May 2012]
29. Claims from a certain portfolio have a Pareto distribution with parameters
 = 3 and  = 500 . A retention limit of £400 is in force, with the excess of this
amount on any claim being paid by a reinsurer.
(i) What proportion of claims involve the reinsurer?
(ii) What is the mean amount paid by the reinsurer on all claims?
(iii) What is the mean amount paid by the reinsurer on all claims in which it
is involved?

www.sankhyiki.in
+91-9711150002
30. The amount, X, of a claim, in thousands of pounds, from an insurance

portfolio has the lognormal distribution with mean 12.2 and standard
deviation 16. Consider an excess of loss reinsurance policy with a
retention of £28,000 so that the claim paid by the insurer (£‘000) is given
by Y, where:
2 3
(a) Determine the probability that a claim involves the reinsurer.

(b) Calculate the mean and variance of the claims paid by the insurer.
(c) Given that a claim is referred to the reinsurer, what is the conditional
expected value paid by the reinsurer? [UK April 1999]
31. A specialist motor insurer writes policies with individual excesses of £500 per
claim. The insurer has taken out a reinsurance policy whereby the insurer pays
out a maximum of £4,500 in respect of each individual claim, the rest being paid
by the reinsurer. The individual claims, gross of reinsurance and the excess, are
believed to follow an exponential distribution with parameter  .
Over the last year, the insurer has gathered the following data:
There were 5 claims which were not processed because the loss was less
than the excess.
There were 11 claims where the insurer paid out £4,500 and the reinsurer the
remainder.
There were 26 other claims in respect of which the insurer paid out a total of
£76,457.
Derive the log likelihood function of            [UK Sept 2001]
32. (i) (a) Explain why an insurance company might purchase reinsurance.
(b) Describe two types of reinsurance.
The claim amounts on a particular type of insurance policy follow a Pareto
distribution with mean 270 and standard deviation 340.
(ii) Determine the lowest retention amount such that under excess of loss
reinsurance the probability of a claim involving the reinsurer is 5%.
[UK April 2015]

www.sankhyiki.in
+91-9711150002
33. A general insurance company writes claims, whose amounts have a lognormal
distribution, with mean 300 and standard deviation 400. The insurance
company purchases excess of loss reinsurance with retention 500 per claim.
(i) Calculate the average expected claim size payable by the insurance
company.
Next year, claim inflation is 10%, but the retention amount remains the same.
(ii) Explain whether the average expected claim size payable by the insurance
company next year would increase by 10%. [UK April 2016]
34. An Insurer writes two classes of insurance business A and B.

(i) On class A the insurer is planning to buy a reinsurance policy such that
the reinsurer pays the excess of the claim amount over 250, if the claim
amount exceeds 250. However, in case the claim amount exceeds a pre-
defined amount M (which is greater than 250), the excess claim amount
over M reverts back to the insurer. No payment is made by the reinsurer if
claim payments are less than or equal to 250. Assuming that the original
claim payments follow an exponential distribution with mean 1000
calculate M such that the expected claim payments made by the reinsurer
equals 500.
(ii) On class B the insurer decides to introduce a policy excess such that 40%
of the losses result in no claim for the insurer. The insurer believes that
claims from class B follow a Pareto distribution with parameters α = 3 and
λ = 600. Calculate the average claim amount paid by the insurer under
class B net of the policy excess. [India Oct 2015]
35. Claims on a home insurance policy have a Pareto distribution with parameters
= 4 and = 7,500. The insurer effects an individual excess of loss reinsurance
treaty with a retention limit of £3,000.
(i) (a) Calculate the probability that a claim involves the reinsurer.
(b) Calculate the insurer‘s expected payment per claim.
Next year the claim amounts on these policies are expected to increase by 10%
but the reinsurance treaty will remain unchanged.
(ii) (a) Calculate the probability that a claim now involves the reinsurer.
(b) Explain whether the insurer‘s expected payment per claim will also
increase by 10%.
(c) Calculate the reinsurer‘s expected claim payment next year on
those claims in which it is involved.

www.sankhyiki.in
+91-9711150002
36. Insurance Company A has taken out an individual excess of loss reinsurance
contract with a retention limit of £40,000. Individual claim amounts, gross of
reinsurance, are believed to follow an exponential distribution with unknown
parameter .
Over the last year, the following claims data are observed:
Claims below retention: 12,220 10,429 36,834 14,623
36,932 13,205 28,506
Claims above retention: 3 in total
(i) (a) Estimate using maximum likelihood estimation.
(b) Apply the method of percentiles using the median claim to estimate .
Insurance Company B has a policyholder excess of £50,000 on its policies. The
individual claim amounts, X, are believed to have a Pareto( ,200000)
distribution (before the excess is applied) where is the unknown parameter.
(ii) (a) Show that the conditional distribution of the amount paid by the
insurer, Y , has a Pareto( ,250000) distribution
The amounts paid the insurer, yi, on the last five claims (i.e, after the £50,000
excess has been deducted) were:
£153,000 £376,000 £120,000 £20,000 £108,000
(b) Use this information and the distribution from part (a) to
determine, ̂, the maximum likelihood estimate of .

www.sankhyiki.in
+91-9711150002
ANSWERS
1. (i) E(Y) = 250
(ii) The expected loss amount, E(X) = 200. The mean amount of Y is
greater than this because we have removed some of the smallest
claims by introducing the deductible.
2. (i) 6.19596 

ii)  = 5.51012  = 4,934.5274
(iii) 0.83219, c = 0.003485( )
(iv) Lognormal = 9.5%, Pareto = 7.3% and Weibull = 6.5%
3. (ii) (a) 260 (b) 251.81
4. (i) (b) E[Z|Z>0] = (ii) L = ∏ 2 ( ) 3
5. (i) c = 4.605 (ii)  = 4.605 (iii) M= 1,225 and 1,500
6. 0.273
7. (ii) (iv) var[ X 1(Pr op ) ] = 705,600, var[ X1( XL ) ] =
159,000
8. (ii) (a) E[Z|Z>0] = 3,656.08 (b) E[Z|Z>0] = 3,711
9. M = 255.45
10. (i) D = 36.587 (ii) E[Y|X>D] = 224.39
( )
11. (i) ( ) { } (ii) c= 0.0110 (iii) c = 0.0104
( )
12. 850.43
13. (i) Exp(. / (iii)(a) (b) M= 1,496.5
14. (i) µ = 4.79 (ii) µ = 4.84 σ
15. (i) c = 0.00028   ii) 144.73
16. (i) 0.011379
17. (i) M= 160.94 (ii) E[Y] = 80 and E[Z] = 20
18. c=
19. (i)  = 2.441  = 244.95 (ii) 80.44
20. 173.12
21. 32.56
22. (i) (a) µ = 5.3351,  = 0.45385 (b) µ = 5.30822,  = 1.37315

www.sankhyiki.in
+91-9711150002
23. (ii) E[Z|Z>0] = 14,819.10 (iii) X ( )

24. (i) D= 23.165 (ii) E[X-D|X>D] = 161.58
25. L= ( )
M.L.E. of  = 0.00008961
26. E[X] = 5/3 V[X] = 25/18
27. (i) c= ∑ (ii) c= 1.024 (iii) c = 2.995
(iv) c= 2.772
( )( )
28. (i) , ( ) -
(ii) E[ - ( )
(iii) (iii) E( - ( )
(iv) As n goes to infinity, the insurer‘s expected claim payment
approaches 600.
(v) From part(ii), the expected claim size is seen to go to infinity as n
goes to infinity. On the other hand, the insurer‘s expected claim
payment saturates to 600. Therefore, the direct insurer‘s share goes
to zero.
29. (i) 0.171468 (ii)77.16 (iii) 450
30. (a) 0.09165 (b) Mean = 10,245 and Var = (c) 21,300
31. logL(loglog(1- )
32. (i) (a) To protect itself from the risk of large claims
(b) Excess of loss and Proportional Reinsurance (ii) 880.88
33. (i) 228
(ii) The insurance company‘s expected claims would increase by less than 10%,
since the chances of high claims has increased due to the standard deviation
remaining the same, hence the reinsurer will pick up a greater share of the
claims.
34. (i) M = 1277.25 (ii) 355.69
35. (i)(a) 0.260308 (b)1,588.92 (ii)(a) 0.28920
(b) The average claim amount retained by the insurance company will increase
by less than 10%. This is because the retention limit is unchanged, ie the insurer
still pays a maximum amount of £3,000 in respect of each claim. The amounts
that the insurer has to pay out on small claims (that were less than £3,000 / 1.1 )
will increase by 10%.
(c) 3750
36. (i)(a) 0.0000257 (b) 0.0000212 (ii)(b) 2.25

www.sankhyiki.in
+91-9711150002
RISK MODELS
1. For each of m independent risks, there is probability 0.2 that a claim made in a
year and probability 0.8 that no claim is made. Claim sizes are independent with
mean 400 and variance 110.
Determine the expected value and the variance of the total amount claimed in
one year. [UK April 2002]
2. (i) Derive the, MGF of the total amount, T, claimed if the number of claims,
N, has a Poisson distribution with mean > 0 and the claim severity
distribution has MGF M(t).
(ii) A portfolio consists of 210 risks each of which gives rise to claims as a
Poisson process. The claim severity distribution is exponential. The
portfolio is divided into 3 groups, as follows: -
Group Number of risks Poisson rate Mean of claim severity
per risk distribution
1 40 1 400
2 120 2 500
3 50 2.5 600
(a) Derive the MGF of the total claim amount S from all 210 independent risks
in one time unit.
(b) Show that S has a compound Poisson distribution and determine the
corresponding Poisson parameter and the claim severity density.
[UK Sept 2002]
3. (i) Let N be the number of claims on a risk in one year. Suppose claims
[X1 , X 2 ,...] are independent, identically distributed random variables,
independent of N. Let S be the total amount claimed in one year.
(a) Derive E(S) and var(S) in terms of the mean and variance of N and
X1 .
(b) Derive, an expression for the MGF Ms(t) of S in terms of the MGFs
MX(t) and MN(t) of X1 and N respectively.
(c) If N has a Poisson distribution with mean show that:
M S ( t )  exp((M X ( t )  1))

www.sankhyiki.in
+91-9711150002
(d) If N has a binomial distribution with parameters m and q,

determine the MGF of S in terms of m, q and MX(t).
(ii) A portfolio consists of 500 independent risks. For the ith risk, with
probability 1 – qi there are no claims in one year, and with probability qi
there is exactly one claim (0 < qi < 1). For all risks, if there is a claim, it has
mean  variance 2 and MGF M(t). Let T be the total amount claimed on
the whole portfolio in one year.
(a) Determine the mean and variance of T.
The amount claimed in one year on risk i is approximated by a
compound Poisson random variable with Poisson parameter qi and
claims with the same mean , the same variance 2 and the same
MGF M(t) as above. Let ̌ denote the total amount claimed on the
whole portfolio in one year in this approximate model.
(b) Determine the mean and variance of ̌ , and compare your answers
to those in (ii)(a).
Assume that qi = 0.02 for all i, and if a claim occurs, it is of size 
with probability one.
(c) Derive the MGF of T, and show that T has a compound binomial
distribution.
(d) Determine the MGF of the approximating ̌ , and show that ̌ has a
compound Poisson distribution. [UK April 2003]
4. A portfolio consists of a total of 120 independent risks. On each risk, no more

than one event can occur each year, and the probability of an event occurring is
0.02. When such an event does occur, the number of claims, N has the following
distribution:
P(N = x) = 0.4  0.6x-1 (x = 1,2,...)
Determine the mean and variance of the distribution of the number of claims
which arise from this portfolio in a year. [UK Sept 2003]
5. Claims occur in a Poisson process rate 20. Individual claims are independent
3
random variables with density; f ( x )  x > 0 independent of the arrival
(1  x ) 4
process. Calculate the mean and variance of the total amount claimed by time t =
2. [UK Sept 2003]

www.sankhyiki.in
+91-9711150002
6. A portfolio consists of two types of policies. For type 1, the number of claims in a
year has a Poisson distribution with mean 1.5 and the claim sizes are
exponentially distributed with mean 5. For type 2, the number of claims in a year
has a Poisson distribution with mean 2 and the claim sizes are exponentially
distributed with mean 4. Let S be the total amount claimed on the whole
portfolio in one year. All policies are assumed to be independent.
(i) Determine the mean and variance of S.
(ii) Derive the MGF of S and show that S has a compound Poisson
distribution. [UK April 2004]
7. The number of claims arising from a hurricane in a particular region has a

Poisson distribution with mean . The claim severity distribution has mean 0.5
and variance 1.
(i) Determine the mean and variance of the total amount of claims arising
from a hurricane.
(ii) The number of hurricanes in this region in one year has a Poisson
distribution with mean . Determine the mean and variance of the total
amount claimed from all the hurricanes in this region in one year.
[UK Sept 2004]
8. A general insurance company has a portfolio of fire insurance policies, which
offer cover for just one fire each year.
Within the portfolio, there are three types of buildings for which the average cost
of a claim and probability of a claim are given in table below.
Type of Number of Risk Average Cost of a claim Probability of a
building Covered (£ 000s) claim
Small 147 12.4 0.031
Medium 218 27.8 0.028
Large 21 130.3 0.017
It is assumed that the cost of a claim has an exponential distribution, and that all
the buildings in the portfolio represent independent risks for this insurance
cover.
(i) Show that the mean and standard deviation of annual aggregate claims
from this portfolio of insurance policies are £272,715 and £150,671,
respectively, and calculate the coefficient of skewness.

www.sankhyiki.in
+91-9711150002
(ii) Using a normal distribution to approximate the distribution of annual

aggregate claims, calculate the premium loading factor necessary such
that the probability that annual aggregate claims exceed premium income
is 0.05.
(iii) Market conditions dictate that the insurer can only charge a premium
which includes a loading of 25%, Calculate the amount of capital that the
insurer must allocate to this line of business in order to ensure that the
probability that annual aggregate claims exceed premium income and
capital is 0.05 (again using a normal approximation).
(iv) Comment on the assumption of independence and the use of a normal
approximation, in relation to your answers to (ii) and (iii). [UK Sept 2004]
9. The number of claims on a portfolio of washing machine insurance policies

follows a Poisson distribution with parameter 50. Individual claim amounts for
repairs are a random variable 100X where X has a distribution with probability
density function:
3
 (6x  x  5) 1 x  5
2
f ( x )   32

0 otherwise
In addition, for each claim (independently of the cost of the repair) there is a 30%
chance that an additional fixed amount of £200 will be payable in respect of
water damage.
(i) Calculate the mean and variance of the total individual claim amounts.
(ii) Calculate the mean and variance of the aggregate claims on the portfolio.
[UK Sept 2005]
10. (i) Let N be a random variable representing the number of claims arising
from a portfolio of insurance policies. Let Xi denote the size of the ith claim
and suppose that X1,X2, are independent identically distributed random
variables, all having the same distribution as X . The claim sizes are
independent of the number of claims. Let S = X1 + X2 + + XN denote the
total claim size. Show that:
M S ( t )  M N [log M X ( t )]
(ii) Suppose that N has a Type 2 negative binomial distribution with
parameters k > 0 and 0 < p < l. That is:

www.sankhyiki.in
+91-9711150002
( k  x)
P( N  x)  pk qk x = 0,1,2,…
( x  1)( x)
Suppose that X has an exponential distribution with mean 1/Derive an
expression for MS(t).
(iii) Now suppose that the number of claims on another portfolio is R with the
size of the ith claim given by Yi. Let. T = Yl + Y2 +...+YR. Suppose that R has
a binomial distribution, with parameters k and 1–p, and that Yi has an
exponential distribution with mean 1/. Show that if  is chosen
appropriately then S and T have the same distribution. [UK April 2006]
11. (i) State two conditions for a risk to be insurable. [UK April 2007]
12. The total claims arising from a certain portfolio of insurance policies over a given
month is represented by:
N
 X i if N  0
S   i 1
0 if N  0

Where N has a Poisson distribution with mean 2 and X1,, X2,  XN is a sequence
of independent and identically distributed random variables that are also
independent of N. Their distribution is such that P(X1=1)=1/3 and P(X1=2)=2/3.
An aggregate reinsurance contract has been arranged such that the amount paid
by the reinsurer is S - 3 (if S > 3) and zero otherwise.
The aggregate claims paid by the direct insurer and the reinsurer are denoted by
SI and SR respectively. Calculate E(SI ) and E(SR). [UK April 2007]
13. The total claim amount, S on a portfolio of insurance policies has a compound
Poisson distribution with Poisson parameter 50. Individual loss amounts have an
exponential distribution with mean 75. However, the terms of the policies mean
that the maximum sum payable by the insurer in respect of a single claim is 100.
(i) Find E(S) and var(S).
(ii) Use the method of moments to fit as an approximation to 5:
(a) a normal distribution
(b) a log-normal distribution
(iii) For each fitted distribution, calculate P(S > 3000). [UK Sept 2007]

www.sankhyiki.in
+91-9711150002
14. A bicycle wheel manufacturer claims that its products are virtually indestructible
in accidents and therefore offers a guarantee to purchasers of pairs of its wheels.
There are 250 bicycles covered, each of which has a probability p of being
involved in an accident (independently). Despite the manufacturer‘s publicity, if
a bicycle is involved in an accident, there is in fact a probability of 0.1 for each
wheel (independently) that the wheel will need to be replaced at a cost of £100.
Let S denote the total cost of replacement wheels in a year.
(i) Show that the MGF of S is given by:
250
 pe 200 t  18pe 100 t  81p 
M S (t)    1  p
 100 
(ii) Show that E(S)=5,000p and var(S) = 550,000p – 100,000p2.
Suppose instead that the manufacturer models the cost of replacement
wheels as a random variable T based on a portfolio of 500 wheels, each of
which (independently) has a probability of 0.lp of requiring replacement.
(iii) Derive expressions for E(T) and Var(T) in terms of p.
(iv) Suppose p = 0.05.
(a) Calculate the mean and variance of S and T.
(b) Calculate the probabilities that S and T exceed £500.
(c) Comment on the differences. [UK April 2008]
15. An insurance portfolio contains policies for three categories of policyholder: A, B

and C. The number of claims in a year, N, on an individual policy follows a
Poisson distribution with mean . Individual claim sizes are assumed to be
exponentially distributed with mean 4 and are independent from claim to claim.
The distribution of , depending on the category of the policyholder, is
Category Value of  Proportion of policy holders
A 2 20%
B 3 60%
C 4 20%
Denote by S the total amount claimed by a policyholder in one year.
(i) Prove that E[S] = E(E[S|])
(ii) Show that E[S|] = 4and var[S|] = 32.
(iii) Calculate E[S]
(iv) Calculate Var[S] [UK April 2009]

www.sankhyiki.in
+91-9711150002
16. Individual claims under a certain type of insurance policy are for either 1 (with
probability ) or 2 (with probability 1 — ).
The insurer is considering entering into an excess of loss reinsurance
arrangement with retention 1+k (where k < 1). Let Xi denote the amount paid by
the insurer (net of reinsurance) on the ith claim.
(i) Calculate and simplify expressions for the mean and variance of Xi.
Now assume that  = 0.2. The number of claims in a year follows a Poisson
distribution with mean 500. The insurer wishes to set the retention so that the
probability that aggregate claims in a year will exceed 700 is less than 1%.
(ii) Show that setting k = 0.334 gives the desired result for the insurer.
[UK April 2009]
17. The total number of claims N on a portfolio of insurance policies has a Poisson
distribution with mean . Individual claim amounts are independent of N and
each other, and follow a distribution X with mean  and variance 2. S denotes
the total aggregate claims in the year. The random variable S therefore has a
compound Poisson distribution.
(i) Derive an expression for the moment generating function of S in terms of
the moment generating function X
(ii) Derive expressions for the mean and variance of S in terms of and .
For a particular type of policy, individual losses are exponentially distributed
with mean 100. For losses above 200 the insurer incurs an additional expense of
50 per claim.
(iii) Calculate the mean and variance of S for a portfolio of such policies with
= 500. [UK Sept 2009]
18. The number of claims N on a portfolio of insurance policies follows a binomial

distribution with parameters n and p. Individual claim amounts follow an
exponential distribution with mean 1/. The insurer has in place an individual
excess of loss reinsurance arrangement with retention M
(i) Derive an expression, involving M and , for the probability that an
individual claim involves the reinsurer.
Let Ii be an indicator variable taking the value 1 if the ith claim involves the
reinsurer and 0 otherwise.
(ii) Evaluate the moment generating function ( ).
Let K be the number of claims involving the re-insurer so that K = I1++IN.

www.sankhyiki.in
+91-9711150002
(iii) (a) Find the moment generating function of K

(b) Deduce that K follows a binomial distribution with parameters that
you should specify. [UK April 2010]
19. An insurance company has issued life insurance policies to 1,000 individuals.
Each life has a probability q of dying in the coming year. In a warm year,
q = 0.001 and in a cold year q = 0.005. The probability of a warm year is 50% and
the probability of a cold year is 50%. Let N be the aggregate number of claims
across the portfolio in the coming year.
(i) Calculate the mean and variance of N.
(ii) Calculate the alternative values for the mean and variance of N assuming
that q is a constant 0.003.
(iii) Comment on the results of (i) and (ii). [UK April 2010]
20. An insurance company has a portfolio of 10,000 policies covering buildings

against the risk of flood damage.
(i) State the conditions under which the annual number of claims on the
portfolio can be modelled by a binomial distribution B(n,p) with
n = 10,000.
These conditions are satisfied and p = 0.03. Individual claim amounts follow a
normal distribution with mean 400 and standard deviation 50. The insurer
wishes to take out proportional reinsurance with the retention  set such that the
probability of aggregate payments on the portfolio after reinsurance exceeding
120,000 is 1%.
(ii) Calculate  assuming that aggregate annual claims can be approximated
by a normal distribution.
This reinsurance arrangement is set up with a reinsurer who uses a premium
loading of 15%.
(iii) Calculate the annual premium charged by the reinsurer.
As an alternative, the reinsurer has offered an individual excess of loss
reinsurance arrangement with retention of M for the same annual premium. The
reinsurer uses the same 15% loading to calculate premiums for this arrangement.
(iv) Show that the retention M is approximately 358.50. [UK Sept 2010]

www.sankhyiki.in
+91-9711150002
21. The annual number of claims on an insurance policy within a certain portfolio
follows a Poisson distribution with mean . The parameter  varies from policy
to policy and can be considered as a random variable that follows an exponential
distribution with mean 1/.
Find the unconditional distribution of the annual number of claims on a
randomly chosen policy from the portfolio. [UK April 2011]
22. Claim amounts on a certain type of insurance policy depend on a parameter 
which varies from policy to policy. The mean and variance of the claim amount X
given are specified by
E[X| = 200 + 
V[X|] =10 + 2
The parameter  follows a normal distribution with mean 20 and variance 4.
Find the unconditional mean and variance of X. [UK Sept 2012]
23. Individual claim amounts from a particular type of insurance policy follow a
normal distribution with mean 150 and standard deviation 30. Claim numbers on
an individual policy follow a Poisson distribution with parameter 0.25. The
insurance company uses a premium loading of 70% to calculate premiums.
(i) Calculate the annual premium charged by the insurance company.
The insurance company has an individual excess of loss reinsurance arrangement
with retention of 200 with a reinsurer who uses a premium loading of 120%.
(ii) Calculate the probability that an individual claim does not exceed the
retention.
(iii) Calculate the probability for a particular policy that in a given year there
are no claims which exceed the retention.
(iv) Calculate the premium charged by the reinsurer.
(v) Calculate the insurance company‘s expected profit. [UK Sept 2012]
24. Claim numbers on a portfolio of insurance policies follow a Poisson process with
parameter  . Individual claim amounts X follow a distribution with moments
mi = E(Xi) for i = 1, 2, 3,…. Let S denote the aggregate claims for the portfolio.
You may assume that the mean of S is m1 and the variance of S is m2.
(i) Derive the third central moment of S and show that the coefficient of
skewness of S is .
( )
(ii) Show that S is positively skewed regardless of distribution of X.

(iii) Show that the distribution of S tends to symmetry as
[UK April 2013]

www.sankhyiki.in
+91-9711150002
25. An insurance company has a portfolio of 1,000 car insurance policies. Claims
arise on individual policies according to a Poisson process with annual rate .
The insurance company believes that  follows a gamma distribution with
parameters  = 2 and  = 8.
(i) (a) Show that the average annual number of claims per policy is 0.25.
(b) Show that the variance of the number of annual claims per policy is
0.28125.
Individual claim amount follow a gamma distribution with density
( )
(ii) Calculate the mean and variance of the annual aggregate claims for the
whole portfolio.
The insurance company has agreed an aggregate excess of loss reinsurance
contract with a retention of £0.55m (this means that the reinsurance company
will pay the excess above £0.55m if the aggregate claims on the portfolio in a
given year exceed £0.55m).
(iii) Calculate, using a Normal approximation, the probability of aggregate
claims exceeding the retention in any year.
For each of the last three years, the total claim amount has in fact exceeded the
retention.
(iv) Comment on this outcome in light of the calculation in part (iii).
[UK April 2013]
26. An insurance company offers dental insurance to the employees of a small firm.
The annual number of claims follows a Poisson process with rate 20. Individual
loss amounts follow an exponential distribution with mean 100. In order to
increase the take-up rate, the insurance company has guaranteed to pay a
minimum amount of £50 per qualifying claim. Let S be the total claim amount on
the portfolio for a given year.
(i) Show that the mean and variance of S are 2,213.06 and 413,918.40
respectively.
[You may use without proof the result that if In= ∫ then
In = ]
(ii) (a) Fit a log-normal distribution for S using the method of moments.
(b) Estimate the probability of S is greater than 4,000.

www.sankhyiki.in
+91-9711150002
Sarah, the insurance company‘s actuary, has instead approximated S by a

Normal distribution.
(iii) Explain, without performing any further calculations, whether the
probability that she calculates that S exceeds 4,000 will be greater or
smaller than the calculation in part (ii). [UK Sept 2013]
27. (i) List six of the characteristics that insurable risks usually have.
(ii) List two key characteristics of a short term insurance contract.
[UK April 2014]
28. Individual claim amounts on a portfolio of motor insurance policies follow a
Gamma distribution with parameters  and . It is known that  = 3 for all
drivers, but the parameter  vary across the population. 70% of drivers have
 = 300 and the remaining 30% have  = 600.
Claims on the portfolio follow a Poisson process with annual rate 500 and the
likelihood of a claim arising is independent of the parameter .
Calculate the mean and variance of aggregate annual claims on the portfolio.
[UK April 2014]
29. An insurance company has a portfolio of 240 insurance policies. The probability
of a claim on the ith policy in a year is pi independently from policy to policy
and there is no possibility of more than one claim. Claim amounts on the ith
policy follow an exponential distribution with mean .
Let X denote the aggregate annual claims on the portfolio.
Determine the mean and variance of X. [UK Sept 2014]
30. The number of claims, N, in a given year on a particular type of insurance policy
is given by:
P(N = n) = 0.8 n = 0, 1, 2, …
Individual claim amounts are independent from claim to claim and follow a
Pareto distribution with parameters  = 5 and  = 1,000.
(i) Calculate the mean and variance of the aggregate annual claims per
policy.
(ii) Calculate the probability that aggregate annual claims exceed 400 using:
(a) a Normal approximation. (b) a Lognormal approximation.
(iii) Explain which approximation in part (ii) you believe is more reliable.
[UK April 2015]

www.sankhyiki.in
+91-9711150002
31. (i) State the simplifications usually made in the basic model for short term
insurance contracts.
(ii) Give two examples of forms of insurance that can be regarded as short
term insurance contracts. [UK Sept 2015]
32. The number of claims in an insurance company follows type 2 negative binomial
distribution with mean and variance equal to 100 and 150 respectively.
Individual claim amounts follow exponential distribution with mean 100.
(i) What are the advantages of negative binomial distribution compared to
Poisson distribution for number of claims?
(ii) Deduce MGF of aggregate claims and calculate mean and variance.
[India Sept 2015]
33. An insurance company introduces a one year health insurance product which
pays a fixed benefit upon surgical procedures as specified in the policy contract.
The maximum no of claims permissible under the contract is limited to 2.
The benefit payable upon surgery is divided into two categories: minor and
major where Minor surgery Benefit = Rs. 100000 and Major Surgery Benefit = Rs.
200000.
The probability associated with minor and major surgical claims are 0.7 and 0.3
respectively.
Assuming that the no of claims from each policy follows a discrete distribution
with the following probability function:
Probability (number of claims equals 0) = 0.7
Derive the distribution function of the aggregate claim amount from an
individual policy over the coming year. [India May 2015]
34. In the country of Tyreia, a car tyre manufacturer offers a guarantee to purchasers
of its tyres. There are 500 cars covered, each of which has a probability p of being
involved in an accident (independently) and if a car is involved in an accident,
there is a probability of 0.1 for each tyre (independently) that the tyre will need
to be replaced at a cost of 5 units. Let S denote the total cost of replacement tyres
in a year. (Assume each car has 4 tyres)
(i) Find the moment generating function of S.

(ii) Hence show that E(S) = 1000 p and Var (S) = 6500p – 2000p2

www.sankhyiki.in
+91-9711150002
Suppose instead that the manufacturer models the cost of replacement tyres as a
random variable T based on a portfolio of 2000 tyres, each of which
(independently) has a probability of 0.1p of requiring replacement.
(iii) Derive expressions for E(T) and Var(T) in terms of p.

(iv) Suppose p = 0.05.
(a) Calculate the mean and variance of S and T.
(b) Calculate the probabilities that S and T exceed 75 units.
(c) Comment on the differences. [India Nov 2014]
35. The total number of claims N on a portfolio of insurance policies has a Poisson
distribution with mean λ. Individual claim amounts are independent of N and
each other, and follow a distribution X with mean μ and variance . S denotes
the total aggregate claims in the year. The random variable S therefore has a
compound Poisson distribution.
(i) Derive an expression for the moment generating function of S in terms of
the moment generating function of X.
(ii) Derive expressions for the mean and variance of S in terms of λ, μ and .
For a particular type of policy, individual losses are exponentially distributed
with mean 100. For losses above 200 the insurer incurs an additional expense of
50 per claim.
(iii) Calculate the mean and variance of S for a portfolio of such policies with
λ = 500. [India May 2014]
36. An insurance company issues 1000 policies to professional cyclists, where the
probability of claim in one year is p for each policy. The cyclists participate in
cycle races held at either mountain area or plain area. The value of p is 0.03 in
mountain area and it is 0.01 in plain area. The probability of being in mountain
or plain area in the next year is 50%. Let N be the aggregate number of claims
from the portfolio in the coming year.
(a) Calculate the mean and variance of N.
(b) Calculate an alternative approximation for mean and variance of N with
approximate common value of p as 0.02.
An Actuary relooks the portfolio and segregates the portfolio in 2 groups as high
and low risk category. The aggregate claim from each of the categories follows

www.sankhyiki.in
+91-9711150002
compound Poisson distribution. The high-risk category includes 400 policies

with Poisson parameter 0.2 and fixed individual claim amount of INR 4000. The
low risk category includes 400 policies with Poisson parameter 0.1 and
individual claim amounts are either of INR 8000 with probability 0.6 or of INR
10000 with probability 0.4. All policies are assumed to be independent and let S
be the random variable denoting the aggregate claim amount from the portfolio
over the next year.
(c) Calculate the mean and variance of S.
(d) Using normal approximation of S calculate the value of y where P(S > y) =
0.2. [India May 2013]
37. The number of deaths (S) in a railway division in a particular year is the
aggregate over the number of deaths in different fatal accidents. The number of
fatal accidents in a year (N), and the number of deaths in the ith fatal accident (Ui)
for i = 1,2, …, N have the following distributions.
( ) ( )
( ) ( )
Further, the numbers of deaths in different fatal accidents are independent of one
another, and are also independent of N.
(i) Calculate the moment generating function of N, and hence its mean and
variance.
(ii) Calculate the mean and variance of Ui.
(iii) Calculate the mean and variance of S.
(iv) The railway authorities provide accidental death cover for up to two
deaths per fatal accident per year, and engage a reinsurer for covering the
remaining deaths. Calculate the mean and variance of Y, the number of
deaths covered by the reinsurer over a year. [India Nov 2011]

www.sankhyiki.in
+91-9711150002
ANSWERS
1. E(S) = 80m V(S) = 25,622m
()
2. (i) () , -
(ii)( ) () * ,( ) - ,( ) - ,( ) -+
(ii)(b) ( )
3. (i) (a) E[S] = E[X1]E[N] V[S] = E[N]V[X1]+V[N][E(X1)]2
(b) ( ( ( )) (d) ( ) ( ( ))
(ii) (a) E[T] = ∑  V[T] = ∑ ∑ ( )
(b) E[ ̌ - ∑ V[ ̌ - ( )∑
(c) () ( ) Form of compound binomial with parameters 500

and 0.02
(d) ̌( ) ( ( )) Form of compound poisson with parameters 10.
4. Mean = 6 Variance = 23.7
5. E[S(2)] = 20 V[S(2)] = 40
6. (i) E[S] = 15.5 V[S] = 139
(ii) ( ) ( 0 ( ) ( ) 1) which is of the form of the MGF of a

compound Poisson distribution with parameter 3.5.
7. (i) E[S] = 0.5 V[S] = 1.25
 (ii) E[T] = 0.5 V[T] = 1.250.25 
8. (i) Coefficient of skewness = 1.6 (ii) 
 iii) Premium = 179,660
(iv) Independence could be doubtful because building all classified in one category
are likely to be facing similar risks or be in the same area. The normal
distribution is not a good approximation since it is a symmetrical distribution,
whereas it is likely that the distribution of S is positively skewed.
9. (i) Mean = 360 Variance = 16,400
(ii) Mean = 18000 Variance = 7,300,000

www.sankhyiki.in
+91-9711150002
( )
10. (ii) ( ) . /
11. (i) The policyholder must have an interest in the risk to be insured.
(ii) The risk must be of a quantifiable financial nature.
12. E[SI] = 2.203 E[SR] = 1.130
13. (i) E[S] = 2,761.51 Var[S] = 216,529
(ii) (a) E[S] = 2,761.51 2 = Var[S] = 216,529
(b) 7.90953 2 = 0.028
(iii) Normal = 0.3050 Lognormal = 0.2810
14. (iii) E[T] = 5,000p Var[T] = 500,000p – 50,000p2
(iv) (a) E[S] = 250, E[T] = 250, Var[S] = 27,250 and Var[T] = 24,875
(b) P(S>500) = P(S>550) = 0.0346 (Applying Continuity correction)
P(T>500) = P(T>550) = 0.0286 (Applying Continuity correction)
(c) Both S and T have the same mean, but variance for S is larger. This makes the tail
values more likely and hence the probability of S exceeding 500 is larger.
15. (iii) E[S] = 12 Var[S] = 102.4
16. (i) E[Xi] = 1+k(1-) Var[Xi] = k2(1-)
17. (i) ( ) ( ( )) (ii) E[S] =  V[S] = 
 (iii) E[S] = 53,383.38 V[S] = 12,199,198
18. (i) P(X>M) = (ii) ( )
(iii) ( ) , - (iv) K ( )
19. (i) E[N] = 3 V[N] = 6.987
(ii) E[N] = 3 V[N] = 2.991
(iii) The mean is the same in parts (i) and (ii) because the mean depends only on the
expected value of q, which is 0.003 in both cases. However, the variance is bigger in (i) as
q is a random variable in (i) and Var(1000q) will be bigger in part (i) and zero in part (ii)

www.sankhyiki.in
+91-9711150002
as q is constant in part (ii). This reflects more uncertainty in the number of claims when
we assume that q is a random variable than there is when q is constant.
20. (i) The assumptions are
 There is at most one claim per policy

 The probability of claim, p, is equal for each policy.
 The policies and claims on policies are independent.
(ii)  = 0.882348 (iii) 16,236
21. P(N=n) =
( )
22. E[X] = 220 V[X] = 54
23. (i) 63.75 (ii) 0.95224 (iii) 0.9881
(iv) 0.32703 (v) 26.07
24. (i) Third central moment of S is m3.
(ii) Since X takes only positive value, we have m3 > 0. This mean coefficient of skewness
is always posivite.
25. (ii) Mean = 500,000 Variance = 1,625,000,000
(iii) Prob = 0.1074
(iv) The prob three years in a row is 0.10743 = 0.00124 .
The probability of this happening is very low. It is more likely that the insurance
company‘s belief about the distribution of claims amounts is incorrect. The normal
approximation tails off quickly and so underestimates the probability of extreme events.
26. (ii)(a) = 7.6616 2=0.0811 (b) 0.01319
(iii) This is because the log normal distribution has a ―fat tail‖ and hence gives
more weight to extreme outcomes.
28. E[S] = 65,000 V[S] = 9,521,665
29. E[X] = 24000 V[X] = 10,000∑ . /
30. (i) E(S) = 62.5 V(S) = 45,572.92 (ii) (a) 0.0569 (b) 0.0249
(iii) The Pareto distribution is significantly skewed and the Normal approximation is
not. The Normal approximation in (ii)(b) has variance 213.482 and mean 62.5, so

www.sankhyiki.in
+91-9711150002
negative values of S (which are impossible in reality) are less than 1 standard deviation
from the mean. The approximation in (ii)(b) will therefore be more reliable.
31. (i) The model assumes that the mean and standard deviation of the claim amounts are
known with certainty. Model assumes that claims are settled as soon as the incident
occurs, with no delays.
No allowance for expenses is made.
No allowance for interest.
(ii) Car insurance, contents insurance (or other similar examples)
32. (i)One advantage that the negative binomial distribution has over the Poisson
distribution is that its variance exceeds its mean. Mean and variance are equal for the
Poisson distribution. Thus, the negative binomial distribution may give a better fit to a
data set which has a sample variance in excess of the sample mean. This is often the case
in practice.
( )
(ii) ( ) . / Mean = 10,000 and Variance = 25,00,000
33.
S 0 100000 200000 300000 400000
CDF 0.7 0.84 0.949 0.991 1
34. (i) ( ) ( ( ) )
(iii) E(T) = 1000p V(T) = 5000p(1 - 0.1p)
(iv) E(S) = E(T) = 50 V(S) = 320 and V(T) = 248.75
(v) P(S>75) = 0.06202 and P(T>75) = 0.0406
(vi) The two distributions have same mean but different variances, it being higher for S
compared to T. This means the probability would be higher under S than under T.
Though the probability under the two distribution is small in absolute terms, it‘s still
higher by 50% under S compared to T.
( ( ) )
35. (i) ( ) (ii) E(S) = λμ V(S) = λ * (μ2 + σ2)
(iii) Mean = 53,383.38 and Variance = 12,199,198.36
36. (a) E(N) = 20 and V(N) = 119.5 (b) E(N) = 20 and V(N) = 19.6
(c) E(S) = 6,72,000 and V(S) = (66453)2 (d) 727926
37. (i) ( ) ( ) ( ) (ii) E(U) =4/3 and V(U) = 4/9
(iii) E(S) =4 and V(S) = 68/3 (iv) E(U) =1/4 and V(U) = 23/48

www.sankhyiki.in
+91-9711150002
SURVIVAL MODELS
1. Show that if the force of mortality ( ) is given by
this implies that deaths between exact ages x and x + 1 are uniformly distributed.
[UK April 2005]
2. Studies of the lifetimes of a certain type of electric light bulb have shown that the
probability of failure, q0, during the first day of use is 0.05 and after the first day
of use the ―force of failure‖, x, is constant at 0.01.
(i) Calculate the probability that a light bulb will fail within the first 20 days.
(ii) Calculate the complete expectation of life (in days) of:
(a) a one-day old light bulb (b) a new light bulb
(iii) Comment on the difference between the complete expectations of life

calculated in (ii) (a) and (b). [UK Sept 2005]
3. Calculate 0.25p80 and 0.25p80.5, using the ELT15 (Females) mortality table and
assuming a uniform distribution of deaths. [UK Sept 2006]
4. (i) Define the hazard rate, h(t), of a random variable T denoting lifetime.
(ii) An investigation is undertaken into the mortality of men aged between

exact ages 50 and 55 years. A sample of n men is followed from their 50th
birthdays until either they die or they reach their 55th birthdays.
The hazard of death (or force of mortality) between these ages, h(t), is
assumed to have the following form: h(t) = α +βt where α and β are
parameters to be estimated and t is measured in years since the 50th
birthday.
(a) Derive an expression for the survival function between ages 50 and
55 years.
(b) Sketch this on a graph
(c) Comment on the appropriateness of the assumed form of the
hazard for modelling mortality over this age range.
[UK April 2007]
5. Below is an extract from English Life Table 15 (Males)

Age x lx
58 88,792
62 84,173

www.sankhyiki.in
+91-9711150002
(i) Estimate l60 under each of the following assumptions:
(a) a uniform distribution of deaths between exact ages 58 and 62

years; and
(b) a constant force of mortality between exact ages 58 and 62 years
(ii) Find the actual value of l60 in the tables and hence comment on the
relative validity of the two assumptions you used in part (i).
[UK Sept 2007]
6. (i) Explain the meaning of the rates of mortality usually denoted qx and mx ,
and the relationship between them.
(ii) Write down a formula for tqx, , under each of the following
assumptions about the distribution of deaths in the age range [x, x+1]:
(a) uniform distribution of deaths
(b) constant force of mortality
A group of animals experiences a mortality rate qx = 0.1.
(iii) Calculate mx under each of the assumptions (a) and (b) above.
(iv) Comment on your results in part (iii). [UK Sept 2008]
7. Below is an extract from English Life Table 15 (females).

Age x Number of survivors to
(years) exact age x out of 100,000 births
30 98,617
40 97,952
(i) Calculate 5q30 under each of the two following alternative assumptions:
(a) a uniform distribution of deaths (UDD) between ages 30 and 40

years
(b) a constant force of mortality between ages 30 and 40 years
(ii) Calculate the number of survivors to exact age 35 years out of 100,000
births under each of the assumptions in (i) above.

www.sankhyiki.in
+91-9711150002
English Life Table 15 (females) was originally calculated using data classified by
single years of age. The number of survivors to exact age 35 years was 98,359.
(iii) Comment on the appropriateness of the assumptions of UDD and a

constant force of mortality between ages 30 and 40 years in this example.
[UK April 2009]
8. (i) Prove that, under Gompertz‘s Law, the probability of survival from age x
to age x + t, tpx , is given by:
( )
tpx 0 . /1
For a certain population, estimates of survival probabilities are available as

follows:
1p50 = 0.995 2p50 = 0.989
(ii) Calculate values of B and c consistent with these observations.
(iii) Comment on the calculation performed in (ii) compared with the usual
process for estimating the parameters from a set of crude mortality rates.
[UK April 2009]
9. Let Tx be a random variable denoting future lifetime after age x, and let T be
another random variable denoting the lifetime of a new-born person.
(i) (a) Define, in terms of probabilities, Sx(t), which represents the

survival function of Tx.
(b) Derive an expression relating Sx(t) to S(t), the survival function of T.
(ii) Define, in terms of probabilities involving Tx, the force of mortality, μx+t.
The Weibull distribution has a survival function given by
( ) ( ) where λ and β are parameters (λ, β > 0).
(iii) Derive an expression for the Weibull force of mortality in terms of λ and β.
(iv) Sketch, on the same graph, the Weibull force of mortality for 0 ≤ t ≤ 5for
the following pairs of values of λ and β:
λ = 1, β = 0.5 ; λ = 1, β = 1.0; λ = 1, β = 1.5
[UK April 2009]

www.sankhyiki.in
+91-9711150002
10. Describe the difference between the following assumptions about mortality
between any two ages, x and y (y > x):
• uniform distribution of deaths
• constant force of mortality
In your answer, explain the shape of the survival function between ages x and y
under each of the two assumptions. [UK Sept 2009]
11. Write down integral equations for the mean and variance of the complete future
lifetime at age x, Tx. [UK April 2010]
12. (i) Write down a formula for tqx (0≤ t ≤ 1) under each of the following
assumptions:
(a) uniform distribution of deaths
(b) constant force of mortality
(ii) Calculate 0.5p60 to six decimal places under each assumption given
q60 = 0.05.
(iii) Comment on the relative magnitude of your answers to part (ii).

[UK Sept 2010]
13. A study of the mortality of a certain species of insect reveals that for the first 30
days of life, the insects are subject to a constant force of mortality of 0.05. After 30
days, the force of mortality increases according to the formula:
μ30+x = 0.05exp(0.01x) , where x is the number of days after day 30.
(i) Calculate the probability that a newly born insect will survive for at least
10 days.
(ii) Calculate the probability that an insect aged 10 days will survive for at
least a further 30 days.
(iii) Calculate the age in days by which 90 per cent of insects are expected to
have died. [UK April 2011]

www.sankhyiki.in
+91-9711150002
14. (i) Describe what is represented by each of the central rate of mortality, mx,
and the initial rate of mortality, qx.
(ii) State the circumstance in which mx = μx. [UK Sept 2011]
15. The mortality of a certain species of furry animal has been studied. It is known
that at ages over five years the force of mortality, μ, is constant, but the variation
in mortality with age below five years of age is not understood. Let the
proportion of furry animals that survive to exact age five years be 5p0.
(i) Show that, for furry animals that die at ages over five years, the average
age at death in years is .
(ii) Obtain an expression, in terms of μ and 5p0, for the proportion of all furry
animals that die between exact ages 10 and 15 years.
A new investigation of this species of furry animal revealed that 30 per cent of
those born survived to exact age 10 years and 20 per cent of those born survived
to exact age 15 years.
(iii) Calculate μ and 5p0. [UK April 2013]
16. (i) Define the force of mortality, μx+t of a random variable T denoting length
of life.
The mortality of a certain species of animal has been studied. It is known that at
ages under five years the force of mortality, μ, is constant.
(ii) Write down an expression, in terms of μ, for the probability that an animal
will survive from birth to exact age five years.
Mortality of these animals at ages over five years exact is incompletely

understood.
However it is known that the probability that an animal aged exactly five years
will survive until exact age 10 years is twice the probability that an animal aged
exactly five years will survive until exact age 20 years.
Assume that the force of mortality, λ, is constant at ages over five years exact.
(iii) Calculate λ.
(iv) Calculate the expectation of life at birth for these animals if λ = μ.
(v) Derive an expression, in terms only of μ, for the expectation of life at birth
for these animals if λ ≠ μ. [UK Sept 2014]

www.sankhyiki.in
+91-9711150002
17. The mortality of a rare form of flying beetle is being studied. It has been
discovered that beetles kept in a protected environment have a constant force of
mortality, μ, but that those in the wild have a force of mortality which is 50%
higher. It has been proven that the beetles revert immediately to the higher rate
of mortality if they are released from the protected environment.
A beetle born and always living in the wild has a 58% chance of living for eight
days. Calculate the probability of living the same length of time for:
(a) a beetle born and reared in the protected environment.
(b) a beetle born in the protected environment which is scheduled to be

released into the wild after six days. [UK April 2015]
18. The integrated hazard for mortality for a group of lives over the period (0,t) ,
where t is measured in weeks, is being modelled by the function:
0 1
( ) * +
(i) Find an expression for h(t) , the hazard function at time t .

(ii) Sketch a graph of h(t) .
(iii) Suggest a context where a hazard function with this shape might be
appropriate.
19. Calculate the complete and curtate expectation of life for an animal subject to a
constant force of mortality of 0.05 per annum.
20. For a particular population, = 40.20 and = 39.27. Calculate .
21. The ―Very-ruthless Management Consultancy Company‖ pays very high wages
but also has a very high failure rate, both from sackings and through people
leaving. A life table for a typical new recruit (with durations measured in years)
would be:
Duration No of lives
0 100,000
1 72,000
2 51,000
3 36,000
4 24,000
5 15,000
6 10,000
7 6,000
8 2,500
9 0

www.sankhyiki.in
+91-9711150002
75 graduates started working at the company on 1 September this year. Calculate

the following:
(i) The expected number of complete years that a graduate will complete
with the company.
(ii) A graduate‘s expected ―lifetime‖ with the company.
22. A mortality table is defined such that:

tpx ( ) for and
tpx 0 for
Calculate:
(i) the complete expectation of life at exact age 45.
(ii) the force of mortality at age 45.
23. A certain species of insect is subject to a constant force of mortality of per day.
Determine an exact expression in terms of  for the curtate expectation of life of a
newborn insect.
24. In a certain population, the force of mortality at age x is given by:
Calculate the probability that a life now aged exactly 73 will die between exact
age 79 and exact age 82.

www.sankhyiki.in
+91-9711150002
ANSWERS
2. (i) 0.21439 (ii) (a) 100 days (b) 95.975 days (UDD) / 95.97478 (CFM)
(iii) The complete expectation of life of a light bulb at any age is an average of the
future lifetimes of all bulbs which have not failed before that age. The value of ̇
is lower than ̇ because the average ̇ includes the very short lifetimes of the
relatively large proportion of bulbs which fail in the first day, which deflate the
average, whereas ̇ excludes these.
3. 0.98510 and 0.98464
( )
4. (i) The hazard function is defined as h(t) = .
(ii) (a) ( ) 0 1
(c) If both α and β are positive, then the formula implies a force of mortality
which increases with age, which is sensible for this age range. The parameter α
measures the ‗level‘ of mortality and the parameter β measures the rate of
increase with age. Varying these permits quite a wide range of forms for S(t). So
the formula seems appropriate.
5. (i) (a) 86,482.5 (b) 86,452
(ii) The actual value of l60 from the tables is 86,714. This shows that neither
assumption is very accurate, but that the uniform distribution of deaths (UDD) is
closer than the constant force of mortality. The UDD assumption is better than
the constant force of mortality assumption because UDD implies an increasing
force of mortality over this age range, which is biologically more plausible than
the assumption of a constant force. The fact that the actual value of l60 is
considerably greater than that implied by the UDD assumption suggests that the
true rate of increase of the force of mortality over this age range in English Life
Table 15 (males) is even greater than that implied by UDD.
6. (i) qx is the probability that a life aged exactly x will die before reaching exact age
x+1, and is called the initial rate of mortality.
mx is called the central rate of mortality and represents the probability that a life
alive between the ages of x and x+1 dies.

www.sankhyiki.in
+91-9711150002
They are related by

∫
(ii) (a) UDD - tqx = t*qx (b) CFM - tqx =
(iii) UDD – 0.105263 CFM – 0.1053605
(iv) The UDD assumption implies an increasing mortality rate over [x, x+1].
CFM is obviously constant. For a given number of deaths over the period, the
estimated exposure would be highest if we assumed an increasing mortality rate.
We would expect the central rate to be highest for that with the lowest estimate
exposure, hence CFM > UDD is the expected order.
7. (i) (a) 0.0033716 (b) 0.0033773 (ii) 98,283.9
(iii) The actual number of survivors to exact age 35 years is higher (or,
equivalently, mortality is lighter) than that under either the UDD or the constant
force assumptions.
The actual number of survivors implies that there were 258 deaths between ages
30 and 35 years and 407 deaths between ages 35 and 40 years.
The actual data reveal that the force of mortality is higher between ages 35 and
40 years than it is between ages 30 and 35 years for females in English Life Table
15, which suggests that the force of mortality is increasing over this age range.
The assumption of UDD implies an increasing force of mortality.
The actual force of mortality seems to be increasing even faster than is implied by
UDD.
A constant force of mortality is unlikely to be realistic for this age range.
Used over a 10-year age span the assumption of UDD is unlikely to be

appropriate, whereas used over single years of age it is acceptable.
8. (ii) c = 1.20665 B = 3.797x107
(iii) In this example, only two observations are provided so there is an analytical
solution to the Gompertz model.

www.sankhyiki.in
+91-9711150002
This is unrealistic as in general a graduation process would be used to provide

a fit to a set of crude rates.
This could be done by weighted least squares or maximum likelihood.
The more general graduation process allows the fitting of more complex models
from the Gompertz-Makeham family which have the form
μx = polynomial(1) + exp(polynomial(2))
the parameters of which cannot always so easily be estimated by the method

used in part (ii).
( )
9. (i) (a) ( ) , - (b) ( )
( )
( )
(ii)
(iii)
(iv)
10. A uniform distribution of deaths means EITHER that deaths are evenly spaced
between the ages x and y. OR that tqx = tqx ) OR + is
constant for .
It also means that the survival function decreases linearly between ages x and y.
The assumption of a constant force of mortality between any two ages means
EITHER that the hazard does not change with age over this age range. OR that
tpx = (px)t This implies that the survival function decreases exponentially between
ages x and y.

www.sankhyiki.in
+91-9711150002
11.
12. (i)(a) UDD - tqx = t*qx (b) CFM - tqx = (
(ii) (a) 0.975000 (b) 0.974679 (c)0.974359
(iii) The Balducci assumption has the smallest value, and the uniform
distribution of deaths (UDD) the largest value. This is because the UDD implies
an increasing force of mortality over the year of age, whereas the Balducci
assumption implies a decreasing force and a constant force is clearly constant.
The higher the force of mortality in the second half of the year of age relative to
its magnitude in the first half of the year of age, the higher the probability of
survival to age 60.5 years
The difference between the three values of 0.5q60 is very small in this case.
13. (i) 0.6065 (ii) 0.2174 (iii) 44.88 days
14. (i) mx is the probability of dying between exact ages x and x+1 per person-year
lived between exact ages x and x+1.
qx is the probability that a life alive at exact age x dies before exact age x+1
(ii) mx and μx are equal when the force of mortality μx+t is constant for 0 ≤ t < 1.
15. (ii) 10p0(1 – 5p10) = 5p0(5p5 – 10p5) = 5p0( )
(iii) 5p0 = 0.45
16. (ii) 5p0 = (iii) (iv) 14.43 yrs
(v) ( )
17. (a) 0.6955 (b) 0.6646

www.sankhyiki.in
+91-9711150002
18. (i) ( ) (ii)

* +
(iii) A hazard function with this shape might be appropriate, for example, for
modelling the mortality of patients recovering from an operation where there is a
high-risk period (represented by the hump) a couple of weeks after the operation
is carried out.
19.
20. 0.00174
21. (i) 2.165
(ii) The complete ―expectation of life‖ is equal to the curtate expectation plus 1⁄2,
i.e., 2.665 years. However, this is based on the (quite dubious!) assumption that
exits occur evenly over each year.
22. (i) 43.33 years (ii) 1/130
23.
24. 0.13487

www.sankhyiki.in
+91-9711150002
ESTIMATING THE LIFETIME DISTRIBUTION FUNCTION

1. A study of the mortality of 12 laboratory-bred insects was undertaken. The
insects were observed from birth until either they died or the period of study
ended, at which point those insects still alive were treated as censored.
The following table shows the Kaplan-Meier estimate of the survival function,
based on data from the 12 insects.
t (weeks) S(t)
0 t<1 1.0000
1 t<3 0.9167
3 t<6 0.7130
6 t 0.4278
(i) Calculate the number of insects dying at durations 3 and 6 weeks.
(ii) Calculate the number of insects whose history was censored.

[UK April 2005]
2. A lecturer at a university gives a course on Survival Models consisting of 8

lectures. 50 students initially register for the course and all attend the first
lecture, but as the course proceeds the numbers attending lectures gradually fall.
Some students switch to another course. Others intend to sit the Survival Models
examination but simply stop attending lectures because they are so boring. In
this university, students who decide not to attend a lecture are not permitted to
attend any subsequent lectures.
The table below gives the number of students switching courses and stopping
attending lectures after each of the first 7 lectures of the course.

www.sankhyiki.in
+91-9711150002
Number of Number of students

Lecture students ceasing to attend lectures
Number switching but remaining registered
courses for Survival Models
1 5 1
2 3 0
3 2 3
4 0 1
5 0 2
6 0 1
7 0 0
The university‘s Teaching Quality Monitoring Service has devised an Index of

Lecture Boringness. This index is defined as the Kaplan-Meier estimate of the
proportion of students remaining registered for the course who attend the final
lecture. In calculating the Index, students who switch courses are to be treated as
censored after the last lecture they attend.
(i) Calculate the Index of Lecture Boringness for the Survival Models course.
(ii) Explain whether the censoring in this example is likely to be non-

informative. [UK Sept 2005]
3. A life assurance company carried out an investigation of the mortality of male

life assurance policyholders. The investigation followed a group of 100
policyholders from their 60th birthday until their 65th birthday, or until they
died or cancelled their policy (whichever event occurred first).
The ages at which policyholders died or cancelled their policies were as follows:
Died (age in years and months)
60y 5m, 61y 1m, 62y 6m, 63y 0m, 63y 0m, 63y 8m and 64y 3m
Cancelled policy (age in years and months)
60y 2m, 60y 3m, 60y 8m, 61y 0m, 61y 0m, 61y 0m, 61y 5m, 62y 2m, 62y 9m,
63y 9m and 64y 5m
(i) Explain which types of censoring are present in the investigation.
(ii) Calculate the Nelson-Aalen estimate of the integrated hazard for these
policyholders.

www.sankhyiki.in
+91-9711150002
(iii) Sketch the estimated integrated hazard function.
(iv) Estimate the probability that a policyholder will survive to age 65.
[UK April 2006]
4. A life insurance company has carried out a mortality investigation. It followed a

sample of independent policyholders aged between 50 and 55 years.
Policyholders were followed from their 50th birthday until they died, they
withdrew from the investigation while still alive, or they celebrated their 55th
birthday (whichever of these events occurred first).
(i) Describe the censoring that is present in this investigation.
An extract from the data for 12 policyholders is shown in the table below.
Last age at which policyholder

Policyholder was observed (years and Outcome
months)
1 50 years 3 months Died
2 50 years 6 months Withdrew
11 55 years 0 months Still alive
12 55 years 0 months Still alive
(ii) Calculate the Nelson-Aalen estimate of the survival function.

(iii) Sketch on a suitably labelled graph the Nelson-Aalen estimate of the
survival function. [UK Sept 2006]
5. A medical study was carried out between 1 January 2001 and 1 January 2006, to
assess the survival rates of cancer patients. The patients all underwent surgery
during 2001 and then attended 3-monthly check-ups throughout the study.
The following data were collected:

www.sankhyiki.in
+91-9711150002
For those patients who died during the study exact dates of death were recorded
as follows:
Patient Date of surgery Date of death
A 1 April 2001 1 August 2005
B 1 April 2001 1 October 2001
C 1 May 2001 1 March 2002
D 1 September 2001 1 August 2003
E 1 October 2001 1 August 2002
For those patients who survived to the end of the study:

Patient Date of surgery
F 1 February 2001
G 1 March 2001
H 1 April 2001
I 1 June 2001
J 1 September 2001
K 1 September 2001
L 1 November 2001
For those patients with whom the hospital lost contact before the end of the
investigation:
Patient Date of surgery Date of last check-up
M 1 February 2001 1 August 2003
N 1 June 2001 1 March 2002
O 1 September 2001 1 September 2005
(i) Explain whether and where each of the following types of censoring is
present in this investigation:
(a) type I censoring
(b) interval censoring; and
(c) informative censoring
(ii) Calculate the Kaplan-Meier estimate of the survival function for these
patients. State any assumptions that you make.
(iii) Hence estimate the probability that a patient will die within 4 years of
surgery. [UK April 2007]
6. In order to boost sales, a national newspaper in a European country wishes to

compile a ―fair play league table‖ for the country‘s leading football clubs. On 1
December it undertakes a survey of all the players who play for these clubs, in
which it collects the following data:
• number of games played by each player since the beginning of the season
(the football season in this country begins in September); and
• for each player who had been dismissed from the field of play between

www.sankhyiki.in
+91-9711150002
the beginning of the season and 1 December (inclusive), the number of

games he had played before the game in which he was first dismissed
No games were played on 1 December.
The statistic the newspaper proposes to use in order to construct its ―fair play
league table‖ is the probability that a player will not have been dismissed in any
of his first 10 games. It plans to calculate this statistic for each of the 20 leading
clubs.
The following table shows the data collected for the players of the club which
was top of the league on 1 December.
Total number of Number of times Games played before

Player
games played dismissed first dismissal
1 12 0
2 12 0
3 12 1 5
4 12 0
5 12 1 7
6 12 0
7 10 0
8 9 1 0
9 9 1 5
10 8 0
11 6 2 2
12 5 0
13 5 0
14 4 1 0
15 4 0
(i) (a) Explain how the Kaplan-Meier estimator can be used to estimate
the newspaper‘s statistic from these data.
(b) Comment on the way in which censoring arises and on the type of
censoring produced.
(ii) Calculate the newspaper‘s statistic using the data above. [UK Sept 2007]
7. An investigation into the mortality of patients following a specific type of major

operation was undertaken. A sample of 10 patients was followed from the date
of the operation until either they died, or they left the hospital where the
operation was carried out, or a period of 30 days had elapsed (whichever of these
events occurred first). The data on the 10 patients are given in the table below.

www.sankhyiki.in
+91-9711150002
Reason for
Duration of
Patient Number observation
observation (days)
ceasing
1 2 Died
2 6 Died
3 12 Died
4 20 Left Hospital
5 24 Left Hospital
6 27 Died
7 30 Study ended
8 30 Study ended
9 30 Study ended
10 30 Study ended
(i) State whether the following types of censoring are present in this
investigation. In each case give a reason for your answer.
(a) Type I (b) Type II (c) Random
(ii) State, with a reason, whether the censoring in this investigation is likely to
be informative.
(iii) Calculate the value of the Kaplan-Meier estimate of the survival function
at duration 28 days.
(iv) Write down the Kaplan-Meier estimate of the hazard of death at duration
8 days.
(v) Sketch the Kaplan-Meier estimate of the survival function. [UK April 2008]
8. In an investigation of reconviction rates among those who have served prison

sentences, let X be a random variable which measures the duration from the date
of release from prison until the ex-prisoner is convicted of a subsequent offence.
The investigation monitored a sample of 100 ex-prisoners (who were all released
on the same date) at one-monthly intervals from their date of release for a period
of 6 months. Those who could not be traced in any month were removed from
the sample at that point and not traced in subsequent months. Reconviction was
assumed to take place at the duration that a prisoner was first known to have
been reconvicted.
(i) Express the hazard rate at duration x months in terms of probabilities.
The investigation produced the following data for a sample of 100 ex-prisoners
Number who had

Months since Number of prisoners
been reconvicted
release contracted
since last contact
1 100 0
2 97 0

www.sankhyiki.in
+91-9711150002
3 95 4
4 90 3
5 85 5
6 80 0
(ii) Calculate the Nelson-Aalen estimate of the survival function.
A previous investigation found that the probability that a prisoner would be

reconvicted within 6 months of release was 0.2.
(iii) Estimate confidence intervals around the integrated hazard using the
results from part (ii) to test the hypothesis that the rate of reconviction has
declined since the previous investigation. [UK Sept 2008]
9. An electronics company developed a revolutionary new battery which it

believed would make it enormous profits. It commissioned a sub-contractor to
estimate the survival function of battery life for the first 12 prototypes. The sub-
contractor inserted each prototype battery into an identical electrical device at
the same time and measured the duration elapsing between the time each device
was switched on and the time its battery ran out. The sub-contractor was
instructed to terminate the test immediately after the failure of the 8th battery,
and to return all 12 batteries to the company.
When the test was complete, the sub-contractor reported that he had terminated
the test after 150 days. He further reported that:
• two batteries had failed after 97 days
• three further batteries had failed after 120 days
• two further batteries had failed after 141 days
• one further battery had failed after 150 days
However, he reported that he was only able to return 11 batteries, as one had
exploded after 110 days, and he had treated this battery as censored at that
duration when working out the Kaplan-Meier estimate of the survival function.
(i) State, with reasons, the forms of censoring present in this study.
(ii) Calculate the Kaplan-Meier estimate of the survival function based on the
information supplied by the sub-contractor.
In his report, the sub-contractor claimed that the Kaplan-Meier estimate of the
survival function at the duration when the investigation was terminated was
0.2727.
(iii) Explain why the sub-contractor‘s Kaplan-Meier estimate would be
consistent with him having stolen the battery he claimed had exploded.
[UK Sept 2009]
10. A certain profession admits new members to the status of student. Students May
qualify as fellows of the profession by virtue of passing a series of examinations.

www.sankhyiki.in
+91-9711150002
Normally student members sit the examinations whilst working for an employer.
There are two sessions of the examinations each year.
An employer provides study support to student members of the profession. It

wishes to assess the cost of providing this study support and therefore wishes to
know the average time it can expect to take for its students to qualify.
The employer has maintained records for 23 of its students who all sat their first
examination in the first session of 2003. The students‘ progress has been recorded
up to and including the last session of 2009. The following data records the
number of sessions which had been held before the specified event occurred for a
student in this cohort:
Qualified 6, 8, 8, 9, 9, 9, 11, 11, 13, 13, 13

Stopped studying 4, 5, 8, 11, 14
The remaining seven students were still studying for the examinations at the end
of 2009.
(i) Determine the median number of sessions taken to qualify for those
students who qualified during the period of observation.
(ii) Calculate the Kaplan-Meier estimate of the survival function, S(t), for the
―hazard‖ of qualifying, where t is the number of sessions of examinations
since 1 January 2003.
(iii) Hence estimate the median number of sessions to qualify for the students
of this employer.
(iv) Explain the difference between the results in (i) and (iii) above.
[UK April 2010]
11. A researcher is reviewing a study published in a medical journal into survival

after a certain major operation. The journal only gives the following summary
information:
• the study followed 16 patients from the point of surgery
• the patients were studied until the earliest of five years after the
operation, the end of the study or the withdrawal of the patient
from the study
• the Nelson-Aalen estimate, S(t), of the survival function was as
follows
Duration since operation t (years) S(t)
1
0.9355
0.7122
0.6285

www.sankhyiki.in
+91-9711150002
(i) Describe the types of censoring which are present in the study.
(ii) Calculate the number of deaths which occurred, classified by duration
since the operation.
(iii) Calculate the number of patients who were censored. [UK Sept 2010]
12. At Miracle Cure hospital a pioneering new surgery was tested to replace human
lungs with synthetic implants. Operations were carried out throughout June
2010. Patients who underwent the surgery were monitored daily until the end of
August 2010, or until they died or left hospital if sooner. The results are shown
below. Where no date is given, the patient was alive and still in hospital at the
end of August.
Reason for
Date of leaving
Patient Date of Surgery leaving
observation
observation
A June 1 June 3 Died
B June 3 July 2 Left Hospital
C June 5
D June 8
E June 9 July 11 Died
F June 12
G June 16 June 21 Died
H June 17 Aug 12 Left Hospital
I June 22
J June 24 June 29 Died
K June 25 Aug 20 Died
L June 26
M June 29 Aug 6 Left Hospital
N June 30
(i) Explain whether each of the following types of censoring is present and
for those present explain where they occur:
• right censoring
• left censoring
• informative censoring
(ii) Calculate the Kaplan-Meier estimate of the survival function for these
patients, stating all assumptions that you make.
(iii) Sketch, on a suitably labelled graph, the Kaplan-Meier estimate of the

survival function.
(iv) Estimate the probability that a patient will die within four weeks of
surgery. [UK April 2011]

www.sankhyiki.in
+91-9711150002
13. A new weedkiller was tested which was designed to kill weeds growing in grass.
The weedkiller was administered via a single application to 20 test areas of grass.
Within hours of applying the weedkiller, the leaves of all the weeds went black
and died, but after a time some of the weeds re-grew as the weedkiller did not
always kill the roots.
The test lasted for 12 months, but after six months five of the test areas were
accidentally ploughed up and so the trial on these areas had to be discontinued.
None of these five areas had shown any weed re-growth at the time they were
ploughed up.
• Ten of the remaining 15 areas experienced a re-growth of weeds at the
following durations (in months): 1, 2, 2, 2, 5, 5, 8, 8, 8, 8.
• Five areas still had no weed re-growth when the trial ended after 12
months.
(i) Describe, giving reasons, the types of censoring present in the data.
(ii) Estimate the probability that there is no re-growth of weeds nine months
after application of the weedkiller using either the Kaplan-Meier or the
Nelson- Aalen estimator. [UK Sept 2011]
14. Mr Bunn the baker made 12 pies to sell in his shop. He placed the pies in the
shop at 9 a.m. During the rest of the day the following events took place.
Time Event
10 a.m. A boy bought two pies
11 a.m. A man bought three pies
12 noon Mr Bunn accidentally sat on one pie and squashed it so it
could not be sold
1 p.m. A woman bought two pies
2 p.m. A dog from across the street ran into Mr Bunn‘s shop and
stole two pies
3 p.m. A girl on the way home from school bought one pie
5 p.m. Mr Bunn closed for the day and the remaining pie was still
in the shop
(i) Estimate the time it takes Mr Bunn to sell 40% of the pies he makes, using
the Nelson-Aalen estimator.
(ii) Comment on whether you think this estimate would be a good basis for
Mr Bunn to plan his future production of pies. [UK April 2012]
15. A certain town runs a training course for traffic wardens each year. The course
lasts for 30 days, but the examination which enables someone to qualify as a
traffic warden can be sat any day during the course. In 2011 there were 13

www.sankhyiki.in
+91-9711150002
participants who started the training course. The following table has been
compiled to show the day each candidate qualified or the day each candidate
who did not qualify left the course.
Day left without

Candidate Day Qualified
qualifying
A 30
B 5
C 21
D 19
E 12
F 30
G 1
H 19
I 12
J 30
K 15
L 10
M 24
(i) Explain whether the following types of censoring are present:
• interval censoring
• right censoring
• informative censoring
(ii) Calculate the Kaplan-Meier estimate of the non-qualification function.

(iii) Sketch a graph of the Kaplan-Meier estimate, labelling the axes.
When the data were gathered, the reasons for exit of candidates D and H were
accidentally transposed, and those for candidates B and L were also accidentally
transposed.
(iv) Explain how your answer to part (ii) would change if you had access to
the correct (i.e. untransposed) data for candidates D, H, B and L.
[UK Sept 2012]
16. In the context of a survival model:
(i) Define right censoring, Type I censoring and Type II censoring.
(ii) Give an example of a practical situation in which censoring would be
informative. [UK April 2013]
17. The Shining Light company has developed a new type of light bulb which it
recently tested. 1,000 bulbs were switched on and observed until they failed, or
until 500 hours had elapsed. For each bulb that failed, the duration in hours until

www.sankhyiki.in
+91-9711150002
failure was noted. Due to an earth tremor after 200 hours, 200 bulbs shattered
and had to be removed from the test before failure.
The results showed that 10 bulbs failed after 50 hours, 20 bulbs failed after 100
hours, 50 bulbs failed after 250 hours, 300 bulbs failed after 400 hours and 50
bulbs failed after 450 hours.
(i) Calculate the Kaplan-Meier estimate of the survival function, S(t), for the
light bulbs in the test.
(ii) Sketch the Kaplan-Meier estimate calculated in part (i).
(iii) Estimate the probability that a bulb will not have failed after each of the
following durations: 300 hours, 400 hours and 600 hours. If it is not
possible to obtain an estimate for any of the durations without additional
assumptions, explain why. [UK April 2013]
18. (i) Explain what is meant by censoring in the context of a mortality

investigation.
A trial was conducted on the effectiveness of a new cream to treat a skin
condition. 100 sufferers applied the cream daily for four weeks or until their
symptoms disappeared if this happened sooner. Some of the sufferers left the
trial before their symptoms disappeared.
(ii) Describe two types of censoring that are present and state to whom they
apply.
The following data were collected.
Numbers of Day symptoms Numbers of Day they left

sufferers disappeared sufferers the trail
2 6 3 2
1 7 1 10
1 10 3 13
2 14
(iii) Calculate the Nelson-Aalen estimate of the survival function for this trial.
(iv) Sketch the survival function, labelling the axes.
(v) Estimate the probability that a person using the cream will still have
symptoms of the skin condition after two weeks. [UK Sept 2013]
19. (i) Describe what is meant by censoring in the context of a mortality

investigation.

www.sankhyiki.in
+91-9711150002
(ii) Explain what right-censoring, left-censoring and interval censoring are,

giving an example of each.
A toy manufacturer is testing the lifetime of its new electric children‘s toy. 500
are set going at 9 a.m. one morning on test rigs plugged into the electricity
supply and are run until 5 p.m. the next day or until they fail, whichever comes
first. Unfortunately the cleaner unplugged a test rig on which 17 toys were still
working at 7 p.m. on the first evening in order to plug his floor polisher in. Then,
as he left work three hours later, he took three of the still working toys for his
children to play with. Of the other 480 toys it was found that 12 failed after four
hours, 25 failed after 11 hours and a further 8 failed after 31 hours.
(iii) Explain which forms of censoring are present in this investigation.

(iv) Calculate the Nelson-Aalen estimate of the survival function.
(v) Sketch a graph of the Nelson-Aalen estimate of the survival function,
labeling the axes.
(vi) Comment on the length of time for which a new toy has a 60% probability
of surviving. [UK April 2014]
20. An investigation was undertaken into the length of post-operative stay in

hospital after a particular type of surgery. All patients undergoing this surgery
between 1 January and 31 January 2013 were observed until either they left the
hospital, died, or underwent a second operation. The event of interest was
leaving the hospital. Patients who died or underwent a second operation during
the period of investigation were treated as censored at the date of death or
second operation respectively. The investigation ended on 28 February 2013, and
patients who were still in the hospital at that time were treated as censored.
(i) State, with reasons, whether the following types of censoring are present
in this investigation:
• right • Type I • Type II • random
(ii) Comment on whether censoring in this investigation is likely to be

informative.
The following data relate to 11 patients included in the investigation.
Date of Date observation Reason that

Operation ended observation ended
2 January 30 January Second Operation
5 January 7 January Died
10 January 24 January Left Hospital
12 January 12 February Left Hospital

www.sankhyiki.in
+91-9711150002
15 January 29 January Left Hospital

20 January 21 January Died
23 January 28 February End of Investigation
24 January 31 January Second Operation
(iii) Calculate the Kaplan-Meier estimate of the survivor function for
remaining in the hospital.
(iv) Sketch the Kaplan-Meier estimate of the survivor function, labelling the
axes.
(v) Comment on the results of the investigation. [UK Sept 2014]
21. A study was made of a group of people seeking jobs. 700 people who were just
starting to look for work were followed for a period of eight months in a series of
interviews after exactly one month, two months, etc. If the job seeker found a job
during a month, the job was assumed to have started at the end of the month.
Unfortunately, the study was unable to maintain contact with all the job seekers.
The data from the study are shown in the table below:
Months since
Found employment Contact lost
start of study
1 100 50
2 70 0
3 50 20
4 40 20
5 20 30
6 20 60
7 12 38
8 6 0
(i) (a) Describe two types of censoring present in the investigation.
(b) Describe an example of a person to whom each type applies.
(ii) Calculate the Kaplan-Meier estimate of the function for ―remaining

without employment‖.
A Weibull distribution with a rate h(t) given by the formula h(t) = λββtβ-1 was
fitted to these data. The estimated value of λ was 0.18 and the estimated value of
β was 0.3. (iii)
Test the goodness-of-fit of the data to this Weibull distribution.
[UK April 2015]

www.sankhyiki.in
+91-9711150002
22. (i) Define how the following forms of censoring arise in a survival
investigation:
- right censoring - type I censoring - random censoring
An experience analysis is conducted where the event of interest is the lapse of a
term assurance policy.
(ii) Explain whether each form of censoring listed in part (i) occurs in each of
the following situations. If it is not possible to state whether a form of
censoring occurs, explain why this is the case.
(a) A policyholder dies.
(b) A subset of the policies is migrated to a new administration system
and no data are provided from the new system to the experience
analysis team.
(c) A policy reaches its maturity date. [UK Sept 2015]
23. A school offers a one year course in a foreign language as an evening class. This
is divided into three terms of 13 weeks each with one lesson per week. At the end
of each lesson all the students sit a test and any that pass are awarded a
qualification, and no longer attend the course.
Last year 33 students started the course. Of these 13 dropped out before
completing the year, and 16 passed the test before the end of the year. The last
lesson attended by the students who did not stay for the whole 39 lessons is
shown in the table below along with their reason for leaving.
Number of Last lesson Reason for
Students attended leaving
5 1 Dropped out
1 6 Dropped out
2 7 Passed out
2 13 Dropped out
5 14 Passed out
6 27 Passed out
4 28 Dropped out
1 30 Dropped out
3 36 Passed out
(i) Calculate the Nelson-Aalen estimate of the survival function.
(ii) Sketch a graph of the Nelson-Aalen estimate of the survival function,
labeling the axes.
(iii) Determine the probability that a student who starts the course passes by
the end of the year.
Since only four students had not passed by the end of the year and a total of 16
had passed, the school claims in its publicity that 80% of students are awarded
the qualification by the end of the year.
(iv) Comment on the school‘s claim in light of your answer to part (iii).
[UK Sept 2015]

www.sankhyiki.in
+91-9711150002
24. (i) Assume that the force of mortality between consecutive integer ages, y
and y + 1, is constant and takes the value μy.
Let Tx be the future lifetime after age x ( ) and Sx(t) be the survival
function of Tx. Show that:
, ( )- , ( )]
(ii) An investigation was carried out into the mortality of male life office
policyholders. Each life was observed from his 50th birthday until the first
of three possible events occurred: his 55th birthday, his death, or the
lapsing of his policy. For those policyholders who died or allowed their
policies to lapse, the exact age at exit was recorded.
Using the result from part (i) or otherwise, describe how the data arising
from this investigation could be used to estimate:
(a) (b) 5 [UK April 2006]
25. An investigation was undertaken into the time spent waiting in check-out queues
at a supermarket. A random sample of customers was surveyed, and the times at
which they joined the check-out queue and completed their purchases were
recorded. If they left the check-out queue without completing a purchase, the
time at which they left was also recorded. Below are the data for 12 customers.
Customer Time joined Time purchase Time left without

Number completed making purchase
1 10.00 a.m. 10.08 a.m.
2 10.07 a.m. 10.09 a.m.
3 10.10 a.m. 10.16 a.m.
4 10.25 a.m. 10.31 a.m.
5 10.30 a.m. 10.32 a.m.
6 10.45 a.m. 10.49 a.m.
7 11.10 a.m. 11.20 a.m.
8 11.15 a.m. 11.21 a.m.
9 11.35 a.m. 11.40 a.m.
10 11.58 a.m. 12.09 p.m.
11 12.10 p.m. 12.14 p.m.
12 12.15 p.m. 12.22 p.m.
(i) Calculate the Kaplan-Meier estimate of the survival function of the

duration between joining the queue and completing a purchase.

www.sankhyiki.in
+91-9711150002
The supermarket decides to introduce a scheme under which any customer who
has to wait at a checkout for more than 10 minutes receives a $2 refund on the
cost of their shopping. The supermarket has 20,000 customers per day.
(ii) Give an estimate of the daily cost of the new scheme.
(iii) Comment on the assumptions that you have made in obtaining the
estimate in (ii). [UK April 2016]
26. (i) Explain the differences between random censoring and Type I censoring
in the context of an investigation into the mortality of life insurance
policyholders. Include in your explanation a statement of the
circumstances in which the censoring will be random, and the
circumstances in which it will be Type I, and give an example of each.
(ii) Explain what non-informative censoring in the investigation in (i) means.
Describe a situation in which censoring might be informative in this
investigation.

www.sankhyiki.in
+91-9711150002
ANSWERS
1. (i) 2 insects died at duration 3 weeks and 2 insects died at duration 6 weeks.
(ii) 7 insects were censored.
2. (i) 0.807
(ii) Censoring in this case is unlikely to be non-informative. This is because the

students who switched courses were probably less interested in the subject
matter of Survival Models than those who remained registered.
Therefore they would have been more likely, had they not switched courses, to
cease attending lectures than those who did not switch.
3. (i) The following types of censoring will be present:
 Right censoring because some policyholders cancel their policy before the
end of the period.
 Type I censoring because the investigation stops at a fixed time.
 Random censoring because some lives cancel their policy at an unknown
time.
 Informative censoring because those who cancel their policy tend to be in
better health.
(ii) ̂
(iv) 0.9243
4. (i) There will be Type I censoring of lives that survive to age 55 years. There will
be random censoring of lives that withdraw before age 55 years.

www.sankhyiki.in
+91-9711150002
(ii) ̂ ( )
5. (i) (a) Type I censoring is present for those lives still under observation at 31
December 2005 as the censoring times are known in advance.
(b) Interval censoring would be present if we only knew death occurred between
check-ups. However, actual dates of death are known, so interval censoring is
not present.
(c) Informative censoring is not likely to be present. The censoring of lives gives
us no information about future lifetimes.
(ii) ̂ ( ) (iii) 0.2821
6. (i) (a) If, for player i, Ti is the number of games played before he is dismissed, and
Ci is the total number of games played before 1 December, and di = 1 if the player
had been dismissed before 1 December and 0 otherwise.
(b) Censoring in these data arises because not all players have been dismissed
before 1 December. Those players who have yet to be dismissed on that data are
right-censored. This censoring is random [NOT Type I], because the metric of
―duration‖ is the number of games played since the start of the season, and this
may vary from player to player.
(ii) ̂ ( )
7. (i) Type I censoring is present because the study ends at a predetermined

duration of 30 days.
Type II censoring is not present because the study did not end after a
predetermined number of patients had died.

www.sankhyiki.in
+91-9711150002
Random censoring is present because the duration at which a patient left hospital
before the study ended can be considered as a random variable.
(ii) Yes. Those patients who left hospital before 30 days had elapsed are more
likely to be recovering well than those patients who remained in hospital, and so
will probably be less likely to die.
(iii) 0.56 (iv) 0
(v)
( )
8. (i)
(ii) ̂ ( ) {
(iii) CI for (0.0601, 0.2085) and CI for S(x) is (0.8118, 0.9417). Since the 95
percent confidence interval around S(x) in the current investigation does not
include the value 0.8, and our estimate of S(x) > 0.8 we conclude that the rate of
reconviction has declined since the previous investigation.
9. (i) Type II censoring as the study was terminated after a pre-determined number
of failures. Random censoring of the device which exploded.
(ii) ̂ ( )
(iii) Since 5/18 is not equal to 0.2727, the sub-contractor‟s story is internally
inconsistent. The Kaplan-Meier estimate of the survival function after the failure
of the 8th battery of 0.2727 would be obtained had only 11 batteries been tested at

www.sankhyiki.in
+91-9711150002
the start, and no battery being censored. Therefore the value of S(150) reported
by the sub-contractor is consistent with him having stolen the last battery.
10. (i) 9 sessions
(ii) ̂ ( )
(iii) The median time to qualify as estimated by the Kaplan-Meier estimate is the
first time at which S(t) is below 0.5. Therefore the estimate is 13 sessions.
(iv) The estimate based on students qualifying during the period is a biased
estimate because it does not contain information about students still studying at
the end of the period, or about those who dropped out (stopped studying
without qualifying).
The students still studying at the end of 2009 have (by definition) a longer period
to qualification than those who qualified in the period.
Hence the Kaplan-Meier estimate is higher than the median using only students
who qualified during the period.
11. (i) Type I (right censoring) of patients who survive to duration 5 years. Random
censoring of patients who withdraw from the study.
(ii) 1 death at duration 1 year, 3 deaths at duration 3 years and 1 death at

duration 4 years
(iii) 11
12. (i) Right censoring is present for those still alive and in hospital at the end of
August OR for those who left hospital while still alive
Left censoring is not present
The censoring is likely to be informative, since those leaving hospital are likely to
be in much better health than those who remain.

www.sankhyiki.in
+91-9711150002
(ii) ̂ ( )
(iii) (iv) 0.2143
13. (i) Right censoring: some areas never developed new weeds. Type I censoring as
the study lasts for a pre-determined time. Random censoring as the accidental
ploughing happened at a time which was not predetermined.
Interval censoring as we do not know exactly when in each month the weed re-
growth happened. Non-informative censoring as the fact that an area was
ploughed up tells us nothing about the duration to weed re-growth in any of the
remaining areas.
(ii) KM – 0.3889 NA – 0.4596
14. (i) We need t for which S(t) = 0.6.Therefore it will be 4 hours until Mr Bunn has
sold 40% of his pies.
(ii) The estimate would not be a good basis on which to plan future production.
And how long it takes to sell 40% of your goods is not very relevant for future
production.
It is based on only one day‘s experience, and a good basis for future production
should be based on several days, probably involving different days of the week.
Sales of pies may vary seasonally: data from a winter‘s day may tell Mr Bunn
little about the demand for pies in summer.
Mr Bunn might be more careful in future not to sit on his pies, and might take
steps to avoid the dog from across the street stealing pies.
The proportion of pies sold will depend on the number of pies Mr Bunn stocks.
He should not assume if he had twice as many pies he would still sell 40% of
them in 4 hours. Mr Bunn may vary his sales strategy, by, for example, reducing
his prices. The method does, however, take account to of censored data.

www.sankhyiki.in
+91-9711150002
15. (i) Interval - No. We are counting in days and we know which day each event
occurred.
Right - Yes. The end of the course at day 30 cut short the investigation when not
all candidates had qualified.
Informative - Possible. Those who left during the 30 days will probably take
longer to qualify than those who stayed.
(ii) ̂ ( )
(iii)
(iv) Since qualifications are assumed to happen before censorships, swapping D

and H will have no effect at all, as the order of the two events will simply be
revesed. Swapping B and L will reduce the value of nj at time 10 which will
increase the value of the hazard at duration 10 compared with previously at
duration 5. This will increase S(t) from durations 5 to immediately before 10 and
reduce it at durations 10 and over.
16. (i) Right censoring. The duration to the event is not known exactly, but is known
to exceed some value. OR the censoring mechanism cuts short observations in
progress.
Type I censoring. The durations at which observations will be censored are

specified in advance.
Type II censoring. Observation continues until a pre-determined

number/proportion of individuals have experienced the event of interest.

www.sankhyiki.in
+91-9711150002
(ii) An investigation of mortality based on life office data in whichindividuals are

censored who discontinue paying their premiums.
17. (i) ̂ ( )
(ii)
(iii) S(300) = 0.9070 and S(400) = 0.5291. S(600) cannot be estimated without
additional assumptions as it lies outside the range of our data.
18. (i) Censoring is the mechanism which prevents us from knowing when an
individual entered the investigation or the exact date of death.
(ii) Right Censoring. The trial is cut short after four weeks when some patients
had still not recovered. OR The trial is cut short when some patients left the trial
before their symptoms disappeared.
Type I Censoring. Censoring times are known in advance for all those patients
still not recovered at the end of the trial.
Random Censoring. The time at which patients left the trial before their
symptoms disappeared is a random variable.
Non-Informative Censoring. There is no reason to believe that those who left the
trial had more or less chance of being cured by the cream than those who
remained.

www.sankhyiki.in
+91-9711150002
(iii) ̂ ( )
(iv) (v) 0.93777
19. (i) Censoring is the mechanism which prevents us from knowing when an
individual entered the investigation or the exact date of death.
(ii) Right-censoring cuts short the investigation in progress so we do not know

exactly when the event of interest happened, we only know it happened after a
certain date.
An example of this might be in a mortality investigation conducted over a period

of one year, all those still alive at the end of the year will die some time after the
end of the investigation, but we do not know when.
Left-censoring prevents us from knowing when entry into the state which we
wish to observe took place.
An example arises in medical studies in which patients are subject to regular

examinations. Discovery of a condition tells us only that the onset fell in the
period since the previous examination, the time elapsed since onset has been left
censored.
Interval-censoring happens if we can only say that an event of interest fell within
some interval of time, rather than exactly when it happened.
For example in a mortality investigation when we only know the calendar year of
death rather than the precise date of death.
(iii) Right-censoring is present as the observation was cut short while in progress
for those toys which were unplugged, taken and which remained working at the
end of the trial.

www.sankhyiki.in
+91-9711150002
Type I censoring is present as the trial ended at a predetermined time, so all

those toys still working were Type I censored.
The censoring is likely to be non-informative censoring. The toys which were

unplugged and taken are unlikely to have any special features such as working
for longer or shorter overall than the rest of the toys in the trial.
Random censoring is present as the action of the cleaner censored the toys at
times which were random.
(iv) ̂ ( ) {
(v)
(vi) We do not know the length of time for which a new toy has a 60% chance of
surviving, only that it is some time in excess of 32 hours.
20. (i) Right censoring - Yes, of patients not experiencing the event of interest before
28 February either because they died, or because they had a second operation, or
because they remained in the hospital until 28 February, each of which outcomes
cut short observations in progress.
Type I censoring - Yes, of those patients remaining in hospital on 28 February,

since this date was fixed in advance of the investigation.
Type II censoring - No, as the end of the investigation was determined by time,
not by the number of patients who had left hospital.
Random censoring - Yes, of patients who died or who had a second operation,
the times of which were not known in advance of the investigation and can be
considered as random variables.

www.sankhyiki.in
+91-9711150002
(ii) Censoring is likely to be informative. Those patients who died or who

underwent a second operation were probably recovering less well than patients
who left hospital. Had they not died or undergone a second operation, they
would probably have remained in hospital for longer than those patients who
were not censored.
(iii) ̂ ( )
(iv)
(v) Deaths occur soon after the operation. There is a high hazard of leaving the
hospital after 14 days.
It may be that clinical protocols regard 14 days as the minimum period for which
patients who have had this operation should remain in hospital, no matter how
well they seem to be recovering.
The fact that censoring is informative is likely to bias the estimate.
The results may not be credible or may have a large variance because the sample
size is very small. The data only allow us to make estimates of ―survival‖ up to a
duration of 36 days.
21. (i) Right censoring - The exact duration of the event is not known, but only that it
exceeds some duration. Example: job seekers with whom contact was lost during
the investigation (or those still seeking jobs at the end of the investigation)
Random censoring - The time at which contact was lost may be regarded as a
random variable. Example: a job seeker with whom contact was lost during the
investigation.

www.sankhyiki.in
+91-9711150002
Type I censoring - The censoring times were known in advance (as they were
determined by the fixed period of the investigation). Example: a person still
without work after 8 months.
Interval censoring - The censoring mechanism prevents us from knowing exactly

when the event of interest took place, only that it fell within a certain period.
Example: EITHER a person who actually found a job after 5.5 months (say) is
recorded as having found a job after 6 months; OR a person who was still seeking
work at the end of the investigation found a job within the interval [8,∞)
(ii) ̂ ( )
(iii) The null hypothesis is that the durations at which job seekers find work
follow a Weibull distribution with parameters λ = 0.18 and β = 0.3. The calculated
value of the chi-squared statistic is 16.40. This should be compared with the
critical value at the 5% level with 6 degrees of freedom (because we have eight
ages and two parameters have been fitted, and 8 – 6 = 2) which is 12.59.
Since 16.40 > 12.59 we reject the null hypothesis that the time to employment
follows the Weibull distribution.
22. (i) Right censoring refers to a life ceasing to be observed prior to the event of
interest occurring.
Type I censoring occurs when the censoring times are known in advance and
lives will be considered censored on a pre-determined date regardless of whether
the event of interest has occurred.
Random censoring refers to the time of censoring being a random variable such
that censoring may occur as a random event prior to the event of interest.
(ii) (a) Right censoring occurs because the censoring means no information is
available about whether the policy would subsequently have lapsed.

www.sankhyiki.in
+91-9711150002
This is not Type I censoring as it would not be known in advance when the
policyholder would die.
Random censoring occurs as the time of death is a random variable.
(b) It is right censoring as it removes information about whether the policies

subsequently lapsed.
It is not clear whether this is Type I censoring because it is not known whether
the migration was anticipated in the observation plan.
For the same reason it is not clear whether it is random censoring.
(c) It is in theory right censoring, but in practice the event of interest cannot occur
after the censoring date.
It is Type I censoring as the maturity date would be known in advance.
23. (i) ̂ ( )
(ii)
(iii) 48.48%
(iv) The school has ignored those students who dropped out during the year.
Since they did not pass, their exclusion would clearly increase the proportion
who pass.
24. (ii) (a) Using the result from part (i) and putting x = 50, y = 50 gives
, ( )-. Since we have censored data, because of the possibility of
policy lapse, we should estimate S50(1) using the Kaplan-Meier or Nelson-Aalen
estimator and hence obtain an estimate of .

www.sankhyiki.in
+91-9711150002
(b) 5q50 = 1 – 5p50, and, since 5p50 = S50(5) , 5q50 can be estimated directly as
1 - S50(5), where S50(5) is the Kaplan-Meier or Nelson-Aalen estimator of the
probability of a life aged 50 years surviving for a further 5 years.
25. (i) ̂ ( )
(ii) 0.2540 × 20,000 × $2 = $10,159
(iii) The survey data mainly relate to the morning. We assume that the staffing
levels of the check-outs relative to customer flow remain the same in the
afternoons.
We assume that the introduction of the compensation scheme does not
change customers‘ behaviour (for example discouraging customers from
leaving the queue).
The sample size (12) is very small compared to the daily customer base
(20,000) which produces a very ―steppy‖ result. We have had to use the
value for S(10) which is also the value for S(8). A larger sample size may
give a smoother more accurate picture.
26. (i) Both random censoring and Type I censoring are examples of right censoring.
Right censoring occurs when a life exits the investigation for a reason other than
death. With random censoring, the censoring times are not known in advance –
they are not chosen by the investigator and are random variables.
An example of random censoring in life insurance is the event of a policyholder
choosing to surrender a policy.
Type I censoring occurs when the censoring times are known in advance, ie the
censoring times are chosen by the investigator. An example of Type I censoring is
when observation ceases for all those still alive at the end of the period of
investigation.
(ii) Censoring is non-informative if it gives no information about the future

patterns of mortality by age for the censored lives.
In the context of this investigation, non-informative censoring occurs if at any
given time, lives are equally likely to be censored regardless of their subsequent
force of mortality. This means that we cannot tell anything about a person‘s
mortality after the date of the censoring event from the fact that they have been

www.sankhyiki.in
+91-9711150002
censored. In this investigation withdrawals might be informative, since lives that

are in better health may be more likely to surrender their policies than those in a
poor state of health. Lives that are censored are therefore likely to have lighter
mortality than those that remain in the investigation.

www.sankhyiki.in
+91-9711150002
PROPORTIONAL HAZARDS MODELS

1. (i) Write down the equation of the Cox proportional hazards model in which
the hazard function depends on duration t and a vector of covariates z.
You should define all the other terms that you use.
(ii) Explain why the Cox model is sometimes described as semi-parametric.

[UK April 2005]
2. An investigation was carried out into the effects of lifestyle factors on the
mortality of people aged between 50 and 65 years. The investigation took the
form of a prospective study following a sample of several hundred individuals
from their 50th birthdays until their 65th birthdays and collecting data on the
following covariates for each person:
X1 Sex (a categorical variable with 0 = female, 1 = male)
X2 Cigarette smoking (a categorical variable with 0 = non-smoker, 1 =

smoker)
X3 Alcohol consumption (a categorical variable with 0 = consumes fewer

than 21 units of alcohol per week, 1 = consumes 21 or more units of
alcohol per week)
In addition, data were collected on the age at death for persons who died during
the period of investigation. In order to analyse the data, it was decided to use a
Gompertz hazard, x = Bcx, where x is the duration since the start of the
observation.
(i) Explain why the Gompertz hazard might be appropriate for analysing the
mortality of persons aged between 50 and 65 years.
(ii) Show that the substitution: B = exp( 0 + 1 X1 + 2 X2 + 3 X3), in the
Gompertz model (where 0 ... 3 are parameters to be estimated), leads
to a proportional hazards model for this particular analysis.
(iii) Using the Gompertz hazard, the parameter estimates in the proportional
hazards model were as follows:
Covariate Parameter estimate Parameter

Sex 1 0.40
Cigarette Smoking 2 0.75
Alcohol consumption 3 -0.2
0 -5.00
c 1.10

www.sankhyiki.in
+91-9711150002
(a) Describe the characteristics of the person to whom the baseline hazard
applies in this model.
(b) Calculate the estimated hazard for a female cigarette smoker aged 55
years who does not consume alcohol.
(c) Show that, according to this model, a cigarette smoker at any age has a
risk of death roughly equal to that of a non-smoker aged eight years older.
[UK Sept 2005]
3. A Cox proportional hazards model was estimated to assess the effect on survival
of a person s sex and his or her self-esteem (measured on a three-point scale as
low, medium or high). The baseline category was males with low self-esteem.
Write down the equation of the model, using algebraic symbols to represent
variables and parameters and defining all the symbols that you use.
[UK April 2006]
4. An investigation was undertaken into the effect of a new treatment on the

survival times of cancer patients. Two groups of patients were identified. One
group was given the new treatment and the other an existing treatment. The
following model was considered:
( ) ( ) ( ) where: ( ) is the hazard at time t, where t is the time

since the start of treatment and ( ) is the baseline hazard at time t
is a vector of covariates such that:
z1 = sex (a categorical variable with 0 = female, 1 = male)
z2 = treatment (a categorical variable with 0 = existing treatment,

1 = new treatment)
and is a vector of parameters, 1, 2 .
The results of the investigation showed that, if the model is correct:
A the risk of death for a male patient is 1.02 times that of a female
patient; and
B the risk of death for a patient given the existing treatment is 1.05
times that for a patient given the new treatment

www.sankhyiki.in
+91-9711150002
(i) Estimate the value of the parameters 1 and 2.
(ii) Estimate the ratio by which the risk of death for a male patient who has
been given the new treatment is greater or less than that for a female
patient given the existing treatment.
(iii) Determine, in terms of the baseline hazard only, the probability that a
male patient will die within 3 years of receiving the new treatment.
[UK Sept 2006]
5. (i) Compare the advantages and disadvantages of fully parametric models

and the Cox regression model for assessing the impact of covariates on
survival.
You have been asked to investigate the impact of a set of covariates, including
age, sex, smoking, region of residence, educational attainment and amount of
exercise undertaken, on the risk of heart attack. Data are available from a
prospective study which followed a set of several thousand persons from an
initial interview until their first heart attack, or until their death from a cause
other than a heart attack, or until 10 years had elapsed since the initial interview
(whichever of these occurred first).
(ii) State the types of censoring present in this study, and explain how each
arises.
(iii) Describe a criterion which would allow you to select those covariates
which have a statistically significant effect on the risk of heart attack,
when controlling the other covariates of the model.
Suppose your final model is a Cox model which has three covariates: age
(measured in age last birthday minus 50 at the initial interview), sex (male = 0,
female = 1) and smoking (non-smoker = 0, smoker = 1), and that the estimated
parameters are:
Age 0.01
Sex -0.4
Smoking 0.5
Sex x smoking -0.25

www.sankhyiki.in
+91-9711150002
where ―sex x smoking‖ is an additional covariate formed by multiplying the two

covariates ―sex‖ and ―smoking‖.
(iv) Describe the final model‘s estimate of the effect of sex and of smoking
behaviour on the risk of heart attack.
(v) Use the results of the model to determine how old a female smoker must
be at the initial interview to have the same risk of heart attack as a male
non-smoker aged 50 years at the initial interview. [UK Sept 2007]
6. An education authority provides children with musical instrument tuition. The

authority is concerned about the number of children giving up playing their
instrument and is testing a new tuition method with a proportion of the children
which it hopes will improve persistency rates. Data have been collected and a
Cox proportional hazards model has been fitted for the hazard of giving up
playing the instrument. Symmetric 95% confidence intervals (based upon
standard errors) for the regression parameters are shown below.
Covariate Confidence Interval
Instrument
Piano 0
Violin [-0.05,0.19]
Trumpet [0.07,0.21]
Tuition method
Traditional 0
New [-0.15,0.05]
Sex
Male [-0.08,0.12]
Female 0
(i) Write down a general expression for the Cox proportional hazards model,
defining all terms that you use.

www.sankhyiki.in
+91-9711150002
(ii) State the regression parameters for the fitted model.
(iii) Describe the class of children to which the baseline hazard applies.
(iv) Discuss the suggestion that the new tuition method has improved the
chances of children continuing to play their instrument.
(v) Calculate, using the results from the model, the probability that a boy will
still be playing the piano after 4 years if provided with the new tuition
method, given that the probability that a girl will still be playing the
trumpet after 4 years following the traditional method is 0.7.
[UK April 2008]
7. A study was undertaken into the length of spells of unemployment among

young people in a certain city. A sample of young people was monitored from
the time they started to claim unemployment benefit until either they resumed
work, or they moved away from the city. None of the members of the sample
died during the study.
The study investigated the impact of age, sex and educational qualifications on
the hazard of returning to work using the following covariates:
A a young person‘s age when he or she started claiming benefit (measured

in exact years since his or her 16th birthday)
S a dummy variable taking the value 1 if the person was male and 0 if the
person was female
E a dummy variable taking the value 1 if the person had passed a school
leaving examination in mathematics, and 0 otherwise
with associated parametersβA , βS and βE .
The investigators decided to use a Cox proportional hazards regression model

for the study.
(i) Explain what is meant by a proportional hazards model.
(ii) Explain why the Cox model is a popular model for the analysis of survival
data.
(iii) (a) Write down the equation of the model that was estimated, defining
the terms you use (other than those defined above).

www.sankhyiki.in
+91-9711150002
(b) List the characteristics of the young person to whom the baseline
hazard applies.
The results showed:
• The hazard of resuming work for males who started claiming benefit
aged 17 years exact and who had passed the mathematics examination
was 1.5 times the hazard for males who started claiming benefit aged 16
years exact but who had not passed the mathematics examination.
• Females who had passed the mathematics examination were twice as

likely to take up a new job as were males of the same age who had failed
the mathematics examination.
• Females who started claiming benefit aged 20 years exact and who had
passed the mathematics examination were twice as likely to resume work
as were males who started claiming benefit aged 16 years exact and who
had also passed the mathematics examination.
(iv) Calculate the estimated values of the parameters βA , βS and βE .

[UK Sept 2009]
8. (i) Write down the hazard function for the Cox proportional hazards model
defining all the terms that you use.
A farmer is concerned that he is losing a lot of his birds to a predator, so he
decides to build a new enclosure using taller fencing. This fencing is expensive
and he cannot afford to build a large enough area for all his birds. He therefore
decides to put half his birds in the new enclosure and leave the others in the
existing enclosure. He is convinced that the new enclosure is an improvement,
but has asked an actuarial student to determine whether the new enclosure will
result in an increase in the life expectancy of his birds. The student has fitted a
Cox proportional hazards model to data on the duration until a bird is killed by a
predator and calculated the following figures relating to the regression
parameters:
Parameter estimate Variance

Chicken 0 0
Bird Duck -0.210 0.002
Goose 0.075 0.004
New 0.125 0.0015
Enclosure
Old 0 0
Male 0.2 0.0026
Sex
Female 0 0

www.sankhyiki.in
+91-9711150002
(ii) State the features of the bird to which the baseline hazard applies.
(iii) For each regression parameter:
(a) Define the associated covariate.
(b) Calculate the 95% confidence interval based on the standard error.
(iv) Comment on the farmer‘s belief that the new enclosure will result in an
increase in his birds‘ life expectancy.
(v) Calculate, using this model, the probability that a female duck in the new
enclosure has been killed by a predator at the end of six months, given
that the probability that a male goose in the old enclosure has been killed
at the end of the same period is 0.1 (all other decrements can be ignored).
[UK April 2010]
9. A study is made of the impact of regular exercise and gender on the risk of
developing heart disease among 50–70 year olds. A sample of people is followed
from exact age 50 years until either they develop heart disease or they attain the
age of 70 years. The study uses a Cox regression model.
(i) List reasons why the Cox regression model is a suitable model for
analyses of this kind.
The investigator defined two covariates as follows:
• Z1 = 1 if male, 0 if female.
• Z2 = 1 if takes regular exercise, 0 otherwise.
The investigator then fitted three models, one with just gender as a covariate, a
second with gender and exercise as covariates, and a third with gender, exercise
and the interaction between them as covariates. The maximised log-likelihoods
of the three models and the maximum likelihood estimates of the parameters in
the third model were as follows:
null model –1,269

gender –1,256
gender + exercise –1,250
gender + exercise + interaction –1,246

www.sankhyiki.in
+91-9711150002
Covariate Parameter
Gender 0.2
Exercise –0.3
Interaction –0.35
(ii) Show that the interaction term is required in the model by performing a
suitable statistical test.
(iii) Interpret the results of the model. [UK Sept 2011]
10. A new drug treatment for patients suffering from a chronic skin disease with
visible symptoms was tested. The drug was administered through a daily dose
for the duration of the trial. As soon as the drug regime started, the symptoms
disappeared in all patients, but after some time had a tendency to reappear as the
agent causing the disease developed resistance to the drug. The trial lasted for six
months.
The data below show the number of patients experiencing a return of their
symptoms in each month after the drug regime started.
Month Number of patient-months Number of patients experiencing
exposed to risk a return of their symptoms
1 200 5
2 190 8
3 175 15
4 150 10
5 135 6
6 125 3
(i) Calculate the hazard of symptoms returning in each month.
As part of the investigation, it is desired to assess the impact of certain risk

factors on the hazard of symptoms returning. It is suggested that to achieve this,
the hazard could be modelled using either a Gompertz model or a semi-
parametric model.
(ii) Comment on the use of each of these models in this situation. [UK April 2012]

www.sankhyiki.in
+91-9711150002
11. For a particular investigation the hazard of mortality is assumed to take the form:
h(t) = A + Bt where A and B are constants and t represents time.
For each life i in the investigation (i = 1, …, n) information was collected on the

length of time the life was observed ti and whether the life exited due to death ( δi
= 1 if the life died, 0 otherwise).
(i) Show that the likelihood of the data is given by:
∏ ( ) . /
(ii) Derive two simultaneous equations from which the maximum likelihood
estimates of the parameters A and B can be calculated. [UK April 2012]
12. (i) State one advantage of a semi-parametric model over a fully parametric
one.
(ii) Write down a general expression for the Cox proportional hazards model,
defining all the terms you use.
A life office is trying to understand the impact of certain factors on the lapse rates
of its policies. It has studied the lapse rates on a block of business subdivided by:
• sex of policyholder (Male or Female)
• policy type (Term Assurance or Whole Life)
• sales channel (Internet, Direct Sales Force or Independent Financial Adviser)
The office has fitted a Cox proportional hazards model to the data and has
calculated the following regression parameters:
Covariate Regression parameter
Female 0.2
Male 0
Term Assurance -0.1

Whole Life 0
Internet 0.4
Independent Financial Adviser -0.2
Direct Sales Force 0
(iii) State the sex/sales channel/policy type combination to which the baseline
hazard relates.
A Term Assurance is sold to a Female by an Independent Financial Adviser.
(iv) Calculate the probability that this Term Assurance is still in force after five
years given that 60% of Whole Life policies bought on the Internet by
Males have lapsed by the end of year five. [UK Sept 2012]

www.sankhyiki.in
+91-9711150002
13. (i) State the form of the hazard function for the Cox Regression Model,
defining all the terms used.
(ii) State two advantages of the Cox Regression Model.
Susanna is studying for an on-line test. She has collected data on past attempts at
the test and has fitted a Cox Regression Model to the success rate using three
covariates:
Employment Z1 = 0 if an employee, and 1 if self-employed
Attempt Z2 = 0 if first attempt, and 1 if subsequent attempt
Study time Z3 = 0 if no study time taken, and 1 if study time taken
Having analysed the data Susanna estimates the parameters as:

Employment 0.4
Attempt -0.2
Study time 1.15
Bill is an employee. He has taken study time and is attempting the test for the
second time. Ben is self-employed and is attempting the test for the first time
without taking study time.
(iii) Calculate how much more or less likely Ben is to pass, compared with Bill.
Susanna subsequently discovers that the effect of the number of attempts is

different for employees and the self-employed.
(iv) Explain how the model could be adjusted to take this into account.
[UK April 2013]
14. (i) Explain why the Gompertz model is commonly used in investigations of
human mortality.
The following model of mortality was used in an investigation of the effects of
where someone lives and income on the risk of death.
loge μx = α +β0x +β1U +β2I ,
where μx is the force of mortality at age x, U takes the value 1 if the person lives
in an urban area and 0 if the person lives in a rural area, I is annual income in US
dollars, and α, β0, β1 and β2 are parameters.
(ii) Show that the model is both a Gompertz model and a proportional
hazards model.
The estimates of the parameters were α = -9.0 β0 = 0.09, β1 = 0.3 and β2 = -0.0001.
(iii) Calculate the predicted force of mortality for an urban resident aged 40
years with an annual income of $20,000.
(iv) Calculate the additional income that an urban resident must have in order
to have the same force of mortality as a rural resident of the same age.

www.sankhyiki.in
+91-9711150002
(v) Calculate the 10-year survival probability for an urban resident aged 40
years whose annual income is $20,000.
(vi) Determine the age of a rural resident with the same income as an urban
resident aged 40 years, who has the same chance of surviving for the next
10 years. [UK Sept 2013]
15. An investigation has been performed into risk factors for liver disease in persons
currently resident in the United Kingdom (UK) and aged over 50 years. It
considered the impact of three covariates: age at the start of the investigation,
weekly alcohol consumption and previous residence in a tropical country.
The investigation used a Cox regression model for the hazard of developing the
disease, h(t), with three parameters, A, C, and T, as follows:
h(t) = h0(t) exp(AA+CC +TT).
A was defined as exact age at the start of the investigation less 50 years.
C represented weekly alcohol consumption, and took the value 1 if the person
consumed more than the recommended maximum per week (a heavy drinker)
and 0 otherwise.
T represented previous residence in a tropical country, and took the value 1 if the
person had lived in a tropical country for more than 12 months and 0 otherwise.
(i) State the characteristics of a person to whom the baseline hazard, h0(t),
applies.
The results of the investigation revealed that the hazard was:
• twice as high for a heavy drinker aged 60 years exact at the start of the
investigation than for a person aged 50 years exact at the start of the
investigation who was not a heavy drinker, where neither had previously
lived in a tropical country.
• four times as high for a heavy drinker who had previously lived in a
tropical country for more than 12 months than for a non-heavy drinker of
the same age who had not previously lived in a tropical country.
• three times as high for a person who had lived in a tropical country for
more than 12 months than for a person of the same age and drinking
habits who had always lived in the UK.

www.sankhyiki.in
+91-9711150002
(ii) Calculate A, C, and T.
The probability of a person aged 50 years exact at the start of the investigation,
who does not drink heavily and has always lived in the UK remaining free of the
disease for 10 years is 0.8.
(iii) Show that the probability of a person of the same age and drinking habits,
who has lived for more than 12 months in a tropical country, remaining
free of the disease for 10 years is slightly over one half. [UK April 2014]
16. (i) Explain what is meant by a proportional hazards model.
(ii) Outline three reasons why the Cox proportional hazards model is widely
used in empirical work. [UK April 2015]
17. (i) Describe what is meant by a proportional hazards model.
A pharmaceutical company is interested in testing a new treatment for a

debilitating but non-fatal condition in cows. A randomised trial was carried out
in which a sample of cows with the condition was assigned to either the new
treatment or the previous treatment. The event of interest was the recovery of a
cow from the condition. The results were analysed using a Cox regression model.
The final model estimated the hazard, h(t,x) as:
h(t,x) = h0(t) exp(0z + 1x + 2xz),
where: h0(t) is the baseline hazard;
z is a covariate taking the value 1 if the cow was assigned the new treatment and
0 if the cow was assigned the previous treatment;
x is a covariate denoting the length of time (in days) for which the cow had been
suffering from the condition when treatment was started; and t is the number of
days since treatment started.
0 , 1 and 2 are parameters. Their estimated values were 0 = 0.8, 1 = 0.4 and
2 = -0.1.
(ii) Determine the characteristics of the baseline cow.

www.sankhyiki.in
+91-9711150002
For a particular cow, the new treatment and the previous treatment have exactly
the same hazard.
(iii) Calculate the number of days for which that cow had the condition before
the initiation of treatment.
Under the previous treatment, cows whose treatment began after they had been
suffering from the condition for three days had a median recovery time of 14
days once treatment had started.
(iv) Calculate the proportion of these cows, which would still have had the
condition after 14 days if they had been given the new treatment.
[UK Sept 2015]
18. A study is being conducted, using the Cox regression model, into how smoking
affects a patient‘s future lifetime after they have had a serious heart attack. The
survival times and smoking status for 6 patients are shown in the table below.
Patients have been labelled as ‗censored‘ if they were still alive at the end of the
investigation or if their death was not considered to be attributable to the heart
attack.
Patient No Time to death Smoker Censored

1 3 Yes No
2 Still Alive No Yes
3 9 No No
4 10 Yes Yes
5 8 No Yes
6 7 No No
The force of mortality for life i at duration t is modelled as:

( ) ( ) ( ) where:
t is the duration in weeks since having a heart attack
0(t) is the baseline hazard function at time t
{ is a regression parameter
Write down the partial likelihood function of given these data values. 32

www.sankhyiki.in
+91-9711150002
ANSWERS
1. (i) If the hazard for life i is (t; zi), then (t; zi) ( ) ( ) where ( ) is the
baseline hazard, and is a vector of regression parameters.
(ii) The model is semi-parametric because is possible to estimate from the data
without estimating the baseline hazard. Therefore the baseline hazard can have
any shape determined by the data.
2. (i) Taking logarithms of the Gompertz hazard produces log x = log B + xlog c
which indicates that the rate of increase of the hazard with age is constant.
Empirically, this is often a reasonable assumption for middle ages and older
ages, which include the age range 50 - 65 years.
(ii) (a) The baseline hazard in this model relates to a female, non-smoker, who
drinks less than 21 units of alcohol per week.
(b) 0.23
3. h(t) = h0(t) exp[ 1F + 2M + 3H] where
h(t) is the estimated hazard, h0(t) is the baseline hazard,
F is a variable taking the value 1 if the life is female, and 0 otherwise,
M is a variable taking the value 1 if the life has medium selfesteem and 0
otherwise,
H is a variable taking the value 1 if the life has high self-esteem and 0 otherwise,
and 1, 2 and 3 are parameters to be estimated.
4. (i) ̂ 1 = 0.0198 and ̂ 2 = -0.0488 (ii) 0.9714
∫ ( )
(iii) . /
5. (i) Fully parametric models are good for comparing homogenous groups, as
confidence intervals for the fitted parameters give a test of difference between the
groups which should be better than non-parametric procedures, or
semiparametric procedures such as the Cox model.
But parametric methods need foreknowledge of the form of the hazard function,
which might be the object of the study.

www.sankhyiki.in
+91-9711150002
The Cox model is semi-parametric so such knowledge is not required.
The Cox model is a standard feature of many statistical packages for estimating
survival model, but many parametric distributions are not, and numerical
methods may be required, entailing additional programming.
(ii) Type I censoring, since the investigation ends after a period which is fixed in
advance. Random censoring, since death from a cause other than a heart attack is
a random variable and may occur at any time.
(iii) The likelihood ratio statistic is a common criterion. Suppose we fit a model
with p covariates and another model with p+q covariates which include all the p
covariates of the first model.
Then if the maximised log-likelihoods of the two models are Lp and Lp+q, then
the statistic -2(Lp - Lp+q ) has a chi-squared distribution with q degrees of
freedom, under the hypothesis that the extra q covariates have no effect in the
presence of the original p covariates.
This statistic can be used either will full likelihoods or with partial likelihoods in
the Cox model This statistic can be used to test the statistical significance of any
set of q covariates in the presence of any other disjoint set of p covariates.
(iv) Holding other factors constant, females have a lower risk of heart attack than
males, and smokers have a higher risk than non-smokers, but the effect of
smoking varies for men and women.
(v) the woman‘s age at interview must be 65 years.
6. (i) ( ) ( ) ( ) where: ( )is the hazard at time t, and ( ) is the

baseline hazard at time t,
zi are the covariates and is the vector of regression parameters.
(ii) z1 = 1 plays violin, 0 otherwise β1 = 0.07

z2 = 1 plays trumpet, 0 otherwise β2 = 0.14
z3 = 1 new tuition method, 0 otherwise β3 = -0.05
z4 = 1 male, 0 otherwise β4 = 0.02
(iii) Baseline hazard refers to a female, following traditional tuition method and
playing the piano

www.sankhyiki.in
+91-9711150002
(iv) The parameter associated with the new tuition method is -0.05. Because the
parameter is negative, the hazard of dropping out is reduced by the new tuition
method. Therefore the new tuition method does appear to improve the chances
of a child continuing with his or her instrument.
However the 95% confidence interval for the parameter spans zero. So at the 5%
significance level it is not possible to conclude that the new tuition method has
improved the chances of children continuing to play their instrument.
(v) 0.74014
7. (i) A proportional hazards (PH) model is a model which allows investigators to

assess the impact of risk factors, or covariates, on the hazard of experiencing an
event. In a PH model the hazard is assumed to be the product of two terms, one
which depends only on duration, and the other which depends only on the
values of the covariates.
Under a PH model, the hazards of different lives with covariate vectors z 1 and z2
are in the same proportion at all times.
(ii) Cox‟s model ensures that the hazard is always positive. Standard software
packages often include Cox‟s model. Cox‟s model allows the general ―shape‖ of
the hazard function for all individuals to be determined by the data, giving a
high degree of flexibility while an exponential term accounts for differences
between individuals.
This means that if we are not primarily concerned with the precise form of the
hazard, we can ignore the shape of the baseline hazard and estimate the effects of
the covariates from the data directly.
(iii) (a) (t) = 0(t)exp(AA+EE +SS) , where (t) is the estimated hazard and
0(t) is the baseline hazard.
(b) A female aged exactly 16 years when she first claimed benefit who had not
passed the school mathematics examination.
(iv) A =0.0811 E = 0.3244 S = -0.3688

baseline hazard at time t,

www.sankhyiki.in
+91-9711150002
(ii) The baseline hazard refers to a female chicken in the old enclosure
(iii) (a) z1 = 1 if Duck 0 otherwise, z2 = 1 if Goose 0 otherwise, z3 = 1 if New enclosure 0

otherwise and z4 = 1 if Male 0 otherwise
(b) β1 = (–0.298, -0.122), β2 = (–0.049, 0.199), β3 = (0.049, 0.201) and β4 = (0.100, 0.300)
(iv) The parameter for the new enclosure is 0.125 so the ratio of the hazard for
two otherwise identical birds is exp(0.125) = 1.133.
So the hazard appears to have got worse.
The 95% confidence interval is entirely positive OR does not include zero
so at the 95% level the deterioration is statistically significant.
(v) 0.07087
9. (i) Cox‘s model ensures that the hazard is always positive. Standard software
packages often include Cox‘s model.
Cox‘s model allows the general ―shape‖ of the hazard function for all individuals
to be determined by the data, giving a high degree of flexibility,
The data in this investigation are censored, and Cox‘s model can handle censored
data.
In Cox‘s model the hazards of individuals with different values of the covariates
are proportional, meaning that they bear the same ratio to one another at all ages.
If we are not primarily concerned with the precise form of the hazard, we can
ignore the shape of the baseline hazard and estimate the effects of the covariates
from the data directly.
(ii) A suitable statistical test is that using the likelihood ratio statistic.
We compare the model with gender + exercise with the model with gender +
exercise + the interaction.
If the log-likelihood for these two models are L and Linteraction respectively, then
the test statistic is -2(L - Linteraction).
This is equal to -2{-1,250 – (-1,246)} = -2(-4) = 8.
Under the null hypothesis that the parameter on the interaction term is zero, this
statistic has a chi-squared distribution with one degree of freedom (since the
interaction term involves one parameter).

www.sankhyiki.in
+91-9711150002
Since 8 > 7.879, the critical value of the chi-squared distribution at the 0.5% level
(or 8 > 3.84 for the 5% level),
we reject the null hypothesis even at the 99.5% level (or 95% level) and conclude
that the interaction term is required in the model.
(iii) The baseline category is females who do not take regular exercise.
The hazards of developing heart disease in the other three categories, relative to
the baseline category, are as follows:
Gender Regular exercise
Male No exp(0.2) = 1.22
Male Yes exp(0.2 – 0.3 – 0.35) = 0.64
Female Yes exp(-0.3) = 0.74
Males who do not take regular exercise are more likely to develop heart disease
than females.
Regular exercise decreases the risk of heart disease for both males and females.
The effect of regular exercise in reducing the risk of heart disease is greater for
males than for females, so much so that among those who take regular exercise,
males have a lower risk of developing heart disease than females.
10. (i) Month 1 5/200 = 0.025 Month 2 8/190 = 0.042
Month 3 15/175 = 0.086 Month 4 10/150 = 0.067
Month 5 6/135 = 0.044 Month 6 3/125 = 0.024
(ii) To assess the impact of risk factors, a proportional hazards model would be
useful because of its simple interpretation or because it allows the effect of each
individual risk factor to be assessed.
The Gompertz model can be framed as a proportional hazards model, as can a

semiparametric model (such as the Cox model).

www.sankhyiki.in
+91-9711150002
The Gompertz model would not be appropriate here, as it has a monotonically

increasing or decreasing hazard, whereas it is clear from part (i) that the hazard
of symptoms returning first rises and then falls with duration.
A semi-parametric model allows the shape of the hazard to be determined by the

data. The semi-parametric model would be better than the Gompertz in this case.
11 (ii) ∑ 0 1 ∑ 0 1
12. (i) We do not need to know the general shape of the hazard/distribution.
(ii) (i) ( ) ( ) ( ) where: ( )is the hazard at time t, and ( ) is

the baseline hazard at time t,
(iii) Baseline hazard refers to a male sold a whole life policy by the direct sales
force.
(iv) 0.57364

baseline hazard at time t, zi are the covariates and is the vector of regression
parameters.
(ii) It ensures the hazard is always positive. The log-hazard is linear.
You can ignore the shape of the baseline hazard and calculate the effect of
covariates directly from the data.
It is widely available in standard computer packages OR is a popular, well-

established model.
(iii) Ben is only exp(-0.55) = 57.7% as likely to pass as Bill OR 42.3% less likely to
pass than Bill.
(iv) The model could be adjusted by including a covariate measuring the
interaction between the number of attempts and employment status.
The covariate would be equal to Z1Z2 and would take the value 1 for a self-
employed person on his or her second or subsequent attempt, and 0 otherwise.

www.sankhyiki.in
+91-9711150002
The effect of the number of attempts for an employee would be equal to exp(β2),
where β2 is the parameter related to Z2, For a self-employed person, the effect of
the number of attempts would be equal to exp(β2 + β3), where β3 is the parameter
related to the interaction term.
14. (i) The Gompertz model is simple to understand and to apply, having only two
parameters. It also fits human mortality at older ages well (e.g. 30–85 years).
(ii) 0.000825 (iv) 3000 (v) 0.9867 (vi) 43.33 years
15. (i) A person who is aged 50 years at the start of the investigation, is not a heavy
drinker, and has not lived for 12 months or more in a tropical country.
(ii) A = 0.0405, C = 0.2877 and T = 1.099
16. (i) In a proportional hazards model the hazard of experiencing an event may be
factorised into two components: one depending only on duration since some
start event, which is known as the baseline hazard, and the other depending only
on a set of covariates and associated parameters.
Thus the ratio between the hazards for any two individuals with different values
of the covariates is constant across all durations.
The baseline hazard applies to an individual with the value zero on all
covariates.
(ii) The proportionality of the hazards makes estimating the impact of covariates
on the hazard straightforward (through partial likelihood).
Widely available statistical software packages have built-in routines for the Cox
model.
The Cox model is semi-parametric so the baseline hazard does not need to be
specified, and can be determined by the data (as with a Kaplan- Meier hazard).
It ensures that the hazard is always positive. It is easy to communicate.
17. (i) A proportional hazards model is used to estimate the effect of covariates on
the hazard of experiencing an event. In a proportional hazards model the hazard
is assumed to factorise into two components, one depending only on duration,
and the other depending only on the covariates. The ratio between the hazards
for persons with any two values of a covariate is the same at all durations.
(ii) A cow who started the previous treatment immediately the condition
appeared.
(iii) 8 days (iv) 0.319
18. ( )
( ) ( )

www.sankhyiki.in
+91-9711150002
EXPOSED TO RISK
1. An investigation into mortality collects the following data:
x = total number of policies under which death claims are made when the
policyholder is aged x last birthday in each calendar year
Px(t) = number of in-force policies where the policyholder was aged x nearest
birthday on 1 January in year t
(i) State the principle of correspondence.
(ii) Obtain an expression, in terms of the Px(t), for the central exposed to risk,
, which corresponds to the claims data and which may be used to
estimate the force of mortality in year t at each age x, x . State any
assumptions you make. [UK April 2005]
2. (i) (a) Explain why it is important to sub-divide data when carrying out
mortality investigations.
(b) Describe the problems that can arise with sub-dividing data.
(ii) List four factors which are often used to sub-divide life assurance data.
[UK April 2006]
3. A national mortality investigation is carried out over the calendar years 2002,
2003 and 2004. Data are collected from a number of insurance companies.
Deaths during the period of the investigation, x, are classified by age nearest at
death.
Each insurance company provides details of the number of in-force policies on 1

January 2002, 2003, 2004 and 2005, where policyholders are classified by age
nearest birthday, Px(t).
(i) (a) State the rate year implied by the classification of deaths.
(b) State the ages of the lives at the start of the rate interval.
(ii) Derive an expression for the exposed to risk, in terms of Px(t), which may
be used to estimate the force of mortality in year t at each age. State any
assumptions you make.

www.sankhyiki.in
+91-9711150002
(iii) Describe how your answer to (ii) would change if the census information
provided by some companies was P*x(t), the number of in-force policies
on 1 January each year, where policyholders are classified by age last
birthday. [UK Sept 2006]
4. The actuary to a large pension scheme carried out an investigation of the

mortality of the scheme‘s pensioners over the two years from 1 January 2005 to 1
January 2007.
(i) List the data required by the actuary for an exact calculation of the central
exposed to risk for lives aged x.
The following is an extract from the data collected by the actuary.
Age x Number of pensioners at Deaths during

nearest
1 Jan 2005 1 Jan 2006 1 Jan 2007 2005 2006
birthday
63 1,248 1,312 1,290 10 6
64 1,465 1,386 1,405 13 15
65 1,678 1,720 1,622 16 23
66 1,719 1,642 1,667 22 19
67 1,686 1,695 1,601 19 25
(ii) (a) Derive an expression that could be used to estimate the central
exposed to risk using the available data. State any assumptions you
make.
(b) Use the data to estimate μ65. State any further assumptions that
you make. [UK April 2007]
5. List four factors in respect of which life insurance mortality statistics are often
subdivided. [UK April 2008]
6. (i) List the data needed for the exact calculation of a central exposed to risk
depending on age.
An investigation studied the mortality of persons aged between exact ages 40

and 41 years. The investigation began on 1 January 2008 and ended on 31
December 2008. The following table gives details of 10 lives involved in the
investigation.
www.sankhyiki.in
+91-9711150002
Life Date of 40th birthday Date of death
1 1 March 2007 –
2 1 May 2007 1 October 2008
3 1 July 2007 –
4 1 October 2007 –
5 1 December 2007 1 February 2008
6 1 February 2008 –
7 1 April 2008 –
8 1 June 2008 1 November 2008
9 1 August 2008 –
10 1 December 2008 –
Persons with no date of death given were still alive when the investigation
ended.
(ii) Calculate a central exposed to risk using the data for the 10 lives in the
sample.
(iii) (a) Calculate the maximum likelihood estimate of the hazard of death
at age 40 last birthday.
(b) Hence, or otherwise, estimate q40. [UK Sept 2009]
7. (i) In the context of mortality investigations describe the principle of

correspondence and give an example of a situation in which it may be
hard to adhere to this principle.
On 1 January 2005 a country introduced a comprehensive system of death
registration, which classified deaths by age last birthday on the date of death.
The government of the country wishes to obtain estimates of the force of
mortality, μx, by single years of age x for the period between 1 January 2005 and 1
January 2008. Annual population censuses have been taken on 30 June each year
since 2004, which classify the population by age last birthday. However the only
copy of the data from the population census of 30 June 2006 was lost when the
computer disc on which it was stored was being transferred between
government departments.
Let the population aged x last birthday on 30 June in year t be denoted by the
symbol Px,t, and the number of deaths during the period of investigation of
persons aged x be denoted by the symbol dx.
(ii) Derive an expression in terms of Px,t and dx which may be used to

estimate μx. [UK Sept 2009]
www.sankhyiki.in
+91-9711150002
8. List four factors often used to subdivide life insurance mortality statistics.
[UK April 2010]
9. An oil company has discovered a vast deposit of oil in an equatorial swamp. The
area is extremely unhealthy and inhabited by venomous spiders. There is an
antidote to bites from these spiders but it is expensive. The antidote acts instantly
but does not provide future immunity. The company commissions a study to
estimate the rate of being bitten by the spiders among its employees, in order to
determine the amount of antidote to provide.
Employees of the company are posted to the swamp for six month tours of duty
starting on 1 January, 1 April, 1 July or 1 October. The first employees to be
posted arrived on 1 January 2008. The swamp is so inaccessible that no
employees are allowed to leave before their six month tours of duty are
completed.
Accidental deaths are common in this dangerous location. The table below gives
some data from the study.
Quarter Number of new Number of Number of

beginning arrivals at start accidental deaths spider bites
of quarter during quarter during quarter
1 January 2008 90 10 15
1 April 2008 80 8 25
1 July 2008 114 10 30
1 October 2008 126 13 40
(i) Estimate the quarterly rate of being bitten by a spider for each quarter of
2008, stating any assumptions you make.
(ii) Suggest reasons why the assumptions you made in (i) might not be valid.
[UK April 2010]
10. Two neighbouring small countries have for many years taken annual censuses of
their populations on 1 January in which each inhabitant must give his or her age.
Country A uses an ―age last birthday‖ definition of age, whereas Country B uses
an ―age nearest birthday‖ definition. Each country has also operated a system in
which deaths are recorded on an ―age nearest birthday at date of death‖ basis.
On 30 June 2009 Country A invaded Country B and the two countries became
one state. The new government wishes to estimate a single set of age-specific
www.sankhyiki.in
+91-9711150002
death rates, μx, for the new unified state using the census data taken in the years
before the invasion.
Derive a formula which the new government may use to estimate μx in terms of
the recorded number of deaths in each country, and the population of each
country recorded as being aged x in the censuses. State any assumptions you
make. [UK Sept 2010]
11. (i) State the principle of correspondence as it applies to the estimation of

mortality rates.
(ii) Explain why it might be difficult to ensure the principle of

correspondence is adhered to, and give a specific example of an
investigation where this may be the case.
An actuary was asked to investigate the mortality of lives in a particular

geographical area. Data are available of the population of this area, classified by
age last birthday, on 1 January in each year. Data on the number of deaths in this
area in each calendar year, classified by age nearest birthday at death, are also
available.
(iii) Derive a formula which would allow the actuary to estimate the force of
mortality at age x + f, μx+ f, in a particular calendar year, in terms of the
available data, and derive a value for f.
(iv) List four factors other than geographical location which a government
statistical office might use to subdivide data for national mortality
analysis. [UK Sept 2011]
12. (i) Explain the reasons why data are subdivided when conducting mortality
investigations.
(ii) Describe the problems which can arise with subdividing data.
[UK April 2012]
13. (i) List four factors other than age and smoker status by which life insurance
mortality statistics are often subdivided.
Two offices in different towns of the same life insurance company write 25-year
term assurance policies. Below are data from these two offices relating to
policyholders of the same age. Both deaths and policies in force are on an age last
birthday basis.
www.sankhyiki.in
+91-9711150002
Gasperton Great Hawking
Policies in force on 1 January 2009 2,000 1,770

Policies in force on 1 January 2010 2,100 1,674
Deaths in calendar year 2009 25 21
(ii) Calculate the central death rate for the calendar year 2009 at this age for
the offices in Gasperton and Great Hawking.
A detailed examination of the records shows that 50% of the policyholders in

Gasperton at both censuses were smokers, and 20% of policyholders in Great
Hawking at both censuses were smokers. National death rates at this age for
smokers in 2009 were 40% higher than those for non-smokers.
(iii) Estimate the central death rates for smokers and non-smokers in
Gasperton and Great Hawking.
The life insurance company charges policyholders in Gasperton and Great

Hawking the same premiums for the 25-year term assurance policies. It charges
smokers in both towns 40% more than non-smokers.
(iv) Comment on the company‘s pricing structure in the light of your results
from parts (ii) and (iii) above. [UK April 2012]
14. (i) State the principle of correspondence as it applies to mortality rates.
A life insurance company has the following data:
Number of policies in force on
1 January 1 January 1 July 1 January

Age last birthday 2009 2010 2010 2011
49 2,000 2,100 2,300 2,500

50 2,100 2,200 2,300 2,400
51 2,300 2,400 2,500 2,600
Number of deaths classified by age next birthday and calendar year
Age next birthday 2009 2010

49 175 200
50 200 225
51 225 235
www.sankhyiki.in
+91-9711150002
(ii) Estimate, using these data, the force of mortality at age 50 next birthday
for the period 1 January 2009 to 1 January 2011.
(iii) State the exact age to which your answer to part(ii) relates. [UK Sept 2012]
15. Population censuses in a certain country are taken each year on the President‘s
birthday, provided that the President‘s astrological advisor deems the taking of a
census favourable. Censuses record the age of every inhabitant in completed
years (that is, curtate age). Deaths in this country are registered as they happen,
and classified according to age nearest birthday at the time of death.
Below are some data from the three most recent censuses.
Age in Population Population Population

completed 2006 2009 2010
years (thousands) (thousands) (thousands)
64 300 320 350

65 290 310 330
66 280 300 320
Between the censuses of 2006 and 2009 there were a total of 3,000 deaths to
inhabitants aged 65 nearest birthday, and between the censuses of 2009 and 2010
there were a total of 1,000 deaths to inhabitants aged 65 nearest birthday.
(i) Estimate, stating any assumptions you make, the death rate at age 65
years for each of the following periods:
• the period between the 2006 and 2009 censuses
• the period between the 2009 and 2010 censuses
(ii) Explain the exact age to which your estimates apply. [UK April 2013]
16. Data are often subdivided when investigating mortality statistics.
(i) Explain why this is done.

(ii) Discuss one potential problem with sub-dividing mortality data.
(iii) List four factors which are commonly used to sub-divide mortality data.
[UK Sept 2013]
17. (i) Explain why data are subdivided into homogeneous groups when
mortality investigations are conducted.
www.sankhyiki.in
+91-9711150002
(ii) List four factors, other than age and sex, by which mortality statistics are
often subdivided. [UK April 2014]
18. (i) State the principle of correspondence as it relates to mortality

investigations.
Two small countries conduct population censuses on an annual basis. Country A

records its population on 1 February every year based on an age definition of age
last birthday. Country B records its population on every 1 August using a
definition of age nearest birthday. Each country records deaths as they happen
based on age next birthday.
Below are some data from the last few years.
Country A
Age last Population Population Population
birthday 1 February 2011 1 February 2012 1 February 2013
44 382,000 394,000 401,000

45 374,000 381,000 385,000
46 354,000 372,000 375,000
Country B
Age nearest Population Population Population
birthday 1 August 2011 1 August 2012 1 August 2013
44 382,000 394,000 401,000

45 374,000 381,000 385,000
46 354,000 372,000 375,000
In the combined lands of Countries A and B in the calendar year 2012 there were
4,800 deaths of those aged 46 next birthday and 4,500 deaths of those aged 45
next birthday.
The two countries decide to form an economic union, after which it will be
mandatory to offer the same rates for life insurance to residents of each country.
(ii) Estimate the death rate at age 45 years last birthday for the two countries
combined.
(iii) Explain the exact age to which your estimate relates. [UK April 2014]
www.sankhyiki.in
+91-9711150002
19. (i) Explain the census approximation for calculating the exposed to risk
between any two census dates.
A mortality investigation bureau has collected the following information on

number of policies in-force each year from different companies.
Age Year Company A Company B Company C
54 2011 3,400 1,250 5,780

2012 3,350 1,450 5,500
2013 3,000 1,500 6,010
55 2011 3,250 1,190 6,000

2012 3,390 1,300 5,960
2013 3,100 1,440 6,030
56 2011 3,270 1,150 5,950

2012 3,020 1,300 5,980
2013 2,950 1,500 5,990
• Company A has provided in-force policy data as at the beginning of each

calendar year using age nearest birthday.
• Company B has provided in-force policy data as at the financial year

closing date (which was 31 March in each year) using age last birthday.
• Company C has provided in-force policy data as at the end of each

calendar year using age next birthday.
(ii) Calculate the contribution to central exposed to risk for lives aged 55 last
birthday for the calendar year 2012 for each of the companies.
[UK Sept 2014]
20. (i) State the principle of correspondence as it applies to death rates.
A nightclub opens at 10.00 p.m. and closes at 2.00 a.m. It admits only people aged
over 21 years on the production of an identity card giving date of birth.
The table below shows the number of people entering in various intervals
between 10.00 p.m. and 2.00 a.m. on 30 June 2013. No-one was admitted after
1.00 a.m., and you may assume that all those who enter the premises stay until
2.00 a.m.
www.sankhyiki.in
+91-9711150002
Year of 10.00–11.30 p.m. 11.30–12.00 p.m. 12.00 p.m.–1.00 a.m.

birth
1989 100 300 200

1990 200 400 350
1991 150 400 300
1992 100 250 200
During the period of opening, 40 people aged 22 last birthday required medical
attention for heat exhaustion.
(ii) Calculate the rate per person-hour at which those attending the night club
aged 22 last birthday required medical attention for heat exhaustion,
stating any assumptions you make. [UK April 2015]
21. (i) State why it is important to divide data into homogeneous classes when
undertaking mortality investigations.
(ii) List four factors, apart from smoking behaviour, by which mortality data
are often classified by life insurance companies.
In a particular life insurance market, it has for many years been the practice for
all companies to charge smokers higher premiums than non-smokers for the
same term assurance policy. Suppose one company decides to switch to charging
smokers and non-smokers the same premiums for term assurance policies. The
other companies retain differential pricing for smokers and non-smokers.
(iii) Discuss the likely implications for the company making the switch.
[UK April 2015]
22. List four factors, other than age and sex, by which mortality statistics are often
subdivided. [UK Sept 2015]
23. Company A and Company B are two small insurance companies which have
recently merged to form Company C. Company C is reviewing its premium rates
for a whole of life product and so is conducting an analysis of mortality rates
experienced.
Company A recorded the number of policies in force every 1 January using a

definition of age next birthday whereas Company B recorded the number of
policies in force every 1 April using an age definition of age last birthday. Both
www.sankhyiki.in
+91-9711150002
companies recorded deaths as they happened using an age definition of age last
birthday.
These are the data for the most recent years.
Company A
Age next Number of policies Number of policies Number of policies

birthday 1 Jan 2012 1 Jan 2013 1 Jan 2014
51 8,192 6,421 8,118

52 7,684 8,298 7,187
53 9,421 8,016 9,026
Company B
Age last Number of policies Number of policies Number of policies

birthday 1 April 2012 1 April 2013 1 April 2014
51 4,496 3,817 4,872

52 5,281 5,218 3,812
53 4,992 5,076 5,076
In the calendar year 2013 Company A recorded 28 deaths of those aged 52 last
birthday and Company B recorded 17 deaths of those aged 52 last birthday.
(i) Estimate the force of mortality for the combined company for age 52 last
birthday, stating all assumptions that you make.
(ii) Explain the exact age to which your estimate applies. [UK Sept 2015]
24. You have been given the following data relating to an insurance company
mortality investigation.
Age last Policies in force on 1 July Deaths in

birthday 2015 2016 2017 2018 2015 2016 2017 2018
63 4192 4444 4885 4889 104 100 117 109
64 3998 4200 4664 4334 122 114 130 124
65 3940 4166 4321 4533 118 120 129 140
Calculate estimates of the force of mortality for those live aged 63, 64 and
65 last birthday, indicating clearly the ages to which your estimates relate.
State any assumptions you make.
www.sankhyiki.in
+91-9711150002
ANSWERS
1. (i) The principle of correspondence states that a life alive at time t should be
included in the exposure at age x at time t if and only if were that life to die
immediately, he or she would be counted in the deaths data x at age x.
(ii) [ , ( ) ( )- , ( ) ( )-]
2. (i) (a) The models of mortality we use assume that we can observe a group of
lives with the same mortality characteristics. This is not possible in practice.
However, data can be sub-divided according to certain characteristics that we

know to have a significant effect on mortality. This will reduce the heterogeneity
of each group, so that we can at least observe groups with similar, but not the
same, characteristics.
(b) Sub-dividing data using many factors can result in the numbers in each class
being too low. It is necessary to strike a balance between homogeneity of the
group and retaining a large enough group to make statistical analysis possible.
Sufficient data may not be collected to allow sub-division. This may be because
marketing pressures mean proposal forms are kept to a minimum.
(ii) The following are factors often used: Sex, Age, Type of policy, Smoker/Non-
smoker status, Level of underwriting, Duration in force, Sales channel, Policy
size, Occupation (or social class) of policyholder, Known impairments,
Geographical region.
3. (i) (a) The age definition changes 6 months before/after each birthday, so this is a
life year rate interval.
(b) Lives are aged x - ó at the start of the rate interval.
(ii) 0 ( ) ( ) ( ) ( )1
(iii) ( ( ) ( )) ( ( ) ( ) ( ) ( ))
( ( ) ( ))
4. (i) For each pensioner in the investigation, the actuary would need: Date of entry
into the investigation (the latest of date of retirement, date of xth birthday and 1
www.sankhyiki.in
+91-9711150002
January 2005) and Date of exit from the investigation (the earliest of date of
death, date of (x+1)th birthday and 1 January 2007)
(ii)(a) , Where Px,t is the number of pensioners aged x

nearest birthday at time t, measured from 1 January 2005. This assumes that Px,t
is linear over the calendar year.
(b) 0.01157
5. Sex, Age, Type of policy, Smoker/non-smoker, Level of underwriting, Duration

in force, Sales channel, Policy size, Known impairments or Occupation.
6. (i) For each life we need date of birth, Date of entry into observation and Date of
exit from observation
(ii) 53 months or 4.42 yrs (iii) (a) 0.4528 (b) 0.369
7. (i) The principle of correspondence states that a life alive at time t should be
included in the exposure at age x at time t if and only if, were that life to die
immediately, he or she would be counted in the deaths data at age x. Problems in
adhering to this can arise when the deaths data and the exposed-to-risk data
come from two different sources. These may classify lives differently.
(ii) where
8. Sex, Age, Type of policy, Smoker/non-smoker status, Level of underwriting OR

lifestyle/participation in dangerous sports, Duration in force, Sales channel,
Policy size, Occupation of policyholder, Known impairments or Post code OR
region/county/country OR address.
9. (i) 0.176, 0.160, 0.162 and 0.176. We assume that all spider bites are treated.
(ii) The assumption that there are no deaths apart from accidental deaths is
unlikely to be true, and probably the company would have data on these which
could be included in the calculations.
Accidental deaths may be more likely among employees in their first quarter
than their second, as those in their second quarter have more experience.
Accidental deaths may be more likely at the beginning of a quarter, when there
are newly arrived employees.
The experience of the quarter beginning 1 January may be different from that of
other quarters because that is the first quarter that any employees are stationed
in the swamp, and they may not know about the spiders when they arrive. In
www.sankhyiki.in
+91-9711150002
subsequent quarters they may be able to adjust their arrangements to reduce the
possibility of being bitten.
10. where is the population ages x last

birthday
where is the population ages x nearest birthday
and
11. (i) A life alive at time t should be included in the exposure at age x at time t if and
only if, were that life to die immediately, he or she would be counted in the
deaths data at age x.
(ii) When the deaths data and the exposed to risk data come from different
sources. E.g. occupational mortality investigations where deaths data come from
death registers and exposed to risk data from census OR where deaths data come
from claims department of an office, whereas exposed to risk data are based on
policies in force, which come from a different part of the office.
(iii) where is the population aged x
0 ( ) ( )1
last birthday on 1 January in year t.
(iv) Sex, Age, Marital status, Occupation, Socio-economic status, Ethnic origin,
Educational attainment, Housing tenure and Disability, chronic health condition,
limiting long-term illness
12. (i) Users of data require rates subdivided by age and other criteria. Models are
based on the assumption that we can observe groups of identical lives. Therefore
it is important that we analyse groups of lives which are homogenous (or have
the same mortality). This can, for example, help avoid anti-selection.
(ii) Small numbers in some sub-groups leading to scanty data and noncredible
rates or a large variance. Sometimes relevant factors cannot be used because the
relevant information cannot be collected on the proposal form because questions
are unlikely to be answered honestly, or because the key questions are intrusive
or impractical for marketing or administrative reasons or make the questionnaire
too long, or cannot be asked by law. Can be difficult to ensure that events data
and exposed-to-risk data are subdivided in the same way, leading to the
principle of correspondence being violated.
13. (i) Gender, Type of policy, Level of underwriting, Duration in force, Sales
channel, Policy size, Occupation, Known impairments, Postcode/geographical
area, Education, Socio-economic class / income and Marital status.
www.sankhyiki.in
+91-9711150002
(ii) Gasperton – 0.0122 and Great Hawking – 0.0122

(iii) Gasperton -> Smokers- 0.0142 and Non-Smokers – 0.0102
Great Hawking -> Smokers- 0.0158 and Non-Smokers – 0.0113
(iv) The company would do better to vary the premiums on the basis of
geographical area, as it is clear that death rates in Great Hawking for both
smokers and non-smokers are higher than those in Gasperton.
If the company does not differentiate its prices on the basis of geographical area,
it may lose business in Gasperton to a rival company which does differentiate;
conversely in Great Hawking it may attract new business from rival companies,
but will underprice the product and hence risk its life assurance fund becoming
insolvent.
There are relatively little data, so it might be worth adopting a ―wait and see‖
approach.
1.4 times the death rate will not translate as 1.4 times the premium. The
difference may me relatively small, (although it is a 25 year term assurance so it
probably is pretty significant).
14. (i) The principle of correspondence states that a life should be included in the
denominator of the rate at time t if and only if, were that life to die at time t, his
or her death would be counted in the numerator.
(ii) ̂ (iii) The estimate ̂ applies to the middle of the rate interval,
which is exact age 49.5 years.
15. (i) for the period of 2006 – 2009 and

for the period of 2009 – 2010
(ii) The rate interval is the life year, starting at age x – 0.5. The age in the middle
of the rate interval is thus x, so the estimate relates to exact age 65 years.
16. (i) All our models and analyses are based on the assumption that we can observe
groups of identical lives (or at least, lives whose mortality characteristics are the
same).
In practice, this is never possible. However, we can at least subdivide our data
according to characteristics known, from experience, to have a significant effect
on mortality. This ought to reduce the heterogeneity of each class so formed.
www.sankhyiki.in
+91-9711150002
(ii) The number of lives in each subdivision may become small. This will lead to
estimates of mortality that are unreliable, with large standard errors. OR
Information about the factors which affect mortality may be unavailable because
it was not asked on the insurance proposal form, or population census OR
Information about the factors which affect mortality may be unreliable because
respondents gave inaccurate or false answers to questions.
(iii) Sex, Age, Type of policy (which often reflects the reason for insuring),
Smoker/non-smoker status, Level of underwriting, Duration in force, Sales
channel, Policy size, Occupation of policyholder, Known impairments,
Postcode/geographical location and Marital status.
same).
Although in practice, this is never possible. We can at least subdivide our data
(ii) Type of policy (which often reflects the reason for insuring), Smoker/non-
size, Occupation of policyholder OR socio-economic class, Known impairments,
Postcode/geographical location and Marital status.
18. (i) A life alive at time t should be included in the exposed-to-risk at age x at time
if and only if, were that life to die immediately, he or she would be included in
the deaths data dx at age x.
(ii) 0.006338
(iii) The rate interval is the life year starting at age 45 exact. The estimate relates
to the age in the middle of the rate interval, which is 45.5 years.
19. (i) In survival investigations, population counts will only be available at census
dates. Define Px,t to be the number of lives under observation, aged x last
birthday, at any time t and suppose that we have the values of Px,t only if t is a
census date.
We require the exposed to risk, , over the interval between the first census and
the last.
This is ∫ , where t1 and t2 are the two census dates.
www.sankhyiki.in
+91-9711150002
To evaluate this, we usually assume that Px,s is linear between census dates. If the
censuses are one year apart this leads to the trapezium approximation:
( )
(ii) Company A – 3115, Company B – 1335.9375, Company C – 5965
20. (i) A life alive at age x at time t should be included in the exposed-to-risk if and
only if, were that life to die immediately, his or her death would be included in
the deaths at age x, dx.
(ii) 0.02045 per person-hour
same).
Although in practice, this is never possible. We can at least subdivide our data
(ii) Sex, Age, Type of policy, Level of underwriting, Duration in force, Sales
channel, Policy size, Occupation or socio-economic group, Known
impairments/medical history, Postcode/geographic location and Marital status
(iii) EITHER If the company changing its policy charges both smokers and non-
smokers a premium equal to the rate typically charged to smokers, then, relative
to other companies, it will become poor value for non-smokers.
The company changing its policy will therefore lose business from nonsmokers
(whom it will charge more than an actuarially fair premium). The portfolio will
(eventually) be made up mostly of smokers (whom it will charge an actuarially
fair premium).
The volume of business sold is likely to decrease, possibly to the extent that it
does not cover the expenses estimated in the pricing basis.
OR If the company changing its policy charges both smokers and non-smokers a
premium equal to the rate typically charged to non-smokers, then relative to
other companies, it will become good value for smokers (and acceptable value
for non-smokers).
www.sankhyiki.in
+91-9711150002
The company changing its policy will therefore attract more business from
smokers (whom it will charge less than an actuarially fair premium). This is a
form of anti-selection.
The smoker business is likely to be unprofitable, although the increase in

business will reduce the overheads per policy This is likely to lead to losses for
the company changing its policy.
OR
If the company changing its policy charges both smokers and non-smokers a
premium somewhere between the rate typically charged to smokers and the rate
typically charged to non-smokers, then relative to other companies, it becomes
good value for smokers and poor value for non-smokers.
The company changing its policy will therefore attract business from smokers
and lose business from non-smokers (whom it will charge more than an
actuarially fair premium). This is a form of anti-selection.
The smoker business is likely to be unprofitable, but any remaining nonsmoker

business will be profitable. This may eventually lead to losses of the company
changing its policy.
22. Type of policy (which often reflects the reason for insuring), Smoker/non-
size, Occupation of policyholder, Known impairments, Postcode/geographical
region and Marital status
23. (i) 0.0034
(ii) The estimate 52 applies to the age at the middle of the rate interval, which is
age 52.5 exact.
24. (i) Age ̂
63 18,410 430 0.0234

64 17,196 490 0.0285
65 16,960 507 0.0299
www.sankhyiki.in
+91-9711150002
GRADUATION
1. An investigation of mortality over the whole age range produced crude estimates
of qx for exact ages x from 2 years to 93 years inclusive. The actual deaths at each
age were compared with the number of deaths which would have been expected
had the mortality of the lives in the investigation been the same as English Life
Table 15 (ELT15). 53 of the deviations were positive and 39 were negative.
Test whether the underlying mortality of the lives in the investigation is

represented by ELT15. [UK April 2005]
2. A life insurance company has investigated the recent mortality experience of its
male term assurance policyholders by estimating the mortality rate at each age,
qx. It is proposed that the crude rates might be graduated by reference to a
standard mortality table for male permanent assurance policyholders with forces
of mortality , so that the forces of mortality implied by the graduated
rates qx are given by the function:
, where k is a constant.
(i) Describe how the suitability of the above function for graduating the
crude rates could be investigated.
(ii) (a) Explain how the constant k can be estimated by weighted least
squares.
(b) Suggest suitable weights.
(iii) Explain how the smoothness of the graduated rates is achieved.

[UK April 2005]
3. Describe the advantages and disadvantages of graduating a set of observed

mortality rates using a parametric formula. [UK Sept 2005]
4. An investigation was carried out into the mortality of male undergraduate

students at a large university. The resulting crude rates were graduated
graphically. The following table shows the observed numbers of deaths at each
age x, dx, and the obtained from the graduation, together with the number of
lives exposed to risk at each age.
www.sankhyiki.in
+91-9711150002
Age x dx Exposed-to-risk
18 6 0.0012 5,200
19 8 0.0013 5,000
20 12 0.0015 4,800
21 8 0.0017 5,000
22 9 0.0019 3,800
23 6 0.0020 3,600
24 8 0.0021 3,200
(i) Test whether the overall fit of the graduated rates to the crude data is
satisfactory using a chi-squared test.
(ii) Comment on your results in (i).
(iii) (a) Describe three possible shortcomings in a graduation which the
chisquared test cannot detect, and
(b) State a test which can be used to detect each one. [UK Sept 2005]
5. An investigation was undertaken into the mortality of male term assurance

policyholders for a large life insurance company. The crude mortality rates were
graduated using a formula of the form: . An extract of the results
is shown below.
Exposure Crude Mortality Graduated

Age Standardised deviation
(years) rate Mortaity rate
(̂ )
x Ex ̂
√ ( )
40 11,037 0.00290 0.00348 -1.035
41 12,010 0.00333 0.00358 -0.459
42 11,654 0.00300 0.00368 -1.212
43 9,658 0.00300 0.00379 -1.264
44 8,457 0.00319 0.00391 -1.061
45 10,541 0.00427 0.00402 0.406
46 7,410 0.00472 0.00415 0.763
47 12,042 0.00399 0.00428 -0.487
48 14,038 0.00406 0.00441 -0.626
49 11,479 0.00375 0.00455 -1.274
50 12,480 0.00409 0.00469 -0.981
51 10,567 0.00407 0.00485 -1.154
52 9,187 0.00512 0.00500 0.163
53 14,027 0.00456 0.00517 -1.007
54 11,581 0.00466 0.00534 -1.004
www.sankhyiki.in
+91-9711150002
(i) Test the graduation for goodness of fit using the chi-squared test.
(ii) (a) By inspection of the data, suggest one aspect of the graduated rates
where adherence to data seems inadequate.
(b) Explain why this may not be detected by the chi-squared test.
(c) Carry out one other test that may detect this deficiency.
(iii) Suggest how the graduation could be adjusted to correct the deficiency
identified. [UK April 2006]
6. (i) (a) Describe the general form of the polynomial formula used to
graduate the most recent standard tables produced for use by UK
life insurance companies.
(b) Show how the Gompertz and Makeham formulae arise as special
cases of this formula.
(ii) An investigation was undertaken of the mortality of persons aged

between 40 and 75 years who are known to be suffering from a
degenerative disease. It is suggested that the crude estimates be graduated
using the formula:
[ . / . / ]
(a) Explain why this might be a sensible formula to choose for this
class of lives.
(b) Suggest two techniques which can be used to perform the

graduation.
(iii) The table below shows the crude and graduated mortality rates for part of
the relevant age range, together with the exposed to risk at each age and
the standardised deviation at each age.
www.sankhyiki.in
+91-9711150002
Age last Graduated force Crude force of Standardised

Exposed to risk
birthday of mortality mortality deviation
x ̂
50 0.08127 0.07941 340 -0.12031
51 0.08770 0.08438 320 -0.20055
52 0.09439 0.09000 300 -0.24749
53 0.10133 0.10345 290 0.11341
54 0.10853 0.09200 250 -0.79336
55 0.11600 0.10000 200 -0.66436
56 0.12373 0.11176 170 -0.44369
57 0.13175 0.12222 180 -0.35225
Test this graduation for:
(a) overall goodness-of-fit (b) bias; and (c) the existence of individual ages
at which the graduated rates depart to a substantial degree from the
observed rates. [UK Sept 2006]
7. An insurance company is investigating the mortality of its annuity policyholders.

It is proposed that the crude mortality rates be graduated for use in future
premium calculations.
(i) (a) Suggest, with reasons, a suitable method of graduation in this case.
(b) Describe how you would graduate the crude rates.
(ii) Comment on any further considerations that the company should take
into account before using the graduated rates for premium calculations.
[UK April 2007]
8. An insurance company is concerned that the ratio between the mortality of its
female and male pensioners is unlike the corresponding ratio among insured
pensioners in general. It conducts an investigation and estimates the mortality of
male and female pensioners, ̂ and ̂ . It then uses the ̂ to calculate what
the expected mortality of its female pensioners would be if the ratio between
male and female mortality rates reflected the corresponding ratio in the PMA92
and PFA92 tables, , using the formula ̃ ̃ .
www.sankhyiki.in
+91-9711150002
The table below shows, for a range of ages, the numbers of female deaths actually
observed in the investigation and the number which would be expected from the
̃ .
Age Actual deaths Expected deaths

x ̂ ̃
65 30 28.4
66 20 30.1
67 25 31.2
68 40 33.5
69 45 34.1
70 50 41.8
71 50 46.5
72 45 44.5
(i) Describe and carry out an overall test of the hypothesis that the ratios
between male and female death rates among the company‘s pensioners
are the same as those of insured pensioners in general. Clearly state your
conclusion.
(ii) Investigate further the possible existence of unusual ratios between male
and female death rates among the company‘s pensioners, using two other
appropriate statistical tests. [UK April 2007]
9. A national mortality investigation was carried out. It was suggested that the
mortality of the male population could be represented by the following
graduated rates: , where is from the standard tables,
ELT15(males).
The table below shows the graduated rates for part of the age range, together
with the exposed to risk, expected and actual deaths at each age. The squared
standardized deviations that were calculated are also shown.
4 5
The standardised deviations were calculated as
√
www.sankhyiki.in
+91-9711150002
Squared
Graduated Expected
Age Exposed to risk Deaths standardized
rates deaths
deviations
x
50 0.00549 10,850 59.57 52 0.9611
51 0.00610 9,812 59.85 54 0.5742
52 0.00679 10,054 68.27 60 1.0010
53 0.00757 9,650 73.05 65 0.8872
54 0.00845 8,563 72.36 64 0.9653
55 0.00945 10,656 100.70 87 1.8637
56 0.01057 9,667 102.18 88 1.9679
57 0.01182 9,560 113.00 97 2.2653
58 0.01323 8,968 118.65 103 2.0634
59 0.01483 8,455 125.39 105 3.3150
(i) Test this graduation for overall goodness-of-fit.

(ii) Comment on your findings in (i). [UK Sept 2007]
10. (i) Explain why crude mortality rates are graduated before being used for
financial calculations.
(ii) List two methods of graduating a set of crude mortality rates and state, for
each method:
(a) under what circumstances it should be used; and

(b) how smoothness is ensured [UK Sept 2007]
11. Describe how smoothness is ensured when mortality rates are graduated using
each of the following methods:
(a) fitting a parametric formula (b) graphical graduation [UK April 2008]
12. An investigation was carried out into mortality rates among a certain class of
female pensioners. Crude mortality rates were estimated by single years of age
from ages 65–89 years last birthday inclusive. The investigators decided to ask an
actuary to compare the crude rates with a standard table. They calculated the
relevant standardised deviations, printed them out and sent them to the actuary.
Unfortunately, because of a printing error, the right-hand edge of the document

containing the standardised deviations failed to print properly. The actuary was
unable to read the magnitude of the standardised deviations. However, the sign
www.sankhyiki.in
+91-9711150002
of each deviation was clear. This revealed that the crude mortality rates were
higher than the standard table rates at ages 65–72 years and 75–84 years
inclusive, but that the crude mortality rates were lower than the standard table
rates at ages 73–74 years and 85–89 years inclusive.
The null hypothesis to be tested is that the crude mortality rates come from a
population with underlying mortality consistent with that in the standard table.
(i) List two statistical tests of the null hypothesis which the actuary could
carry out on the basis of the information received.
(ii) Carry out both tests. For each test, state what feature of the experience it is
specifically testing, and give your conclusion. [UK April 2008]
13. An investigation into the mortality experience of a sample of the male student
population of a large university has been carried out. The university authorities
wish to know whether the mortality of male students at the university is the
same as that of males in the country as a whole. They have drawn up the
following table.
Age x Number of deaths Expected number

of deaths assuming
national mortality
18 13 10
19 15 12
20 14 14
21 20 12
22 12 8
23 8 5
Carry out an overall test of the university authorities‘ hypothesis, stating your
conclusion. [UK Sept 2008]
14. A life insurance company has a small group of policies written on impaired lives
and has conducted an investigation into the mortality of these policyholders. It is
proposed that the crude mortality rates be graduated for use in future premium
calculations.
www.sankhyiki.in
+91-9711150002
Discuss the suitability of two methods of graduation that the insurance company
could use. [UK April 2009]
15. Explain the basis underlying the grouping of signs test, and derive the formula
for the probability of exactly t positive groups by considering the possible
arrangements of a set of positive and negative signs. [UK April 2009]
16. An investigation into the mortality of men engaged in a hazardous occupation

was carried out. The following is an extract from the results.
Age x Initial Observed ̂

exposed-to-risk Ex deaths θx
30 950 12 0.0126
31 1,200 14 0.0117
32 1,200 16 0.0133
33 900 9 0.0100
34 1,000 11 0.0110
35 1,100 15 0.0136
36 800 10 0.0125
37 1,250 16 0.0128
38 1,400 17 0.0121
It was decided to graduate the results with reference to English Life Table 15
(males). The formula used for the graduation was .
(i) Using a test of the overall fit of the graduated rates to the data, test the
hypothesis that the underlying mortality of men in the hazardous
occupation is in accordance with the graduation formula given above.
(ii) Test the graduation using two other tests which detect different features
of the graduation. For each test you apply:
(a) State the feature of the graduation it is designed to detect.

(b) Carry out the test.
(c) State your conclusion. [UK Sept 2009]
17. (i) State three different methods of graduating raw mortality data and for
each method give an example of a situation when the method would be
appropriate.
www.sankhyiki.in
+91-9711150002
A life insurance company last priced its whole of life contract 30 years ago using
a standard mortality table. The company wishes to establish whether recent
mortality experience in the portfolio of business is in line with the pricing basis.
These are the data:
Extract from the standard table

Recent Experience
used for pricing the product
Number of
Age last Exposed to Risk Deaths during
x survivors to age
birthday during 2009 2009
x
50 2,381 16 50 32,669
51 3,177 21 51 32,513
52 3,460 22 52 32,338
53 1,955 15 53 32,143
54 3,485 24 54 31,926
55 3,122 29 55 31,685
56 2,781 26 56 31,417
57 3,150 31 57 31,121
58 3,651 39 58 30,795
59 3,991 48 59 30,435
60 30,039
(ii) Test the goodness of fit of these data with the pricing basis and comment
on your results.
(iii) (a) State, with reasons, one further test which you would deem
appropriate to perform on these data.
(b) Carry out that test. [UK April 2010]
18. A large pension scheme conducts an investigation into the mortality of its
younger male pensioners. The crude mortality rates are graduated using a
standard table by subtracting a constant from the rates given in the table.
A trainee has been asked to test the goodness-of-fit of the proposed graduation
using a chi-squared test. The trainee‘s workings are reproduced below:
―Test H0: good fit against H1: bad fit.
www.sankhyiki.in
+91-9711150002
Age Actual Deaths Expected Deaths (Actual Deaths –

Expected Deaths)2
/Actual Deaths
60 8 8.23 0.00661
61 8 10.01 0.50501
62 10 10.52 0.02704
63 12 14.80 0.65333
64 14 14.21 0.00315
65 13 17.37 1.46899
Test Statistic = 2.66413
Age range is 65–60 = 5 years so 5 degrees of freedom.
Two-tailed test so take 2 * 2.66413 = 5.32826 and compare against tabulated value
of chi-square distribution with 5 degrees of freedom at 2.5% level, which is
12.833.
So we accept the null hypothesis.‖
Identify the errors in the trainee‘s workings, without performing any detailed
calculations. [UK Sept 2010]
19. (i) Outline the circumstances under which graphical graduation of crude
mortality rates might be useful.
(ii) List the steps involved in graphical graduation. [UK Sept 2010]
20. Rocky Bay is a small seaside town in the north of Europe. In a leaflet advertising
the town, the tourist office has claimed that ―in August, Rocky Bay has a
Mediterranean climate‖. An actuarial student spent August 2009 on holiday in
Rocky Bay with his family, and became sceptical of this claim. When he returned
home, he thought it might be interesting to examine the claim by applying some
of the methods he had learned while studying for the Core Technical subjects.
For each of the 31 days in August 2009 he collected data recorded by various
meteorological offices on the maximum temperature in Rocky Bay and the mean
of the maximum temperatures reported on the same day at a range of places in
the Mediterranean region.
The data are shown below, where, for each of the days in August, ―+‖ means that
Rocky Bay had the higher maximum temperature and ―–― means that the
Mediterranean average was higher.
www.sankhyiki.in
+91-9711150002
1 2 3 4 5 6 7 8 9 10 11 12
- - - - - - - - - - - -
13 14 15 16 17 18 19 20 21 22 23 24
+ + + + - - - - - - - -
25 26 27 28 29 30 31
- - - - + + +
(i) Carry out a statistical test to examine the tourist office‘s claim.
(ii) Suggest reasons why the test might not be an appropriate way to examine
the tourist office‘s claim. [UK Sept 2010]
21. (i) Explain why a mortality experience would need to be graduated.

An actuary has conducted investigations into the mortality of the following
classes of lives:
(a) the female members of a medium-sized pension scheme
(b) the male population of a large industrial country
(c) the population of a particular species of reptile in the zoological
collections of the southern hemisphere
The actuary wishes to graduate the crude rates.
(ii) State appropriate methods of graduation for each of the three classes of
lives and, for each class, briefly explain your choice. [UK April 2011]
22. An historian has investigated the force of mortality from tuberculosis in a

particular town in a developed country in the 1860s using a sample of records
from a cemetery.
He wishes to test whether the underlying mortality from tuberculosis in the town
is the same as the national force of mortality from this cause of death, as reported
in death registration data. The data are shown in the table below.
Age-group Deaths in Central exposed to National force
sample risk in sample of mortality
5–14 13 3,685 0.0051

15–24 47 2,540 0.0199
25–34 52 1,938 0.0309
35–44 50 1,687 0.0316
45–54 33 1,386 0.0286
55–64 23 1,018 0.0230
65–74 13 663 0.0202
75–84 3 260 0.0070
www.sankhyiki.in
+91-9711150002
(i) Carry out an overall test of the null hypothesis that the underlying
mortality from tuberculosis in the town is the same as the national force of
mortality, and state your conclusion.
(ii) (a) Identify two differences between the experience of the sample and
the national experience which the test you performed in (i) might
not detect.
(b) Carry out a test for each of the differences in (ii)(a).
(iii) Comment on the results from all the tests carried out in (i) and (ii).
[UK April 2011]
23. (i) Describe three shortcomings of the χ2 test for comparing crude estimates
of mortality with a standard table and why they may occur.
The following table gives an extract of data from a mortality investigation

conducted in the rural highlands of a developed country. The raw data have
been graduated by reference to a standard mortality table of assured lives.
Expected Observed
Age x zx zx2
Deaths Deaths
60 36.15 35 –0.191 0.037
61 28.92 24 –0.915 0.837
62 31.34 27 –0.775 0.601
63 38.01 35 –0.488 0.238
64 26.88 32 0.988 0.975
65 37.59 36 –0.259 0.067
66 33.85 34 0.026 0.001
67 26.66 32 1.034 1.070
68 22.37 26 0.767 0.589
69 18.69 33 3.310 10.956
70 18.24 22 0.880 0.775
(ii) For each of the three shortcomings you described in (i):

(a) name a test that would detect that shortcoming.
(b) carry out the test on the data above.
(iii) Comment on your results from (ii). [UK Sept 2011]
24. The mortality experience of a large company pension scheme is to be tested to

see if the experience of males aged 65–72 years is consistent with a standard
table. The results were collated by the firm conducting the analysis on a
www.sankhyiki.in
+91-9711150002
computer spreadsheet, with positive and negative standardised deviations being

distinguished only by being in a different coloured font. Unfortunately the
results have been supplied to the company in the form of a printout produced on
a black-and-white printer from which it is not possible to tell the signs of the
deviations.
The values of the standardised deviations shown are as follows:

0.052, 0.967, 2.528, 0.328, 1.234, 0.250, 1.023, 0.756
(i) Suggest two tests which could be conducted from the information given.
(ii) Carry out the tests you suggested in your answer to part (i).
[UK April 2012]
25. (i) Describe a situation when graduation of raw mortality data using a
parametric formula might be appropriate and explain why.
(ii) (a) State another method of graduation.
(b) Suggest a situation in which its use may be appropriate.
A large insurance company has graduated the mortality experience of part of its
business. The original data and the graduated rates are as follows.
Age Exposed to risk Number of deaths Graduated rates

( ̂s )
40 1284 4 0.00240
41 2038 4 0.00266
42 1952 12 0.00297
43 2158 7 0.00332
44 2480 11 0.00371
45 1456 7 0.00415
46 2100 12 0.00464
47 1866 16 0.00519
48 1989 15 0.00577
49 1725 10 0.00642
(iii) Test this graduation for overall goodness of fit.

(iv) Discuss whether it may be necessary to test for smoothness.
(v) Test the data for individual outliers. [UK Sept 2012]
26. A life office compared the mortality of its policyholders in the age range 30 to 60
years inclusive with a set of mortality rates prepared by the Continuous
Mortality Investigation (CMI). The mortality of the life office policyholders was
www.sankhyiki.in
+91-9711150002
higher than the CMI rates at ages 30–35, 38–41, 45–50 and 54–59 years inclusive,
and lower than the CMI rates at all other ages in the age range.
(i) Perform two tests of the null hypothesis that the underlying mortality of
the life office policyholders is represented by the CMI rates.
(ii) Comment on your results from part (i).
(iii) Explain the problem which duplicate policies cause in the context of the
CMI mortality investigations. [UK April 2013]
27. (i) (a) State three different methods of graduating crude mortality data.
(b) Give, for each method, one advantage and one disadvantage.
An insurance company has graduated the experience of one block of its life
business against a standard table, the following is an extract of the data.
Age x Exposed to Observed Graduated

risk deaths rates
30 36,254 26 0.000590
31 37,259 20 0.000602
32 28,057 23 0.000617
33 31,944 23 0.000636
34 30,005 26 0.000660
35 28,389 12 0.000689
36 36,124 31 0.000724
37 28,152 22 0.000765
38 24,001 25 0.000813
39 30,448 31 0.000870
(ii) Carry out a test for overall goodness of fit.

(iii) Carry out two other statistical tests to check the validity of the graduation.
(iv) Discuss, with reference to the tests you have performed, whether it would
be reasonable for the company to use the graduated rates to price life
insurance policies. [UK Sept 2013]
28. (i) (a) State three features which are desirable when a graduation is
performed.
(b) Explain why they are desirable.
The actuary to a large pension scheme has attempted to graduate the scheme‘s
recent mortality experience with reference to a table used for similar sized
schemes in a different industry. He has calculated the standardised deviations
between the crude and the graduated rates, zx, at each age and has sent you a
printout of the figures over a small range of ages. Unfortunately the dot matrix
www.sankhyiki.in
+91-9711150002
printer on which he printed the results was very old and the dots which would
form the minus sign in front of numbers no longer function, so you cannot tell
which of the standardised deviations is positive and which negative. Below are
the data which you have.
Age Standardised deviation
60 2.40
61 0.08
62 0.80
63 0.76
64 1.04
65 0.77
66 1.30
67 1.76
68 0.28
69 0.68
70 0.93
(ii) (a) Carry out an overall goodness-of-fit test on the data.

(b) Comment on your result.
(iii) (a) List four defects of a graduation which the test you have carried
out would fail to detect.
(b) Suggest, for each of the defects, a test which could be used to detect
it.
(iv) Carry out one of the tests suggested in part (iii)(b). [UK April 2014]
29. A life insurance company is developing a new class of annuity business. It has
conducted a study of mortality among lives it believes represent this new
business. It wishes to graduate the data so that they are suitable for use in
financial calculations. It decides to use a standard table as a basis for graduation
and the function: where are the graduated rates and are the
rates from the standard table.
The table below gives some results from the graduation.
Age x Crude rates Graduated rates Exposed to risk

̂
70 0.0167 0.022661 1,200
71 0.0209 0.024783 1,194
72 0.0236 0.027204 973
73 0.0324 0.029956 956
74 0.0362 0.033072 912
75 0.0402 0.036587 845
www.sankhyiki.in
+91-9711150002
76 0.0561 0.040357 820

77 0.0623 0.044962 369
78 0.0552 0.049899 489
79 0.0640 0.055390 500
(i) Carry out an overall test of the goodness-of-fit of this graduation to the
crude rates.
(ii) List three defects of a graduation which the test you conducted in (i) may
not detect.
(iii) Perform, for each of two of the defects listed in (ii), an additional test
which can detect the defect.
(iv) Comment on the results of the tests carried out in parts (i) and (iii).
[UK Sept 2014]
www.sankhyiki.in
+91-9711150002
ANSWERS
1. The null hypothesis is that the observed rates are a sample from a population in
which English Life Table 15 represents the true rates. If the null hypothesis is
true, then the observed number of positive deviations, P, will be such that P ~
Binomial (92, ó).
We use the normal approximation to the Binomial distribution because we have

> 20 ages
This means that, approximately, P ~ Normal (46, 23).
The z-score associated with the probability of getting 53 positive deviations if the
null hypothesis is true is, therefore
√
We use a two-tailed test, since both an excess of positive and an excess of

negative deviations are of interest. Using a 5 % significance level, we have
-1.96 <1.355 < +1.96.
This means we have insufficient evidence to reject the null hypothesis.
2. (i) The suitability of a linear relationship between and could be

investigated by plotting –log(1 - qx) against –log(1 - qsx) or by plotting and
and looking for a linear relationship. An approximately linear relationship
will suffice. If data are scarce, too close a fit is not to be expected, especially at
extreme ages.
(ii) (a) We can work with either or . The value of k which minimizes either
∑ ( ) or ∑ ( ) should be found (note that the
summations are over all relevant ages x). At each age there will be a different
sample size or exposed to risk, Ex. This will usually be largest at ages where
many term assurances are sold (e.g. ages 25 to 50 years) and smaller at other
ages.
(b) The estimation procedure should pay more attention to ages where there are
lots of data. These ages should have a greater influence on the choice of k than
other ages. This implies weights wxEx. A suitable choice would be or
or
www.sankhyiki.in
+91-9711150002
(iii) The graduated forces of mortality are a linear function of the forces in the
standard table. Since the forces in the standard table should already be smooth, a
linear function of them will also be smooth.
3. Advantages:
The graduated rates will progress smoothly provided the number of parameters
is small.
Good for producing standard tables.
Can easily be extended to more complex formulae, provided optimisation can be
achieved.
Can fit the same formula to different experiences and compare parameter values
to highlight differences between them.
Disadvantages:
It can be hard to find a formula to fit well at all ages without having lots of
parameters.
Care is required when extrapolating: the fit is bound to be best at ages where we
have lots of data, and can often be poor at extreme ages.
4. (i) T.S. = 4.4673. Since we have 7 ages, we compare this with the tabulated value
at the 5% level at, say, 4 degrees of freedom (since we lose 2 3 degrees for every
10 ages graduated graphically). The tabulated value with 4 degrees of freedom is
9.488. Since 4.4673 < 9.488 we have no evidence to reject the null hypothesis.
(ii) On the basis of the chi-squared test, the graphical graduation adheres to the
data satisfactorily. However, there is a large deviation at age 20 which requires
further investigation.
(iii) Possible shortcomings, and the relevant tests are:
There may be long runs of deviations of the same sign caused by

undergraduation. These can be detected by the grouping of signs test or the serial
correlations test.
There may be one or two large deviations at particular ages, balanced by lots of
small deviations (as in the example in part (i)) These can be detected by the
individual standardised deviations test.
The graduated rates may be too high or too low over the whole of the age range,
but by an amount too small for the chi-squared test to detect. The signs test or the
cumulative deviations test will detect this.
www.sankhyiki.in
+91-9711150002
The results of the graduation may not be smooth. This can be detected by looking
at the third order differences of the graduated rates. If the rates are smooth, these
should be small in magnitude compared with the quantities themselves and
should progress regularly.
5. (i) The observed value of T.S. is 12.816. The critical value of the distribution at
the 5% level is 21.03. This is greater than the observed value of T.S. and so we
have insufficient evidence to reject the null hypothesis.
(ii) (a) The obvious problem with the graduation is one of overall bias. The
graduated rates are consistently too high, resulting in too many negative
deviations.
(b) This is not detected by the test because the test statistic is the sum of the
squared deviations and so information on the sign and some information on the
size of the individual deviations is lost. The test would detect large bias, but in
this case the graduated and crude rates are close enough that the statistic is
below the critical value.
(c) Signs Test P-Value = 0.0176, is less than 0.025 (this is a two-tailed test) and so
we reject the null hypothesis.
Cumulative deviations test – T.S. = -2.715. This is a two-tailed test. Since

|-2.715|>1.96 , we reject the null hypothesis.
(iii) The problem is that the graduated rates are too high. There doesn‘t appear to
be a problem with the overall shape. So we should be able to adjust the
parameters rather than change the underlying equation.
The problem persists across the whole age range, so the first adjustment to try
would be to decrease the value of .
6. (i) (a) The general form is μx = (polynomial(1)) + exp(polynomial(2)) , where

polynomial (1) takes the form α0 + α1x + α2x2 +...
and polynomial (2) takes the form β0 +β1x + β2x2 +....
(b) In the case of the Gompertz formula μx = Bcx , then putting B = exp(β0) and
c = exp(β1) , we can re-write the formula as μx = exp(β0) exp(β1x) = exp(β0 +β1x) ,
which is of the required form if αi = 0 for all i and βi = 0 for i = 2, 3, ….
Similarly the Makeham formula μx = A+ Bcx can be expressed in the required

form by putting A = α0 , B = exp(β0) and c = exp(β1).
www.sankhyiki.in
+91-9711150002
(ii) (a) (a) The Gompertz formula written μx = exp(β0 + β1x) is an exponential
function which implies that the rate of increase of mortality with age is constant.
This is often a reasonable assumption for ordinary lives at middle ages and older
ages. In the special case of the impaired lives known to be suffering from a
degenerative disease, it is plausible to suppose that the rate of increase of
mortality might increase with age.
The term . /
(b) The graduation can be achieved by maximum likelihood estimation of the

parameters or by ordinary least squares regression of log[ ̂ - on and
. / .
(iii) (a) In this case, we have 8 ages, but 3 parameters were estimated when
performing the graduation, so df = 5 , T.S = 11.07 and L.S. = 1.52052, we have
sufficient evidence to the reject the null hypothesis and conclude that the
graduation adheres satisfactorily to the data.
(b) To test for bias we use EITHER the Signs Test or the Cumulative Deviations
test.
Signs Test – P-Value = 0.0352 Cumulative Deviations Test – T.S. = -0.9335
(c) To test for the existence of individual ages at which the graduated rates depart
greatly from the observed rates we can use the Individual Standardised
Deviations Test.
There are no ages at which the absolute value of zx exceeds 1.96. Therefore we do
not reject the null hypothesis and conclude that there are no outliers.
7. (i) (a) Graduation by reference to a standard table would be appropriate. There

are likely to be existing standard tables which are suitable and this method is
suitable for relatively small data sets.
Alternatively, graduation by parametric formula would be suitable if the volume

of data was large enough. But that is unlikely to be the case here.
Graphical graduation would not be appropriate for rates for premium

calculations.
www.sankhyiki.in
+91-9711150002
(b) Assuming graduation by reference to a standard table:
• Select a suitable table, based on a similar group of lives.
• Plot the crude rates against from the standard table to identify a simple
relationship.
• Find the best-fit parameters, using maximum likelihood or least squares

estimates.
• Test the graduation for goodness of fit. If the fit is not adequate, the process
should be repeated.
(ii) Considerations include:
• As the premiums are for annuity policies, it is important not to overestimate

the mortality rates, as the premiums would be too low.
• The rates will be based on current mortality; the company should also take into
account expected future changes, especially any reductions in mortality rates.
• Premiums charged by other insurer: if rates are too high the company will fail
to attract business; if too low, it may attract too much, unprofitable business.
8. (i) Test statistics = 11.3343. The critical value of the distribution at the 5% level
of statistical significance is 15.51. Since 11.3343 < 15.51, we have no reason to
reject the null hypothesis that the sex ratios of death rates among the company‘s
pensioners are the same as those prevailing in the PMA92 and PFA92 tables.
(ii) Signs Test: P-Value = 0.1094, Since this is greater than 0.025 (two-tailed test),
the sex ratios of death rates among the company‘s pensioners are not
systematically higher or lower than those derived from the PMA92 and PFA92
tables.
Cumulative deviations test: T.S. = 0.875, and since |0.875| < 1.96 using a two-
tailed test, the sex ratios of death rates among the company‘s pensioners are not
systematically higher or lower than those derived from the PMA92 and PFA92
tables.
9. (i) T.S = 15.8623
(ii) From the data we can see that the actual deaths are lower than those expected
at all ages. The graduated rates are too high; the graduation should be revisited.
www.sankhyiki.in
+91-9711150002
At these ages the force of mortality increases with age, so a suitable adjustment
may be to reduce the age shift relative to the standard table from 2 years.
The standardised deviations also appear to show a systematic increase with age,
showing that departure of the graduated rates from the actual rates increases
with age. There appear to be no outliers (all the zx‘s have absolute values below
1.96).
10. (i) We assume that mortality rates progress smoothly with age. Therefore a crude
estimate at age x carries information about the rates at adjacent ages, and
graduation allows us to use this fact to ―improve‖ the estimate at age x by
smoothing.
This reduces the sampling errors at each age. It is desirable that financial
quantities progress smoothly with age, as irregularities are hard to justify to
clients.
(ii) Any two of the following three methods are acceptable:

By parametric formula:
Should be used for large experiences, especially if the aim is to produce a

standard table; Depends on a suitable formula being found which fits the data
well. Provided the number of parameters is small, the resulting curve should be
smooth.
With reference to a standard table
Should be used if a standard table for a class of lives similar to the experience is
available, and the experience we are interested in does not provide much data.
The standard table will be smooth, and provided the function linking the
graduated rates to the rates in the standard table is simple, this smoothness will
be ―transferred to the graduated rates‖.
Graphical
if a quick check is needed, or data are very scanty. The graduation should be
tested for smoothness using the third differences of the graduated rates, which
should be small in magnitude and progress regularly with age. If the smoothness
is unsatisfactory, the curve can be adjusted (―handpolishing‖) and the
smoothness tested again.
www.sankhyiki.in
+91-9711150002
11. (a) Provided a formula with a small number of parameters is chosen the resulting
graduation will be acceptably smooth.
(b) The graduation should be tested for smoothness using the third differences of
the graduated rates which should be small in magnitude and progress regularly.
A further iterative process, which involves manual adjustment of the graduation

(called ‗hand-polishing‘) is sometimes necessary to ensure smoothness.
12. (i) Since we do not know the values of the rates in the crude experience but only
the signs of the deviations the tests we can carry out are limited. We can,
however, perform the signs test and the grouping of signs test.
(ii) The signs test looks for overall bias. We have 25 ages, and at 18 of these the
crude rates exceed the standard table rates (i.e. we have positive deviations)
If the null hypothesis is true, then the observed number of positive deviations, P,
will be such that P ~ Binomial (25, 0.5).
We use the normal approximation to the Binomial distribution because we have

a large number of ages (>20) This means that, approximately, P ~ Normal (12.5,
6.25). T.S. = -2 (using continuity correction). Using a 5 % significance level, we
have -2.00 < -1.96. This means we have just sufficient evidence to reject the null
hypothesis.
The grouping of signs test looks for long runs or clumps of ages with the same
sign, indicating that the crude experience is different from the standard
experience over a substantial age range.
The number of runs of positive signs is 2 (65–72 years and 75–84 years). We have
25 ages and 18 positive signs in total, which means 7 negative signs.
Using the table provided under n1 = 18 and n2 = 7, we find that, under the null
hypothesis, the greatest number of positive runs x for which the probability of x
or fewer positive runs is less than 0.05 is 3. Since we only have 2 runs, we
conclude that the probability of obtaining 2 or fewer runs is much less than 0.05.
Therefore we reject the null hypothesis.
13. T.S. = 10.783, Since 10.783 < 12.59 there is insufficient evidence to reject the
hypothesis that the mortality rate of men in the University is the same as that of
the national population.
www.sankhyiki.in
+91-9711150002
14. Graduation by reference to a standard table might be appropriate, if a suitable

standard table could be found. However the fact that the company insures non-
standard lives makes it unlikely that a suitable standard table would exist.
Graphical graduation might be used if no suitable standard table can be found.

However it is a last resort as it is difficult to obtain results which are smooth and
which adhere to the data.
Graduation using a parametric formula is unlikely to be appropriate as the

amount of data in this investigation is likely to be small and it is unlikely that the
company will want to produce a standard table.
15. Suppose we have a set of n crude mortality rates for a given age range x to
x + n - 1, and we wish to compare them to a standard set of n mortality rates for
the same age range.
If the mortality underlying the crude rates is the same as that of the standard set
of rates (the null hypothesis), then we should expect the difference between the
two sets of rates to be due only to sampling variability.
The grouping of signs test tests the null hypothesis by examining the number of
groups of consecutive positive deviations among the n ages, where a positive
deviation occurs when the crude rate exceeds the corresponding rate in the
standard set.
Suppose there are a total of m positive deviations, n – m negative deviations and

G positive groups. Then the number of possible ways to arrange t positive
groups among n – m negative deviations is ( ).
There are ( )ways to arrange m positive signs into t positive groups.
There are ( )ways to arrange m positive and n – m negative signs.
( )( )
Therefore the probability of exactly t positive groups is ( )
( )
The grouping of signs test then evaluates Pr[t ≤ G] under the null hypothesis. If
this is less than 0.05 we reject the null hypothesis at the 5% level.
16. (i) T.S. = 4.808, no evidence to reject H0
www.sankhyiki.in
+91-9711150002
(ii) Signs Test – P(more than or equal to 6 positive devations) = 0.2539, cannot
reject H0.
Cumulative deviations test – T.S. = 0.7457, cannot reject H0.
Grouping of Signs Test – P-Value = 0.0476, reject H0.
17. (i) By reference to a standard table – appropriate if data are scanty or a table of
similar lives exists.
Graphical graduation – appropriate if a ―quick and dirty‖ result needed OR for

scanty data where no other method is appropriate
By parametric formula, if the experience is large.
(ii) T.S. = 3.84, reject H0.
(iii) Signs Test : P-Value = 0.246, cannot reject H0.
18. The null hypothesis is poorly expressed – should be ―underlying rates are the
graduated rates‖ or similar.
The test statistic is incorrect – the denominator should be expected deaths.
Cannot comment on figures in table as no access to workings.
Number of ages is 6 not 5.
However fewer than 6 degrees of freedom is appropriate because should deduct

1 for estimated parameter and some for choice of standard table
This is a one-tailed test not two-tailed.
Even if it were two-tailed, multiplying test statistic by 2 is inappropriate.
The trainee has not stated the level of significance to which he or she is working
(presumably 5 per cent)
Does not explain that the reason for conclusion is 12.833 > 5.32826.
The null hypothesis should never be ―accepted‖ rather it is ―not rejected‖.
The trainee has not stated his or her conclusion in terms of the null hypothesis
www.sankhyiki.in
+91-9711150002
All the graduated rates are above the crude rates so although the graduation has
been accepted it is suspect.
19. (i) Graphical graduation might be used when EITHER a quick visual impression
OR a rough estimate is all that is required,
This is useful when the data are scanty and EITHER there is very little prior
knowledge about the class of lives being analysed so that a suitable standard
table cannot be found OR the experience of a professional person can be called
upon
(ii) Plot the crude data, preferably on a logarithmic scale.
If data are scanty, group ages together, choosing evenly spaced groups and
making sure there are a reasonable number of deaths (e.g. at least 5) in each
group.
Plot approximate confidence limits or error bars around the plotted crude rates.
Draw the curve as smoothly as possible, trying to capture the overall shape of the
crude rates.
Test the graduation for goodness-of-fit and EITHER test for smoothness OR
examine third differences If the graduation fails the test, re-draw the curve.
―Hand polishing‖ individual ages may be necessary to ensure adequate
smoothness.
20. (i) Signs Test – Using Normal approximation T.S. = -3.05, reject H0.
Grouping of Signs Test – T.S. = 4.04, reject H0.
(ii) Runs of consecutive days with the same sign are likely since the weather
tends to be determined by atmospheric conditions lasting more than one day.
The Mediterranean averages are averages for the month of August 2009, not
long-run averages.
August 2009 might have been an unusually hot month in the Mediterranean
region.
Maximum temperature is not the only measure of climate, also consider mean
temperature, hours of sunshine, windiness, etc.
Choice of locations used for Mediterranean data could be important.
Also tests just look at whether one is higher or lower – the difference in each case
could be negligible (e.g. 25.001 degrees vs 25.002 degrees)
www.sankhyiki.in
+91-9711150002
A non-standard measurement method might have been used in Rocky Bay,

which confounds the comparison.
21. (i) We believe that mortality varies smoothly with age (and evidence from large
experiences supports this belief).
Therefore the crude estimate of mortality at any age carries information about
mortality at adjacent ages. By smoothing the experience, we can make use of data
at adjacent ages to improve the estimates at each age.
This reduces sampling (or random) errors. The mortality experience may be used
in financial calculations. Irregularities, jumps and anomalies in financial
quantities (such as premiums for life insurance contracts) are hard to justify to
customers.
(ii) (a) Female members of a medium-sized pension scheme. With reference to a

standard table, because there are many extant tables dealing with female
pensioners.
(b) Male population of a large industrial country. By parametric formula, because

the experience is large. OR because the graduated rates may form a new standard
table for the country.
(c) Population of a particular species of reptile in the zoological collections of the

southern hemisphere. Graphical, because no suitable standard table is likely to
exist and the experience is small.
22. (i) T.S. = 4.438, cannot reject H0.
(ii) (a) Small bias which is not great enough for the chi-squared test to detect.
(b) Signs Test : P-Value = 0.0352, cannot reject H0.
Cumulative deviation test : T.S. = 1.6595, cannot reject H0.
(iii) In none of the tests we have performed do we reject the null hypothesis.
Therefore it seems that the mortality from tuberculosis in the town is the same as
the national force of mortality.
23. (i) Outliers. Since all the information is summarised in one number, a few large
deviations may be offset or hidden by a large number of small deviations.
www.sankhyiki.in
+91-9711150002
Small bias. Since the squares of the differences are used, the sign of the
differences are lost, hence small but consistent bias above or below may not be
noticed.
Clumps or runs. Again because the squares of the differences are used, the sign
of the differences are lost, so significant groups of (clumps or runs) of bias over
ranges of the data may not be detected.
(ii) (a) A few large deviations or outliers – Individual Standardised Deviations

Test.
Small but consistent bias – Signs Test OR Cumulative Deviations Test.
Clumps or runs of bias over ranges of the data - Grouping of Signs Test OR Serial
Correlations Test.
24. (i) Chi-squared test (for overall goodness of fit)

(Modified) individual standardised deviations test (for outliers)
(ii) Chi-squared test : T.S. = 10.64
25. (i) When preparing standard tables OR when graduating data from a large
industrywide scheme, or a national population because there will be lots of data
available.
(ii) (a) EITHER Graphical graduation OR Graduation with reference to a

standard table
(b) EITHER Graphical graduation may be suitable for a analysis of a newly

discoveredinsect (as data will be scanty and an existing table will not exist) OR
Graduation with reference to a standard table is useful if data are scanty and a
suitable standard table exists (e.g. for female pensioners from a small scheme).
(iii) T.S. = 13.62, can not reject H0.
(iv) It is not necessary to test for smoothness if the graduation was performed
using a parametric formula or a standard table, provided that a small number of
parameters were used in the formula, or in the function linking to the rates in the
standard table.
It will be necessary to test for smoothness if the graduation was performed

graphically but this is unlikely to be the case with data from a large insurance
company.
www.sankhyiki.in
+91-9711150002
(v) The null hypothesis is that the graduated rates are the same as the true
underlying rates in the block of business. (i.e the same as part (iii))
We would expect the individual deviations to be distributed Normal (0,1) and

therefore only 1 in 20 zxs should have absolute magnitude greater than 1.96 (or
none should be outside -3 to +3). Looking at the zx‘s we see that the largest one is
2.576 and the next is 2.0294. Since they are both greater in magnitude than 1.96
we have sufficient evidence to reject the null hypothesis.
26. (i) Signs Test: T.S. = 2.16, we reject H0.

Grouping of Signs Test: T.S. = -2.7, we reject H0.
(ii) The life office‘s rates are, overall, different from the CMI rates (actually they
are higher). Additional tests are needed to examine the magnitude of the
difference between the two sets of rates.
The shape of the life office‘s mortality rates is also rather different from the CMI
schedule, and this might require further investigation, OR
The Grouping of Signs test suggests clumping of the deviations. It is possible that
the difference between the shape of the two sets of rates is so small in magnitude
as to be negligible.
(iii) We can no longer be sure that we are observing a collection of independent

claims. It is quite possible that two distinct death claims are the result of the
death of the same life. The effect of this is to increase the variance of the number
of claims, by a factor which may depend on age. This may affect tests based on
standardised deviations.
27. (ii) T.S. = 11.56
28. (ii) T.S. = 14.852, cannot reject H0.
29. (i) T.S. = 12.295, cannot reject H0.
(ii) There may be one or two large deviations at individual ages, the effect of
which are insufficient to raise the chi-squared value above the critical level.
Small but consistent bias across the whole of the age range.
The graduation might be the wrong shape, in that the graduated rates might be
higher than the crude rates in one part of the age range, and systematically lower
in another part of the age range. This will lead to runs or clumps of deviations of
the same sign. The rates may not progress smoothly from age to age.
www.sankhyiki.in
+91-9711150002
MORTALITY PROJECTION
1. (i) Explain the notation and meaning of the parameters x and fn,x in the
following reduction factor formula:
( )( )
(ii) State briefly how the values of these parameters are usually determined.
(iii) The mortality rate for the base year of a mortality projection has been
estimated to be: m60,0 = 0.006
It is believed that the minimum possible mortality rate for lives aged 60 is
0.0012. It is also believed that 30% of the maximum possible reduction in
mortality at this age will have occurred by ten years‘ time.
Using an appropriate reduction factor, calculate the projected mortality
rate for lives aged 60 in 20 years‘ time.
(iv) Describe the advantages and disadvantages of using an expectation-based

approach to mortality projections.
2. (i) Discuss a major difficulty that is present in a three-factor age-period-

cohort mortality projection model that is not found in either an age-period
or age-cohort model.
(ii) The following Lee-Carter model has been fitted to mortality data covering
two age groups (centred on ages 60 and 70), and a 41-year time period
from 1990 to 2030 inclusive:
( )
(a) Define in words the symbols ax , bx , kt and .
(b) State the constraints that are normally imposed on bx and kt in
order for the model to be uniquely specified.
(c) In this model kt has been set to cover a 41-year time period from
1990 to 2030 inclusive, such that for projection (calendar) year t :
kt+1 = kt - 0.02 + et
where et is a normally distributed random variable with zero mean
and common variance.
Identify the numerical values of kt ( t =1990,1991, ...2029,2030 ),
ignoring error terms. Hint: they need to satisfy the constraint for kt
that you specified in part (b).
(iii) Mortality has been improving over time for both ages included in the
model in part (ii). You have been given the following further information
about the model: b60 = 3b70 ̂ 60,2010 = 0.00176 ̂ 70,2010 = 0.01328
www.sankhyiki.in
+91-9711150002
where is the predicted mortality rate at age in calendar year calculated

from the fitted model (ie ignoring error terms).
(a) State what the above information indicates about the impact of the
time trend on mortality at the two ages.
(b) Use the above information to complete the specification of the
model.
(c) Use the model to calculate the projected values of ̂ 60,2025 and
̂ 70,2025.
(iv) Describe the main disadvantages of the Lee-Carter model.
3. You have fitted a model to mortality data that are subdivided by age and time
period, with a view to using the model to project future mortality rates. For a
particular age, the model is defined as:
[ ( )]
where Dx,t is the random number of deaths, and is the central exposed to
risk for age group x in time period t ( t = 0 is the year 1975).
(i) If mx,t is the central rate of mortality for exact age x in time period t , show
that the above model is equivalent to:
stating the values of the parameters A, B and C.
(ii) The model had been fitted to existing data covering the years 1975 to 2017
inclusive. At age 55 the maximum likelihood estimates of the parameters
are: ̂= -6, ̂ = -0.007, ̂ = 0.00007
and a plot of the predicted values of m55,t is shown in the graph below:
www.sankhyiki.in
+91-9711150002
A colleague has commented that this model is not an adequate fit to the
observed data and suggests replacing the quadratic function with a cubic
spline function, again fitting a different function for each age.
(a) Set out the revised mortality projection model that uses a cubic spline
function as suggested by your colleague, defining all the symbols used.
(b) Give a possible reason for the inadequate fit of the original model and
explain how the use of the cubic spline function could improve the model
as suggested.
(c) A second colleague has challenged the use of cubic splines for this
purpose, arguing that the resulting fitted model tends to be too ‗rough‘.
Explain what is meant by ‗rough‘ in this context, and describe how the
method of p-splines could be used to help address this difficulty.
(iii) Describe the disadvantages of using the p-spline approach.
4. In a particular country, Y and Z are important terminal diseases that are

significant causes of death for men at older ages. The following represents a
Markov jump model of the process, for male lives aged 70, showing annual
constant transition rates:
(i) Calculate the probability that a healthy male life aged exactly 70 is dead
by the end of the coming year.
(ii) An early diagnosis of Disease Z can prevent the disease from entering the
terminal phase and can lead to a full recovery.
www.sankhyiki.in
+91-9711150002
A national screening program has been planned that will increase the
rates of early diagnosis of Disease Z, and this is expected to reduce the
rate of contracting the terminal phase of the illness by 70% of the current
rate (i.e the transition rate from H to Z in the above Markov model should
reduce by 70%). All other transition rates are expected to remain the same
as before.
Calculate the revised probability of dying over the year, and hence the
percentage reduction in the overall probability of mortality achieved.
(iii) Without performing any more calculations, explain whether a similar
screening program for Disease Y (which would reduce the transition
rate from H to Y by 70%) would result in a greater or lower percentage
reduction in the overall 1-year probability of mortality.
5. The following Lee-Carter mortality projection model is being fitted to some

historical data: lnmx,t = ax + bx kt + ex,t where:
mx,t is the central mortality rate at age x in Year t
ax and bx are factors relating to mortality rates projected for age x
kt is a factor relating to mortality rates projected for Year t
ex,t is an independent and identically distributed error term.
In this particular model there are 37 different projection years ( t = 0,1,..., 36 ),

where t = 0 is the base calendar year for the projection.
(i) State the constraints that are typically imposed on the estimated values of
bx and kt when fitting the model.
(ii) A model has been fitted to the data, and it is found that estimated values
of kt are related as follows:
̂ ̂
Given that these values satisfy the overall constraints specified in part (i),
calculate the estimated values of k0 and k10 for this model.
(iii) The following ratio is used to show the projected change in mortality at a
̂
particular age x over the first ten years of the projection: ̂
where ̂ is
the predicted mortality rate from the fitted model ignoring error terms.
(a) Calculate this ratio for the case where ̂ x =1.
(b) Three of the estimated values of bx are:
̂ 50 = -0.14, ̂ 65 = 0.28, ̂ 75 =1.30
Calculate the values of the above ratio for x = 50, 65 and 75.
(iv) With reference to the values you have calculated in part (iii) or otherwise,
explain how the sign and magnitude of the value of the bx parameter
influences the impact of the assumed time trend on projected mortality
rates using the Lee-Carter model.
www.sankhyiki.in
+91-9711150002
ANSWERS
1. (i) x is the lowest level, expressed as a proportion of the current mortality rate at
age x, to which the mortality rate at age x can reduce at any time in the future. fn,x
is the proportion of the maximum possible reduction (of (1 - x ) ) that is expected
to have occurred by n years‘ time.
(ii) Both parameters could be set by expert opinion, perhaps assisted by some
analysis of relevant recent observed mortality trends.
(iii)
(iv) Advantages
- The method is easy to implement.
Disadvantages
- The effect of such factors as lifestyle changes and prevention of hitherto
major causes of death are difficult to predict, as they have not occurred
before, and experts may fail to judge the extent of the impact of these on
future mortality adequately.
- Because the parameters are themselves target forecasts, there is a
circularity in the theoretical basis of the projection model (because
forecasts are being used to construct a model whose purpose should be to
produce those forecasts).
- Setting the target levels leads to an under-estimation of the true level of
uncertainty around the forecasts.
2. (i) Three-factor models have the logical problem that each factor is linearly
dependent on the other two. So we need to ensure that the three arguments of
the function work together in a consistent way in the formulae.
(ii) (a) In the Lee-Carter model:
ax is the mean value of ln(mx,t) averaged over all periods t
kt is the effect of time on mortality
bx is the extent to which mortality is affected by the time trend at age x
ex,t is the error term (independently and identically distributed with zero
mean and common variance).
(b) ∑ ∑
(c) kt = 0.4, 0.38, .... - 0.38, - 0.4 for t =1990,1991, ...2029,2030 respectively
(iii)(a) Mortality rates at age 60 are assumed to be improving at three times the
rate at which they are improving at age 70.
(b) b70 = 0.25 b60 = 0.75 a60 =-6.34244 a70=-4.32150
(c) ̂ 60,2025 =0.00141 ̂ 70,2025 = 0.01232
iv) Future estimates of mortality at different ages are heavily dependent on the
original estimates of the parameters ax and bx , which are assumed to remain
constant into the future. These parameters are estimated from past data, and will
incorporate any roughness contained in the data. In particular, they may be
www.sankhyiki.in
+91-9711150002
distorted by past period events, which might affect different ages to different
degrees. If the estimated bx values show variability from age to age, it is possible
for the forecast age-specific mortality rates to ‗cross over‘ (such that, for example,
projected rates may increase with age at one duration, but decrease with age at
the next).
There is a tendency for Lee-Carter forecasts to become increasingly rough over
time. The model assumes that the underlying rates of mortality change are
constant over time across all ages, when there is empirical evidence that this is
not so.
The Lee-Carter model does not include a cohort term, whereas there is evidence
from some countries that certain cohorts exhibit higher mortality improvements
than others.
Unless observed rates are used for the forecasting, it can produce ‗jump-off‘
effects (ie an implausible jump between the most recent observed mortality rate
and the forecast for the first future period).
3. (i) A=ea B = eb C = ec
(ii) (a) The mortality projection model would now be:
[ ( )] ∑ ( )
where there are J knots positioned at values t1,t2, ...,tJ , j are parameters to
be fitted fromthe data, and:
( ) {
( )
(b) The trend in mortality over time is unlikely to follow a quadratic function,
even after it has been log-transformed, as in this model, because the progression
of predicted values is likely to be too smooth.
There may be significant variations in the trends in the past data that may be
relevant to future projections and which we would therefore like the model to
take into account.
Spline functions are very flexible models in terms of the shape of the function
being fitted.
Adherence to data can be improved both by increasing the number of knots
used, and by placing the knots in locations where the greatest changes in
curvature of the trend line occur.
However, some smoothing is still a requirement, and using cubic splines
generally produces the smoothest result (compared to using splines of higher
orders).
(c) The problem with splines is that they can be too flexible, and may cause the
model to include historical trend variations that are either short-term or past-
specific, and which are not expected to recur in future.
To include these features in the model may then be inappropriate or unhelpful
www.sankhyiki.in
+91-9711150002
when we attempt to use the model for forecasting purposes.

One symptom of this over-adherence, or roughness, in the model, is that the
sequence of estimated parameters ̂ ̂ ̂ may form an uneven progression,
and smoothing this progression can help reduce the roughness in the predicted
values from the model.
The method of p-splines attempts to find an optimal model by introducing a
penalty for models, which have excessive ‗roughness‘.
The method may be implemented as follows:
- Specify the knot spacing and degree of the polynomials in each spline.
- Define a roughness penalty, P() , which increases with the variability of
adjacent coefficients. This, in effect, measures the amount of roughness in
the fitted model.
- Define a smoothing parameter, , such that if  = 0 , there is no penalty
for increased roughness, but as  increases, roughness is increasingly
penalised.
- Estimate the parameters of the model, including the number of splines,
by maximising the penalised log-likelihood:
( ) ( ) ( )
where l( )would be the usual log-likelihood for the model.
- The penalised log-likelihood is effectively trying to balance smoothness
and adherence to the data.
(iii) When applied to ages separately, mortality at different ages is forecast
independently so there is a danger that there will be roughness between adjacent
ages. There is no explanatory element to the projection (in the way that time
series methods use a structure for mortality and an identifiable time series for
projection). p-splines tend to be over-responsive to adding an extra year of data.
4. (i) 0.016639 (ii) q70 = 0.015311 which is 0.001328 lower than the previous value
of 0.016639. This is a reduction of 8.0%.
(iii) The reduction in mortality rate would be less, for two reasons:
(1) People with Disease Y live for longer on average than those with
disease Z. So, cutting the number of people contracting Disease Y
will have a proportionately lower impact on the total number dying
during the year compared to Disease Z (ie Z is a more serious
disease than Y, so reducing the incidence of Z should have the
bigger impact on mortality rates).
(2) The transition rate from H to Y is lower than from H to Z. So
reducing this rate to 30% of its current level will cause a smaller
reduction in the number of people contracting Disease Y over the
year. So, even if the mortality rates for the two diseases were the
same, the impact on the number of people dying would be less (ie
www.sankhyiki.in
+91-9711150002
Disease Z is commoner than Disease Y, so there are fewer deaths

from Disease Y that can be prevented).
5. (i) ∑ ∑ (ii) ̂ ̂
(iii) (a) 0.905 (b) 1.014 0.972 0.878
̂
(iv) When x =1 , the projected change in mortality over time directly reflects the
change in the time trend function ̂ t over the specified time period (eg in this
model this leads to a 9.5% reduction in mortality over the first ten years of the
projection).
When ̂ x is positive, the change in mortality over time is in the same direction as
the time trend function (eg in this model positive ˆbx apply at ages 65 and 75 and
so mortality is projected to reduce over the ten-year projection period at these
ages).
When ̂ x is negative, the trend in mortality assumed at that age is in the opposite
direction to the time trend function in the model (eg in this model a negative
value of ˆbx applies at age 50 and so mortality rates are predicted to rise over the
ten- year period at this age).
When 0<| ̂ x |<1, the change in mortality over time is smaller in absolute terms
than the change in the time trend function (eg this applies at ages 50 and 65 in
this model, where changes in mortality of +1.4% and -2.8% respectively are
projected, both of which are less in absolute terms than the 9.5% change obtained
when ̂ x =1 ).
When ̂ x 1, the change in mortality over time is greater in absolute terms than
the change in the time trend function (eg in this model this applies at age 75,
where a reduction of 12.2% in mortality is projected for the ten-year period).
www.sankhyiki.in
+91-9711150002
STOCHASTIC PROCESSES
1. (i) Define each of the following examples of a stochastic process
(a) a symmetric simple random walk
(b) a compound Poisson process
(ii) For each of the processes in (i), classify it as a stochastic process according
to its state space and the time that it operates on. [UK April 2005]
2. (i) In the context of a stochastic process denoted by {Xt : t J}, define:

(a) state space (b) time set (c) sample path
(ii) Stochastic process models can be placed in one of four categories

according to whether the state space is continuous or discrete, and
whether the time set is continuous or discrete. For each of the four
categories:
(a) State a stochastic process model of that type.
(b) Give an example of a problem an actuary may wish to study using
a model from that category. [UK Sept 2005]
3. In the context of a stochastic process {Xt : t J}, explain the meaning of the
following conditions:(a) strict stationarity (b) weak stationarity [UK April 2006]
4. (i) Define the following types of a stochastic process:

(a) a Poisson process
(b) a compound Poisson process; and
(c) a general random walk
(ii) For each of the processes in (i), state whether it operates in continuous or
discrete time and whether it has a continuous or discrete state space.
(iii) For each of the processes in (i), describe one practical situation in which an
actuary could use such a process to model a real world phenomenon.
[UK Sept 2006]
5. (a) Define, in the context of stochastic processes, a:

1. mixed process 2. counting process
(b) Give an example application of each type of process. [UK April 2007]
6. (i) Define the following stochastic processes:

(a) Poisson process (b) compound Poisson process
(ii) Identify the circumstances in which a compound Poisson process is also a
Poisson process. [UK April 2008]
www.sankhyiki.in
+91-9711150002
7. (i) Explain how the classification of stochastic processes according to the

nature of their state space and time space leads to a four way
classification.
(ii) For each of the four types of process:
(a) give an example of a statistical model
(b) write down a problem of relevance to the operation of:
• a food retailer
• a general insurance company [UK April 2009]
8. For each of the following processes:

counting process; general random walk;
compound Poisson process; Poisson process;
Markov jump chain.
(a) State whether the state space is discrete, continuous or can be either.
(b) State whether the time set is discrete, continuous, or can be either.
[UK April 2010]
9. Describe how a strictly stationary stochastic process differs from a weakly

stationary stochastic process. [UK April 2011]
10. (i) Define a general random walk.

(ii) State the conditions under which a general random walk would become a
simple random walk. [UK April 2012]
11. For both of the following sets of four stochastic processes, place each process in a
separate cell of the following table, so that each cell correctly describes the state
space and the time space of the process placed in it. Within each set, all four
processes should be placed in the table.
Time space
Discrete Continuous
Discrete
State space
Continuous
(a) General Random Walk, Compound Poisson Process, Counting Process,

Poisson Process
(b) Simple Random Walk, Compound Poisson Process, Counting Process, White
Noise [UK April 2013]
www.sankhyiki.in
+91-9711150002
12. (i) Define a Poisson process.
A bus route in a large town has one bus scheduled every 15 minutes. Traffic
conditions in the town are such that the arrival times of buses at a particular bus
stop may be assumed to follow a Poisson process.
Mr Bean arrives at the bus stop at 12 midday to find no bus at the stop. He
intends to get on the first bus to arrive.
(ii) Determine the probability that the first bus will not have arrived by 1.00
pm the same day.
The first bus arrived at 1.10 pm but was full, so Mr Bean was unable to board it.
(iii) Explain how much longer Mr Bean can expect to wait for the second bus
to arrive.
(iv) Calculate the probability that at least two more buses will arrive between
1.10 pm and 1.20 pm. [UK Sept 2013]
13. For a simple random walk:

(i) Define the process.
(ii) Write down the nature of the state space and time space in which it
operates.
(iii) Describe an example of a practical application of the process.
[UK April 2015]
14. A football match between two teams, Team A and Team B, is being decided by a
penalty competition. Each team takes one penalty alternately. Team A goes first.
Let Xi be the total number of penalties scored by team A minus the total number
of penalties scored by team B after the ith penalty has been taken. If Xi = 2, team
A wins and the competition stops. If Xi = –2, team B wins and the competition
stops.
(i) Determine the possible sample paths for the process Xi for i = 1, 2, 3, 4.
Suppose the chance of team A scoring each of its penalties is 0.5, and the chance
of team B scoring each of its penalties is 0.4.
(ii) Determine the distribution of Xi for i = 2 and i = 3.
www.sankhyiki.in
+91-9711150002
ANSWERS
1. (i) (a) where {
(b) Let Nt be a Poisson process, t 0 and let Y1, Y2, , Yj, , be a sequence of i.i.d.
random variables. Then a compound Poisson process is defined by ∑
(ii) (a) A simple random walk operates on discrete time and has a discrete state
space (the set of all integers, Z).
(b) A compound Poisson process operates on continuous time. It has a discrete or
continuous state space depending on whether the variables Yj are discrete or
continuous respectively.
2. (i) (a) The state space is the set of values which it is possible for each random
variable Xt to take.
(b) The time set is the set J, the times at which the process contains a random
variable Xt.
(c) A sample path is a joint realisation of the variables Xt for all t in J, that is a set
of values for Xt (at each time in the time set) calculated using the previous values
for Xt in the sample path.
(ii) Discrete State Space, Discrete Time

(a) Simple random walk, Markov chain, or any other suitable example
(b) Any reasonable example. For example: No Claims Discount systems, Credit
Rating at end of each year
Discrete State Space, Continuous Time

(a) Poisson process, Markov jump process, for example
(b) Any reasonable example. For example: Claims received by an insurer, Status
of pension scheme member
Continuous State Space, Discrete Time

(a) General random walk, time series, for example
(b) Any reasonable example. For example: Share prices at end of each trading
day, Inflation index
Continuous State Space, Continuous Time

(a) Brownian motion, diffusion or Itô process, for example. Compound Poisson
process if the defined state space is continuous.
(b) Any reasonable example. For example: Share prices during trading period,
Value of claims received by insurer
www.sankhyiki.in
+91-9711150002
3. For a process to be strictly stationary, the joint distribution of and

are identical for all k, t1, t2,…, tn in J and all integers n. This
means that the statistical properties of the process remain unchanged over time.
(b) Because strict stationarity is difficult to test fully in real life, we also use the
less stringent condition of weak stationarity. Weak stationarity requires that the
mean of the process, E[Xt] = m(t), is constant and the covariance,
E[(Xs - m(s)) (Xt - m(t))], depends only on the time difference t-s.
4. (ii) (a) A Poisson process operates in continuous time and has a discrete state
space, the set of nonnegative integers.
(b) A compound Poisson process operates in continuous time. It has a discrete or
continuous state space depending on whether the variables Yj are discrete or
continuous respectively.
(c) A general random walk operates in discrete time. Again, this has a discrete or
continuous state space according to whether the variables Yj have a discrete or
continuous distribution.
(iii) (a) Examples of a Poisson process:

• claims arriving to an insurance company through time
• car accidents reported over time
• arrival of customers at a service point over time
(b) A standard example of a compound Poisson process used by actuaries is for

modelling the total amount of claims to an insurance company over time.
(c) Examples of a general random walk:

• modelling share prices daily
• inflation index, measured on say a monthly basis
5. Mixed process
(a) Is a stochastic process that operates in continuous time, which can also change
value at predetermined discrete instants.
(b) The number of contributors to a pension scheme can be modelled as a mixed
process with state space S ={1, 2,3,...} and time interval J =[0,∞].
Counting process
(a) Is a process, X, in discrete or continuous time, whose state space is the natural
numbers {0, 1, 2, …}. X(t) is a non-decreasing function of t.
(b) Number of claims reported to an insurer by time t.
www.sankhyiki.in
+91-9711150002
6. A compound Poisson process meets the conditions for being a Poisson process if
Yi is an indicator function OR if each Yi is identically 1 (which is a special case of
the indicator function)
7.
Problem of Problem of
Type of process Statistical Model relevance to food relevance to a
retailer general insurer
Whether or not
SS Discrete and particular product
Markov chain NCD
TS Discrete out of stock at the
end of each day
Number of claims
SS Discrete and Rate of arrival of
Counting Process received monitored
TS Continuous customers in shop
continuously
Total amount
Value of goods in insured on a certain
SS Continuous
White Noise stock at the end of type of policy
and TS Discrete
each day valued at the end of
each month
Volume (or value)
SS Continuous Value of claims
Compound Poisson of trade in shop
and TS arriving monitored
Process over a continuous
Continuous period of time
continuously
8. State Space Time Set
Counting Process Discrete Discrete or Continuous

General Random Walk Discrete or Continuous Discrete
Compound Poisson Process Discrete or Continuous Continuous
Poisson Process Discrete Continuous
Markov Jump Chain Discrete Discrete
9. (a)
Time space
Discrete Continuous
Discrete Counting process Poisson process
State space General random Compound
Continuous
walk Poisson process
www.sankhyiki.in
+91-9711150002
Time space
Discrete Continuous
Simple Random
Discrete Counting process
walk
State space
Compound
Continuous White noise
Poisson process
12. (ii) 0.0183 (iii) 15 minutes (iv) 0.1443
13. (i) This is defined as Xn= Y1 + Y2 +... Yn where the random variables Yj (the steps
of the walk) are mutually independent with the common probability distribution:
Pr[Yj = 1] = p, Pr[Yj = -1] = 1 - p.
(ii) It operates in discrete time with a discrete state space.
(iii) Any reasonable practical application e.g. cumulative results of the Oxford vs
Cambridge boat race (net lead of Cambridge over Oxford) measured annually.
OR how much a gambler has won or lost if he wins or loses £1 on every bet
14. (i) Team A goes first, so at i = 1 the process can have the values 1 (if Team A
scores) or 0 (if Team A misses).
Team B then has a go. If Team B scores, then X2 = X1 – 1.
If Team B misses, then X2 = X1.
Team A then has another go. If Team A scores, then X3 = X2 + 1.

If Team A misses, then X3 = X2. [½]
Hence possible sample paths for Xi (i = 1, 2, 3, 4) are:

0, 0, 0, 0 0, 0, 0, –1 0, 0, 1, 0 0, 0, 1, 1 0, –1, 0, 0 0, –1, 0, –1
0, –1, –1, –1 0, –1, –1, –2 1, 0, 0, –1 1, 0, 0, 0 1, 0, 1, 0 1, 0, 1, 1
1, 1, 1, 0 1, 1, 1, 1 1, 1, 2, process ends at i = 3
(ii)
x P( ) P( )
-1 0.2 0.1
0 0.5 0.35
1 0.3 0.4
2 0 0.15
www.sankhyiki.in
+91-9711150002
MARKOV CHAINS
1. Let Y1, Y3, Y5,…, be a sequence of independent and identically distributed
random variables with ( ) ( ) and define
for k = 1, 2, 3,…
(i) Show that {Yk : k =1, 2,...} is a sequence of independent and identically
distributed random variables.
Hint: You may use the fact that, if X, Y are two variables that take only
two values and E(XY) = E(X)E(Y), then X, Y are independent.
(ii) Explain whether or not {Yk : k = 1, 2,...} constitutes a Markov chain.
(iii) (a) State the transition probabilities pij(n) = P(Ym+n = j |Ym = i) of the
sequence {Yk : k = 1, 2,... }
(b) Hence show that these probabilities do not depend on the current
state and that they satisfy the Chapman-Kolmogorov equations.
[UK April 2005]
2. A No-Claims Discount system operated by a motor insurer has the following

four levels:
Level 1: 0% discount
The rules for moving between these levels are as follows:
 Following a year with no claims, move to the next higher level, or remain
at level 4.
 Following a year with one claim, move to the next lower level, or remain
at level 1.
 Following a year with two or more claims, move back two levels, or move
to level 1 (from level 2) or remain at level 1.
For a given policyholder the probability of no claims in a given year is 0.85 and
the probability of making one claim is 0.12.
X(t) denotes the level of the policyholder in year t.
(i) (a) Explain why X(t) is a Markov chain.

(b) Write down the transition matrix of this chain.
www.sankhyiki.in
+91-9711150002
(ii) Calculate the probability that a policyholder who is currently at level 2

will be at level 2 after:
(a) one year (b) two years (c) three years
(iii) Explain whether the chain is irreducible and/or aperiodic.
(iv) Calculate the long-run probability that a policyholder is in discount level

2. [UK April 2005]
3. A die is rolled repeatedly. Consider the following two sequences:
I Bn is the largest number rolled in the first n outcomes.
II Cn is the number of sixes rolled in the first n outcomes.
For each of these two sequences:
(a) Explain why it is a Markov chain.

(b) Determine the state space of the chain.
(c) Derive the transition probabilities.
(d) Explain whether the chain is irreducible and/or aperiodic.
(e) Describe the equilibrium distribution of the chain. [UK Sept 2005]
4. A motor insurer s No Claims Discount system uses the following levels of

discount {0%, 25%, 40%, 50%}. Following a claim free year a policyholder moves
up one discount level (or remains on 50% discount). If the policyholder makes
one (or more) claims in a year they move down one level (or remain at 0%
discount).
The insurer estimates that the probability of making at least one claim in a year is
0.1 if the policyholder made no claims the previous year, and 0.25 if they made a
claim the previous year.
New policyholders should be ignored.
(i) Explain why the system with state space {0%, 25%, 40%, 50%} does not
form a Markov chain.
(ii) (a) Show how a Markov chain can be constructed by the introduction
of additional states.
(b) Write down the transition matrix for this expanded system, or
draw its transition diagram.
(iii) Comment on the appropriateness of the current No Claims Discount
system. [UK April 2006]
www.sankhyiki.in
+91-9711150002
5. Employees of a company are given a performance appraisal each year. The

appraisal results in each employee s performance being rated as High (H),
Medium (M) or Low (L). From evidence using previous data it is believed that
the performance rating of an employee evolves as a Markov chain with transition
matrix:
for some parameter .
(i) Draw the transition graph of the chain.
(ii) Determine the range of values for for  which the matrix P is a valid
transition matrix.
(iii) Explain whether the chain is irreducible and/or aperiodic.
(iv) For  = 0.2, calculate the proportion of employees who, in the long run,
are in state L.
(v) Given that = 0.2, calculate the probability that an employee s rating in the
third year, X3, is L:
(a) in the case that the employee‘s rating in the first year, X1, is H
(b) in the case X1 = M
(c) in the case X1 = L [UK April 2006]
6. The credit-worthiness of debt issued by companies is assessed at the end of each

year by a credit rating agency. The ratings are A (the most credit-worthy), B and
D (debt defaulted). Historic evidence supports the view that the credit rating of a
debt can be modelled as a Markov chain with one-year transition matrix
X=
(i) Determine the probability that a company rated A will never be rated B in
the future.
www.sankhyiki.in
+91-9711150002
(ii) (a) Calculate the second order transition probabilities of the Markov
chain.
(b) Hence calculate the expected number of defaults within the next
two years from a group of 100 companies, all initially rated A.
The manager of a portfolio investing in company debt follows a downgrade

trigger strategy. Under this strategy, any debt in a company whose rating has
fallen to B at the end of a year is sold and replaced with debt in an A-rated
company.
(iii) Calculate the expected number of defaults for this investment manager
over the next two years, given that the portfolio initially consists of 100 A-
rated bonds.
(iv) Comment on the suggestion that the downgrade trigger strategy will
improve the return on the portfolio. [UK Sept 2006]
7. A manufacturer uses a test rig to estimate the failure rate in a batch of electronic
components. The rig holds 100 components and is designed to detect when a
component fails, at which point it immediately replaces the component with
another from the same batch. The following are recorded for each of the n
components used in the test (i = 1,2, ,n):
si = time at which component i placed on the rig
ti = time at which component i removed from rig
The test rig was fully loaded and was run for two years continuously.
You should assume that the force of failure, , of a component is constant and
component failures are independent.
(i) Show that the contribution to the likelihood from component i is:
( ( ))
(ii) Derive the maximum likelihood estimator for . [UK Sept 2006]
8. A motor insurance company wishes to estimate the proportion of policyholders

who make at least one claim within a year. From historical data, the company
believes that the probability a policyholder makes a claim in any given year
www.sankhyiki.in
+91-9711150002
depends on the number of claims the policyholder made in the previous two
years. In particular:
 the probability that a policyholder who had claims in both previous years
will make a claim in the current year is 0.25
 the probability that a policyholder who had claims in one of the previous
two years will make a claim in the current year is 0.15; and
 the probability that a policyholder who had no claims in the previous two
years will make a claim in the current year is 0.1
(i) Construct this as a Markov chain model, identifying clearly the states of
the chain.
(ii) Write down the transition matrix of the chain.
(iii) Explain why this Markov chain will converge to a stationary distribution.
(iv) Calculate the proportion of policyholders who, in the long run, make at
least one claim at a given year. [UK Sept 2006]
9. A three state process with state space {A, B, C} is believed to follow a Markov
chain with the following possible transitions:
An instrument was used to monitor this process, but it was set up incorrectly and
only recorded the state occupied after every two time periods. From these
observations the following two-step transition probabilities have been estimated:
P2AA = 0.5625
P2AB = 0.125
P2BA = 0.475
P2CC = 0.4
Calculate the one-step transition matrix consistent with these estimates.
[UK April 2007]
10. Every person has two chromosomes, each being a copy of one of the
chromosomes from one of their parents. There are two types of chromosomes
labelled X and Y. A child born with an X and a Y chromosome is male and a child
with two X chromosomes is female.
www.sankhyiki.in
+91-9711150002
The blood-clotting disorder haemophilia is caused by a defective X chromosome

(X*). A female with the defective chromosome (X*X) will not usually exhibit
symptoms of the disease but may pass the defective gene to her children and so
is known as a carrier. A male with the defective chromosome (X*Y) suffers from
the disease and is known as a haemophiliac.
A medical researcher wishes to study the progress of the disease through the first
born child in each generation, starting with a female carrier.
You may assume:
• every parent has a equal chance of passing either of their chromosomes to
their children
• the partner of each person in the study does not carry a defective X
chromosome; and
• no new genetic defects occur
(i) Show that the expected progress of the disease through the generations
may be modelled as a Markov chain and specify carefully:
(a) the state space; and
(b) the transition diagram
(ii) State, with reasons, whether the chain is:

(a) irreducible; and
(b) aperiodic
(iii) Calculate the stationary distribution of the Markov chain. [UK April 2007]
11. A no-claims discount system has 3 levels of discount: 0%, 25% and 50%. The
rules for moving between discount levels are:
• After a claim-free year, move up to the next higher level or remain at the
50% discount level.
• After a year with one or more claims, move down to the next lower level
or remain at the 0% discount level.
The long-run probability that a policyholder is in the maximum discount level is
0.75.
Calculate the probability that a given policyholder has a claim-free year,
assuming that this probability is constant. [UK Sept 2007]
12. In a game of tennis, when the score is at ―Deuce‖ the player winning the next
point holds ―Advantage‖. If a player holding ―Advantage‖ wins the following
point that player wins the game, but if that point is won by the other player the
score returns to ―Deuce‖.
When Andrew plays tennis against Ben, the probability of Andrew winning any
point is 0.6. Consider a particular game when the score is at ―Deuce‖.
www.sankhyiki.in
+91-9711150002
(i) Show that the subsequent score in the game can be modelled as a Markov
Chain, specifying both:
(b) the transition matrix
(ii) State, with reasons, whether the chain is:
(a) irreducible; and
(b) aperiodic
(iii) Calculate the number of points, which must be played before there is
more than a 90% chance of the game having been completed.
(iv) (a) Calculate the probability that Andrew wins the game.
(b) Comment on your answer. [UK Sept 2007]
13. In a certain small country all listed companies are required to have their accounts
audited on an annual basis by one of the three authorised audit firms (A, B and
C). The terms of engagement of each of the audit firms require that a minimum
of two annual audits must be conducted by the newly appointed firm. Whenever
a company is able to choose to change auditors, the likelihood that it will retain
its auditors for a further year is (80%, 70%, 90%) where the current auditor is
(A,B,C) respectively. If changing auditors a company is equally likely to choose
either of the alternative firms.
(i) A company has just changed auditors to firm A. Calculate the expected
number of audits which will be undertaken before the company changes
auditors again.
(ii) Formulate a Markov chain which can be used to model the audit firm
used by a company, specifying:
(a) the state space
(b) the transition matrix
(iii) Calculate the expected proportion of companies using each audit firm in
the long term. [UK April 2008]
14. A No-Claims Discount system operated by a motor insurer has the following
four levels:
The rules for moving between these levels are as follows:
• Following a year with no claims, move to the next higher level, or remain
at level 4.
• Following a year with one claim, move to the next lower level, or remain
at level 1.
www.sankhyiki.in
+91-9711150002
• Following a year with two or more claims, move down two levels, or
move to level 1 (from level 2) or remain at level 1.
For a given policyholder in a given year the probability of no claims is 0.85 and
the probability of making one claim is 0.12.
(i) Write down the transition matrix of this No-Claims Discount process.
(ii) Calculate the probability that a policyholder who is currently at level 2
will be at level 2 after:
(a) one year.
(b) two years.
(iii) Calculate the long-run probability that a policyholder is in discount level
2. [UK Sept 2008]
15. Consider the random variable defined by Xn= ∑ with each Yi mutually
independent with probability:
P[Yi = 1] = p, P[Yi= -1] = 1-p 0<p<1
(i) Write down the state space and transition graph of the sequence Xn.
(ii) State, with reasons, whether the process:
(a) is aperiodic.
(b) is reducible.
(c) admits a stationary distribution.
Consider j > i > 0.
(iii) Derive an expression for the number of upward movements in the
sequence Xn between t and (t + m) if Xt= i and Xt+m= j.
(iv) Derive expressions for the m-step transition probabilities pij(m).
(v) Show how the one-step transition probabilities would alter if Xn was
restricted to non-negative numbers by introducing:
(a) a reflecting boundary at zero.
(b) an absorbing boundary at zero.
(vi) For each of the examples in part (v), explain whether the transition
probabilities pij(m)would increase, decrease or stay the same.
(Calculation of the transition probabilities is not required.) [UK Sept 2008]
16. (i) Explain what is meant by a time-homogeneous Markov chain.
Consider the time-homogeneous two-state Markov chain with transition matrix:

. /
(ii) Explain the range of values that a and b can take which result in this being
a valid Markov chain which is:
(a) irreducible (b) periodic [UK April 2009]
www.sankhyiki.in
+91-9711150002
17. A motor insurer operates a no claims discount system with the following levels
of discount {0%, 25%, 50%, 60%}.
The rules governing a policyholder‘s discount level, based upon the number of
claims made in the previous year, are as follows:
• Following a year with no claims, the policyholder moves up one discount
level, or remains at the 60% level.
• Following a year with one claim, the policyholder moves down one
discount level, or remains at 0% level.
• Following a year with two or more claims, the policyholder moves down
two discount levels (subject to a limit of the 0% discount level).
The number of claims made by a policyholder in a year is assumed to follow a

Poisson distribution with mean 0.30.
(i) Determine the transition matrix for the no claims discount system.
(ii) Calculate the stationary distribution of the system, π.
(iii) Calculate the expected average long term level of discount.
The following data shows the number of the insurer‘s 130,200 policyholders in
the portfolio classified by the number of claims each policyholder made in the
last year.
This information was used to estimate the mean of 0.30.
No claims 96,632
One claim 28,648
Two claims 4,400
Three claims 476
Four claims 36
Five claims 8
(iv) Test the goodness of fit of these data to a Poisson distribution with mean
0.30.
(v) Comment on the implications of your conclusion in (iv) for the average
level of discount applied. [UK April 2009]
18. (i) State the Markov property.

A stochastic process X(t) operates with state space S.
(ii) Prove that if the process has independent increments it satisfies the
Markov property.
(iii) (a) Describe the difference between a Markov chain and a Markov
jump process.
www.sankhyiki.in
+91-9711150002
(b) Explain what is meant by a Markov chain being irreducible.

An actuarial student can see the office lift (elevator) from his desk. The lift has an
indicator which displays on which of the office‘s five floors it is at any point in
time. For light relief the student decides to construct a model to predict the
movements of the lift.
(iv) Explain whether it would be appropriate to select a model which is:
(a) irreducible (b) has the Markov property [UK Sept 2009]
19. A firm rents cars and operates from three locations — the Airport, the Beach and
the City. Customers may return vehicles to any of the three locations.
The company estimates that the probability of a car being returned to each
location is as follows:
Car returned to
Car hired from Airport Beach City
Airport 0.5 0.25 0.25
Beach 0.25 0.75 0
City 0.25 0.25 0.5
(i) Calculate the 2-step transition matrix.
(ii) Calculate the stationary distribution π.
It is suggested that the cars should be based at each location in proportion to the
stationary distribution.
(iii) Comment on this suggestion.
(iv) Sketch, using your answers to parts (i) and (ii), a graph showing the
probability that a car currently located at the Airport is subsequently at
the Airport, Beach or City against the number of times the car has been
rented. [UK Sept 2009]
20. A Markov Chain with state space {A, B, C} has the following properties:
• it is irreducible
• it is periodic
• the probability of moving from A to B equals the probability of moving
from A to C
(i) Show that these properties uniquely define the process.
(ii) Sketch a transition diagram for the process. [UK April 2010]
21. An airline runs a frequent flyer scheme with four classes of member: in
www.sankhyiki.in
+91-9711150002
ascending order Ordinary, Bronze, Silver and Gold. Members receive benefits
according to their class. Members who book two or more flights in a given
calendar year move up one class for the following year (or remain Gold
members), members who book exactly one flight in a given calendar year stay at
the same class, and members who book no flights in a given calendar year move
down one class (or remain Ordinary members).
Let the proportions of members booking 0, 1 and 2+ flights in a given year be p0,
p1 and p2+ respectively.
(i) (a) Explain how this scheme can be modelled as a Markov chain.
(b) Explain why there must be a unique stationary distribution for the
proportion of members in each class.
(ii) Write down the transition matrix of the process.
The airline‘s research has shown that in any given year, 40% of members book no
flights, 40% book exactly one flight, and 20% book two or more flights.
(iii) Calculate the stationary probability distribution.
The cost of running the scheme per member per year is as follows:
Ordinary members £0
Bronze members £10
Silver members £20
Gold members £30
The airline makes a profit of £10 per passenger for every flight before taking into
account costs associated with the frequent flyer scheme.
(iv) Assess whether the airline makes a profit on the members of the scheme.
[UK April 2010]
22. A pet shop has four glass tanks in which snakes for sale are held. The shop can
stock at most four snakes at any one time because:
• if more than one snake were held in the same tank, the snakes would
attempt to eat each other and
• having snakes loose in the shop would not be popular with the
neighbours
The number of snakes sold by the shop each day is a random variable with the
following distribution:
www.sankhyiki.in
+91-9711150002
Number of Snakes Potentially Sold Probability

in Day (if stock is sufficient)
None 0.4
One 0.4
Two 0.2
If the shop has no snakes in stock at the end of a day, the owner contacts his
snake supplier to order four more snakes. The snakes are delivered the following
morning before the shop opens. The snake supplier makes a charge of C for the
delivery.
(i) Write down the transition matrix for the number of snakes in stock when
the shop opens in a morning, given the number in stock when the shop
opened the previous day.
(ii) Calculate the stationary distribution for the number of snakes in stock
when the shop opens, using your transition matrix in part (i).
(iii) Calculate the expected long term average number of restocking orders
placed by the shop owner per trading day.
If a customer arrives intending to purchase a snake, and there is none in stock,
the sale is lost to a rival pet shop.
(iv) Calculate the expected long term number of sales lost per trading day.
The owner is unhappy about losing these sales as there is a profit on each sale of
P. He therefore considers changing his restocking approach to place an order
before he has run out of snakes. The charge for the delivery remains at C
irrespective of how many snakes are delivered.
(v) Evaluate the expected number of restocking orders, and number of lost
sales per trading day, if the owner decides to restock if there are fewer
than two snakes remaining in stock at the end of the day.
(vi) Explain why restocking when two or more snakes remain in stock cannot
optimise the shop‘s profits.
The pet shop owner wishes to maximise the profit he makes on snakes.
(vii) Derive a condition in terms of C and P under which the owner should
change from only restocking where there are no snakes in stock, to
restocking when there are fewer than two snakes in stock. [UK Sept 2010]
www.sankhyiki.in
+91-9711150002
23. Distinguish between the conditions under which a Markov chain:

(a) has at least one stationary distribution.
(b) has a unique stationary distribution.
(c) converges to a unique stationary distribution. [UK April 2011]
24. Children at a school are given weekly grade sheets, in which their effort is
graded in four levels: 1 ―Poor‖, 2 ―Satisfactory‖, 3 ―Good‖ and 4 ―Excellent‖.
Subject to a maximum level of Excellent and a minimum level of Poor, between
each week and the next, a child has:
• a 20 per cent chance of moving up one level.
• a 20 per cent chance of moving down one level.
• a 10 per cent chance of moving up two levels.
• a 10 per cent chance of moving down two levels.
Moving up or down three levels in a single week is not possible.
(i) Write down the transition matrix of this process.
Children are graded on Friday afternoon in each week. On Friday of the first
week of the school year, as there is little evidence on which to base an
assessment, all children are graded ―Satisfactory‖.
(ii) Calculate the probability distribution of the process after the grading on
Friday of the third week of the school year. [UK April 2011]
25. Farmer Giles makes hay each year and he makes far more than he could possibly
store and use himself, but he does not always sell it all. He has decided to offer
incentives for people to buy large quantities so it does not sit in his field
deteriorating. He has devised the following ―discount‖ scheme.
He has a Base price, B of £8 per bale. Then he has three levels of discount: Good
price, G, is a 10% discount, Loyalty price, L is a 20% discount and Super price, S,
is a 25% discount on the Base price.
• Customers who increase their order compared with last year move to one
higher discount level, or remain at level S.
• Customers who maintain their order from last year stay at the same
discount level.
• Customers who reduce their order from last year drop one level of
discount or remain at level B provided that they maintained or increased
www.sankhyiki.in
+91-9711150002
their order the previous year.

• Customers who reduce their order from last year drop two levels of
discount if they also reduced their order last year, subject to remaining at
the lowest level B.
(i) Explain why a process with the state space of {B, G, L, S} does not display
the Markov property.
(ii) (a) Define any additional state(s) required to model the system with
the Markov property.
(b) Construct a transition graph of this Markov process clearly
labeling all the states.
Farmer Giles thinks that each year customers have a 60% likelihood of increasing
their order and a 30% likelihood of reducing it, irrespective of the discount level
they are currently in.
(iii) (a) Write down the transition matrix for the Markov process.
(b) Calculate the stationary distribution.
(c) Hence calculate the long run average price he will get for each bale
of hay.
(iv) Calculate the probability that a customer who is currently paying the
Loyalty price, L, will be paying L in two years‘ time.
(v) Suggest reasons why the assumptions Farmer Giles has made about his
customers‘ behaviour may not be valid. [UK April 2011]
26. The diagrams below show three Markov chains, where arrows indicate a non-
zero transition probability.
State whether each of the chains is:
(a) irreducible.
(b) periodic, giving the period where relevant.
[UK Sept 2011]
www.sankhyiki.in
+91-9711150002
27. An actuary walks from his house to the office each morning, and walks back
again each evening. He owns two umbrellas. If it is raining at the time he sets off,
and one or both of his umbrellas is available, he takes an umbrella with him.
However if it is not raining at the time he sets off he always forgets to take an
umbrella.
Assume that the probability of it raining when he sets off on any particular
journey is a constant p, independent of other journeys.
This situation is examined as a Markov Chain with state space {0,1,2}
representing the number of his umbrellas at the actuary‘s current location (office
or home) and each time step representing one journey.
(i) Explain why the transition graph for this process is given by:
(ii) Derive the transition matrix for the number of umbrellas at the actuary‘s
house before he leaves each morning, based on the number before he
leaves the previous morning.
(iii) Calculate the stationary distribution for the Markov Chain.
(iv) Calculate the long run proportion of journeys (to or from the office) on
which the actuary sets out in the rain without an umbrella.
The actuary considers that the weather at the start of a journey, rather than being
independent of past history, depends upon the weather at the start of the
previous journey. He believes that if it was raining at the start of a journey the
probability of it raining at the start of the next journey is r (0 < r <1), and if it was
not raining at the start of a journey the probability of it raining at the start of the
next journey is s (0 < s < 1, r ≠ s).
(v) Write down the transition matrix for the Markov Chain for the weather.
(vi) Explain why the process with three states {0,1,2}, being the number of his
umbrellas at the actuary‘s current location, would no longer satisfy the
Markov property.
www.sankhyiki.in
+91-9711150002
(vii) Describe the additional state(s) needed for the Markov property to be
satisfied, and draw a transition diagram for the expanded system.
[UK Sept 2011]
28. The series Yi records, for each time period i, whether a car driver is accident free
during that period (Yi = 0) or has at least one accident (Yi = 1).
Define ∑ with state space {0, 1, 2,…}.
An insurer makes an assumption about the driver‘s accident proneness by
considering that the probability of a driver having at least one accident is related
to the proportion of previous time periods in which the driver had at least one
accident as follows:
( ) . / ( )
(i) Demonstrate that the series Xi satisfies the Markov property, whilst Yi
does not.
(ii) Explain whether the series Xi is:
(a) irreducible
(b) time homogeneous
(iii) Draw the transition graph for Xi covering all transitions which could
occur in the first three time periods, including the transition probabilities.
(iv) Calculate the probability that the driver has accidents during exactly two
of the first three time periods.
(v) Comment on the appropriateness of the insurer‘s assumption about
accident proneness. [UK April 2012]
29. A company operates a sick pay scheme as follows:

• Healthy employees pay a percentage of salary to fund the scheme.
• For the first two consecutive months an employee is sick, the sick pay
scheme pays their full salary.
• For the third and subsequent consecutive months of sickness the sick pay
is reduced to 50% of full salary.
To simplify administration the scheme operates on whole months only, that is for
a particular month‘s payroll an employee is either healthy or sick for the purpose
of the scheme.
www.sankhyiki.in
+91-9711150002
The company‘s experience is that 10% of healthy employees become sick the
following month, and that sick employees have a 75% chance of being healthy
the next month.
The scheme is to be modelled using a Markov Chain.
(i) Explain what is meant by a Markov Chain.
(ii) Identify the minimum number of states under which the payments under
the scheme can be modelled using a time homogeneous Markov Chain,
specifying these states.
(iii) Draw a transition graph for this Markov chain.
(iv) Derive the stationary distribution for this process.
(v) Calculate the minimum percentage of salary which healthy employees
should pay for the scheme to cover the sick pay costs.
(vi) Calculate the contributions required if, instead, sick pay continued at
100% of salary indefinitely.
(vii) Comment on the benefit to the scheme of the reduction in sick pay to 50%
from the third month. [UK April 2012]
30. A no claims discount system operates with three levels of discount, 0%, 15% and
40%. If a policyholder makes no claim during the year he moves up a level of
discount (or remains at the maximum level). If he makes one claim during the
year he moves down one level of discount (or remains at the minimum level) and
if he makes two or more claims he moves down to, or remains at, the minimum
level.
The probability for each policyholder of making two or more claims in a year is
25% of the probability of making only one claim.
The long-term probability of being at the 15% level is the same as the long-term
probability of being at the 40% level.
(i) Derive the probability of a policyholder making only one claim in a given
year.
(ii) Determine the probability that a policyholder at the 0% level this year will
be at the 40% level after three years.
(iii) Estimate the probability that a policyholder at the 0% level this year will
be at the 40% level after 20 years, without calculating the associated
transition matrix. [UK Sept 2012]
www.sankhyiki.in
+91-9711150002
31. (i) Define the stationary distribution of a Markov chain.

A baseball stadium hosts a match each evening. As matches take place in the
evening, floodlights are needed. The floodlights have a tendency to break down.
If the floodlights break down, the game has to be abandoned and this costs the
stadium $10,000. If the floodlights work throughout one match there is a 5%
chance that they will fail and lead to the abandonment of the next match.
The stadium has an arrangement with the Floodwatch repair company who are
brought in the morning after a floodlight breakdown and charge $1,000 per day.
There is a 60% chance they are able to repair the floodlights such that the evening
game can take place and be completed without needing to be abandoned. If they
are still broken the repair company is used (and paid) again each day until the
lights are fixed, with the same 60% chance of fixing the lights each day.
(ii) Write down the transition matrix for the process which describes whether
the floodlights are working or not.
(iii) Derive the long run proportion of games which have to be abandoned.
The stadium manager is unhappy with the number of games being abandoned,
and contacts the Light Fantastic repair company who are estimated to have an
80% chance of repairing floodlights each day. However Light Fantastic will
charge more than Floodwatch.
(iv) Calculate the maximum amount the stadium should be prepared to pay
Light Fantastic to improve profitability. [UK Sept 2012]
32. (i) Explain what is meant by a time inhomogeneous Markov chain and give
an example of one.
A No Claims Discount system is operated by a car insurer. There are four levels
of discount: 0%, 10%, 25% and 40%. After a claim-free year a policy holder moves
up one level (or remains at the 40% level). If a policy holder makes one claim in a
year he or she moves down one level (or remains at the 0% level). A policy
holder who makes more than one claim in a year moves down two levels (or
moves to or remains at the 0% level). Changes in level can only happen at the
end of each year.
www.sankhyiki.in
+91-9711150002
(ii) Describe, giving an example, the nature of the boundaries of this process.
(iii) (a) State how many states are required to model this as a Markov
chain.
(b) Draw the transition graph.
The probability of a claim in any given month is assumed to be constant at 0.04.
At most one claim can be made per month and claims are independent.
(iv) Calculate the proportion of policyholders in the long run who are at the
25% level.
(v) Discuss the appropriateness of the model. [UK April 2013]
33. The two football teams in a particular city are called United and City and there is
intense rivalry between them. A researcher has collected the following history on
the results of the last 20 matches between the teams from the earliest to the most
recent, where:
U indicates a win for United;
C indicates a win for City;
D indicates a draw.
UCCDDUCDCUUDUDCCUDCC
The researcher has assumed that the probability of each result for the next match
depends only on the most recent result. He therefore decides to fit a Markov
chain to this data.
(i) Estimate the transition probabilities for the Markov chain.
(ii) Estimate the probability that United will win at least two of the next three
matches against City. [UK Sept 2013]
34. A motor insurer offers a No Claims Discount scheme which operates as follows.
The discount levels are {0%, 25%, 50%, 60%}. Following a claim-free year a
policyholder moves up one discount level (or stays at the maximum discount).
After a year with one or more claims the policyholder moves down two discount
levels (or moves to, or stays in, the 0% discount level).
The probability of making at least one claim in any year is 0.2.
(i) Write down the transition matrix of the Markov chain with state space
{0%, 25%, 50%, 60%}.
(ii) State, giving reasons, whether the process is:
www.sankhyiki.in
+91-9711150002
(a) irreducible (b) aperiodic.

(iii) Calculate the proportion of drivers in each discount level in the stationary
distribution.
The insurer introduces a ―protected‖ No Claims Discount scheme, such that if
the 60% discount is reached the driver remains at that level regardless of how
many claims they subsequently make.
(iv) Explain, without doing any further calculations, how the answers to parts
(ii) and (iii) would change as a result of introducing the ―protected‖ No
Claims Discount scheme. [UK Sept 2013]
35. An industrial kiln is used to produce batches of tiles and is run with a standard
firing cycle. After each firing cycle is finished, a maintenance inspection is
undertaken on the heating element which rates it as being in Excellent, Good or
Poor condition, or notes that the element has Failed.
The probabilities of the heating element being in each condition at the end of a
cycle, based on the condition at the start of the cycle are as follows:
START END
Excellent Good Poor Failed
Excellent 0.5 0.2 0.2 0.1
Good 0.5 0.3 0.2
Poor 0.5 0.5
Failed 1
(i) Write down the name of the stochastic process which describes the
condition of a single heating element over time.
(ii) Explain whether the process describing the condition of a single heating
element is: (a) irreducible. (b) periodic.
(iii) Derive the probability that the condition of a single heating element is
assessed as being in Poor condition at the inspection after two cycles, if the
heating element is currently in Excellent condition.
www.sankhyiki.in
+91-9711150002
If the heating element fails during the firing cycle, the entire batch of tiles in the
kiln is wasted at a cost of £1,000. Additionally a new heating element needs to be
installed at a cost of £50 which will, of course, be in Excellent condition.
(iv) Write down the transition matrix for the condition of the heating element
in the kiln at the start of each cycle, allowing for replacement of failed
heating elements.
(v) Calculate the long term probabilities for the condition of the heating
element in the kiln at the start of a cycle.
The kiln is fired 100 times per year.
(vi) Calculate the expected annual cost incurred due to failures of heating
elements.
The company is concerned about the cost of ruined tiles and decides to change its
policy to replace the heating element if it is rated as in Poor condition.
(vii) Evaluate the impact of the change in replacement policy on the
profitability of the company. [UK April 2014]
36. A sports league has two divisions {1,2} with Division 1 being the higher. Each
season the bottom team in Division 1 is relegated to Division 2, and the top team
in Division 2 is promoted to Division 1.
Analysis of the movements of teams between divisions indicates that the
probabilities of finishing top or bottom of a division differs if a team has just
been promoted or relegated, compared with the probabilities in subsequent
seasons.
The probabilities are as follows:
If neither promoted
Finishing If promoted previous If relegated previous
nor relegated previous
Position season season
season
Top 0.1 0.25 0.15
Bottom 0.3 0.25 0.15
Other 0.6 0.5 0.7
(i) Write down the minimum number of states required to model this as a
Markov chain.
www.sankhyiki.in
+91-9711150002
(ii) Draw a transition graph for the Markov chain.

(iii) Write down the transition matrix for the Markov chain.
(iv) Explain whether the Markov chain is: (a) irreducible. (b) aperiodic.
Team A has just been promoted to Division 1.
(v) Calculate the minimum number of seasons before there is at least a 60%
probability of Team A having been relegated to Division 2. [UK Sept 2014]
37. A motor insurance company offers annually renewable policies. To encourage

policyholders to renew each year it offers a No Claims Discount system which
reduces the premiums for those people who claim less often. There are four
levels of premium:
0: no discount
1: 15% discount
2: 25% discount
3: 40% discount
A policyholder who does not make a claim in the year, moves up one level of
discount the following year (or stays at the maximum level).
A policyholder who makes one or more claims in a year moves down one level
of discount if they did not claim in the previous year (or remains at the lowest
level) but if they made at least one claim in the previous year they move down
two levels of discount (subject to not going below the lowest level).
(i) (a) Explain how many states are required to model this as a Markov
chain.
(b) Draw the transition graph of the process.
The probability, p, of making at least one claim in any year is constant and
independent of whether a claim was made in the previous year.
(ii) Calculate the proportion of policyholders who are at the 25% discount
level in the long run given that the proportion at the 40% level is nine
times that at the 15% level.
(iii) (a) Explain how the state space of the process would change if the
probability of making a claim in any one year depended upon
whether a claim was made in the previous year.
(b) Write down the transition matrix for this new process.
[UK Sept 2014]
www.sankhyiki.in
+91-9711150002
38. (i) Describe what is meant by a Markov chain.

A simplified model of the Internet consists of the following websites with links
between the websites as shown in the diagram below.
An internet user is assumed to browse by randomly clicking any of the links on

the website he is on with equal probability.
(ii) Calculate the transition matrix for the Markov chain representing which
website the internet user is on.
(iii) Calculate, of the total number of visits, what proportion are made to each
website in the long term. [UK April 2015]
39. A profession has examination papers in two subjects, A and B, each of which is
marked by a team of examiners. After each examination session, examiners are
given the choice of remaining on the same team, switching to the other team, or
taking a session‘s holiday.
In recent sessions, 10% of subject A‘s examiners have elected to switch to subject
B and 10% to take a holiday. Subject B is more onerous to mark than subject A,
and in recent sessions, 20% of subject B‘s examiners have elected to take a
holiday in the next session, with 20% moving to subject A.
After a session‘s holiday, the profession allocates examiners equally between
subjects A and B. No examiner is permitted to take holiday for two consecutive
sessions.
(i) Sketch the transition graph for the process.
(ii) Determine the transition matrix for this process.
www.sankhyiki.in
+91-9711150002
(iii) Calculate the proportion of the profession‘s examiners marking for

subjects A and B in the long run.
The profession considers that in future, an equal number of examiners is likely to
be required for each subject. It proposes to try to ensure this by adjusting the
proportion of those examiners on holiday who, when they return to marking, are
allocated to subjects A and B.
(iv) Calculate the proportion of examiners who, on returning from holiday,
should be allocated to subject B in order to have an equal number of
examiners on each subject in the long run. [UK Sept 2015]
40. The weather in a particular city during the summer months is very variable. A
research team has recorded the weather each day during the first three weeks of
July. They use the notation S to denote a sunny day, C to denote a cloudy day,
and R to denote a rainy day. Their results are as follows:
Week 1: SSRCSCC
Week 2: SCRRCSS
Week 3: RCCSCCS
One of the team suggests that the weather each day depends only on the weather
for the previous day and decides to fit a Markov chain to the data.
(i) Estimate the transition probabilities for the Markov chain.
(ii) The team plans to hold its summer barbecue on 23 July. Estimate the
probability that this will be a sunny day.
41. The manager of a sales team keeps records of how much each of the three sales
staff (Andy, Brenda and Carol) sells each week. The data suggests that the sales
staff member who makes the most sales each week can be modelled using a
Markov Chain with the following transition matrix:
Andy 0.4 0.3 0.3
Brenda 0.3 0.5 0.2
Carol 0.2 0.3 0.5
Brenda made the most sales in the first week in April.
(i) Calculate the probability that each member of the sales staff makes the
most sales in the third week of April.
www.sankhyiki.in
+91-9711150002
(ii) Calculate the long-term proportion of weeks in which each member of the
sales staff makes the most sales.
The manager is keen to encourage competition in the team, so he introduces an
―Employee of the Week‖ incentive. He awards ―Employee of the Week‖ to the
member of the sales staff who makes the most sales unless this is the same
employee who was awarded ―Employee of the Week‖ last week. If last week‘s
―Employee of the Week‖ makes the most sales the manager will decide which of
the other two staff should be ―Employee of the Week‖ and is equally likely to
choose either.
(iii) Justify why whoever is awarded ―Employee of the Week‖ can NOT be
modelled as a Markov Chain with state space {Andy, Brenda, Carol}.
(iv) Identify a state space with the minimum number of states required to
model the sequence of ―Employees of the Week‖ as a Markov Chain.
[UK April 2018]
42. A small town is served by a single funeral director. The funeral director collects
corpses immediately following death and stores them in a refrigerator pending
embalming. The number of deaths per day in this town has the following
probability distribution:
Number of deaths per day Probability
0 0.497
1 0.348
2 0.122
3 0.028
4 0.005
The embalmer can embalm exactly one corpse per day. He works on a corpse
from the refrigerator if there is one, but if the refrigerator is empty he works on
the first corpse to arrive that day. Corpses are removed from the refrigerator
immediately before being embalmed and are not returned there after embalming.
The refrigerator has room for four corpses. If more space is needed, the funeral
director has to ask the local hospital if there is spare capacity in the hospital‘s
refrigerator.
www.sankhyiki.in
+91-9711150002
(i) Determine the transition matrix for the number of corpses in the funeral
director‘s refrigerator.
(ii) Calculate the long-run probability of there being 0, 1, 2, 3 and 4 corpses in
the refrigerator.
(iii) Calculate the probability that the funeral director has to contact the
hospital on any given day.
The embalmer has not had a day off for years. The funeral director says that from
now on the embalmer must not work on Christmas Day.
(iv) Calculate the probability that the funeral director will need to contact the
hospital on Christmas Day when the embalmer is not working.
[UK Sept 2018]
43. A company has for many years offered a car insurance policy with four levels of
No Claims Discount (NCD): 0%, 15%, 30% and 40%. A policyholder who does
not claim in a year moves up one level of discount, or remains at the highest
level. A policyholder who claims one or more times in a year moves down a level
of discount or remains at the lowest level. The company pays a maximum of
three claims in any year on any one policy.
The company has established that:

• the arrival of claims follows a Poisson process with a rate of 0.35 per year.
• the average cost per claim is £2,500.
• the proportion of policyholders at each level of discount is as follows:
Discount level Proportion of policyholders
0% 4.4%
15% 10.5%
30% 25.1%
40% 60.0%
(i) Calculate the premium paid by a policyholder at the 40% discount level
ignoring expenses and profit.
The company has decided to introduce a protected NCD feature whereby
policyholders can make one claim on their policy in a year and, rather than move
down a level of discount, remain at the level they are at. All other features of the
policy remain the same.
www.sankhyiki.in
+91-9711150002
(ii) Draw the transition graph for this process.

(iii) Calculate the premium paid, in the long term, by a policyholder at the
40% discount level of the policy with protected NCD, ignoring expenses
and profit.
(iv) Discuss THREE issues with the policy with protected NCD, which may
each be either a disadvantage or an advantage to the company.
[UK Sept 2017]
www.sankhyiki.in
+91-9711150002
ANSWERS
1. (ii) Not Markov (iii) pij(n) = P(Ym+n = j |Ym = i)=0.5
2. (i)(a) It is clear that X(t) is a Markov chain; knowing the present state, any
additional information about the past is irrelevant for predicting the next
transition.
(b)
(ii) (a) 0 (b) 0.2295 (c) 0.062475
(iii) The chain is irreducible as any state is reachable from any other. It is also
aperiodic.
(iv) 0.05269
3. (a) Given the current state (the largest outcome or the number of sixes) up to the
nth roll, no additional information is required to predict the status of the chain
after the next roll. Therefore both Bn and Cn have the Markov property.
(b) Bn has state space {1, 2, 3, 4, 5, 6}, the state space for Cn is the set of non-
negative integers.
(c) ( ) {
( ) ( )
and ( )
(d) Aperiodic; not irreducible
(e) In the long run, Bn will reach state 6 and will remain there; hence in
equilibrium P(Bn = 6) = 1 for sufficiently large n.
Cn cannot decrease and has an infinite state space; therefore, it is certain that it
will escape to infinity with probability one.
www.sankhyiki.in
+91-9711150002
4. (i) This is not a Markov chain because it does not possess the Markov property,
that is transition probabilities do not depend only on the current state.
Specifically, if you are in the 25% discount level, the transition probability to
state 0% is 0.25 if a claim was made last year and 0.1 if the previous year was
claim free.
(ii)
(iii) In theory, the insurer should just use 2 NCD states according to whether the
policyholder made a claim in the previous year. This is because the company
believes the claims frequency is the same for drivers who have not made a claim
for 1, 2, 3…years (i.e. it remains at 0.1 whether the driver has been claims-free for
1 or 10 years).
However there may be other reasons for adopting this scale:
 Marketing or competitive pressures.

 It may discourage the policyholder from making small claims, or
encourage careful driving, to preserve their discount.
www.sankhyiki.in
+91-9711150002
5.
(ii) , -
(iii) The chain is both irreducible, as every state can be reached from every other
state, and aperiodic, as the chain may remain at its current state for all H, M, L.
(iv) 1/3 (v) (a) 0.1008 (b) 0.28 (c) 0.6192
6. (i) 0.375 (ii)(a) X2= (b) 6.26%
(iii) 5.91
(iv) The expected number of defaults has been reduced by this strategy. (The
variance of the number of defaults would also reduce.)
However it is not possible to tell whether the overall return is improved as this
depends on the price at which bonds were bought and sold at the end of year 1.
The price of the debt sold may have been depressed by the companies having
been downgraded to rating B, and the manager loses out on any increase in
price if they recover.
The ―downgrade trigger‖ strategy will incur dealing costs, which should be
considered when comparing the returns.
∑
7. (ii) ̂
8. (i) Consider the following four states that the policyholder might be at the end of
a year:
www.sankhyiki.in
+91-9711150002
• the policyholder has made at least one claim both in the year just ended
and the previous one (state A)
• the policyholder has made no claims in the year just ended but s/he made at
least one claim during the previous year (state B)
• the policyholder has made at least one claim in the year just ended but not
in the previous one (state C)
• the policyholder has made no claim during either the year ended or the
previous one (state D)
If the year ended is year n, and Xn denotes the current state of the policyholder,
then Xn constitutes a Markov chain.
(ii)
(iii) Since, this Markov chain has finite state space, is irreducible and aperiodic.
(iv) 12/107
9.
10. (i) The state space consists of the four possible combinations of chromosomes:
Female non-carrier (FN) or XX
Female carrier (FC) or X*X
Male non-sufferer (MN) or XY
Male haemophiliac (MH) or X*Y
Using the assumption that there is an equal chance of either chromosome

being inherited:
• A female non-carrier will lead to a female non-carrier or male non-carrier.
• A female carrier may produce: X*X, XX, X*Y, XY all with equal probability.
• A male non-sufferer will lead to female non-carrier or male non-carrier.
www.sankhyiki.in
+91-9711150002
• A male haemophiliac may produce: X*X or XY (because his partner must

provide an X) with equal probability.
The transition diagram is therefore:
(ii)(a) The chain is reducible because once it enters states FN or MN it cannot

access FC or MH.
(b) The chain is aperiodic.
(iii) (0.5, 0, 0.5, 0)
11. 0.7913
12. (i) State space:{Deuce, Advantage A(ndrew), Advantage B(en), Game A(ndrew),
Game B(en)}.
Deuce Adv A Adv B Game A Game B
Deuce 0 0.6 0.4 0 0

Adv A 0.4 0 0 0.6 0
Adv B 0.6 0 0 0 0.4
Game A 0 0 0 1 0
Game B 0 0 0 0 1
The chain is Markov because the probability of moving to the next state does
not depend on history prior to entering that state (because the probability of
each player winning a point is constant)
(ii) Reducible (iii) not aperiodic (iv) 8 points
(v) (a) 0.6923 (b) This is higher than 0.6 because Ben has to win at least two points
in a row to win the game.
www.sankhyiki.in
+91-9711150002
13. (i) 6 (ii) State space = {AL, A, BL, B, CL, C} where subscript L
indicates locked in to the current auditor.
(iii) (0.28125, 0.203125, 0.515625)
14. (i)
(ii) (a) 0 (b) 0.2295 (iii) 0.05269
15. (i) State space is the set of integers Ζ.

Transition graph:
(ii) (a) Not aperiodic, period =2 (b) irreducible

(c) No stationary distribution will exist because the state space is infinite.
( ) ( ) ( )
(iii) (iv) {
(v) (a) Reflecting boundary implies P[Xi+1 = 1│Xi = 0] = 1 (or p01(1) = 1)
(b) Absorbing boundary implies P[Xi+1 = 0│Xi = 0] = 1 (or p00(1) = 1)
www.sankhyiki.in
+91-9711150002
(vi) In (a) some sample paths which would have taken X below zero will be
reflected, increasing the probability of reaching j at step m.
So the m-step transition probabilities would increase.
In (b) any sample path which reaches zero would no longer be able to access
state j so the transition probabilities would decrease.
16. (i) A Markov chain is a stochastic process with discrete states operating in
discrete time in which the probabilities of moving from one state to another
are dependent only on the present state of the process.
(ii) (a) 0 < a ≤1 and 0 < b ≤1

(b)The chain is only periodic if the chain must alternate between the states.
So a = 1 and b = 1.
17. (i)
(ii) (0.048574, 0.107674, 0.218685, 0.625067) (iii) 51.13%

(iv) Reject H0
(v) As the goodness of test fails, the discount level calculated assuming the
Poisson distribution may be incorrect. The goodness-of-fit test fails due to a
larger number of multiple claims than expected.
Conversely a higher number of policyholders make no claims than expected

(within the mean of 0.30), so the average discount level may be understated.
The average discount level calculated from the data could usefully be compared
with that estimated using the Poisson distribution.
18. (i) The Markov property states that the future development of a process can be
predicted from its present state alone without reference to its past history.
(iii) (a) A Markov chain is a stochastic process with the Markov property which
has a discrete time set with a discrete state space. A Markov jump process is a
stochastic process with the Markov property which has a continuous time set
with a discrete state space.
(b) A Markov chain is irreducible if any state can be reached from any other state.
www.sankhyiki.in
+91-9711150002
(iv)(a) A lift could not serve its purpose unless it could return to each of the
floors which it serves. This means an irreducible model would be appropriate.
(b) Suppose, for example, the lift is currently at the third floor, with its last two
states being the fourth floor and the fifth floor. In such a case the lift is more
likely to be heading downwards than upwards. So the past history is likely to
provide information on the likely future movement of the lift, unless the state
space is very complicated (involving a number of past floors as well as the
current floor). Therefore a Markov model is unlikely to be appropriate.
19. (i)
(ii) (1/3, ½, 1/6)
(iii) The stationary distribution gives the long run probability that a particular car
will be at each location. However this does not take into account the demand
for hiring vehicles at each location, or the amount of space available at each
location. These factors are likely to be more important in determining how many
cars to base at each site.
(iv)
20.
www.sankhyiki.in
+91-9711150002
21. (i) (a) The state space is discrete (with four states: O – ordinary passenger, B –
bronze member, S – silver member and G – gold member)
The probability that a passenger has a particular membership status next year
depends only on their membership status in the current year (i.e. the status in
previous years is not relevant). Therefore the process is Markov.
(b) The state space is finite and therefore there is at least one stationary
probability distribution. Since any state can be reached from any other state, the
Markov chain is irreducible. Therefore the stationary probability distribution is
unique.
(ii)
(iii) (0.5333, 0.2667, 0.1333, 0.0667)
22. (i)
(ii) (10/43, 21/86, 9/43, 27/86) (iii) 0.1884 (iv) 0.0465
(v) 0.2455
(vi) Restocking at two or more snakes would not result in fewer lost sales than
restocking at 1. Because the probability of selling more than 2 snakes is zero. It
would, however, result in more restocking charges than restocking at 1.
Therefore it must result in lower profits than restocking at 1 so is not optimal.
(vii) C< 0.8148P
23. (a) A Markov chain with a finite state space has at least one stationary probability
distribution.
www.sankhyiki.in
+91-9711150002
(b) An irreducible Markov chain with a finite state space has a unique stationary
probability distribution.
(c) A Markov chain with a finite state space which is irreducible, and which is
also aperiodic converges to a unique stationary probability distribution.
24. (i)
(ii) 35% that a child will be graded Poor‘, 27% that a child will be graded
Satisfactory, 21% that a child will be graded Good and 17% that a child will be
graded Excellent.
25. (i) Past history is needed to decide where to go in the chain. If a customer is at L
and reduces his or her order, you need to know what level of discount he was at
the previous year to determine whether he or she drops one or two levels of
discount.
(ii) The L level needs to be split into two.

L+ is Loyalty Price with no reduction in demand last year
L– is Loyalty Price with reduction in demand last year
www.sankhyiki.in
+91-9711150002
(iii) (a) (b) (0.1358, 0.12346, 0.09877, 0.14815, 0.49383)

(c) 6.5181
(iv) 0.262
(v) A constant figure takes no account of the amount of hay which Farmer Giles
has to sell: for example a drought year could produce very little which one large
customer may buy in its entirety.
The amount of hay in the local market is important. Another supplier may try a
heavy discounted year to get into the market. Customers‘ behaviour may depend
on the discount level they are at. There may be national trends in the demand for
hay e.g. a sudden trend towards vegetarianism.
A 60% chance of increasing may be implausible, as field space is likely to be

limited, so a constant increase in numbers unlikely. Customers‘ behaviour may
depend on the amount of hay they typically purchase.
26. (a) A Yes, irreducible. B No, not irreducible. C Yes, irreducible.
(b) A Yes, period is 2, B No, not periodic. C No, not periodic.
27. (i) Transitions from state ―Zero‖

No umbrellas to take so must be two at the other location.
Transitions from state ―One‖
If it does not rain, then there remains one at each location, probability 1 - p.
If it does rain, both umbrellas end up at the next destination, probability p.
Transitions from state ―Two‖
If it does not rain, then forgets to take an umbrella so none is at the next location,
probability 1 - p.
If it does rain, takes one of the umbrellas to the other location, probability p.
www.sankhyiki.in
+91-9711150002
(ii)
(iii) ( ) (iv) ( ) (v)
(vi) This would not satisfy the Markov property because (in states ―One‖ and
―Two‖) would need to know, in addition, whether it was raining or not on the
last journey to determine the future evolution of the process. e.g. if in state
―Two‖, probability of next moving to ―Zero‖ is 1-r if it rained on the last journey
and 1-s if it did not. As r does not equal s the Markov property is not satisfied.
(vii) If we expand the states to include information about whether it rained on

the last journey, then the Markov property is satisfied. Five states are needed, as
cannot be in position with zero umbrellas when it rained on last journey, so the
state space is {Zero, One Rained, One Did Not Rain, Two Rained, Two Did Not
Rain}
28. (i) The series Xi depends only on the current state and hence satisfies the Markov
property and Yi depends on all the previous values of Yi.
www.sankhyiki.in
+91-9711150002
(ii) (a) Not irreducible

(b) The probabilities depend on the number of time periods n so the process is
not time homogeneous
(iii) (iv) 17/64
(v) It is reasonable to assume that probability of having an accident depends on

the number of previous accidents. It is also reasonable that the effect of a
previous accident should wear off over time. There are likely to be other factors
which have a significant effect on the probability of an accident, such as the fact
that people who have recently had an accident might drive more carefully.
May want to give more weighting to recent years.
29. (i) A process with a discrete state space and discrete time space where the future
development is only dependent on the current state occupied.
(ii) 3
(iii)
(iv) (15/17, 3/34, 1/34) (v) 12.917% (vi) 13.333% of salary
(vii) The reduction in cost is calculated as 3.23%. This is not particularly

significant either relative to the likely uncertainty in the assumptions or because
recovery rates are so high. The reduction in sick pay is likely to encourage
employees to try to get back into work.
www.sankhyiki.in
+91-9711150002
30. (i) 0.4 (ii) 0.25 (iii) 0.3125
31. (i) Let S be the state space. We say that {πj | j∈S} is a stationary probability
distribution for a Markov chain with transition matrix P if the following hold for
all j∈S : π = π P, Σπj = 1 and πj 0.
(ii) With state space {Working, Broken}

Transition matrix A =. /
(iii) 1/13 (iv) 4,384.62 per day
32. (i) A Markov chain is a discrete time, discrete space Markov process. For a time-
inhomogeneous Markov chain, the transition probabilities depend on the
absolute values of time, rather than just the time difference.
The value of ―time‖ can be represented by many factors, for example the time of
year, age or duration. An example might be a No Claims Discount scheme where
the probability of a claim reflects trends in accident frequency over time.
(ii) Both boundaries are mixed as policyholders can either stay in that state for
consecutive periods or move back to another state. E.g. When at the maximum
40% level, a policyholder who makes no claim will stay there the next year,
whereas one who makes one claim will drop to the 25% level and one who
makes more than one claim will drop to the 10% level.
(iii)
(iv) 24.47%
www.sankhyiki.in
+91-9711150002
(v) Equal probability of an accident in every month is pretty unlikely. Perhaps

more accidents in winter when driving conditions are worse, or in summer,
when mileage is higher.
The probability of a second claim may differ from the first and may be
dependent upon the level the person is at (e.g. does it make a difference to the
future premium?)
Claim probability may depend upon policyholder age/sex or car size/age, and
on many other factors (occupation, geographical area, marital status, mileage,
where car is stored, etc.)
Claim levels may be affected by the past history of a person's claims (so the
process is no longer Markov).
Unrealistic to assume at most one claim per month.
33. (i) From/To U C D

U 1/6 1/3 ½
C 2/7 3/7 2/7
D 1/3 1/2 1/6
(ii) 0.15873
34. (i) (ii) Irreducible and aperiodic
(iii) (0.08257, 0.18349, 0.14679, 0.58716)
(iv) The 60% discount level becomes an absorbing state and so it is no longer
irreducible. However it is still aperiodic because you cannot get out of the
absorbing state 60% and the other states still have no period. The process would
now be stationary when all drivers are in the absorbing 60% discount level. OR
The new stationary distribution is [0,0,0,1] because the 60% state is now
absorbing.
www.sankhyiki.in
+91-9711150002
35. (i) Markov chain.
(ii) (a) It is not irreducible because a heating element cannot move to a state of
being in better condition.
(b) It is not periodic because it can remain in each state (or any other suitable
reason).
(iii) 0.26.
(iv) Excellent Good Poor
Excellent 0.6 0.2 0.2

Good 0.2 0.5 0.3
Poor 0.5 0 0.5
(v) (25/51, 10/51, 16/51) (vi) 27,735 (vii) 11,092
36. (i) Four states
(ii) State Space {1 just Promoted, 1 Same division, 2 Same division, 2 just
relegated}
(iii) 1P 1S 2S 2R
1P 0 0.7 0 0.3
1S 0 0.85 0 0.15
2S 0.15 0 0.85 0
2R 0.25 0 0.75 0
(iv) (a) the chain is irreducible because every state can eventually be reached
from every other state.
(b) the chain is aperiodic because it can loop in states 1S or 2S and, being
irreducible, every state has the same period.
(v) Probability of being relegated in first year is 0.3 and 0.15 in each subsequent
year.
www.sankhyiki.in
+91-9711150002
37. (i) Five states
(ii) 0.213018
(iii) (a) Six states are now required because the probability of a person in
discount level 1 moving to discount level 2 depends upon whether a claim was
made the previous year or not.
Hence discount level 1 must be split into
1+ = no claim made previous year and
1- = claim made previous year
(iv)
38. (ii) (iii) (1/4, ¼, 3/8, 1/8)
39. (i)
www.sankhyiki.in
+91-9711150002
(ii)
(iii) In the long run 58.8% of examiners are marking subject A and 29.4% are
marking subject B.
(iv) All those returning from holiday will have to be allocated to subject B.
40. (i) (ii) 0.31973
41. (i) Andy 0.31 Brenda 0.40 Carol 0.29

(ii) Andy 19/64 Brenda 3/8 Carol 21/64
(iii) In this case, to know who will be ―Employee of the week‖, we need to know
who was ―Employee of the week‖ last week as well as who made most sales this
week.
Suppose Andy made most sales this week. If he was ―Employee of the week‖ last
week his probability of being ―Employee of the week‖ this week is 0, but if
Brenda was ―Employee of the week‖ last week, Andy will be ―Employee of the
week‖ this week.
So additional states are needed to model ―Employee of the week‖ as a Markov
Chain.
(iv) This needs nine states i.e. 3 by 3, defined by
Most Sales : ―Employee of the Week‖ last week
Andy : Andy Andy : Brenda Andy : Carol Brenda : Andy
Brenda : Brenda Brenda : Carol Carol : Andy Carol : Brenda
Carol : Carol
42. (i) The number of corpses in the refrigerator one morning is the number the
previous morning, plus the number of deaths that day less the one the embalmer
embalmed
www.sankhyiki.in
+91-9711150002
(ii) (0.626, 0.195, 0.103, 0.051, 0.025) (iii) 0.006 (iv) 0.025
43. (i) 783.67 (iii) 862.71

(iv) This may be a common feature in the market. If competitors offer it and this
company does not, it may lose business. The previous system may have
discouraged claims if it meant that people lost their NCD. Introducing the new
system may change the incidence of claims. Or the average size of claims may
change (smaller ones may have gone unreported previously).
The one-off increase in premium when they introduce the scheme may prompt
otherwise loyal customers to shop around for a better deal. If the company is the
first in the market to launch this option, they may win lots of new business.
Extra administrative costs may be incurred. The protected NCD may appear
unfair to policyholders as customers not making a claim can end up with the
same discount as those who made a claim.
The new system may embody a moral hazard as it could make customers drive
less carefully.
www.sankhyiki.in
+91-9711150002
TIME–HOMOGENEOUS AND INHOMOGENEOUS

MARKOV JUMP PROCESSES
1. Marital status is considered using the following time-homogeneous, continuous
time Markov jump process:
 the transition rate from unmarried to married is 0.1 per annum

 the divorce rate is equivalent to a transition rate of 0.05 per annum
 the mortality rate for any individual is equivalent to a transition rate of
0.025 per annum, independent of marital status
The state space of the process consists of five states: Never Married (NM),
Married (M), Widowed (W), Divorced (DIV) and Dead (D).
Px is the probability that a person currently in state x, and who has never
previously been widowed, will die without ever being widowed.
(i) Construct a transition diagram between the five states.

(ii) Show, by general reasoning or otherwise, that PNM equals PDIV.
(iii) Demonstrate that:
and
(iv) Calculate the probability of never being widowed if currently in state NM.
(v) Suggest two ways in which the model could be made more realistic.
[UK April 2005]
2. An insurance policy covers the repair of a washing machine, and is subject to a

maximum of 3 claims over the year of coverage.
The probability of the machine breaking down has been estimated to follow an
exponential distribution with the following annualised frequencies, :
1/10 If the machine has not suffered any previous breakdown.
= 1/5 If the machine has broken down once previously.
1/4 If the machine has broken down on two or more occasions.
As soon as a breakdown occurs an engineer is despatched. It can be assumed that
the repair is made immediately, and that it is always possible to repair the
machine.
The washing machine has never broken down at the start of the year (time t = 0).
Pi(t) is the probability that the machine has suffered i breakdowns by time t.
www.sankhyiki.in
+91-9711150002
(i) Draw a transition diagram for the process defined by the number of
breakdowns occurring up to time t.
(ii) Write down the Kolmogorov equations obeyed by P0(t), P1(t) and P2(t) .
(iii) (a) Derive an expression for P0 (t) and
(b) demonstrate that ( )

(iv) Derive an expression for P2(t).
(v) Calculate the expected number of claims under the policy. [UK April 2005]
3. A life insurance company prices its long-term sickness policies using a three-state
Markov model in continuous time. The states are healthy (H), ill (I) and dead (D).
The forces of transition in the model are HI = , IH = , HD = , ID =  and they
are assumed to be constant over time.
For a group of policyholders observed over a 1-year period, there are:

23 transitions from State H to State I ;
15 transitions from State I to State H;
3 deaths from State H;
5 deaths from State I.
The total time spent in State H is 652 years and the total time spent in State I is 44
years.
(i) Write down the likelihood function for these data.
(ii) Derive the maximum likelihood estimate of .
(iii) Estimate the standard deviation of ̃, the maximum likelihood estimator
of         [UK Sept 2005]

 Claims arrive at an insurance company according to a Poisson process with rate
per week.
Assume time is expressed in weeks.
(i) Show that, given that there is exactly one claim in the time interval
[t, t + s], the time of the claim arrival is uniformly distributed on [t, t + s].
(ii) State the joint density of the holding times T0, T1,…, Tn between
successive claims.
(iii) Show that, given that there are n claims in the time interval [0, t], the
number of claims in the interval [0, s] for s < t is binomial with parameters
n and s/t. [UK Sept 2005]
www.sankhyiki.in
+91-9711150002
5. A Markov jump process Xt with state space S = {0, 1, 2,… , N} has the following
transition rates:
(i) Write down the generator matrix and the Kolmogorov forward equations
(in component form) associated with this process.
( )
(ii) Verify that for and for all j i, the function ( ) ( )
is a solution to the forward equations in (i).
(iii) Identify the distribution of the holding times associated with the jump
process. [UK Sept 2005]
6. A time-inhomogeneous Markov jump process has state space {A, B} and the
transition rate for switching between states equals 2t, regardless of the state
currently occupied, where t is time.
The process starts in state A at t = 0.

(i) Calculate the probability that the process remains in state A until at least
time s.
(ii) Show that the probability that the process is in state B at time T, and that it
is in the first visit to state B, is given by .
(iii) (a) Sketch the probability function given in (ii).
(b) Give an explanation of the shape of the probability function.
(c) Calculate the time at which it is most likely that the process is in its
first visit to state B. [UK Sept 2005]
7. A savings provider offers a regular premium pension contract, under which the
customer is able to cease paying in premiums and restart them at a later date. In
order to profit test the product, the provider set up the four-state Markov model
shown in the following diagram:
www.sankhyiki.in
+91-9711150002
Show, from first principles, that under this model:
[UK April 2006]
8. (i) (a) Explain what is meant by a Markov jump process.

(b) Explain the condition needed for such a process to be time-
homogeneous.
(ii) Outline the principal difficulties in fitting a Markov jump process model
with time-inhomogeneous rates.
A company provides sick pay for a maximum period of six months to its
employees who are unable to work. The following three-state, time-
inhomogeneous Markov jump process has been chosen to model future sick pay
costs for an individual:
Where Sick means unable to work and Healthy means fit to work.
The time dependence of the transition rates is to reflect increased mortality and
morbidity rates as an employee gets older. Time is expressed in years.
(iii) Write down Kolmorgorov s forward equations for this process, specifying
the appropriate transition matrix.
(iv) (a) Given an employee is sick at time w < T, write down an expression
for the probability that he or she is sick throughout the period
w < t < T.
(b) Given that a transition out of state H occurred at time w, state the
probability that the transition was into state S.
(c) For an employee who is healthy at time , give an approximate
expression for the probability that there is a transition out of state
H in a small time interval [w, w + dw], where w >. Your
expression should be in terms of the transition rates and PHH (,w)
only.
www.sankhyiki.in
+91-9711150002
(v) Using the results of part (iv) or otherwise, derive an expression for the
probability that an employee is sick at time T and has been sick for less
than 6 months, given that they were healthy at time < T – 0.5. Your
expression should be in terms of the transition rates and PHH (,w)
only.
(vi) Comment on the suggestions that:
(a) (t) should also depend on the holding time in state S, and
(b) mortality rates can be ignored. [UK April 2006]
9. The price of a stock can either take a value above a certain point (state A), or take
a value below that point (state B). Assume that the evolution of the stock price in
time can be modelled by a two-state Markov jump process with homogeneous
transition rates AB =, BA=.
The process starts in state A at t = 0 and time is measured in weeks.
(i) Write down the generator matrix of the Markov jump process.
(ii) State the distribution of the holding time in each of states A and B.
(iii) If  =3, find the value of t such that the probability that no transition to
state B has occurred until time t is 0.2.
(iv) Assuming all the information about the price of the stock is available for a
time interval [0,T], explain how the model parameters  and can be
estimated from the available data.
(v) State what you would test to determine whether the data support the
assumption of a two-state Markov jump process model for the stock price.
[UK Sept 2006]
10. (i) Explain the difference between a time-homogeneous and a time-
inhomogeneous Poisson process.
An insurance company assumes that the arrival of motor insurance claims
follows an inhomogeneous Poisson process.
Data on claim arrival times are available for several consecutive years.
(ii) (a) Describe the main steps in the verification of the company‘s
assumption.
(b) State one statistical test that can be used to test the validity of the
assumption.
(iii) The company concludes that an inhomogeneous Poisson process with rate
www.sankhyiki.in
+91-9711150002
( ) ( ) is a suitable fit to the claim data (where t is measured

in years).
(a) Comment on the suitability of this transition rate for motor
insurance claims.
(b) Write down the Kolmogorov forward equations for P0 j (s,t).
(c) Verify that these equations are satisfied by:
( ( )) ( ( ))
( )
for some f(s,t) which you should identify.
[Note that ∫ .]
(d) Comment on the form of the solution compared with the case
where is constant. [UK Sept 2006]
11. The members of a particular profession work exclusively in partnerships. A
certain partnership is concerned that it is losing trained technical staff to its
competitors. Informal debriefing interviews with individuals leaving the
partnership suggest that one reason for this is that the duration elapsing between
becoming fully qualified and being made a partner is longer in this partnership
than in the profession as a whole.
The partnership decides to investigate whether this claim is true using a
multiple-state model with three states: (1) fully qualified but not yet a partner, (2)
fully qualified and a partner, (3) working for another partnership. The period of
the investigation is to be 1 January 1997 to 31 December 2006.
(i) (a) Draw and label a state-space diagram depicting the chosen model,
showing possible transitions between the three states.
(b) State any assumptions implied by the diagram you have drawn
and comment on their appropriateness.
(ii) (a) State what data would be required in order to estimate the
transition intensity of moving from state (1) to state (2) for
employees aged 30 years last birthday.
(b) Write down the likelihood of these data.
(c) Derive an expression for the maximum likelihood estimate of this
transition intensity.
The investigation assumes that all transition intensities are constant within each
year of age. [UK April 2007]
www.sankhyiki.in
+91-9711150002
12. (i) Consider two Poisson processes, one with rate λ and the other with rate μ.
Prove that the sum of events arising from either of these processes is also a
Poisson process with rate (λ + μ).
(ii) (a) Explain what is meant by a Markov jump chain.
(b) Describe the circumstances in which the outcome of the Markov
jump chain differs from the standard Markov chain with the same
transition matrix.
An airline has N adjacent check-in desks at a particular airport, each of which
can handle any customer from that airline. Arrivals of passengers at the check-in
area are assumed to follow a Poisson process with rate q. The time taken to
check-in a passenger is assumed to follow an exponential distribution with mean
1/a.
(iii) Show that the number of desks occupied, together with the number of
passengers waiting for a desk to become available, can be formulated as a
Markov jump process and specify:
(b) the transition diagram
(iv) State the Kolmogorov forward equations for the process, in component
form.
(v) Comment on the appropriateness of the assumptions made regarding
passenger arrival and the check-in process.
(vi) (a) Set out the transition matrix of the jump chain associated with the
airline check-in process.
(b) Determine the probability that all desks are in use before any
passenger has completed the check-in process, given that no
passengers have arrived at check-in at the outset. [UK April 2007]
13. The following data have been collected from observation of a three-state process
in continuous time:
State Total time spent Total transitions to
Occupied in state (hours) State A State B State C
A 50 Not applicable 110 90
B 25 80 Not applicable 45
C 90 120 15 Not applicable
www.sankhyiki.in
+91-9711150002
It is proposed to fit a Markov jump model to this data set.

(i) (a) List all the parameters of the model.
(b) Describe the assumptions underlying the model.
(ii) (a) Estimate the parameters of the model.
(b) Give the estimated generator matrix.
The following additional data in respect of secondary transitions were collected
from observation of the same process.
Observed Observed
Triplet of Triplet of
number of number of
successive successive
triplets triplets
transitions transitions
nijk nijk
ABC 42 BCA 38
ABA 68 BCB 7
ACA 85 CAB 64
ACB 4 CAC 56
BAB 50 CBA 8
BAC 30 CBC 7
(iii) State the distribution of the number of transitions from state i to state j,
given the number of transitions out of state i.
(iv) Test the goodness-of-fit of the model by considering whether triplets of
successive transitions adhere to the distribution given in (iii).
( )
[Hint: Use the test statistic ∑∑∑ where E is the expected
number of triplets under the distribution in (iii)]
(v) Identify two other aspects of the appropriateness of the fitted model that
could be tested, stating suitable tests in each case.
(vi) Outline two methods for simulating the Markov jump process, without
performing any calculations. [UK Sept 2007]
14. An internet service provider (ISP) is modelling the capacity requirements for its
network. It assumes that if a customer is not currently connected to the internet
(―offline‖) the probability of connecting in the short time interval [t,dt] is
0.2dt + o(dt). If the customer is connected to the internet (―online‖) then it
www.sankhyiki.in
+91-9711150002
assumes the probability of disconnecting in the time interval is given by

0.8dt + o(dt).
The probabilities that the customer is online and offline at time t are PON(t) and
POFF(t) respectively.
(i) Explain why the status of an individual customer can be considered as a
Markov Jump Process.
(ii) Write down Kolmogorov‘s forward equation for P‘OFF(t).
(iii) Solve the equation in part (ii) to obtain a formula for the probability that a
customer is offline at time t, given that they were offline at time 0.
(iv) Calculate the expected proportion of time spent online over the period
[0,t].
[HINT: Consider the expected value of an indicator function which takes the
value 1 if offline and 0 otherwise.]
(v) (a) Sketch a graph of your answer to (iv) above.
(b) Explain its shape. [UK April 2008]
15. An investigation was carried out into the relationship between sickness and
mortality in an historical population of working class men. The investigation
used a three-state model with the states:
1 Healthy
2 Sick
3 Dead
Let the probability that a person in state i at time x will be in state j at time x+t be
tpijx. Let the transition intensity at time x+t between any two states i and j be μijx+t.
(i) Draw a diagram showing the three states and the possible transitions
between them.
(ii) Show from first principles that
(iii) Write down the likelihood of the data in the investigation in terms of the
transition rates and the waiting times in the Healthy and Sick states, under
the assumption that the transition rates are constant.
The investigation collected the following data:
www.sankhyiki.in
+91-9711150002
• man-years in Healthy state 265

• man-years in Sick state 140
• number of transitions from Healthy to Sick 20
• number of transitions from Sick to Dead 40
(iv) Derive the maximum likelihood estimator of the transition rate from Sick
to Dead.
(v) Hence estimate:
(a) the value of the constant transition rate from Sick to Dead
(b) 95 per cent confidence intervals around this transition rate
[UK April 2008]
16. In the village of Selborne in southern England in the year 1637 the number of
babies born each month was as follows
January 2 July 5
February 1 August 1
March 1 September 0
April 2 October 2
May 1 November 0
June 2 December 3
Data show that over the 20 years before 1637 there was an average of 1.5 births
per month. You may assume that births in the village historically follow a
Poisson process.
An historian has suggested that the large number of births in July 1637 is
unusual.
(i) Carry out a test of the historian‘s suggestion, stating your conclusion.
(ii) Comment on the assumption that births follow a Poisson process.
[UK Sept 2008]
17. A company pension scheme, with a compulsory scheme retirement age of 65, is
modelled using a multiple state model with the following categories:
1 currently employed by the company
2 no longer employed by the company, but not yet receiving a pension
3 pension in payment, pension commenced early due to ill health retirement
4 pension in payment, pension commenced at scheme retirement age
5 dead
www.sankhyiki.in
+91-9711150002
(i) Describe the nature of the state space and time space for this process.
(ii) Draw and label a transition diagram indicating appropriate transitions
between the states.
For i,j in {1,2,3,4,5}, let:
tp1ix the probability that a life is in state i at age x+t, given they are in state 1 at
age x
μijx+t the transition intensity from state i to state j at age x+t
(iii) Write down equations which could be used to determine the evolution of
tp1ix (for each i) appropriate for:
(a) x + t < 65.
(b) x + t = 65.
(c) x + t > 65. [UK Sept 2008]
18. There is a population of ten cats in a certain neighbourhood. Whenever a cat

which has fleas meets a cat without fleas, there is a 50% probability that some of
the fleas transfer to the other cat such that both cats harbour fleas thereafter.
Contacts between two of the neighbourhood cats occur according to a Poisson
process with rate μ, and these meetings are equally likely to involve any of the
possible pairs of individuals. Assume that once infected a cat continues to have
fleas, and that none of the cats‘ owners has taken any preventative measures.
(i) If the number of cats currently infected is x, explain why the number of
possible pairings of cats which could result in a new flea infection is
x(10 – x).
(ii) Show how the number of infected cats at any time, X(t), can be formulated
as a Markov jump process, specifying:
(a) the state space
(b) the Kolmogorov differential equations in matrix form
(iii) State the distribution of the holding times of the Markov jump process.
(iv) Calculate the expected time until all the cats have fleas, starting from a
single flea-infected cat. [UK April 2009]
www.sankhyiki.in
+91-9711150002
19. An investigation into mortality by cause of death used the four-state Markov
model shown below.
(i) Show from first principles that
The investigation was carried out separately for each year of age, and the
transition intensities were assumed to be constant within each single year of age.
(ii) (a) Write down, defining all the terms you use, the likelihood for the
transition intensities.
(b) Derive the maximum likelihood estimator of the force of mortality
from heart disease for any single year of age.
The investigation produced the following data for persons aged 64 last birthday:
Total waiting time in the state Alive 1,065 person-years

Number of deaths from heart disease 34
Number of deaths from cancer 36
Number of deaths from other causes 42
(iii) (a) Calculate the maximum likelihood estimate (MLE) of the force of
mortality from heart disease at age 64 last birthday.
(b) Estimate an approximate 95% confidence interval for the MLE of
the force of mortality from heart disease at age 64 last birthday.
(iv) Discuss how you might use this model to analyse the impact of risk
factors on the death rate from heart disease and suggest, giving reasons, a
suitable alternative model. [UK April 2009]
www.sankhyiki.in
+91-9711150002
20. The complaints department of a company has two employees, both of whom
work five days per week.
The company models the arrival of complaints using a Poisson process with rate
1.25 per working day.
(i) List the assumptions underlying the Poisson process model.
On receipt of a complaint, it is immediately assessed as being straightforward, of

medium difficulty or complicated. 60% of cases are assessed as straightforward
and 10% are assessed as complicated. The time taken in person-days‘ effort to
prepare responses is assumed to follow an exponential distribution, with
parameters 2 for straightforward complaints, 1 for medium difficulty complaints
and 0.25 for complicated complaints.
(ii) Calculate the average number of person-days‘ work expected to be

generated by complaints arriving during a five-day working week.
(iii) Define a state space under which the number of outstanding complaints
can be modelled as a Markov jump process.
The company has a service standard of responding to complaints within a fixed

number of days of receipt. It is considering using this Markov jump process to
model the probability of failing to meet this service standard.
(iv) Discuss the appropriateness of using the model for this purpose, with
reference to the assumptions being made. [UK Sept 2009]
21. A researcher is studying a certain incurable disease. The disease can be fatal, but
often sufferers survive with the condition for a number of years. The researcher
wishes to project the number of deaths caused by the disease by using a multiple
state model with state space:
{H – Healthy, I – Infected, D(from disease) – Dead (caused by the disease), D(not

from disease) – Dead (not caused by the disease)}.
The transition rates, dependent on age x, are as follows:

• a mortality rate from the Healthy state of μ(x)
• a rate of infection with the disease σ(x)
• a mortality rate from the Infected state of υ(x) of which ρ(x) relates to Deaths
caused by the disease
(i) Draw a transition diagram for the multiple state model.
www.sankhyiki.in
+91-9711150002
(ii) Write down Kolmogorov‘s forward equations governing the transitions

by specifying the transition matrix.
(iii) Determine integral expressions, in terms of the transition rates and any
expressions previously determined, for:
(a) PHH(x, x + t)
(b) PHI(x, x + t)
(c) PHD(from disease)(x, x + t) [UK Sept 2009]
22. A government has introduced a two-tier driving test system. Once someone
applies for a provisional licence they are considered a Learner driver. Learner
drivers who score 90% or more on the primary examination (which can be taken
at any time) become Qualified. Those who score between 50% and 90% are
obliged to sit a secondary examination and are given driving status Restricted.
Those who score 50% or below on the primary examination remain as Learners.
Restricted drivers who pass the secondary examination become Qualified, but
those who fail revert back to Learner status and are obliged to start again.
(i) Sketch a diagram showing the possible transitions between the states.
(ii) Write down the likelihood of the data, assuming transition rates between
states are constant over time, clearly defining all terms you use.
Figures over the first year of the new system based on those who applied for a
provisional licence during that time in one area showed the following:
Person-months in Learner State 1,161

Person-months in Restricted State 1,940
Number of transitions from Learner to Restricted 382
Number of transitions from Restricted to Learner 230
Number of transitions from Restricted to Qualified 110
Number of transitions from Learner to Qualified 217
(iii) (a) Derive the maximum likelihood estimator of the transition rate
from Restricted to Learner.
(b) Estimate the constant transition rate from Restricted to Learner.
[UK April 2010]
23. A reinsurance policy provides cover in respect of a single occurrence of a

specified catastrophic event. If such an event occurs, future cover is suspended.
However if a reinstatement premium is paid within one time period of
occurrence of the event then the insurance coverage is reinstated. If a second
www.sankhyiki.in
+91-9711150002
specified event occurs it is not permitted to reinstate the cover and the policy will
lapse.
The transition rate for the hazard of the specified event is a constant 0.1. Whilst
policies are eligible for reinstatement, the transition rate for resumption of cover
through paying a reinstatement premium is 0.05.
(i) Explain whether a time homogeneous or time inhomogeneous model

would be more appropriate for modelling this situation.
(ii) (a) Explain why a model with state space {Cover In Force, Suspended,
Lapsed} does not possess the Markov property.
(b) Suggest, giving reasons, additional state(s) such that the expanded
system would possess the Markov property.
(iii) Sketch a transition diagram for the expanded system.
(iv) Derive the probability that a policy remains in the Cover In Force state
continuously from time 0 to time t.
(v) Derive the probability that a policy is in the Suspended state at time t > 1
if it is in state Cover In Force at time 0. [UK April 2010]
24. A study is undertaken of marriage patterns for women in a country where

bigamy is not permitted. A sample of women is interviewed and asked about the
start and end dates of all their marriages and where the marriages had ended,
whether this was due to death or divorce (all other reasons can be ignored). The
investigators are interested in estimating the rate of first marriage for all women
and the rate of re-marriage among widows.
(i) Draw a diagram illustrating a multiple-state model which the

investigators could use to make their estimates, using the four states:
―Never married‖, ―Married‖, ―Widowed‖ and ―Divorced‖.
(ii) Derive from first principles the Kolmogorov differential equation for first
marriages.
(iii) Write down the likelihood of the data in terms of the waiting times in each
state, the numbers of transitions of each type, and the transition
intensities, assuming the transition intensities are constant.
(iv) Derive the maximum likelihood estimator of the rate of first marriage.
[UK Sept 2010]
www.sankhyiki.in
+91-9711150002
25. At a certain airport, taxis for the city centre depart from a single terminus. The
taxis are all of the same make and model, and each can seat four passengers (not
including the driver). The terminus is arranged so that empty taxis queue in a
single line, and passengers must join the front taxi in the line. As soon as it is full,
each taxi departs. A strict environmental law forbids any taxi from departing
unless it is full. Taxis are so numerous that there is always at least one taxi
waiting in line.
Customers arrive at the terminus according to a Poisson process with a rate β per
minute.
(i) Explain how that the number of passengers waiting in the front taxi can be
modelled as a Markov jump process.
(ii) Write down, for this process:

(a) the generator matrix
(b) Kolmogorov‘s forward equations in component form
(iii) Calculate the expected time a passenger arriving at the terminus will have
to wait until his or her taxi departs.
The four-passenger taxis were highly polluting, and the government instituted a
―scrappage‖ scheme whereby taxi drivers were given a subsidy to replace their
old four-passenger taxis with new ―greener‖ models. Two such models were on
the market, one of which had a capacity of three passengers and the other of
which had a capacity of five passengers (again, not including the driver in each
case). Half the taxis were replaced with three-passenger models, and half with
five-passenger models.
Assume that, after the replacement, three-passenger and five-passenger models

arrive randomly at the terminus.
(iv) Write down the transition matrix of the Markov jump chain describing the
number of passengers in the front taxi after the vehicle replacement.
(v) Calculate the expected waiting time for a passenger arriving at the
terminus after the vehicle scrappage scheme and compare this with your
answer to part (iii). [UK Sept 2010]
www.sankhyiki.in
+91-9711150002
26. (i) Define a Markov jump process.

A study of a tropical disease used a three-state Markov process model with
states:
1. Not suffering from the disease

2. Suffering from the disease
3. Dead
The disease can be fatal, but most sufferers recover. Let tpijx be the probability
that a person in state i at age x is in state j at age x+t. Let μijx+t be the transition
intensity from state i to state j at age x+t.
(ii) Show from first principles that:
The study revealed that sufferers who contract the disease a second or
subsequent time are more likely to die, and less likely to recover, than first-time
sufferers.
(iii) Draw a diagram showing the states and possible transitions of a model
which allows for this effect yet retains the Markov property.
[UK April 2011]
27. A recording instrument is set up to observe a continuous time process, and stores
the results for the most recent 250 transitions. The data collected are as follows:
State Total time Number of transitions to

i spent in State A State B State C
state i(hours)
A 35 Not 60 45
applicable
B 150 50 Not 25
applicable
C 210 55 15 Notapplicable
It is proposed to fit a Markov jump model using the data.
(i) (a) State all the parameters of the model.

(b) Outline the assumptions underlying the model.
www.sankhyiki.in
+91-9711150002
(ii) (a) Estimate the parameters of the model.

(b) Write down the estimated generator matrix of the model.
(iii) Specify the distribution of the number of transitions from state i to state j,
given the number of transitions out of state i. [UK Sept 2011]
28. A continuous-time Markov process with states {Able to work (A), Temporarily
unable to work (T), Permanently unable to work (P), Dead (D)} is used to model
the cost of providing an incapacity benefit when a person is permanently unable
to work. The generator matrix, with rates expressed per annum, for the process is
estimated as:
A T P D
A 0.15 0.1 0.02 0.03

T 0.45 0.6 0.1 0.05
P 0 0 0.2 0.2
D 0 0 0 0
(i) Draw the transition graph for the process.
(ii) Calculate the probability of a person remaining in state A for at least 5

years continuously.
Define F(i) to be the probability that a person, currently in state i, will never be in
state P.
(iii) Derive an expression for:
(a) F(A) by conditioning on the first move out of state A.

(b) F(T) by conditioning on the first move out of state T.
(iv) Calculate F(A) and F(T).
(v) Calculate the expected future duration spent in state P, for a person
currently in state A. [UK Sept 2011]
29. An investigation was conducted into the effect marriage has on mortality and a
model was constructed with three states: 1 Single, 2 Married and 3 Dead. It is
assumed that transition rates between states are constant.
(i) Sketch a diagram showing the possible transitions between states.
www.sankhyiki.in
+91-9711150002
(ii) Write down an expression for the likelihood of the data in terms of
transition rates and waiting times, defining all the terms you use.
The following data were collected from information on males and females in
their thirties.
Years spent in Married state 40,062

Years spent in Single state 10,298
Number of transitions from Married to Single 1,382
Number of transitions from Single to Dead 12
Number of transitions from Married to Dead 9
(iii) Derive the maximum likelihood estimator of the transition rate from
Single to Dead.
(iv) Estimate the constant transition rate from Single to Dead and its variance.
[UK April 2012]
30. The volatility of equity prices is classified as being High (H) or Low (L) according
to whether it is above or below a particular level. The volatility status is assumed
to follow a Markov jump process with constant transition rates ϕLH = μ and
ϕHL = ρ.
(i) Write down the generator matrix of the Markov jump process.
(ii) State the distribution of holding times in each state.
A history of equity price volatility is available over a representative time period.
(iii) Explain how the parameters μ and ρ can be estimated.
Let ( ) be the probability that the process is in state j at time s+t given that it
was in state i at time s (i, j = H, L), where t ≥ 0. Let ̅ ( ) be the probability that
the process remains in state i from time s to time s+t .
(iv) Write down Kolmogorov‘s forward equations for and
Equity price volatility is Low at time zero.
www.sankhyiki.in
+91-9711150002
(v) Derive an expression for the time after which there is a greater than 50%
chance of having experienced a period of high equity price volatility.
(vi) Solve the Kolmogorov equation to obtain an expression for ( ).

[UK Sept 2012]
31. On a small distant planet lives a race of aliens. The aliens can die in one of two
ways, either through illness, or by being sacrificed according to the ancient
custom of the planet. Aliens who die from either cause may, some time later,
become zombies.
(i) Draw a multiple-state diagram with four states illustrating the process by
which aliens die and become zombies, labelling the four states and the
possible transitions between them.
(ii) Write down the likelihood of the process in terms of the transition
intensities, the numbers of events observed and the waiting times in the
relevant states, clearly defining all the terms you use.
(iii) Derive the maximum likelihood estimator of the death rate from illness.
The aliens take censuses of their population every ten years (where the year is an
―alien year‖, which is the length of time their planet takes to orbit their sun). On
1 January in alien year 46,567, there were 3,189 live aliens in the population. On 1
January in alien year 46,577 there were 2,811 live aliens in the population. During
the intervening ten alien years, a total of 3,690 aliens died from illness and 2,310
were sacrificed, and the annual death rates from illness and sacrifice were
constant and the same for each alien.
(iv) Estimate the annual death rates from illness and from sacrifice over the
ten alien years between alien years 46,567 and 46,577.
The rate at which aliens who have died from either cause become zombies is 0.1
per alien year.
(v) Calculate the probabilities that an alien alive in alien year 46,567 will, ten
alien years later:
(a) still be alive (b) be dead but not a zombie [UK Sept 2012]
32. During a football match, the referee can caution players if they commit an offence
by showing them a yellow card. If a player commits a second offence which the
referee deems worthy of a caution, they are shown a red card, and are sent off the
www.sankhyiki.in
+91-9711150002
pitch and take no further part in the match. If the referee considers a particularly
serious offence has been committed, he can show a red card to a player who has
not previously been cautioned, and send the player off immediately.
The football team manager can also decide to substitute one player for another at
any point in the match so that the substituted player takes no further part in the
match. Due to the risk of a player being sent off, the manager is more likely to
substitute a player who has been shown a yellow card. Experience shows that
players who have been shown a yellow card play more carefully to try to avoid a
second offence.
The rate at which uncautioned players are shown a yellow card is 1/10 per hour.
The rate at which those players who have already been shown a yellow card are
shown a red card is 1/15 per hour.
The rate at which uncautioned players are shown a red card is 1/40 per hour.
The rate at which players are substituted is 1/10 per hour if they have not been
shown a yellow card, and 1/5 if they have been shown a yellow card.
(i) Sketch a transition graph showing the possible transitions between states
for a given player.
(ii) Write down the compact form of the Kolmogorov forward equations,
specifying the generator matrix.
A football match lasts 1.5 hours.

(iii) Solve the Kolmogorov equation for the probability that a player who
starts the match remains in the game for the whole match without being
shown a yellow card or a red card.
(iv) Calculate the probability that a player who starts the match is sent off
during the match without previously having been cautioned.
Consider a match that continued indefinitely rather than ending after 1.5 hours.
(v) (a) Derive the probability that in this instance a player is sent off
without previously having been cautioned.
(b) Explain your result. [UK April 2013]
33. Outside an apartment block there is a small car park with three parking spaces. A
prospective purchaser of an apartment in the block is concerned about how often
he would return in his car to find that there was no empty parking space
available. He decides to model the number of parking spaces free at any time
using a time homogeneous Markov Jump Process where:
www.sankhyiki.in
+91-9711150002
• The probability that a car will arrive seeking a parking space in a short
interval dt is A.dt + o(dt).
• For each car which is currently parked, the probability that its owner
drives the car away in a short interval dt is B.dt + o(dt) where A, B > 0.
(i) Specify the state space for the above process.

(ii) Draw a transition graph of the process.
(iii) Write down the generator matrix for the process.
(iv) Derive the probability that, given all the parking spaces are full, they will
remain full for at least the next two hours.
(v) Explain what is meant by a jump chain.
(vi) Specify the transition matrix for the jump chain associated with this
process.
Suppose there are currently two empty parking spaces.

(vii) Determine the probability that all the spaces become full before any cars
are driven away.
(viii) Derive the probability that the car park becomes full before the car park
becomes empty.
(ix) Comment on the prospective purchaser‘s assumptions regarding the
arrival and departure of cars. [UK Sept 2013]
34. In a computer game a player starts with three lives. Events in the game which
cause the player to lose a life occur with a probability dt + o(dt) in a small time
interval dt.
However the player can also find extra lives. The probability of finding an extra
life in a small time interval dt is dt + o(dt). The game ends when a player runs
out of lives.
(i) Outline the state space for the process which describes the number of lives
a player has.
(ii) Draw a transition graph for the process, including the relevant transition
rates.
(iii) Determine the generator matrix for the process.
(iv) Explain what is meant by a Markov jump chain.
(v) Determine the transition matrix for the jump chain associated with the
process.
(vi) Determine the probability that a game ends without the player finding an
extra life. [UK April 2015]
www.sankhyiki.in
+91-9711150002
ANSWERS
1. (i)
(iv) 2/3
(v) Make mortality and marriage rates age dependent.
Divorce rate dependent on duration of marriage.
Divorce rate dependent on whether previously divorced.
Make mortality rate marital status-dependent.
2. (i)
(ii) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
(iii) (a) ( )
(iv) ( ) [ ] (v) 0.1049
3. (i) ( ) ( )
(ii) ̂
(iii) 0.00736
4. (ii) Since holding times are independent, each having an exponential distribution,
their joint density is ( )
5. (i)
(ii) For i = j(<N), the solution in (ii) implies that ( ) so that the
distribution of the holding times T0, T1,..., TN -1 is exponential with parameter .
www.sankhyiki.in
+91-9711150002
For i = N, this is obviously not true; once the chain reaches state N, it stays there
forever.
6. (i) ̅̅̅̅ ( )
(iii) (a)
(b) Initially probability increases from 0 at T = 0, and accelerates as the transition

rate from A to B increases.
However, as transitions increase, it becomes more likely that the process has
already visited state B and jumped back to A. Therefore the probability of being
in the first visit to B tends (exponentially) to zero.
(c) t=1
8. (i) (a) A continuous-time Markov process X ,t 0 t with a discrete state space S is

called a Markov jump process.
(b) In the case where the probabilities P[Xt= j|Xs=i] for i,j in S and
0 depend only on the length of time interval t - s , the process is called
time-homogeneous.
(ii) A model with time-inhomogeneous rates has more parameters, and there
may not be sufficient data available to estimate these parameters. Also, the
solution to Kolmogorov s equations may not be easy (or even possible) to find
analytically.
(iv) (a) Pr(Waiting time > T-w|Xw = S) 0 ∫ ( ( ) ( )) 1
( )
(b) ( ) ( )
(c) PHH(,w).( (w) + (w)) dw
(v) ∫ ( ) ( ) , ∫ ( ( ) ( )) -
www.sankhyiki.in
+91-9711150002
(vi) (a) This is likely to improve the predictive power of the model because:
(i) There is empirical evidence that recovery rates depend on the duration
of the sickness.
(ii) The limit of 6 months on sick pay may cause some durational effects around
this point.
However this would make the model more complicated to analyse, and increase
the volume of data required to fit parameters reliably.
(b) For individuals in employment mortality rates are likely to be low, and may
be ignorable. It is less likely that mortality out of state S could be excluded.
9. (i)A= 0 1
(ii) The distribution is exponential in both cases; with parameter σ in state A, ρ in

state B.
(iii) 0.54 weeks
(iv) The time spent in state A before the next visit to B has mean 1/σ.
Therefore a reasonable estimate for σ is the reciprocal of the mean length of each
visit:
̂ = (Number of transitions from A to B) / (Total time spent in state A up until

the last transition from A to B).
Similarly we can estimate ̂ .
(v) Testing whether the successive holding times are exponential variables and
independent would be best. Any procedure which does this test is acceptable.
10. (i) The probability that an event occurs during the short time interval between t
and t + h is approximately equal to λ(t).h for small h where λ(t) is called the rate
of the process. For a time-inhomogeneous process, λ(t) depends on the current
time t; for a time-homogeneous process it is independent of time.
(ii) (a) Divide the time period into intervals of a suitable size, say one month.
Estimate the arrival rate separately for each time period.
See if the observed data match the pattern which would be expected if the model
were accurate and if the parameters had their values given by their estimates.
If not, the model should be revised.
(b) A goodness of fit test, such as the chi-squared test, should be carried out for
each time period chosen.Tests for serial correlation [e.g. portmanteau test] should
use the whole data set at once.
www.sankhyiki.in
+91-9711150002
(iii) (a) This implies that claims are seasonal with period 12 months, and that
claims in the peak (presumably winter) are double those at the low point of the
year. This would be reasonable if in a climate where driving conditions are worse
in winter.
(d) Solution is of the same form, except that for the homogeneous case
f(s,t) = λ(t-s).
11. (i) (a)
(b) The chosen model ignores death among persons in the relevant age groups.
Since mortality in this age group among professional people is likely to be low,
this seems reasonable.
This diagram assumes that demotion is possible, i.e. some-one who has become a
partner can return to non-partnership status without leaving the company.
The assumption is also made that a new employee joining from another company
can do so as a partner.
(ii) (a) Assume we have data on N individuals (i = 1, ..., N).
We should need to know for each individual:
• the total waiting time during the calendar years 1997–2006 in state (1) when
aged 30 last birthday
• whether or not the individual was made a partner between exact ages 30 and
31 years during the calendar years 1997–2006 while remaining in the company.
(b) ∏ , ( ) -( ) where vi is the waiting time at age 30

last birthday in state (1) for individual i.
di is an indicator variable such that di = 1 if individual i was made a partner

while aged 30 last birthday during the period of the investigation and di = 0
otherwise.
www.sankhyiki.in
+91-9711150002
∑
(c) ̂ ∑
12. (ii) (a) A jump chain is formed by recording the state of a Markov jump process
only at the instant when a transition has just been made. The jump chain is in
itself a Markov chain.
(b) The outcome of the jump chain can only differ from that of the standard
Markov chain if the jump process enters an absorbing state. As the jump process
will make no further transitions once it enters an absorbing state, the jump chain
―stops‖. It is possible to model the jump chain as though transitions continue to
occur but the chain continues to occupy the same state.
(iii) The possible states are 0 to N desks in use with no passengers queuing, and
N desks in use with 0, 1, 2, ….. passengers in the queue.
When all desks are occupied and there are M passengers in the queue denote the
state as N:M.
State space is: {0, 1, 2, …., N - 1, N : 0, N : 1, N : 2, …..}
(iv)
(v) Poisson process is usually suitable for arrivals at a service point. Rate may be
time inhomogeneous because passengers may aim to arrive a couple of hours
before the flight — so a time-inhomogeneous Poisson process may be better.
However if the airline operates many flights this may not be an issue. Passengers
may be checked-in in family groups rather than individually. There is likely to be
www.sankhyiki.in
+91-9711150002
a minimum time for processing a check-in due to standard security questions etc,
so exponential distribution may not hold.
(vi)(a)
(b) This is the probability that all the first N transitions are to the right in the
transition diagram. The probability of each transition is given by the elements in
the upper half of the jump chain transition matrix in (vi)(a). Required probability
is therefore ∏
13. (i) (a) The parameters are:
• the rate of leaving state i, λi, for each i,
• the jump-chain transition probabilities, rij, for j ≠ i, where rij is the conditional
probability that the next transition is to state j given the current state is i.
(b) The assumptions are as follows.
• The holding time in each state is exponentially distributed. The parameter of

this distribution varies only by state i. The distribution is independent of
anything that happened prior to the current arrival in state i.
• The destination of the jump on leaving state i is independent of holding time,

and of anything that happened prior to the current arrival in state i.
(ii) (a) rÂB = 11/20, rÂC = 9/20, rˆBA=16/25, rˆBC = 9/25, rˆCA= 24/27 =8/9 and rˆCB = 1/9
(b) A =
(iii) Distribution is binomial with mean n.rij and variance n.rij (1 - rij), where n is
the given number of transitions.
www.sankhyiki.in
+91-9711150002
(iv) T.S =7.7335, no evidence to reject H0.

(v) Holding times — are these exponentially distributed?
A chi-squared goodness of fit test would be appropriate
Is destination of jump independent of the holding time?
There is no obvious test statistic for doing this. A suitable test would be to
classify jumps as being from short, medium and long holding times and
investigating these graphically.
(vi) Approximate and Exact Method ( Mention theory)
14. (i) Operates in continuous time (t ≥ 0) with discrete state space {ONline, OFFline},
and transition probability does not depend on history prior to arrival in current
state (Markov property).
(ii) P‘OFF (t) = 0.8*PON(t) -0.2*POFF(t)
(iii) POFF(t) = 0.8+0.2e-t
(iv) ( )
(v)
Shape: starts at zero as given offline at that point, asymptotes to ratio of
connection to (connection + disconnection) rates.
15. (iii) exp[(-μ12 -μ13)v1]exp[(-μ23 -μ21)v2 ](μ12 )d12 (μ21)d21 (μ13)d13 (μ23)d 23 where vi is the
total observed waiting time in state i, and dij is the number of transitions
observed from state i to state j.
(iv) ̂
(v) (a) 0.2857 (b) (0.1972, 0.3742)
16. (ii) The assumption that births follow a Poisson process is unlikely to be entirely
realistic EITHER because of the occurrence of multiple births
(twins and triplets) OR because births tend to occur seasonally OR because the
process might be time inhomogeneous.
17. (i) The state space is discrete with states as given in the question.
www.sankhyiki.in
+91-9711150002
The process operates in continuous time.

However, at the compulsory scheme retirement age of 65 there is a discrete step
change.
This is sometimes described as a mixed process.
(ii)
(iii)
www.sankhyiki.in
+91-9711150002
18. (i) There are x infected cats and hence 10 – x uninfected cats. Flea transmission
requires one of the x infected cats to meet one of the (10 - x) uninfected cats.
(ii) The total number of pairings of cats is 10C2= 45.

So the probability of a meeting resulting in an increase in the number of cats
with fleas is 0.5x(10 - x)/45.
As this depends only on the number of cats currently infected, and meetings
occur according to a Poisson process, the number of infected cats over time
follows a Markov jump process.
(a) The state space is the number of cats infected {0,1,2,,…..10}
(iii) Holding times are exponentially distributed. With mean 90/μx(10 - x)

(iv) 50.92/μ
19. (iii) (a) 0.0319249 (b) (0.0212,0.0427)

(iv) Using the four state model, the lives in the investigation would have to be
stratified according to the risk factors and the transition intensities estimated
separately for each stratum.
This is likely to run into problems of small numbers.
Using a Cox regression model with death from heart disease as the event of
www.sankhyiki.in
+91-9711150002
interest and the risk factors as covariates would avoid this problem.
Lives who died from other causes could be treated as censored at the durations
when they died.
20. (ii) 6.25 person-days

(iii) r – straightforward, s – medium, t – complicated.
where r = 0,1,2,3,4,5,…., s = 0,1,2,3,4,5,…… and t = 0,1,2,3,4,5,…..
(iv) EITHER The model will only give an approximation. OR The model is not
suitable for this purpose.
The model could not be used to do this without extending the state space to
consider the time the complaint has been in the queue. There are only two
employees, so holidays and sickness are important factors not taken into
account.
The model assumes complaints are time-homogeneous. We do not know the
nature of the business, but for some industries complaints would be seasonal
e.g. holiday companies.
The model assumes that complaint arrivals are independent, but more
complaints might be expected if the company has had a quality control
problem at a particular time. If struggling to meet the service standard, action
would be. Taken, such as overtime, or prioritising easy cases. Staff may be
able to deal with complaints which are similar to other recent complaints very
quickly, using standard „template‟ responses.
The memoryless property is unlikely to be realistic as the work required to
complete the case could be assessed and then worked through to a schedule.
The Markov jump process could be used to estimate the probability that a
complaint is responded to within a given number of days of receipt.
So the model could be used to estimate the probability of a complaint not being
responded to in the stated time, that is the failure to meet the service standard.
21. (i)
www.sankhyiki.in
+91-9711150002
(iii)
22. (i)
(ii) *( ) + *( ) + (iii) ̂
23. (i) A time inhomogeneous model should be used.

Because transition probabilities out of the ―Suspended‖ state between times s
and t may depend not only on the time difference t – s but on the the duration s
the policy has been in that state (e.g. the probability of remaining in the
suspended state for t = 0.75 and s = 0.25 is exp(–0.025), but the probability for t =
1.25 and s = 0.75 is 0.
(ii) (a) A model with this state space would not satisfy the Markov property
because a policy can only be reinstated once, so if in state Cover in Force we
would need to know if the policy has previously been Suspended.
(b) A Markov model could be obtained by expanding the state space to {Cover In
Force, Suspended, Reinstated, Lapsed}. In this case the future transitions will
depend only on the state currently occupied and duration, irrespective of
previous states.
www.sankhyiki.in
+91-9711150002
(iv) ( ) ( ) (v) 0.1025exp(-0.1t)
24. (i)
(ii)
(iii)
(iv) ̂
25. (i) A Markov jump process is a continuous-time Markov process with a discrete
state space.
For a process to be Markov, the future development of the process must depend
only on its current state.
This is the case here, as the future of the process depends only on the number of
passengers currently in the front taxi.
The number of passengers in the front taxi also has a discrete state space
{0, 1, 2, 3}.
(ii) (iv) (v)
26. (i) A Markov jump process is a continuous time, discrete state process.
www.sankhyiki.in
+91-9711150002
(ii)
27. (i) (a) The parameters are the rate of leaving state i, λi, for each i, and the jump-
chain transition probabilities, rij, for j ≠ i, where rij is the conditional probability
that the next transition is to state j given the current state is i.
(b) The assumptions are as follows.
EITHER The holding time in each state is exponentially distributed OR The
transition intensities from each state are not time-dependent.
The parameter of this distribution varies only by state i, so that the distribution
is independent of anything that happened prior to the arrival in current state i.
The destination of the jump on leaving state i is independent of holding time,
and of anything that happened prior to the current arrival in state i.
(ii) (a) rÂB = 60/105=4/7, rÂC = 45/105=3/7, rˆBA =50/75=2/3
rˆBC =25/75=1/3, rˆCA =55/70=11/14 and rˆCB =15/70=3/14
(b)
(iii) EITHER Binomial, with mean n.rij and variance n.rij.(1 – rij), n being the
number of transitions out of state i.
28. (i) (ii) 0.472
www.sankhyiki.in
+91-9711150002
(iii) (a) F(A) =2/3*F(T)+1/5 (b) F(T) =3/4*F(A)+1/12

(iv) F(A) = 23/45 and F(T) = 7/15 (v) 22/9 years
29. (i)
(ii)
(iii) ̂ (iv) ̂ and variance = 1.13
30. (i) 0 1
(ii) The holding times are exponentially distributed with parameter μ in state L
and ρ in state H.
(iii) The time spent in state L before the next visit to H has mean 1/μ.
Therefore a reasonable estimate for μ is the reciprocal of the mean length of each
visit:
= (Number of transitions from L to H) / (Total time spent in state L)
Similarly estimate for ρ is the reciprocal of the mean length of each visit:
= (Number of transitions from H to L) / (Total time spent in state H)
(iv) (v) T= ln(2)/μ
(vi)
www.sankhyiki.in
+91-9711150002
31. (i)
(ii)
(iii) ̂ (iv) 0.123 and 0.077
(v)(a) 0.135 (b) 0.465
32. (i) (iii) 71.36% (iv) 0.03183

(v) (a) 1/9
(b) This is the ratio of the transition rate to ―straight to sent off‖ to the total
transition rate out of state U.
33. (i) The state space is {0,1,2,3} where the number indicates the number of available
spaces.
(ii)
(iii) (iv) P00(2) =exp(-6B)

(v) If a Markov jump process Xt is examined only at the times of transition, the
www.sankhyiki.in
+91-9711150002
resulting process is called the jump chain associated with Xt.
(vi) (vii) (A/A+B).(A/A+2B)
(viii)
(ix) A time inhomogeneous model may be more appropriate. Residents may
come and go at particular times, for example if they drive to work.
They are unlikely to be moving their car as regularly in the middle of the night
Independent arrivals questionable because a family might have two cars
arriving/leaving at the same time OR people might arrive and wait until a space
becomes available thus leading to a queue
The Markov assumption may not be valid because neighbours may know from at
experience when cars are moved and time their arrival accordingly.
The model assumes those parking cars are competent drivers, and do not park so
as to take up 2 spaces.
34. (i) {0, 1, 2, 3, 4, …}
(ii)
(iii)
(iv) A jump chain is each distinct state visited in the order visited where the time
set is the times when states are moved between.
www.sankhyiki.in
+91-9711150002
(v)
(vi) ( )
www.sankhyiki.in
+91-9711150002
TIME SERIES
1. The model fitted to the data is:

X t  5.67  0.61X t 1  et  0.23et 1
The most recently observed value in the series is X20 = 8.2, with estimated
residual e20 = –1.38.
(a) Evaluate estimates x̂ 20 (1) and x̂ 20 (2) for X21 and X22.
(b) The simplest form of the method of exponential smoothing used at
time 19 gave a forecast for X20 of 8.37. Assuming the smoothing
parameter is equal to 0.2, find the forecast of X21.
(c) Give an example of a circumstance in which a form of exponential
smoothing might be expected to outperform Box-Jenkins forecasting in
the prediction of future values of the time series. [UK Sept 2002]
2. An ARIMA process X satisfies the recursion;

X t  1X t 1   2 X t 2  e t  e t 1
where et is white noise with variance 2.

(a) Write down a condition in terms of the roots of an equation for X to be
stationary.
(b) Show, in the case where 2 = –0.5, that X is stationary as long as
|1| < 1.5. [UK Sept 2003]
3. (i) Calculate the autocovariance function {k : k  0} and the autocorrelation

function {k : k  0} for the m th order moving average process:
( )
where {et : t  0} is a sequence of uncorrelated, zero-mean random variables

with common variance  e2 .
(ii) Explain whether or not the process is invertible in the case where m=2.
[UK Sept 2003]
www.sankhyiki.in
+91-9711150002
4. A Box-Jenkins model-fitting procedure suggests that the best fitting model for a
set of normalised share price data x1 , …, xn is ARMA(1,2), with equation:
X t  0.63X t 1  e t  0.45e t 1  0.34e t 2
where {e1, e2,...} is a sequence of uncorrelated, zero-mean random variables with
variance 2.
(i) Determine whether the model is stationary and/or invertible.
(ii) Calculate  0 , 1 ,  2 the autocovariance function of the fitted model at lags 0, 1
and 2, in terms of 2. [UK April 2004]
5. Consider the second order autoregressive process:

X t  0.6X t 1  0.3X t 2  e t
(a) Determine whether the process can be stationary.

(b) State, with a reason, whether the process possesses the Markov property.
(c) Assuming that  =1, calculate the values of  0 , 1 ,  2 . [UK Sept 2004]
6. (i) Write down the defining equation of an ARMA(1,1) process, identifying

the parameters of the process.
(ii) Explain what it means to say that a time series is stationary and state (but
do not prove) a condition needed to ensure that an ARMA(1,1) process
can be stationary.
(iii) Outline the method of moments parameter estimation technique as it
would be applied to estimate the parameters of an ARMA(1,1) process.
(iv) Suppose an individual has fitted the following model to a dataset:
x t  9.12  0.71x t 1  e t  0.17e t 1
The most recently observed value in the series is x25 = l4.82 with estimated
residual ê 25  1.98 .
(a) Obtain estimates x̂ 25 (1) and x̂ 25 (2) for x26 and x27.
(b) The simplest form of exponenia1 smoothing used at time 24 gave a
forecast for x25 of 12.97. Assuming the smoothing parameter is equal to
0.3, find the forecast for x26.
www.sankhyiki.in
+91-9711150002
(v) Discuss when the method of exponential smoothing might in practice be

preferred to a method based on the Box-Jenkins technique. [UK Sept 2004]
7. Yt , t = 1, 2, 3, ... is a time series defined by:

Yt  0.8Yt 1  Zt  0.2Zt 1
where Zt , t= 0, 1,... is a sequence of independent zero-mean variables with

common variance 2. Derive the autocorrelation k , k = 0,1,2,… [UK April 2005]
8. The following time series model is used for the monthly inflation rate (Yt) in a
particular country:
Yt  0.4Yt 1  0.2Yt 2  Zt  0.025
where {Zt} is a sequence of uncorrelated identically distributed random variables

whose distributions are normal with mean zero.
(i) Derive the values of pd and q, when this model is considered as an
ARIMA(p, d, q) model.
(ii) Determine whether {Yt} is a stationary process.
(iii) Assuming an infinite history, calculate the expected value of the rate of
inflation over this.
(iv) Calculate the autocorrelation function of {Yt}.
(v) Explain how the equivalent infinite-order moving average representation
of {Yt} may be derived.
[UK Sept 2005]
9. (i) Derive the autocovariance and autocorrelation functions of the AR(1) process.
X t  X t 1  e t
where   1 and the et form a while noise process.
(ii) The time series Zt is believed to follow an ARIMA(1,d,0) . process for some
value of d. The time series Z(t k ) is obtained by differencing k times and the
sample autocorrelations, {ri : i=1,2,...,10}, are shown in the table below for
various values of k.
www.sankhyiki.in
+91-9711150002
k=0 k=1 k=2 k=3 k=4 k=5

r1 100% 100% 83% 3% 45% 64%
r2 100% 100% 66% 12% 5% 13%
r3 100% 100% 54% 11% 4% 3%.
r4 100% 99% 45% 1% 6% 4%
r5 100% 99% 37% 3% 4% 6%
r6 100% 99% 30% 12% 12% 1%
r7 99% 98% 27% 3% 7% 9%
r8 99% 98% 24% 3% 0% 4%.
r9 99% 97% 19% 3% 5% 6%
r10 99% 97% 13% 7% 5% 4%
Suggest, with reasons, appropriate values for d and the parameter in the
underlying AR(1) process. [UK April 2006]
10. State the Markov property and explain briefly whether the following processes
are Markov:
AR(4);
ARMA (1, 1). [UK Sept 2006]
11. (i) Explain the concept of cointegrated time series.

(ii) Give two examples of circumstances when it is reasonable that two processes
may be cointegrated. [UK April 2007]
12. A modeller has attempted to fit art ARMA(p,q) model to a set of data using the
Box-Jenkins methodology. The plot of residuals based on this proposed fit is
shown below.
www.sankhyiki.in
+91-9711150002
Residuals based on fitted model
(i) Under the assumptions of the model, the residuals should form a white
noise process.
(a) By inspection of the chart, suggest two reasons to suspect that the
residuals do not form a white noise process.
(b) Define what is meant by a turning point.
(c) Perform a significance test on the number of turning points in the data
above. (There are 100 points in the data and 59 turning points.)
(ii) On your suggestion, the original fitted model is discarded, and re-
parameterised to:
Xn2  5  0.9(Xn1  5)  en2  0.5en
Given the following observations:
x99 = 2 x100 = 7 ê99 =—0.7 ê100 = 1.4
Use the Box-Jenkins methodology to calculate the forward estimates
x̂100 (1), x̂100 (2) and x̂100 (3) . [UK April 2007]
13. An investment actuary notices that the volatility of the price of a particular asset
is much higher following a significant change in the price of the asset.
Define an ARCH model and explain what particular properties of the model
would make it appropriate for modelling this asset. [UK Sept 2007]
www.sankhyiki.in
+91-9711150002
14. The time series Xt is assumed to be stationary and to follow an ARMA(2,1)

process defined by:
8 1 1
Xt  1 X t 1  X t 2  Z t  Z t 1
15 15 7
where Zt are independent N(0,1) random variables.
(i) Determine the roots of the characteristic polynomial, and explain how
their values relate to the stationarity of the process.
(ii) (a) Find the autocorrelation function for lags 0, 1 and 2.
(b) Derive the autocorrelation at lag k in the form
A B
k  
ck d k
(iii) Determine the mean and variance of Xt. [UK Sept 2007]
15. Consider the following model applied to some quarterly data:
where et is a white noise process with mean zero and variance .
(i) Express in terms of and the roots of the characteristic polynomial of
the MA part, and give conditions for invertibility of the model.
(ii) Derive the autocorrelation function (ACF) for Yt.
For our particular data the sample ACF is:
Lag ACF
1 0.73
2 0.14
3 0.37
4 0.59
5 0.24
6 0.12
7 0.07
(iii) Explain whether these results confirm the initial belief that the model
could be appropriate for these data. [UK April 2008]
www.sankhyiki.in
+91-9711150002
16. (i) Describe the difference between strictly stationary processes and weakly
stationary processes.
(ii) Explain why weakly stationary multivariate normal processes are also
strictly stationary.
(iii) Show that the following bivariate time series process, (X n , Yn ) T is weakly
stationary:
X n  0.5X n 1  0.3Yn 1  e nx
Yn  0.1X n 1  0.8Yn1  e ny
where e nx and e nx are two independent white noise processes.

(iv) Determine the positive values of c for which the process
X n  (0.5  c)X n 1  0.3Yn 1  e nx
Yn  0.1X n 1  (0.8  c)Yn 1  e ny

is stationary. [UK April 2008]
17. Consider the ARCH(1) process:
X t    e t  0  1 (X t 1  ) 2
where et are independent normal random variables with variance 1 and mean 0.
Show that, for s = 1,2,…,t–1, Xt and Xt-s are:
(i) uncorrelated
(ii) not independent [UK Sept 2008]
18. From a sample of 50 consecutive observations from a stationary process. The

table below gives values for the sample autocorrelation function (ACF) and the
sample partial autocorrelation function (PACF):
Lag ACF PACF
1 0.854 0.854
2 0.820 0.371
3 0.762 0.085
The sample variance of the observations is 1.253.
www.sankhyiki.in
+91-9711150002
(i) Suggest an appropriate model, based on this information1 giving your

reasoning.
(ii) Consider the AR(1) model:
Yt  a1Yt 1  e t
where et is a white noise error term with mean zero and variance 2.
Calculate method of moments (Yule-Walker) estimates for the parameters
of a and a1 on 2 the basis of the observed sample.
(iii) Consider the AR(2) model:
Yt  a1Yt 1  a 2 Yt 2  e t
where e1 is a white noise error term with mean zero and variance 2.
Calculate method of moments (Yule-Walker) estimates for the parameters
of a1, a2 and 2 on the basis of the observed sample.
(iv) List two statistical tests that you should apply to the residuals after fitting
a model to time series data. [UK Sept 2008]
19. Let Yt be a stationary time series with autocovariance function Y(s).

(i) Show that the new series Xt = a + bt + Yt where a and b are fixed non-zero
constants, is not stationary.
(ii) Express the autocovariance function of Xt = Xt – Xt-1 in terms of Y(s) and
show that this new series is stationary.
(iii) Show that if Yt is a moving average process of order 1, then the series Xt
is not invertible and has variance larger than that of Yt. [UK April 2009]
20. Consider the stationary autoregressive process of order 1 given by

Yt  2Yt 1  Zt   0.5
where Zt denotes white noise with mean zero, and variance 2.

Express Yt in the form Yt   a j Z t  j and hence or otherwise find an expression
j0
for the variance of Yt in terms of  and  [UK Sept 2009]
www.sankhyiki.in
+91-9711150002
21. The following data is observed from n = 500 realisations from a time series:
n n
 x i  13,153.32,
i 1
 (x
i 1
i  x ) 2  3,153.67 and
n 1
 (x
i 1
i  x )(x i1  x )  2,176.03
(i) Estimate, using the data above, the parameters , 1 and  from the
model:
X t    1 (X t 1  )   t
where t is a white noise process with variance 2.

(ii) After fitting the model with the parameters found in (i), it was calculated
that the number of turning points of the residuals series ̂ t is 280.
Perform a statistical test to check whether there is evidence that, ̂ t is not
generated from a white noise process. [UK Sept 2009]
22. The following two models have been suggested for representing some quarterly
data with underlying seasonality.
Model 1 Y1  Yt 4  e t
Model 2 Yt  e t 4  e t
where et is a white noise process in each case.

(i) Determine the autocorrelation function for each model.
The observed quarterly data is used to calculate the sample
autocorrelation.
(ii) State the features of the sample autocorrelation that would lead you to
prefer Model 1. [UK April 2010]
23. Observations y1 , y2 ,..., yn are made from a random walk process given by.
Y0  0 and for t > 0
where et is a white noise process with variance 2.

(i) Derive expressions for E(Yt) and var(Yt) and explain why the process is
not stationary.
www.sankhyiki.in
+91-9711150002
(ii) Show that =cov(Yt,Yt-s) for s < t is linear in s.

(iii) Explain how you would use the observed data to estimate the parameters
a and .
(iv) Derive expressions for the one-step and two-step forecasts for Yn+1 and
Yn+2. [UK April 2010]
24. A time series model is specified by:

Yt  2Yt1  2Yt2  et
where et is a while noise process with variance 2
(i) Determine the values of  for which the process is stationary.
(ii) Derive the auto-covariances 0 and , for this process and find a general
recursive express ion for k for k  2.
(iii) Show that the auto-covariance function can be written in the form:
 k  A k  kB k
for some values of A, B which you should specify in terms of the constants
 and 2 . [UK Sept 2010]
25. Consider the time series Yt  0.7  0.4Yt 1  0.12Yt 2  e t , where et is a white noise
process with variance 2.
(i) Identify the model as an AR1MA(p,d,q) process.
(ii) Determine whether Y is a stationary process.
(iii) Calculate E(Yt).
(iv) Calculate the auto-correlations 1, 2, 3 and . [UK April 2011]
26. Consider the time series

Yt = 0.1 + 0.4Yt-1 + 0.9et-1 + et
(i) Identify the model as an ARIMA(p,d,q) process.
(ii) Determine whether Yt is:
www.sankhyiki.in
+91-9711150002
(a) a stationary process
(b) an invertible process
(iii) Calculate E(Yt) and find the auto-covariance function for Yt.
(iii) Determine the MA(∞) representation for Yt. [UK Sept 2011]
27. Consider the time series model (1B)3 Xt = et
where B is the backwards shift operator and et is a white noise process with
variance 2 .
(i) Determine for which values of , the process is stationary.
Now assume that  = 0.4.
(ii) (a) Write down the Yule-Walker equations.
(b) Calculate the first two values of the auto-correlation function 1 and 2.
(iii) Describe the behavior of k and the partial autocorrelation function k
as k ∞. [UK April 2012]

28. In order to model a particular seasonal data set an actuary is considering using a
model of the form
(1 B3)(1 ( +)B + B2) Xt = et

where B is the backward shift operator and et is a white noise process with
variance 2.
(i) Show that for a suitable choice of s the seasonal difference series
Yt = Xt -Xt-s is stationary for a range of values of  and  which you should
specify.
After appropriate seasonal differencing the following sample autocorrelation

values for the series Yt are observed: ̂1 = 0.2 and ̂2 = 0.7.
(ii) Estimate the parameters  and  based on this information.
[HINT: let X =  + , Y =  and find a quadratic equation with roots  and .]
(iii) Forecast the next two observations ̂ 101 and ̂ 102 based on the parameters
estimated in part (ii) and the observed values x1, x2,…, x100 of Xt .
[UK Sept 2012]
www.sankhyiki.in
+91-9711150002
29. An actuary is considering the time series model defined by
Xt = Xt-1 + et
where et is a sequence of independent Normally distributed random variables

with mean 0 variance 2. The series begins with the fixed value X0 = 0.
(i) Show that the conditional distribution of Xt given Xt-1 is Normal and
hence show that the likelihood of making observations x1, x2, …..,xn from
this model is:
( )
∏
√
(ii) Show that the maximum likelihood estimate of  can also be regarded as a
least squares estimate.
(iii) Find the maximum likelihood estimates of  and 2.
(iv) Derive the Yule-Walker equations for the model and hence derive
estimates of  and 2 based on observed values of the autocovariance
function.
(v) Comment on the difference between the estimates of  in parts (iii) and
(iv). [UK April 2013]
30. (i) State the three main stages in the Box-Jenkins approach to fitting an
ARIMA time series models.
(ii) Explain, with reasons, which ARIMA time series would fit the observed
data in the charts below.
www.sankhyiki.in
+91-9711150002
Now consider the time series model given by
Xt = 1Xt-1 + 2Xt-2 + 1et-1 + et
(iii) Derive the Yule-Walker equations for this model.
(iv) Explain whether the partial auto-correlation function for this model can
ever give a zero value. [UK Sept 2013]
31. A sequence of 100 observations was made from a time series and the following
values of the sample auto-covariance function (SACF) were observed:
Lag SACF
1 0.68
2 0.55
3 0.30
4 0.06
The sample mean and variance of the same observations are 1.35 and 0.9
respectively.
(i) Calculate the first two values of the partial correlation function ̂ 1 and ̂ 2.
(ii) Estimate the parameters (including 2) of the following models which are
to be fitted to the observed data and can be assumed to be stationary.
www.sankhyiki.in
+91-9711150002
(a) Yt = a0 + a1 Yt-1 + et
(b) Yt = a0 + a1 Yt-1 + a2 Yt-2 + et
In each case et is a white noise process with variance 2.

(iii) Explain whether the assumption of stationarity is necessary for the
estimation for each of the models in part (ii).
(iv) Explain whether each of the models in part (ii) satisfies the Markov
property. [UK April 2014]
32. (i) List the main steps in the Box-Jenkins approach to fitting an ARIMA time
series to observed data.
Observations x1, x2, …, x200 are made from a stationary time series and the
following summary statistics are calculated:
∑ ∑( ̅) ∑( ̅ )( ̅)
∑( ̅ )( ̅)
(ii) Calculate the values of the sample auto-covariances ̂0, ̂1 and ̂2.
(iii) Calculate the first two values of the partial correlation function ̂ 1 and ̂ 2.
The following model is proposed to fit the observed data:
Xt -  = a1 (Xt-1 - ) + et
(iv) Estimate the parameters , a1 and 2 in the proposed model.
After fitting the model in part (iv) the 200 observed residual values ̂ t were
calculated. The number of turning points in the residual series was 110.
(v) Carry out a statistical test at the 95% significance level to test the
hypothesis that ̂ t is generated from a white noise process.
[UK Sept 2014]
www.sankhyiki.in
+91-9711150002
33. The following time series model is being used to model monthly data:
Yt = Yt-1 +Yt-12 -Yt-13 + et +1et-1 +12et-12 +112et-13
(i) Perform two differencing transformations and show that the result is a
moving average process which you may assume to be stationary.
(ii) Explain why this transformation is called seasonal differencing.
(iii) Derive the auto-correlation function of the model generated in part (i).
[UK April 2015]
34. Consider the following pair of equations:
where are independent white noise processes.

(i) (a) Show that these equations can be represented as
where M and N are matrices to be determined.

(b) Determine the values of  for which these equations represent a
stationary bivariate time series model.
(ii) Show that the following set of equations represents a VAR(p) (vector auto
regressive) process, by specifying the order and the relevant parameters:
[UK Sept 2015]

35. Consider the following time series model:
Yt = 1 + 0.6Yt-1 + 0.16Yt-2 + et
(i) Determine whether Yt is stationary and identify it as an ARMA(p,q)
process.
(ii) Calculate E(Yt).
(iii) Calculate for the first four lags:
 the autocorrelation values 1, 2, 3, 4 and
 the partial autocorrelation values 1, 2, 3, 4. [UK April 2016]
www.sankhyiki.in
+91-9711150002
ANSWERS
1. (a) ̂( ) ̂( ) (b) ̂ ( )
(c) Exponential smoothing might be expected to outperform Box-Jenkins

forecasting when a slowly varying trend or multiplicative seasonal variations is
present.
2. (a) The characteristics equation for this process is
1-
The process will be stationary if the modulus of each root of this equation
is > 1.
3. (i) {
(ii) Not invertible
4. (i) Since both roots are strictly greater than 1 in magnitude, the model is
invertible.
(ii)
5. (a) Roots of the characteristics are 1.082 and -3.082. Since both these roots are
strictly greater than 1 in magnitude, we see that this is a stationary model.
(b) We can see straight away that this process does not possess the Markov
property because; if we are told the value of we can see from the
formula used to calculate Xt that this will influence the values, and hence
the probabilities, for Xt.
(c) .
6. (i) ( )
(ii) Stationarity and the condition for stationarity: A time series is described as
―stationary‖ if its statistical properties do not vary over time. For practical
purposes, it is sufficient for a series to be ―weakly stationary‖, which
requires its first two moments to be constant over time. In other words,
the mean and variance take constant values, and the covariance depends
only on the lag, not on the time t.
Stationarity is an issue relating only to the autoregressive terms, and is not
affected by adding or subtracting constants. So the stationarity of this
www.sankhyiki.in
+91-9711150002
model is the same as for the simpler model , i.e it is

stationary if (and only if)
( )( ) ̂ ( ) , ̂( ) , (b) ̂ ( )
(v) Exponential smoothing methods may be preffered to the Box-Jenkins

approach when: There is a slowly varying trend and There is multiplicative,
rather than additive, seasonal variation
7.
8. (i) d =0, p = 2 and q = 0.
(ii) We showed that the series was stationary in part(i).
(iii) 6.25%
(iv)
(v) Equivalent infinite-order moving average: What we are trying to do here

is extress Yt in terms of Zt‘s. To do this we need to make use of the
characteristic polynomial of Yt. We can express the time series as:
( )( )
Once we‘ve done this we can invert the characteristic polynomial to give:
( ) ( )
the characteristic polynomial from part(i) and the mean from
part(iii), we get:
( )( )
Since:
( ) ( )
( ) ( ) ( )
Hence:
( ) ( )
( )
9. (i)
www.sankhyiki.in
+91-9711150002
(ii) Values for the parameters: If a process is ARIMA(1,d,0), then if we

difference it d times we will get an ARMA(1,0) process, i.e an AR(1)
process.
We have seen in part (i) that for an AR(1) process, the population
autocorrelation function decays exponentially. The column which
suggests an exponential decay function for the sample autocorrelations is
the column k = 2. So we set d = 2 and difference twice.
Setting the first sample autocorrelation r1 equal to the formula for the first
population autocorrelation ρ1 calculated in part (i), we find that:
α=0.83
10. The Markov property states that the future development of a process can be
predicted from its present state alone, without any reference to its past history
in terms of probabilities:
( | ) ( | )
An AR(4) process is defined by:
This clearly does not have the Markov property since the definition of the
process at the time t depends on the values at times t-2, t-3 and t-4 as well as t-1.
An ARMA(1,1) process is defined by:
Rearranging this definition gives us:
( ) ( )
( )( )
( )( )
This clearly does not have the Markov property since the definition of the
process at time t depends on the values at the times t-2, t-3, t-4, etc as well as t-1.
11. (i) Cointegrated time series

X and Y are said to be cointegrated if:
 X and Y are l(1) random process
www.sankhyiki.in
+91-9711150002
 There exists a non-zero vector ( ) such that is

stationary. The vector ( ) is called a cointegrating vector.
(ii) Two processes might be cointegrated if:
 One of the processes is driving the other, or if
 Both are being driven by the same underlying process.
12. (i)(a) Inspection of the chart

A white noise process consists of independent random variable with zero mean
and constant variance. However:
1. The mean value of the residuals does not appear to be zero. There are many
more positive residuals than negatives suggesting that the underlying mean
of the residual distribution is greater than zero.
2. The variance of the residual distribution does not appear to be constant. The
variability in the residuals is small for small t but appears to increase with
increasing time.
So the residuals do not form a white noise process.
(i)(b) Definition of a turning point
If e1,e2,….,eN is a sequence of residuals, we say that ek is a turning point if either

ek-1 < ek and ek > ek+1, or ek-1 > ek and ek < ek+1.
(i)(c) ( ) ( )
( )
( ) ( )
√
But this is a two-sided test (as either a very small or a very large number of
turning points would indicate the residuals are not white noise) so there is about
a 16% chance of getting as extreme a number of turning points as this, even if
the data are white noise.
(ii). ̂ ( ) , ̂ ( ) , ̂ ( )
13. The ARCH(ρ) models are defined by the relation:
www.sankhyiki.in
+91-9711150002
√ ∑ ( )
where {et} is a sequence of independent standard normal random variables.
ARCH models can be used for modeling financial time series time series. If Zt is
the price of an asset at the end of the t-th trading day, the ARCH model can be
used to model ( ), interpreted as the daily return on day t.
The ARCH family of models captures the feature frequently observed in asset
price data that a significant change in the price of an asset is often followed by a
period of high volatility. A significant deviation of Xt-k from the mean µ gives
rise to an increase in the volatility of the asset price.
14. (i)
Since both these solutions are greater than 1 in magnitude we conclude that the
series is stastionary.
(ii) (a)
(ii)(b) i.e c = 3 and d = 5.
(iii) ( )
15. (i) √
The time series is invertible if the roots λi, of the characteristic equation of the
MA part are greater than one in magnitude :
| |
www.sankhyiki.in
+91-9711150002
|√ | | |
(ii) ( )( ) ( )
( )
Hence, ( )( )
( )( )
(iii) Now ρ2, ρ6 and ρ7 are zero, so we would expect r2, r6 and r7 to be close to
zero. They do not appear to be (we have insufficient information to carry
out a formal attest). So it appears that the sample ACFs are not consistent
with the theoretical ACFs.
16. (i) Strictly and weakly stationary time series

A process is strictly stationary if:
( ) ( )
A process is weakly stationary if:
 E(Xt) is constant
 cov(Xt,Xt+k) is constant for a given lag k.
(ii) Weakly stationary multivariate is strictly stationary
A normal distribution is defined by its mean,µ, and its variance, σ2 only.

So if these are constant (as per the weakly stationary definition) then this
will uniquely define the process. Hence it will also be strictly stationary.
(iv) c < 0.121 c < 0.579
18. (i) From the figures it looks like the ACF is decaying slowly and the PACF is
cutting off after lag 2. This is a characteristic of an AR(2) model.
(ii) ̂ ̂ = 0.339
(iii) ̂ ̂ ̂ = 0.301
(iv) The appropriate tests are the Portmanteau (Ljung and Box) and Turning
Points tests.
20. ∑ ( ) ( )
www.sankhyiki.in
+91-9711150002
21.
22.
(ii) Features that would lead you to prefer Model 1
As we noted in part (i):
23. (i) ( ) ( )
(iv) Forecast values
Since , the one-step ahead forecast for Yn+1 is:
̂( ) ̂ ( ) ̂
Also since, , the two-step ahead forecast for Yn+2 is:
̂( ) ̂ ̂ ( ) ( ) ̂ ̂ ( )
√
24. (i)
(ii)
(iii)
25. (i) Yt is an ARIMA(2, 0, 0) if it is stationary.
(ii) The characteristic polynomial is:
( )
which has roots -5 and 1.667. Since all are of magnitude greater than 1, the
process is stationary.
www.sankhyiki.in
+91-9711150002
(iii) ( )
26. (i) The model is an ARIMA(1, 0, 1) process.
(ii) (a) We have already shown in part (i) that the process is stationary.
(ii) (b)
Since this is greater than 1 in absolute value, the process is invertible.
(iii)
. /
(v) ∑
27. (i)
(ii)(a)
(ii)(b)
(iii)The time series is an AR(3) series.
The autocorrelation function, ρk will delay(i.e tend to 0) as
The partial autocorrelation function φk will cut off (i.e be 0) for k>3.
28. (ii)
(iii) ̂
̂
̂
29. (iv) ̂ ̂
̂
̂ ̂ ̂̂ ̂
̂
̂ ∑ ( ̅)
̂ ∑ ( ̅) ( ̅)
www.sankhyiki.in
+91-9711150002
(v)The autocovariance estimate involve ̅ whereas the maximum likelihood

estimates don‘t.
30. (i) The three main stages are (a) tentative model indentification (b) model
fitting and (c) diagnostics
(ii) Since the auto-correlation is non-zero for the first lag only and the partial
autocorrelation function decays exponentially it is likely that the observed
data comes from an MA(1) (or equivalently a ARMA(0,1) or ARIMA(0,0,1)
model).
(iii) n = 1n-1 + 2n-2
(iv) The presence of the term 1et-1 means that the PACF will decay
exponentially to zero, but it will never get there, so that the PACF will
always be non-zero.
31. (i) ̂ ̂
(ii) (a) ̂ ̂ ̂
(iii) Stationarity is necessary for both models since the Yule-Walker equations
do not hold without the existence of the auto-covariance function.
(iv) Model (a) does satisfy the Markov property since the current value
depends only on the previous value.
This does not hold for Model (b).
32. (i) The three main stages are (a) tentative model indentification (b) model
fitting and (c) diagnostics.
(ii) ̂ ̂ ̂
(iii) ̂ ̂
(iv) ̂ ̂ ̂
(v) The number of turning points T is approximately Normally distributed
with E(T) = 132 and Var(T) = 5.9362. So a 95%CI for T is (120.4, 143.6).
Our observed value T = 110 does not lie within the 95% confidence
interval. Therefore we have evidence to reject the H0 and conclude that
the observed et to not come from a white noise process.
A different model is required.
33. (i) Set Xt = (1 - B12)(1 - B)Yt where B is the background shift operator
i.e. Xt = Yt - Yt-1 - Yt-12 + Yt-13
www.sankhyiki.in
+91-9711150002
then we have Xt = et + 1 et-1 + 12 et-12 + 112et-13 = (1 + 1B)(1 + 12B12)et which

is a moving average process [of order 13].
(ii) This is called seasonal differencing because it compares the monthly change
in Yt with the corresponding monthly change at the same time last year.
(iii) ( )( ) ( )
( )
( )( ) ( )
and for all other s
34. (i) [ ] 0 1 (ii)
(iii)
35. (i) ARMA(2,0) (ii) 25/6
(iii)
www.sankhyiki.in
+91-9711150002
EXTREME VALUE THEORY
1. (i) Explain what is meant by an extreme event and give two examples in an
insurance context.
(ii) Explain why it is important to model extreme events separately from
other events.
2. (i) Describe the generalised extreme value (GEV) distribution.

(ii) Outline an alternative approach that can be used in place of the GEV
distribution to model extreme events.
(iii) State the key advantage of the method outlined in (ii) over that described
in (i).
3. The claim amounts in a general insurance portfolio are independent and follow
an exponential distribution with mean £2,500.
(i) Calculate the probability that an individual claim will exceed £10,000.
(ii) Calculate the probability that, in a sample of 100 claims, the largest claim
will exceed £10,000 using:
(a) an exact method
(b) an approximation based on a Gumbel-type GEV distribution.
(iii) State the two key assumptions made in (ii)(a).
4. If individual losses, X, follow a Pa(,) distribution, determine the distribution of

the threshold exceedances, W = X -u|X > u.
5. Compare the limiting value of the density functions for a Gamma(,) and an
Exp() distribution when   1 and hence determine which has the heavier tail.
6. (i) Determine the hazard rate for the Weibull distribution with parameters
c > 0 and  > 0 .
(ii) Comment on the behaviour of the hazard rate.
7. (i) Show that: ∫ ( )

Hint: use the substitutionu = 3 and transform the integrand into the
PDF of the Gamma(2,1) distribution.
www.sankhyiki.in
+91-9711150002
(ii) Hence deduce an expression involving a chi-squared probability for the

mean residual life for the W(3,1/2) distribution.
(iii) By calculating the values of mean residual life function when x = 1 , x = 4
and x = 9, determine whether the mean residual life of the W(3,1/2)
distribution is an increasing or decreasing function of x.
8. (i) Show that the mean residual life of the Gamma(2,1) distribution is given
by: e(x) = (x+2)/(x+1)
(ii) Use the mean residual life to compare the tail of the Gamma (2,1)
distribution with that of the Exp(1) distribution.
9. (i) Explain why claim amounts from general insurance policies are typically
modelled using statistical distributions with heavy tails.
Claim amounts on a portfolio of insurance policies are assumed to follow a
Weibull distribution. A quarter of losses are below 15 and a quarter of losses are
above 80.
(ii) Estimate the parameters c ,  of the Weibull distribution that fits this data.
(iii) Determine whether or not this Weibull distribution has a heavier tail than
that of the exponential distribution with parameter c, by considering
your estimate of .
www.sankhyiki.in
+91-9711150002
ANSWERS
1. (i) Extreme events are outcomes that have a very low probability of occurrence
but involve very large sums of money.
In an insurance context, they may arise as a result of a single cause that has a
high financial cost (eg a bodily injury claim or complete destruction of a
building) …
… or as an accumulation of events with a related cause (eg flood damage to a
large number of houses in one town).
(ii) The majority of risk events fall within the main body of the fitted distribution
and can usually be modelled reasonably accurately by one of the standard
statistical distributions.
However, there is usually a lack of past data on extreme events.
If a distribution is fitted to the whole dataset, the parameter estimates will reflect
where the bulk of the data values lie rather than the extreme events. This might
mean the fitted distribution understates the probability of future extreme events.
Therefore, a different approach to modelling extreme events is taken, eg by
considering the distribution of block maxima or the distribution of threshold
exceedances.
2. (i) The maximum value, XM, in a sample of n IID random variables X1 , X2 ,..., Xn
tends to a particular distribution as the sample size increases. This is called the
generalised extreme value (GEV) distribution.
The GEV distribution has CDF:
(ii) As an alternative to focusing upon a single maximum value, we can consider

the distribution of all the claim values that exceed some threshold, u . The
distribution of X -u|X > u is called the threshold exceedances distribution. [1]
A similar theory to GEV predicts that the limiting distribution, as u , is a
generalised Pareto distribution (GPD).
(iii) The GPD method has the advantage that it uses a larger part of the data and
models all the large claims above the threshold, not just the single highest value.
www.sankhyiki.in
+91-9711150002
3. (i) 0.0183 (ii)(a) 0.8425 (b) 0.8398

(iii) The two key assumptions are that all claims come from an exponential
distribution with mean £2,500 and that they are statistically independent.
4. Pareto( )
5. The Gamma distribution has the heavier tail
6. (i) ( )
(ii) If 1, then this hazard rate is an increasing function of x, which corresponds
to a light tail.
If 0 > >1, then the hazard rate is a decreasing function of x, which corresponds
to a heavy tail.
( )
7. (ii) ( )
(iii) When x=1, e(x) = 0.8887 When x=4, e(x) = 1.5599 When x=9, e(x) = 2.1608
The mean residual life is an increasing function of x, suggesting that this
distribution has a heavy tail.
8. Gamma(2,1) has a lighter tail than Exp(1)

9 (i) Compared to other forms of insurance, general insurance claims are positively
skewed with long tails. Therefore, they have more extreme claims and so
have heavier tails.
(ii)
(iii) The Weibull distribution has a heavier tail
www.sankhyiki.in
+91-9711150002
COPULAS
1. List, in words, the three technical properties which a copula function must satisfy
to ensure that it correctly captures the properties expected of a joint distribution
function.
2. An investor purchases three 5-year bonds from different companies within the
same industry sector. The probability that an individual bond defaults within the
first year is 10%.
(i) Using a Gumbel copula with parameter  = 2, calculate the probability
that all three bonds default within the first year.
(ii) Discuss the suitability of the Gumbel copula in this situation.
3. For the Clayton copula:

(i) Determine whether the generator function ( ) ( ) is valid.
(ii) Determine the inverse generator function.
(iii) Derive the Clayton copula function in the bivariate case.
4. For the Frank copula:

(i) Determine whether the generator function ( ) . /is valid.
(ii) Determine the inverse generator function.
(iii) Derive the Frank copula function.
5. (i) Derive the coefficient of lower and upper tail dependence for the Clayton
copula in the case where the parameter .
(ii) Comment on how the value of the parameter  affects the degree of lower
tail dependence in the case of the Clayton copula.
6. Derive the coefficient of lower tail dependence for the Gumbel copula in the case
where the parameter .
7. Let X and Y be two random variables representing the future lifetimes of two 40-
year old individuals. The two lives are married. You are given that:
PX 25) = 0.17831 and P(Y 25) = 0.11086
(i) Calculate the joint probability that both lives will die by the age of 65
using:
(a) the Gumbel copula with = 5
www.sankhyiki.in
+91-9711150002
(b) the Clayton copula with = 5

(c) the Frank copula with = 5 .
(ii) Comment on the results as well as on which copula you think is most
appropriate to use for modeling joint life expectancy.
8. Suppose that X and Y are random variables that can each take values in the range
( ) and that have the following characteristics:
 The marginal cumulative distribution function of X is ( ) ( )
 The marginal cumulative distribution function of Y is ( ) ( )
 The joint cumulative distribution function of X and Y is
( ) ( )
(i) Show that the copula function for X and Y is ( ) ( )
(ii) Show that this is an Archimedean copula with generator function
( )
(iii) Determine the coefficients of lower and upper tail dependence for this
copula.
9. You are modelling the returns on a portfolio of ten corporate bonds. Your
definition of default is that the return in any one year is less than minus 60%. The
probability that a single bond will default is 10%. You believe that the returns on
the bonds are linked by a Gumbel copula, with a single parameter  = 2.
The generator function for the Gumbel copula is ( ( )) .
(i) Calculate the probability that all ten bonds will have defaulted in one
year‘s time.
(ii) Explain the relevance of the correlation coefficient and the choice of
copula when considering the relationship between two or more variables.
(iii) Discuss the choice of the Gumbel copula in this case.
10. (i) List three Archimedean copulas.

(ii) Explain different situations in which it would be appropriate to use each
of these copulas.
11. Southwest Re is a start-up reinsurance company that is assessing its economic

capital using a Value at Risk approach calibrated to the 95th percentile loss over
one year. During its first year, Southwest Re underwrote four excess of loss
reinsurance treaties with the following features:
www.sankhyiki.in
+91-9711150002
Excess Probability of no loss occurring

(ie below excess)
Cornwall Insurance £50m 0.995
Devon Insurance £50m 0.985
Somerset Insurance £50m 0.975
Dorset Insurance £50m 0.965
Claims on the reinsurance treaties are assumed to be linked by a Gumbel copula

with parameter  = 2.5.
The generator function for a Gumbel copula with parameter  is:
( ( )) , ( ( ))-
The Chief Capital Officer has suggested that, because the probability of no losses
occurring on the four reinsurance treaties is greater than 95%, the reinsurer does
not need to hold any economic capital.
Verify the Chief Capital Officer‘s claim that the probability of no losses occurring
on the four reinsurance treaties is greater than 95%.
12. An investment company is analysing the likelihood of two corporate bonds

defaulting and is trying to decide which copula to use to model their dependence
structure.
Bond A has a probability of default in the following year of 0.05.
Bond B has a probability of default in the following year of 0.15.
You are given the following generator functions:

Gumbel Copula : ( ( )) , ( ( ))-
Clayton Copula : ( ( )) ( ( ) )
(i) Calculate the probability of both bonds defaulting in the following year
using:
(a) a Gumbel copula with parameter  = 2
(b) a Clayton copula with parameter  = 2 .
(ii) Explain which copula would be more appropriate.
13. Privet Partners, an investment company, is planning to launch a new investment

fund which will invest in rare books and old vinyl music records (‗records‘).
The returns on these two asset types appear to diversify each other reasonably
well – the Kendall‘s tau between the returns on books and records is 0.6.
www.sankhyiki.in
+91-9711150002
However, it does appear that when returns are poor, they are more likely to be
poor for both asset classes; strong positive returns for either books or records are
less likely to coincide.
(i) Describe the coefficient of lower tail dependence and its relevance to risk
modelling.
(ii) Recommend, with justification, an appropriate copula that could be used
to model joint returns between the book and record asset classes.
(iii) Discuss how your answer to part (ii) would change if you were modeling
losses rather than returns, with losses being defined as having the
opposite sign to returns.
www.sankhyiki.in
+91-9711150002
ANSWERS
1. Three technical properties a copula function must satisfy are:
1. A copula is an increasing function of its inputs.
2. If all the marginal CDFs are equal to 1 except for one of the marginal CDFs
then the copula function is equal to the value of that one marginal CDF.
3. A copula function always returns a valid probability.
2. (i) 0.0185
(ii) The Gumbel copula exhibits (non-zero) upper-tail dependence, the degree of
which can be varied by adjusting the single parameter. However, it exhibits no
lower tail dependence.
Hence, the Gumbel copula is appropriate if we believe that the three investments
are likely to behave similarly as the term approaches five years but not at early
durations.
This is unlikely to be the case though. If one bond defaults early on, then it may
be indicative of problems in the industry sector or the economy and so the other
investments may also be likely to default early on. [ó]
If we believe the performance of investments issued by companies within the
industry are much more closely associated (eg subject to the same systemic and
operational risk factors), then a copula that exhibits both lower and upper tail
dependence, such as the Student‘s t copula, may be more appropriate.
3. (ii) ( ) ( )
4. (ii) ( ) , ( ) -
5. (ii) As  increases, increases and hence 2-1/ increases. So the higher the value
of the parameter , the higher the degree of upper tail dependence.
7. (i)(a) 0.0986 (b) 0.1089 (c) 0.0583

(ii) The Clayton copula gives the highest probability of both lives dying within 25
years. This is because the Clayton copula exhibits lower tail dependence. This
means that if one life does not survive for long (ie dies early), there is a high
probability that the other life will not survive for long (ie will also die early).
www.sankhyiki.in
+91-9711150002
The Gumbel copula gives the lowest probability of both lives dying within 25
years. This is because the Gumbel copula exhibits upper tail dependence. This
means that if one life survives for a long time, there is a high probability that the
other life will also survive for a long time.
Studies also suggest that if one member of a married couple dies, this can
precipitate the death of the other member (‗broken heart syndrome‘). On this
basis, we might choose to use a copula function where there is a degree of
positive interdependence throughout, eg the co-monotonic (or minimum)
copula.
8. (iii)
9. (i) 0.0688%
(ii) Both are important in describing the overall relationship between the
dependant variables.
The correlation coefficient indicates the overall level of dependence between
the bond returns. The higher the value of the coefficient the greater the degree
of dependence.
The copula describes the shape of this relationship (ie how the level of
dependence varies with the level of return on the bonds).
(iii) The Gumbel copula has upper-tail dependence (the degree of this
dependence can be tailored by the choice of parameter).
This copula is suitable if the portfolio‘s bond returns are closely related in the
upper tail (ie extreme positive returns). If the opposite was more likely to be
the case (ie the bonds‘ rates of default may be related and hence poor
returns occur together) then a copula with lower-tail dependence (such as the
Clayton copula) may be more suitable.
However, the Gumbel and Clayton (Archimedean) copulas are parameterised
only with a single variable. This means that there is an implicit assumption
that the shape and level of correlation between each bond is assumed to be
identical, which might not be the case. A wider range of relationships could
be described by a two-parameter copula, such as the t -copula.
10. (i) The Gumbel, Frank and Clayton copulas are all Archimedean copulas.
(ii) The main difference between the copulas is in the tail dependency.
Gumbel copula
modelling situations where

associations increase for extreme high values
Frank copula
www.sankhyiki.in
+91-9711150002
modelling the relationship between equity

returns and bond returns, which do not usually show tail dependence
because their returns are not directly dependent on each other
Clayton copula
cy
The Clayton copula can have:
negative events are thought to happen together, eg returns from a portfolio of

shares where poor or negative returns are likely to occur simultaneously on a
number of investments (eg a market crash)
– making it similar to the Frank copula in this
respect.
11. The joint probability of no losses is 95.843%. This is just greater than 95% and
hence would seem to support the Chief Capital Officer‘s claim.
12. (i) (a) 0.0289 (b) 0.0475
(ii) The Clayton copula has lower-tail dependency (but no upper-tail
dependency) whereas the Gumbel copula has upper-tail dependency (but no
lower-tail dependency).
The choice of copula will depend on what dependent behaviour we might
expect these two bonds to exhibit, which in turn may depend on the
economic climate. For example, if the companies were closely related and
susceptible to a recession, then a Gumbel copula may be more appropriate.
Both copulas give a higher probability of both bonds defaulting than under
independence (joint probability of default = 0.05 * 0.15 = 0.0075 ).
13 (i) Tail dependencies are used to describe joint concentrations of risk at the
extremes of the marginal distributions.
For distributions of asset returns, lower tail dependence considers whether
very poor returns on one asset class are likely to be associated with very poor
returns on another asset class.
The coefficient lies in the range [0,1]. A value of zero indicates no dependence
in the tail. A value of 1 indicates perfect correlation in the tail.
As different copulas exhibit different tail dependence, the presence of tail
dependence can help in the selection of an appropriate copula.
(ii) A copula with lower tail dependency, but no upper tail dependency, is
required. The Clayton copula is suitable.
(iii) Modelling losses requires a copula with upper tail dependency. The Gumbel
copula could be used.

CS2A Workbook

Uploaded by

Copyright:

Available Formats

You might also like

CS2A Workbook

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CS2A Workbook

Uploaded by

Copyright:

Available Formats

www.sankhyiki.

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 1

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 2

REINSURANCE & LOSS DISTRIBUTION

1. The loss amount, X on a certain type of insurance policy, has a Pareto

3. (i) Show that:

(ii) Individual claim amounts on a certain type of general insurance policy

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 3

346. A policyholder excess of 100 is a standard condition on each policy,

(i) (a) Show that P(X > x) = . / (x > ).

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 4

6. The loss severity distribution for a portfolio of household insurance policies is

7. (i) Show that:

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 5

9. An insurer believes that claim amounts, X, on its portfolio of pet insurance

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 6

terms of X and hence derive an expression for the probability density

13. Losses on a portfolio of insurance policies in 2006 are assumed to have an

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 7

(ii) Show that the maximum likelihood estimate of  is:

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 8

20. Claims on a certain type of insurance policy are believed to follow an

21. Claim amounts on a certain type of insurance policy follow an exponential

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 9

0 if 0 <X < 80;

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 10

i) Calculate the size of the deductible.

26. Individual claims may be regarded as realizations of a random variable

c being an unspecified parameter. In a year, there are 1000 claims of amounts.

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 11

(i) Derive an explicit expression for the maximum likelihood estimator of

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 12

30. The amount, X, of a claim, in thousands of pounds, from an insurance

(a) Determine the probability that a claim involves the reinsurer.

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 13

34. An Insurer writes two classes of insurance business A and B.

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 14

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 15

2. (i) 6.19596 

15. (i) c = 0.00028   ii) 144.73

16. (i) 0.011379

17. (i) M= 160.94 (ii) E[Y] = 80 and E[Z] = 20

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 16

23. (ii) E[Z|Z>0] = 14,819.10 (iii) X ( )

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 17

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 18

(d) If N has a binomial distribution with parameters m and q,

4. A portfolio consists of a total of 120 independent risks. On each risk, no more

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 19

7. The number of claims arising from a hurricane in a particular region has a

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 20

(ii) Using a normal distribution to approximate the distribution of annual

9. The number of claims on a portfolio of washing machine insurance policies

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 21

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 22

15. An insurance portfolio contains policies for three categories of policyholder: A, B

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 23

18. The number of claims N on a portfolio of insurance policies follows a binomial

Satya Niketan | North Campus | Mumbai| Jaipur | Kolkata | Siliguri Page 24