Tarifacao Por Experiencia Complementar

38 2 The Bayes Premium
2.5 Conjugate Classes of Distributions

The above three cases are contained in the general framework that we are
going to discuss in this section. Let the observation vector X have indepen-
dent components, which — up to some weighting — have the same conditional
distribution F& given = &. We look at the family of possible marginal dis-
tributions F = {F& : & 5 }. In insurance practice, the specication of the
family F is in itself a problem. Sometimes, there are indications about which
families might be appropriate. For example, for modelling the random vari-
able of the number of claims, the Poisson distribution is often a reasonable
choice. However, in many situations, we have no idea about the distribution
of the observations and specifying F is very problematic. How we deal with
such situations will be discussed in Chapter 3.
The choice of the family F is not easy, but the choice of the structural
function U (&), or more precisely, the choice of a family U = {U (&) | 5 }
of structural functions indexed with the hyperparameter 5 , to which U (&)
belongs, is even more di!cult.
The following conditions should be satised so that the whole framework
provides an instrument which is suitable for use in practice:
i) The family U must be big enough so that it contains su!ciently many
distributions which could, in reality, describe the collective.
ii) The family U should be as small as possible to re ect our understanding
of the collective.
iii) Ideally, the families F and U should be chosen so that the model is math-
ematically tractable and in particular, so that the Bayes premium can be
written in an analytic form.
In order to guarantee such behaviour, a useful concept is that of conjugate
families of distributions.
Denition 2.16. The family U is conjugate to the family F if for all 5
and for all realizations x of the observation vector X there exists a 0 5
such that
U (&| X = x) = U 0 (&) for all & 5 , (2.26)
i.e. the a posteriori distribution of given X is again in U for every a priori
distribution from U.
Remarks:
• The biggest possible family U (that which contains all distribution func-
tions) is conjugate to any given family F. However, in order that our
resulting model be useful, it is important that U should be as small as
possible, while still containing the range of “realistic” distributions for the
collective.
• In Section 2.4 we saw three examples of conjugate classes of distributions
(Poisson—gamma, binomial—beta, normal—normal).
2.5 Conjugate Classes of Distributions 39
2.5.1 The Exponential Class and Their Associated Conjugate

Families
Denition 2.17. A distribution is said to be of the exponential type, if it can

be expressed as
· ¸
x& b(&)
dF (x) = exp + c(x, 2 /w) d(x), x 5 A R. (2.27)
2 /w
In (2.27) denotes either the Lebesgue measure or counting measure, b(·) is
some real-valued twice-dierentiable function of &, and w and 2 are some
real-valued constants.
Denition 2.18. The class of distributions of the exponential type as dened
in (2.27) is referred to as the one (real-valued) parameter exponential class
Fexp = {F& : & 5 } . (2.28)
Remarks:
• The one parametric exponential class Fexp covers a large class of fami-
lies of distributions. It includes, among others, the families of the Poisson,
Bernoulli, gamma, normal and inverse-Gaussian distributions. It plays a
central role in the framework of general linear models (GLM), which it-
self belongs to the standard repertoire of tools used in the calculation of
premiums depending on several rating factors.
• Each of such families within Fexp is characterized by the specic form of
b,c
b (.) and c (., .) . We will denote such a specied family by Fexp .
• The above parameterization of the density is the standard notation used in
the GLM literature (see, for example, McCullagh and Nelder [MN89] or the
description of the GENMOD procedure [SAS93]), and is often referred to
as the natural parametrization. The term 2 /w in (2.27) could be replaced
by a more general function a( 2 ). In the literature, the exponential class is
often dened in this slightly more general way. However, this more general
form is of a rather theoretical nature, since in practical applications and
also in the GENMOD procedure, it is almost always true that a( 2 ) =
2 /w.
• The parameter & is referred to as the canonical parameter. In our context,
& is to be interpreted as the risk prole taking values in . Observe that & is
here one-dimensional and real. The parameter 2 is called the dispersion
parameter, assumed to be xed. Lastly, the quantity w denotes a prior
known weight (w = weight) associated with the observation. Whereas the
dispersion parameter 2 is constant over observations, the weight w may
vary among the components of the observation vector.
• Another interesting parametrization can be found in [Ger95].
b,c
Let us now take a specic family Fexp 5 Fexp characterized by the specic
form of b (.) and c (., .) .
Theorem 2.19. We assume that for a given &, the components of the vector
b,c
X = (X1 , . . . , Xn ) are independent with distribution F& 5 Fexp , each with the
same dispersion parameter 2 and with weights wj , j = 1, 2, . . . , n.
We consider the family
b
© ¡ ¢ ª
Uexp = u (&) : = x0 , 2 R × R+ , (2.29)
where · ¸
x0 & b(&)
u (&) = exp + d(x0 , 2 ) , & 5 , (2.30)
2
are densities (with respect to the Lebesgue measure).
b b,c
Then it holds that Uexp is conjugate to Fexp .
Remarks:
• x0 and 2 are £hyperparameters.
¤
• Note that exp d(x0 , 2 ) in (2.30) is simply a normalizing factor.
b,c
• Note that the family conjugate to Fexp depends only on the function b (&)
¡ ¢
and not on c x, 2 /w .
Proof of Theorem 2.19:
For the a posteriori density of given X we get
n
Y · ¸ · ¸
xj & b(&) x0 & b(&)
u ( &| X = x) 2 exp 2 /w
· exp
j=1
j 2
5h i1 h 2 i 6
2
9 2 + w• 2 x0 + w• x & b(&) :
= exp 7 ¡ 2 ¢ 8,
2 / 2 + w•
Xn wj Xn
where x = xj and w• = wj .
j=1 w• j=1
We see immediately that with the a priori distribution (with the hyperpara-
meters x0 , 2 ) the a posteriori distribution, given X = x, is again in Uexp
b
,
with updated hyperparameters
μ ¶μ 2 ¶1 μ ¶1
2 2
x00 = x0 + w• x + w• and 02
= 2
+ w• .
2 2 2
(2.31)
This proves Theorem 2.19. ¤
We now want to determine P ind , P coll and P Bayes .

b,c b
Theorem 2.20. For the family Fexp and its conjugate family Uexp we have
i)
P ind (&) = b0 (&) and Var [ Xj | = &, wj ] = b00 (&) 2 /wj . (2.32)
If the region is such that exp [x0 & b(&)] disappears on the boundary of
for each possible value x0 , then we have
ii)
P coll = x0 , (2.33)
iii)
P Bayes = X + (1 )P coll , (2.34)
where
X wj
X= Xj ,
j w•
w•
= 2 .
w• + 2
Remarks:
• Note that P Bayes is a weighted average of the individual claims experience
and the collective premium. It is a linear function of the observations, and
therefore a credibility premium. The case where P Bayes is of a credibility
type is often referred to as exact credibility in the literature.
• The credibility weight has the same form as in (2.25) and w• now plays
the role of the number of observation years.
Proof of Theorem 2.20:

For the proof of (2.32) we consider the moment-generating function of X given
= & and given the weight w.
£ ¯
mX (r) = E erX ¯ = &]
Z · ¸
rx x& b (&) ¡ 2
¢
= e exp + c x, /w d (x)
2 /w
Z " ¡ ¢ ¡ ¢ #
2 2 ¡ ¢
x & + r /w b & + r /w
= exp + c x, 2 /w d (x)
2 /w
" ¡ ¢ #
b & + r 2 /$ b (&)
× exp
2 /w
" ¡ ¢ #
b & + r 2 /w b (&)
= exp ,
2 /w
where the last equality follows because the integral term is equal to 1 (inte-
gration of a probability density function). The cumulant-generating function
is therefore given by
¡ ¢
b & + r 2 /w b (&)
kX (r) := ln mX (r) = .
2 /w
From this we get
P ind (&) := E [X| = &] = kX 0

(0) = b0 (&) ,
00
Var [X| = &] = kX (0) = b00 (&) 2 /w.
We have thus proved (2.32).

For the proof of (2.33) we see that
Z · ¸
coll x0 & b (&) © ¡ 2
¢ª
P = μ (&) exp d& · exp d x0 ,
2

Z · ¸
0 x0 & b (&) © ¡ ¢ª
= b (&) exp 2
d& · exp d x0 , 2

Z · ¸
0 x0 & b (&) © ¡ 2
¢ª
= x0 (x0 b (&)) exp d& · exp d x0 ,
2

2
= x0 exp [x0 & b (&)]|C , (2.35)
where C denotes the boundary of . The choice of the parameter region

turns out to be crucial. Under the technical assumption made in Theorem
2.20, that has been chosen in such a way, that the boundary term disappears
for every x0 , we get
P coll = x0 ,
so that we have proved (2.33).
From the proof of Theorem 2.19, where we showed that the a posteriori
b
distribution of given X = x also belongs to Uexp with hyperparameters
μ 2
¶μ 2 ¶1

x00 = x0 + w• x + w•1 ,
2 2
μ 2 ¶1
02 2
= + w• ,
2
we get without further ado
P Bayes = ^
μ () = E [μ () |X] = X + (1 ) x0 ,
Xn Xn
wj w•
where X = Xj , w• = wj , = .
w
j=1 • j=1
w• + 2 / 2
This proves (2.34) and thus ends the proof of Theorem 2.20. ¤
Under the following special assumptions, we get the classical examples

(including those already seen in Section 2.4):
N
a) Poisson—Gamma: We have frequency observations Xj = wjj , where, con-
ditional on &, Nj is Poisson distributed with parameter j = wj &. The
density of Xj is then given by
wj x
wj & (wj &) k
f& (x) = e for x = , k 5 N0 .
(wj x)! wj
This can be written in the form (2.27) as follows:

³ ´ ³ ´
e e
& = log &, b & = exp & e , e2 = 1, e = wj ,
w
³ ´
e2 /w
c x, e = log ((wx)!)
e e log w.
+ wx e
b
To nd the conjugate family of prior distributions Uexp e and
we insert &
³ ´
e into (2.30) and we get
b &
5 ³ ´6
e exp &
x0 & e
e 2 exp 7
u (&) 8.
2
e
This density, expressed in terms of the original variable & rather than &,
becomes · ³ ´ ¸
1 x0 / 2 &
u (&) 2 · exp log & 2 (2.36)
&
x0 1
= & 2 1 e 2 & .
Note that by the change of variables from & e to & we have to take into
e
account the rst derivative d&/d& (term 1/& on the right-hand side of
(2.36)).
Hence n x0
o
b 1 12 & 2
Uexp = u (&) : u (&) 2 & 2 e ; x0 , > 0 ,
which is the family of the Gamma distributions.

For P coll we get
x0
P coll = E [] = 2 2 = x0 ,

hence (2.33) is fullled.
N
b) Binomial—Beta: We have frequency observations Xj = wjj , where condi-
tional on &, Nj has a Binomial distribution with n = wj and p = &. The
density of Xj is then given by
μ ¶
wj w w x
f& (x) = &wj x (1 &) j j for x = k/wj , k = 0, . . . , wj , wj 5 N.
wj x
This can be written in the form of (2.27) as follows:
³ ´ ³ ´ ³ ´
e = log & , b &
& e = log 1 + e&e , e2 = 1,

1&
³ ´
2 ¡ we ¢
we = wj , c x, e /w
e = log wx e .
From (2.30) we get

n x0 1x0 o
b
Uexp = u (&) : u (&) b & 2 1 (1 &) 2 1 ; 0 < x0 < 1, 2 > 0 ,
which is the family of beta distributions.

For P coll , we get
x0
coll 2
P = E [] = x0 = x0 .
2 + 1x
2
0
c) Gamma—Gamma: We have observations Xj , that are, conditional on &,

Gamma distributed with shape parameter wj and scale parameter wj &,
where wj is the weight associated with Xj . In particular, this is the case
if the observations Xj are the average of wj independent claim sizes, each
of them Gamma distributed with shape parameter and scale parameter
&. The density of Xj is then given by
w
(wj &) j wj 1 wj &x
f& (x) = x e .
(wj )
This can be written in the form (2.30) as follows:
³ ´ ³ ´
e = &, b &
& e = log & e , e2 = 1 , w
e = wj ,
³ ´
e2 /w
c x, e = (we 1) log (x) log ( (w))
e e log (w)
+ w e .
Note that
μ (&) := E& (Xj ) = &1 .
From (2.30) we get
n 1 x0
o
b
Uexp = u (&) : u (&) 2 & 2 e 2 & ; 2 5 (0, 1) , x0 > 0 ,
which is the family of gamma distributions.

One can easily check that
P coll = E [μ ()] = x0 .
d) NormalNormal: We have observations Xj that are, conditional on &,

normally distributed with expected value & and variance 2 /wj . The den-
sity of Xj is then given by
( )
2
(x &)
f&,,wj (x) = (2 2 /wj )1/2 exp 2 .
2 /wj

³ ´
e
& = &, b & e =& e2 /2, e2 = 2 , w e = wj ,
³ ´ ³ 2 ³ ´´
2 1 x 2
c x, e /we = 2 e 2 /we + log 2e /we .
From (2.30) we get that

( Ã ! )
2
b (& x0 )
Uexp = u (&) : u (&) 2 exp ; x0 , 2 R × R+ ,
2 2
i.e. the family of conjugate distributions is the family of normal distribu-

tions. From this we can immediately see that
P coll = E [] = x0 .
e) GeometricBeta: We have observations Xj , that are, conditional on &,

distributed with a geometric distribution with parameter & (i.e. without
weights, or equivalently, all weights wj are equal to 1). The density of Xj
is then given by
f& (x) = (1 &)x &, x 5 N.
³ ´ ³ ´
e = log (1 &) , b &
& e = log 1 e&e ,
³ ´
2 2
e = 1, w
e = 1, c x, e /we = 0.
Note that
1&
μ (&) = E& (Xj ) = .
&
From (2.30) we get that
n x0 1
o
b 1 2 2
Uexp = u (&) : u (&) 2 (1 &) 2 & 2 ; x0 > , 0 < < 1 ,
which is the family of the Beta distributions.

Again, one can easily verify, that
P coll = E [μ ()] = x0 .
2.5.2 Construction of Conjugate Classes

The following theorem is often helpful when we are looking for a conjugate
family U to the family F = {F& : & 5 }.
Theorem 2.21.
Assumptions: F and U satisfy the following conditions:
i) The likelihood functions l(&) = f& (x), x xed, are proportional to an ele-
ment of U, i.e. for every possible observation x 5 A, there exists a ux 5 U,
¡R ¢1
such that ux (&) = f& (x) f& (x)d& .
ii) U is closed under the product operation, i.e. for every pair u, v 5 U we
¡R ¢1
have that u(·)v(·) u(&)v(&)d& 5 U.
Then it holds that U is conjugate to F.
Proof: Given X = x, we nd the following a posteriori distribution:
f& (x)u(&)
u (& | x) = R
f& (x)u(&)d&
R
f& (x)d&
= R ux (&) u(&) 5 U.
f& (x)u(&)d&
This proves Theorem 2.21. ¤
Theorem 2.21 indicates how we should construct a family U, which is con-

jugate to the family F. Dene
( μZ ¶1 )
U 0 = ux : ux (&) = f& (x) f& (x)d& , x5A .
If U 0 is closed under the product operation then U 0 is conjugate to F. If not,

it can often be extended in a natural way.
Example:
Let F be the family of binomial distributions
F = {f& (x) : & 5 [0, 1]} with
μ ¶
n P
n
f& (x) = x• &x• (1 &)nx• , x• = xj , xj 5 {0, 1}.
j=1
Then we see that U 0 consists of the Beta distributions with (a, b) 5 {(1, n +1),
(2, n), (3, n 1),. . . , (n + 1, 1)}.
U 0 is not closed under the product operation. A natural extension is

U = {Beta (a, b) : a, b > 0} .
It is easy to check that U is closed under the product operation and that U is
therefore, by Theorem 2.21, conjugate to F.
2.6 Another Type of Example: the Pareto3Gamma Case 47
2.6 Another Type of Example: the ParetoGamma Case

The model and the results of this section are taken from [Ryt90].
Motivation
A frequent assumption in the practice of reinsurance is that the claim amounts
of those claims which exceed a given limit x0 are Pareto distributed. Typically,
based on information from numerous portfolios, the reinsurer has an “a priori”
idea about the level of the “Pareto parameter” &. He also collects information
from the primary insurer about all claims exceeding a particular limit c0 , and
he can also use this information to estimate &. The question then arises as to
how he can best combine both sources of information in order to estimate &
as accurately as possible.
Let X0 = (X1 , . . . , Xn ) be the vector of observations of all the claim sizes, of
those claims belonging to a given contract, whose size exceeds x0 . We assume
that the Xj (j = 1, . . . , n) are independent and Pareto distributed,
Xj Pareto (x0 , &), (2.37)
with density and distribution function

μ ¶(&+1) μ ¶&
& x x
f& (x) = and F& (x) = 1 for x x0 , (2.38)
x0 x0 x0
and with moments

&
μ(&) = x0 , if & > 1,
&1
&
2 (&) = x20 , if & > 2.
(& 1)2 (& 2)
In order to incorporate the a priori knowledge, we further assume that

the Pareto parameter & is itself the realization of a random variable with
distribution function U (&). In order to specify an appropriate class U of a pri-
ori distributions, to which U (&) belongs, we use the technique from Theorem
2.21.
The likelihood function leads to the following family:
; 3 3 4 4<
? μ ¶
xj D D@
X n
U 0 = ux : ux (&) 2 &n exp C C ln & .
= x0 >
j=1
The elements of U 0 are Gamma distributions. The natural extension of U 0 is

therefore the family of the Gamma distributions, i.e.
½ ¾
1
U = distributions with densities u(&) = & exp{&} . (2.39)
()
Since the family U is closed with respect to the product operation, we have
that U is conjugate to
F = {Pareto(x0 , &) : &, x0 > 0} .
We are looking for the best estimator of the Pareto parameter .

From the form of the likelihood function (2.38) and with the form of the a
priori distribution (2.39) we get for the a posteriori density of given x
; 3 4 <
? μ ¶
xj D @
Xn
ux (&) 2 &+n1 exp C + ln & .
= x0 >
j=1
We see that ux (&) is again the density of a Gamma distribution with updated
parameters
Xn μ ¶
0 0 xj
=+n and =+ ln .
j=1 x0
The Bayes estimator for is therefore given by
e = E [|X] = +n
P
n ³ ´. (2.40)
X
+ ln x0j
j=1
Formula (2.40) can be written in another, more easily interpretable form. In

order to do this, we rst consider the maximum-likelihood estimator for the
Pareto parameter &, i.e. the estimator that one might use if we only had data
and no a priori information. This is given by
bMLE = n
& P ³ Xj ´ .
n
ln x0
j=1
(2.40) can now be written as
bMLE + (1 ) · ,
e =·&

where 3 43 4
X n μ ¶ Xn μ ¶ 1
Xj D C Xj D
=C ln + ln .
j=1
x0 j=1
x0
The Bayes estimator for is therefore a weighted average of the maximum-

likelihood estimator and the a priori expected value E [] = /, where,
however, in contrast to the earlier examples, the weight depends on the
observations and is not a credibility weight in the strict sense of Theorem
2.20.

Tarifacao Por Experiencia Complementar

Uploaded by

Copyright:

Available Formats

You might also like

Tarifacao Por Experiencia Complementar

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tarifacao Por Experiencia Complementar

Uploaded by

Copyright:

Available Formats

38 2 The Bayes Premium

2.5 Conjugate Classes of Distributions

2.5.1 The Exponential Class and Their Associated Conjugate

Denition 2.17. A distribution is said to be of the exponential type, if it can

Fexp = {F& : & 5 } . (2.28)

We now want to determine P ind , P coll and P Bayes .

Proof of Theorem 2.20:

P ind (&) := E [X| = &] = kX 0

We have thus proved (2.32).

where C denotes the boundary of . The choice of the parameter region

Under the following special assumptions, we get the classical examples

This can be written in the form (2.27) as follows:

which is the family of the Gamma distributions.

From (2.30) we get

which is the family of beta distributions.

c) Gamma—Gamma: We have observations Xj , that are, conditional on &,

which is the family of gamma distributions.

d) NormalNormal: We have observations Xj that are, conditional on &,

This can be written in the form of (2.27) as follows:

From (2.30) we get that

i.e. the family of conjugate distributions is the family of normal distribu-

e) GeometricBeta: We have observations Xj , that are, conditional on &,

which is the family of the Beta distributions.

2.5.2 Construction of Conjugate Classes

Theorem 2.21 indicates how we should construct a family U, which is con-

If U 0 is closed under the product operation then U 0 is conjugate to F. If not,

U 0 is not closed under the product operation. A natural extension is

2.6 Another Type of Example: the ParetoGamma Case

Xj  Pareto (x0 , &), (2.37)

with density and distribution function

and with moments

In order to incorporate the a priori knowledge, we further assume that

The elements of U 0 are Gamma distributions. The natural extension of U 0 is

F = {Pareto(x0 , &) : &, x0 > 0} .

We are looking for the best estimator of the Pareto parameter .

Formula (2.40) can be written in another, more easily interpretable form. In

(2.40) can now be written as

The Bayes estimator for  is therefore a weighted average of the maximum-

You might also like

Denition 2.17. A distribution is said to be of the exponential type, if it can

Fexp = {F& : & 5 } . (2.28)

P ind (&) := E [X| = &] = kX 0

where C denotes the boundary of . The choice of the parameter region

Xj Pareto (x0 , &), (2.37)

We are looking for the best estimator of the Pareto parameter .

The Bayes estimator for is therefore a weighted average of the maximum-