Method of Moments

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Method of Moments

Definition. If {X1 , ... , Xn } is a sample from a population, then the empirical k-th moment
of this sample is defined to be

X1k + · · · + Xnk
n
X12 +X22 +X32
Example. For a sample {X1 , X2 , X3 } the empirical second moment is 3 .

Example. The empirical first moment of a sample {X1 , ... , Xn } is X1 +···+Xn


n which we
usually denote by X̄ or sometimes by X̄n to emphasize that there are n observations.

Assume a parametric distribution defined in terms of r parameters (θ1 , ... , θr ); for example
the distribution Exponential(θ) is define in term of only one parameter, while the
Normal(µ , σ 2 ) has two parameters. Suppose that we have observed n data points from the
population under study. We want to use this data set to estimate the parameters of the
model. Based on this sample we can calculate the empirical moments. Then to estimate the
parameters of the model, we match the first r empirical moments with their theoretical
counterparts:

theoretical k-moment = (sample or) empirical k-th moment, k=1 , ... , r

So, here we have r equations in r unknowns (θ1 , ... , θr ). By solving this system of equations
we get the estimates for the parameters.

Note. Some manuscripts use the notation E(M ) , ... , E(M r ) to denote the sample moments.

Example ∗. Four losses are observed from a Gamma distribution. The observed losses are
200 , 300 , 350 , and 450 . Find the method of moments estimate for α.

Solution. First Step: The Gamma distribution has two parameters α and θ. The theoretical

1
first moment is E(X) = αθ and the theoretical second moment is E(X 2 ) = α(α + 1)θ2 . Now
we calculate the empirical counterparts:

Second Step: Calculate the sample moments:

200 + 300 + 350 + 450


sample first moment = = 325
4

2002 + 3002 + 3502 + 4502


sample second moment = = 113, 750
4

Third Step: Form the equations:

E(X) = E(M ) ⇒ αθ = 325

E(X 2 ) = E(M 2 ) ⇒ α(α + 1)θ2 = 113, 750



 αθ = 325 (1)
 α(α + 1)θ2 = 113, 750 (2)

α(α + 1)θ2 113, 750 α+1


= 2
= ⇒ = 1.0769 ⇒ α = 13
(αθ) (325)2 α

Example ∗. You are given the following sample of five claims:

4 , 5 , 21 , 99 , 421

You fit a Pareto distribution using the method of moments. Determine the 95th percentile of
the fitted distribution.

θ 2θ 2
Solution. For the Pareto distribution we have E(X) = α−1 and E(X 2 ) = (α−1)(α−2) . On the
other hand:

 E(M ) = 4+5+21+99+421
= 110
5
 E(M 2 ) = 42 +52 +212 +992 +4212
= 37504.8
5

2
Next Step: Form the equations:

θ
E(X) = E(M ) ⇒ = 110
α−1

E(X 2 ) = E(M 2 ) ⇒ α(α + 1)θ2 = 37504.8



 θ
= 110 (1)
α−1
 2θ2
= 37504.8 (2)
(α−1)(α−2)

/ ( )2
2θ2 θ 37504.8 2(α − 1)
⇒ = = 3.0996 ⇒ = 3.0996
(α − 1)(α − 2) α−1 1102 α−2

⇒ cross multiplying ⇒ α = 3.819 ⇒ putting in (1) ⇒ θ = 310.08


( )α ( )3.819
θ 310.08
.95 = F (x) = 1 − =1− ⇒ x = 369
x+θ x + 310.08

Example ∗. For a sample of dental claims x1 , x2 , ... , x10 , you are given:
∑ ∑ 2
(i). xi = 3860 and xi = 4, 574, 802.
(ii). Claims are assumed to follow a lognormal distribution with parameters µ and σ.
(iii). µ and σ are estimated using the method of moments.
Calculate E(X ∧ 500) for the fitted distribution.

Solution. For the lognormal distribution we have E(X) = exp(µ + 12 σ 2 ) and


E(X 2 ) = exp(2µ + 2σ 2 ). We equate these with their sample counterparts:

 exp(µ + 1 σ 2 ) = 3860
2 10 = 386
 exp(2µ + 2σ 2 ) = 4,574,802
= 457, 480.2
10

/( )
1 2 2 457, 480.2
⇒ exp(2µ + 2σ ) 2
exp(µ + σ ) = = 3.07 ⇒ exp(σ 2 ) = 3.07
2 (386)2

3
⇒ σ 2 = ln(3.07) = 1.122 ⇒ exp(2µ + 2(1.122)) = 457, 480.2 ⇒ µ = 5.39

Now (from the table):


( ) [ ( )]
µ+ 21 σ 2 ln r − µ − σ 2 ln r − µ
E(X ∧ r) = e Φ +r 1−Φ ⇒
σ σ
( ) [ ( )]
ln 500 − 5.39 − 1.122 ln 500 − 5.39
E(X ∧ 500) = 386 Φ √ + 500 1 − Φ
1.122 σ
[ ]
= 386 Φ(−0.29) + 500 1 − Φ(0.77) = 259

Here is an example for mixture distribution:

Example ∗. You are given the following:

• The random variable X has the density function

• f (x) = wf1 (x) + (l − w)f2 (x), 0 < x < ∞, 0 ≤ w ≤ 1.

• A single observation of the random variable X yields the value 1.


∫∞
• 0 xf1 (x)dx = 1
∫∞
• 0 xf1 (x)dx = 2

• f2 (1) = 2f1 (1) ̸= 0

Determine the method of moments estimate of w.

Solution.

E(X) = w E(X1 ) + (1 − w)E(X2 ) = w(1) + (1 − w)(2) = 2 − w

But the single observation x1 = 1, so the sample mean is also 1. By equating the theoretical
mean with the sample mean, we get:

4
2−w =1 ⇒ w=1

Here is an example for non-classical distributions:

Example ∗. You are given the following:

• The random variable X has the density function f (x) = 2


θ2
(θ − x) , 0 < x < θ.

• A random sample of two observations of X yields values 0.50 and 0.90.

Determine θ̄, the method of moments estimator of θ.

Solution.
∫ θ ( ∫ θ ∫ θ ) ( )
2 2 2 θ3 θ3 θ
E(X) = x(θ − x)dx = θ x dx − x dx = − =
θ2 0 θ2 0 0 θ2 2 3 3

On the other hand, the sample first moment is:

0.5 + 0.9
= 0.7
2

Matching the two values gives us:

θ
= 0.7 ⇒ θ = 2.1
3

Here is an example for dealing with discrete distributions:

Example. We want to estimate the parameters β and r in the negative binomial distribution.
The first and second empirical moments are 6 and 60. Find the method of moment estimate
of P (N ≥ 2).

Solution.

5

 E(N ) = rβ
 E(N 2 ) = Var(N ) + E(N )2 = rβ(1 + β) + (rβ)2


 rβ = 6

 rβ(1 + β) + (rβ)2 = 60

By putting rβ = 6 in the second equation, we get

6(1 + β) + 36 = 60 ⇒ β=3 ⇒ r=2

1 rβ
P (N ≥ 2) = 1 − P (N = 0) − P (N = 1) = 1 − r

(1 + β) (1 + β)r+1

1 6
=1− 2
− 3 = 0.8438
4 4

Below is an example for dealing with negative moments:

Example. We have observed the following claim sizes

10 , 13 , 16 , 20 , 23

We want to fit an inverse exponential model to this data. Calculate the method of moments
estimate for the probability of claim being higher than 12.

Solution. For this distribution only the negative moments exist. If θ is the parameter of this
distribution, then we have

E(X −1 ) = θ−1

On the other hand, the sample negative moment is:

1 1 1 1 1
10 + 13 + 16 + 20 + 23
= Harmonic average = 0.0666
5

Matching the two values gives us:

6
E(X −1 ) = E(M −1 ) ⇒ θ−1 = 0.0666 ⇒ θ = 15.0195

Using the table, we will have:

15.0195
F (x) = exp(−θx) ⇒ P (X > 12) = 1 − F (12) = 1 − exp(− ) = 0.7140
12

Below is an example dealing with censoring. But before getting to the example, note that in
presence of right censoring the expected value of claim is actually the limited expected value
E(X ∧ u) where u is the policy limit. The moments for this distribution are E[(X ∧ u)k ].
These moments will be used for the purpose of method of moments estimation. So, the model
distribution and the sample distribution are both censored.

Example. We have observed the following 10 values of claim sizes:

100 , 100 , 150 , 170 , 170 , 200 , 220 , 300+ , 300+ , 300+

where the last three are censored observations. We want to fit the distribution
Pareto(α = 2 , θ) to this data. What is the method of moments estimation for θ ?.

Solution.

100 + 100 + 150 + 170 + 170 + 200 + 220 + 300 + 300 + 300
E(M ∧ 300) = = 201
10

Using the table, we will have:


[ ( )α−1 ]
θ θ 300θ
E(X ∧ 300) = 1− =
α−1 300 + θ 300 + θ

300θ
E(X ∧ 300) = E(M ∧ 300) ⇒ = 201 ⇒ 300θ = 201(300 + θ)
300 + θ

(201)(300)
⇒ t= = 609.1
99

7
[ ( )2 ]
Note. If we had α = 3, then we would need to solve the equation θ
2 1− θ
300+θ = 201
which would require a numerical method to solve it.

Below is an example dealing with left truncation. For this type of situations, the model
distribution and the sample distribution must be truncated. Note that in this process, an
estimation for the underlying distribution is found , but not for the truncated one.

Example. The following values of a random variable have been observed

0.55 , 0.60 , 0.73 , 0.82 , 0.95

We want to fit the distribution with density

f (x) = 1 + c + 2cx 0≤x≤1

where 0 ≤ c ≤ 1. What is the MME for c if the distribution is assumed to be truncated at


x = 0.63

Solution.

0.73 + 0.82 + 0.95


E(M | M > 0.63) = = 0.8333
3
∫1 ∫1 ∫1
x f (x) dx x (1 + c + 2cx) dx (x + cx + 2cx2 ) dx
E(X | X > 0.63) = ∫0.63
1 = ∫0.63
1 = 0.63
∫1
0.63 f (x) dx 0.63 (1 + c + 2cx) dx 0.63 (1 + c + 2cx) dx
[ ]
3 1
( 1+c 2 2
2 x + 3 cx ) 0.63 0.3016 + 3.1349 ∗ c
= =
[(1 + c)x + cx2 ]10.63 0.37 + 0.9731 ∗ c

0.3016 + 3.1349 ∗ c
E(X | X > 0.63) = E(M | M > 0.63) ⇒ = 0.8333 ⇒ c = 0.003
0.37 + 0.9731 ∗ c

Below is an example dealing with (complete) grouped data. So in general assume that we
have k intervals formed by the boundary points:

c 0 < c1 < · · · < c k

8
and that n is the total number of observations, and that nj is the number of observations

falling in j-th interval (cj−1 , cj ]. So we have n = kj=1 . For a grouped data no adjustment is
needed for the model being fitted, but on the other hand since the exact values of the sample
points are not known , we need to approximate the sample moments. If in each interval
(cj−1 , cj ] no point has an advantage over another point, then we can assume that the the
sample points have uniform distribution over this interval. Then the average value in each
interval is just the midpoint of the interval. Then the weighted average


k
cj−1 + cj nj 1 ∑ cj−1 + cj
k
= nj
2 n n 2
j=1 j=1

is taken for the sample mean x̄.

Example. Consider the following grouped data we had sometime ago..

Interval Number of observations


(0 , 2] 25
(2 , 10] 10
(10 , 100] 10
(100 , 1000] 5

We want to fit an Exponential(θ) to this data. Find the MME for θ.

Solution.

1 ∑ cj−1 + cj
k
E(M ) = nj
n 2
j=1

[( ) ( ) ( ) ( ) ]
1 0+2 2 + 10 10 + 100 100 + 1000
= (25) + (10) + (10) + (5) = 67.7
50 2 2 2 2

E(X) = E(M ) ⇒ θ = 66.58

9
Finally we see here in an example how to calculate the limited expected value E(X ∧ u) for a
grouped data. Suppose you are given the following grouped data:

Interval Number of observations


(0 , 2] 25
(2 , 10] 10
(10 , 100] 10
at 100 45

This data is right-censored at 100. Then the limited expected value E(X ∧ 100) is calculated
through:
[( ) ( ) ( ) ]
1 0+2 2 + 10 10 + 100
E(X ∧ 100) = (25) + (10) + (10) + (100)(45) = 57.06
90 2 2 2

10

You might also like