Statistics 512 Notes 18

Statistics 512 Notes 18:
Multiparameter maximum likelihood estimation

We consider
1
, ,
n
X X K
iid with pdf
( ; ), f x
where
1
( , , )
p
K
is p-dimensional.
As before,
1
1
1
1
( ) ( ; , , )
( ) log ( ) log ( ; , , )
n
i p
i
n
i p
i
L f x
l L f x

K
K
The maximum likelihood estimate is
arg max ( ) arg max ( )

MLE
L l

We can find critical points of the likelihood function by
solving the vector equation
1
1
1
2
1
( , , ) 0
( , , ) 0
( , , ) 0
p
p
p
p
l
l
l

K
K
M
K
We need to then verify that the critical point is a global
maximum.
Example 1: Normal distribution
1
, ,
n
X X K
iid
2
( , ) N
2
2
1
1
( ) 1 1
( , , ; , ) exp
2
2
n
i
n
i
x
f x x

1

1

]
,
K
2
2
1
1
( , ) log log 2 ( )
2 2
n
i
i
n
l n X
The partials with respect to
and
are
2
1
3 2
1
1
( )
( )
n
i
i
n
i
i
l
X
l n
X
Setting the first partial equal to zero and solving for the
mle, we obtain
MLE
X
Setting the second partial equal to zero and substituting the
mle for
, we find that the mle for
is
2
1
1
( )
n
MLE i
i
X X
n
.
To verify that this critical point is a maximum, we need to
check the following second derivative conditions:
(1) The two second-order partial derivatives are negative:
2
2
,
0
MLE MLE
l

<
and
2
2
,
0
MLE MLE
l

<
(2) The Jacobian of the second-order partial derivatives is

positive,
2 2
2
2 2
2
,

0

MLE MLE
l l
l l

>

See additional sheet for verification of (1) and (2) for
normal distribution.
Example 2: Gamma distribution
1 /
1
, 0
( ) ( ; , )
0, elsewhere
x
x e x
f x

< <

'
[ ]
1
( , ) log ( ) log ( 1) log /
n
i i
i
l X X
The partial derivatives are

1
2
1
'( )
log log
( )
n
i
i
n
i
i
l
X
X l
1
+
1

]
1
+
1
Setting the second partial derivative equal to zero, we find

1
n
i
i
MLE
MLE
X
n

When this solution is substituted into the first partial
derivative, we obtain a nonlinear equation for the MLE of
:
1
1
'( )
log log log 0

( )
n
n
i
i
MLE i
i
X
n n n X
n
+ +
This equation cannot be solved in closed form. Newtons

method or another iterative method can be used.
digamma(x) = function in R that computes the derivative of
the log of the gamma function of x,
'( )
( )
x
x
.
uniroot(f,interval) = function in R that finds the
approximate zero of a function in the interval. There
should be only one zero and the lower and upper points of
the interval should have opposite signs.
alphahatfunc=function(alpha,xvec){
n=length(xvec);
eq=-n*digamma(alpha)-n*log(mean(xvec))+n*log(alpha)
+sum(log(xvec));
eq;
}
> alphahatfunc(.3779155,illinoisrainfall)
[1] 65.25308
> alphahatfunc(.5,illinoisrainfall)
[1] -45.27781
alpharoot=uniroot(alphahatfunc,interval=c(.377,.5),xvec=ill
inoisrainfall)
> alpharoot
$root
[1] 0.4407967
$f.root
[1] -0.004515694
$iter
[1] 4
$estim.prec
[1] 6.103516e-05
betahatmle=mean(illinoisrainfall)/.4407967
[1] 0.5090602
.4408
.5091
MLE
MLE
Consistency, asymptotic distribution and optimality of

MLE for multiparameter estimation
Theorem 6.4.1: Let
1
, ,
n
X X K
be iid with pdf
1
( ; ( , , ))
p
f x K
for . Assume the regularity
conditions (R6-R9) hold [similar to (R0)-(R5), assumptions
that the log likelihood is smooth]. Then
(a)
P
MLE

(b)
1
( ) (0, ( ))
D
n p
n N I

where
( ) I
is the Fisher information matrix of ,
1
( ) log ( ; ), , log ( ; )
p
I Cov f x f x

_

,
K
.
As in the univariate case, the Fisher information matrix can
be expressed in terms of the second derivative of the log
likelihood function under the regularity conditions:
2
log ( ; ), log ( ; ) log ( ; )
jk
j k j k
I Cov f X f X E f X

_ 1

1

1
, ]
Corollary 6.4.1: Let
1
, ,
n
X X K
be iid with pdf
1
( ; ( , , ))
p
f x K
for . Assume the regularity
conditions (R6-R9) hold. Then
MLE
is an asymptotically
efficient estimate in the sense that the covariance matrix of
any other consistent estimate is at least as large (in
particular, the variance for each component of is at least
as large).
Note on practical use of theorem:
It is also true that

( )( ) (0, identity matrix)
D
MLE n p
nI N
Thus,
1
0

( ) , ( )
0
MLE MLE
N I
_ _

, ,
M
, which can be used to form
approximate confidence intervals.
Example 1:
1
, ,
n
X X K
iid
2
( , ) N .
2
2 4
2
,
2
4 4 6
1 1
- ( )
( , )
1 1 1
- ( ) ( )
2
X
I E
X X

1

1
1
1

1
]
=
2
2
1
0
2
0
1
1
1
1
1
]
Thus,
2
1 2
2
0
( , )
0
2
I

1
1
1
1
]
Thus,
2
2
0
( , ) ,
0
2
MLE MLE
n
N
n

_
1

1
_

1

1
,

1
]
,
To form approximate confidence intervals in practice, we
can substitute the MLE estimates into the covariance
matrix:
2
2
0
( , ) ,
0
2
MLE
MLE MLE
MLE
n
N
n

_
1

1
_

1

1
,

1
]
,
Thus, an approximate 95% confidence interval for
is
1.96
MLE
n
t
and an approximate 95% confidence interval
for
is
1.96
2
MLE
n
t
.
Example 2:
Gamma distribution:
( )
( )
( )
( )
2
2
,
2 3
2
2
2
''( ) ( ) '( )
1
-
( )
( , )
1 2
-
''( ) ( ) '( )
1

( )
1

I E
X

1
+
1
1

1
1
1
]
1

1
1
1
1
1
]
For the Illinois rainfall data,
.4408
.5091
MLE
MLE
Thus,
( )
( )
2
2
2
''(.4408) (.4408) '(.4408)
1

.5091
(.4408)
( , )
1 .4408

.5091 .5091
6.133 1.964
1.964 1.701
MLE MLE
I
1

1
1

1
1
1
]
1
1
]
infmat=matrix(c(6.133,1.964,1.964,1.704),ncol=2)
> invinfmat=solve(infmat)
> invinfmat
[,1] [,2]
[1,] 0.2584428 -0.2978765
[2,] -0.2978765 0.9301816
Thus,
0.259 -0.298

0
227 227
( , ) ,
0 0.298 0.259

227 227
MLE MLE
N
_
1

1
_
1

, 1

1
]
,
Thus, approximate 95% confidence intervals for
and
are
0.259
: 0.441 1.96 (0.375, 0.507)
227
0.930
: 0.509 1.96 (0.384, 0.634)
227
t
t
Note: We can also use observed Fisher information to form
confidence intervals based on maximum likelihood
estimates where in place of the information matrix, we use
the observed information matrix O where
2
1
log ( )
MLE
n
i
ij
i
i j
f X
O

We could also use the parametric bootstrap to form

confidence intervals based on maximum likelihood
estimates where we resample from
( ; )
MLE
f x

Statistics 512 Notes 18

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics 512 Notes 18

Uploaded by

Copyright:

Available Formats

Statistics 512 Notes 18:

Multiparameter maximum likelihood estimation

arg max ( ) arg max ( )

The partials with respect to

, we find that the mle for

(2) The Jacobian of the second-order partial derivatives is

The partial derivatives are

Setting the second partial derivative equal to zero, we find

log log log 0

This equation cannot be solved in closed form. Newtons

Consistency, asymptotic distribution and optimality of

We could also use the parametric bootstrap to form

You might also like