Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing

Estimadores Extremos
Mxima Verossimilhana (MLE)
Cristine Campos de Xavier Pinto
CEDEPLAR/UFMG
Maio/2010
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Instead of using conditional mean and variance assumptions,
we are going to use a full distributional assumption.
We assume that we have an i.i.d sample x
i
, y
i

N
i =1
, where
x
i
R
K
and y
i
R
G
, and we are interested in estimating a
model for the conditional distribution of Y
i
given X
i
.
Assumption: The density of y
i
given x
i
is known up to a
nite number of unknown parameters.
We impose a parametric model for the conditional density.
The vector y
i
can be continuous or discrete, or it can have
both continuous and discrete characteristics.
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Example 1: Suppose we have a latent variable y
+
i
that follows
the linear model
y
+
i
= x
i
+
i
where
i
is independent of x
i
.
x
i
is a 1xK vector with the rst element equals to the unity.
is a Kx1 vector of parameters.

i
~ A (0, 1)
Instead of observing y
+
i
, we observe only a binary variable that
equals the sign of y
+
i
y
i
=
_
1 if y
+
i
> 0
0 if y
+
i
_ 0
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Using the assumptions above, we need to obtain the
distribution of y
i
given x
i
Pr [ y
i
= 1[ x
i
] = Pr [ y
+
i
> 0[ x
i
]
= Pr [ x
i
+
i
> 0[ x
i
]
= Pr [
i
> x
i
[ x
i
]
= 1 (x
i
) = (x
i
)
and
Pr [ y
i
= 0[ x
i
] = 1 (x
i
)
Using the information above, the density of y
i
given x
i
is
f (y
i
[ x
i
) = [(x
i
)]
y
[1 (x
i
)]
1y
, y = 0, 1
Given the support conditions, f (y
i
[ x
i
) is zero if y / 0, 1
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Example 2: Let y
i

N
i =1
be independent with common
distribution dened by
f (y
i
) =
_
A
_

1
,
2
1
_
with probability
A
_

2
,
2
2
_
with probability 1
In this case, we are doing unconditional MLE, and we are
interested in estimating =
_

1
,
2
1
,
2
,
2
2
_
In this case the density of y
i
is
f (y
i
) =

_
2
1
exp
_

(y
i

1
)
2
2
2
1
_
+
1
_
2
2
exp
_

(y
i

2
)
2
2
2
2
_
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
p
0
(Y[ X): true conditional density of Y
i
given X
i
= x.
A R
K
: all possible values for X
i
: all possible values for Y
i
A and : are the supports of the random vectors X
i
and Y
i
For all x A, we assume that the p
0
(.[ X) is a density with
respect to a nite measure, denoted by v (dy) .
We can choose v (dy) in such a way that Y
i
can be discrete,
continuous or a mixture.
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
In MLE, we minimize the distance from the conditional
density of Y from the true density.
Conditional Kullback-Leibler Information Inequality: For
any nonnegative function f (.[ X) such that
_

f (y[ x) v (dy) = 1 for all x A


the Kullback-Leibler information inequality is
/(f ; x) =
_

log
_
p
0
(y[ x)
f (y[ x)
_
p
0
(y[ x) v (dy) _ 0, for all x A
Note that this integral is equal to zero for f = p
0
.
For each x, /(f ; x) is minimized at f = p
0
.
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Lets apply this inequality to a parametric model for p
0
(y[ x) .
A parametric model for p
0
(y[ x) can be dened as
_
f (.[ x; ) , , R
P
_
which
_

f (y[ x; ) v (dy) = 1 for each x A and each


This parametric model is correct specied model of the
conditional density p
0
(.[ .) if for some
0
,
f (.[ x;
0
) = p
0
(.[ x) for all x A
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Notice that for each x A, we can write /(f ; x) as
E[ log [p
0
(y
i
[ X
i
)][ X
i
= x] E[ log [f (Y
i
[ X
i
; )][ X
i
= x]
and if the parametric model is corrected specied, we have
E[ log [f (y
i
[ X
i
;
0
)][ X
i
= x] _ E[ log [f (Y
i
[ X
i
; )][ X
i
= x]
Conditional log-likelihood for observation i:
E[ l
i
(
0
)[ X
i
= x] _ E[ l
i
()[ X
i
= x]
where
l
i
() = l (y
i
, X
i
, ) = log [f (y
i
[ X
i
; )]
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Taking the expectation of the expression above, we can see
that
0
solves
max

E[log [f (y[ X; )]]


. .
Q
0
()
Using the sample analog, the CMLE estimator

maximizes

Q
N
() =
1
N
N

i =1
log [f (y
i
[ X
i
; )]
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing

0
is identied if for any ,=
0
, implies that
f (y[ X; ) ,= f (y[ X;
0
) .
Information Inequality: If
0
is identied and
E[[log [f (y[ X; )][] < for all then
Q
0
() = E[log [f (y[ X; )]] has a unique maximum at
0
.
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Consistency: Let (x
i
, y
i
) : i = 1, 2, ... be a random sample
with x
i
A and y
i
. Let = R
p
be the parameter set
and denote the parametric model of the conditional density
f (.[ x, ) : x A, . Assume that:
(i ) f (.[ x, ) is the true density with respect to the
measure v (dy) for all x and .
(ii ) For some
0
, p
0
(.[ x) = f (.[ x,
0
) for all
x A, and if ,=
0
, then
f (Y[ X; ) ,= f (Y[ X;
0
)
(iii ) is compact
(iv) log f (Y[ X; ) is continuos at each with
probability one
(v) E[sup

[log f (Y[ X; )[] <


then


0
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Example: Back to our rst example.
In this case, log-likelihood function for observation i is
l
i
() = y
i
log (x
i
) + (1 y
i
) log [1 (x
i
)]

solves the following maximization problem


max

1
N
N

i =1
y
i
log (x
i
) + (1 y
i
) log [1 (x
i
)]
Note that this function is continuous in .
MLE only works with the density is corrected specied.
If the latent model is not linear or if is not independent of x
i
and normally distributed, the density of y
i
given x
i
is incorrect.
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
To get the asymptotic linear representation of the MLE, we
need to assume that
0
is in the interior of and that l
i
() is
twice continuously dierentiable on the interior of .
The score of the log-likelihood for each observation is
s
i
() = \

l
i
()
/
=
_
l
i
()

1
,
l
i
()

2
, ,
l
i
()

p
_
/
Under some regularity conditions, we can show that
E[s
i
(
0
)] = 0
Lets try to show this.
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Using the denition of expectation
E[ s
i
(
0
)[ x
i
] =
_

s (y, x
i
, ) f (y[ x; ) v (dy)
If integration and dierentiation can be interchanged on
int()
\

_
_

f (y[ x; ) v (dy)
_
=
_

f (y[ x; ) v (dy)
for all x
i
A, int () .
Since
_

f (y[ x; ) v (dy) = 1 for all ,


\

_
_

f (y[ x; ) v (dy)
_
= 0, and
_

f (y[ x; ) v (dy) = 0
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Notice that
\

f (y[ x; ) = \

log f (y[ x; ) f (y[ x; )


and
_

log f (y[ x; ) f (y[ x; ) v (dy) = 0


evaluating this expression at
0
and transposing this
expression, we have
_

s (y, x
i
, ) f (y[ x; ) v (dy) = 0
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Example: Lets get the score for the rst example
\

l
i
() = \

(y
i
log (x
i
) + (1 y
i
) log [1 (x
i
)])
In this case,
\

l
i
() = y
i
\

log (x
i
) + (1 y
i
)\

log [1 (x
i
)]
Notice that
\

log (x
i
) =
(x
i
) x
/
i
(x
i
)
, \

log [1 (x
i
)] =
(x
i
) x
/
i
1 (x
i
)
At the end,
\

l
i
() =
(x
i
) x
/
i
(y
i
(x
i
))
(x
i
) (1 (x
i
))
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Lets show that E[\

l
i
(
0
)] = 0.
Lets dene u
i
= y
i
(x
i

0
) = y
i
E[ y
i
[ x
i
]
s
i
(
0
) =
(x
i

0
) x
/
i
u
i
(x
i

0
) (1 (x
i

0
))
Notice that E[ u
i
[ x
i
] = 0, so
E[ s
i
(
0
)[ x
i
] =
(x
i

0
) x
/
i
E[ u
i
[ x
i
]
(x
i

0
) (1 (x
i

0
))
= 0
with implies that
E[s
i
(
0
)] = 0
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
The Hessian for each observation i is a PxP matrix of second
partial derivatives of l
i
() ,
H
i
() = \

s
i
() = \
2

/ l
i
()
Lets try to get the asymptotic linear representation of the
MLE.
First, we need to do the expectation of the rst order
condition
1
_
N
N

i =1
\

l
i
_

_
=
1
_
N
N

i =1
\

l
i
(
0
)
+
_
1
N
N

i =1
\
2

/ l
i
_

_
_
_
N
_


0
_
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Under some conditions,
_
1
N
N

i =1
\
2

/ l
i
_

_
_

p
E[H
i
(
0
)] = H
0
We can write
_
N
_


0
_
=
H
1
0
_
N
N

i =1
s
i
(
0
) +o
p
(1)
We know that E[s
i
(
0
)] = 0 and
Var [s
i
(
0
)] = E
_
s
i
(
0
) s
i
(
0
)
/

= I (
0
) <
I
0
= I (
0
): Information matrix
Under standard regularity conditions,
_
N
_


0
_

d
A
_
0, H
1
0
I
0
H
1
0
_
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Note that
\

_
_

s
i
(
0
) f (y[ x
i
;
0
) v (dy)
_
= 0
Assuming that dierentiation under the integral is allowed,
\

_
_

s
i
() f (y[ x
i
; ) v (dy)
_
=
_

(s
i
() f (y[ x
i
; )) v (dy)
=
_

s
i
() f (y[ x
i
; ) v (dy) +
_

s
i
() \

f (y[ x
i
; )
/
v (dy)
=
_

s
i
() f (y[ x
i
; ) v (dy)
+
_

s
i
() \

log f (y[ x; )
/
f (y[ x; ) v (dy)
=
_

s
i
() f (y[ x
i
; ) v (dy) +
_

s
i
() s
i
()
/
f (y[ x; ) v (dy)
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
At the end,
E[ \

s
i
()[ x
i
] = E
_
s
i
() s
i
()
/

x
i
_
Conditional Information Inequality:
E[ H
i
(
0
)[ x
i
] = E
_
s
i
(
0
) s
i
(
0
)
/

x
i
_
Using the Law of iterated expectation, we have the
Information Inequality
E[H
i
(
0
)] = E
_
s
i
(
0
) s
i
(
0
)
/
_
In other words,
H
0
= I (
0
)
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Asymptotic Normality MLE: Suppose we have a random
sample (X
i
, Y
i
)
N
i =1
and the hypothesis in the consistency
theorem are satised. If
(a)
0
interior ()
(b) f (y[ x; ) is twice continuously dierentiable
and f (y[ x; ) > 0 in a neighborhood Aof
0
(c)
_
sup
A
|\

f (y[ x; )| v (dy) < ,


_
sup
A
_
_
_\
2

/ f (y[ x; )
_
_
_ v (dy) <
(d) I = E
_
\

log f (y[ x; ) (\

log f (y[ x; ))
/
_
exists and is nonsingular
(e) E
_
sup
A
_
_
_\
2

/ log f (y[ x; )
_
_
_
_
<
Then,
_
N
_


0
_

d
N
_
0, I (
0
)
1
_
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
To estimate the asymptotic variance, we need to estimate I
0
There are several ways to estimate this matrix.
Using the sample analog of the moments

I
1
=
1
N
N

i =1
s
_
y
i
, x
i
,

_
s
_
y
i
, x
i
,

_/

I
2
=
1
N
N

i =1
\
2

/ log f
_
y
i
[ x
i
;

_
Another possible estimator is the sample average of the
conditional information matrix. Let
I (x; ) = E
_
s
i
() s
i
()
/

x
_
,using the law of iterated
expectation and the sample analog

I
3
=
1
N
N

i =1

_
x;

_
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
The regularity conditions for consistency of each of these
estimators are weak, and in general they will be consistent
when the likelihood function is twice dierentiable.
There are some properties of these estimators that help to
decide which one to use.

I
1
is easier to compute than

I
2
that is easier to compute than

I
3
.

I
1
is always positive denite, but it can behave poorly in nite
samples.

I
2
is not guaranteed to be positive denite.

I
3
is positive denite if it exists and has better small sample
properties than

I
1
.
All these estimator are not consistent if the conditional
density of y on x is misspecied. In this case, we need to use
the general extremum estimator formula.
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Example:
\

l
i
() =
(x
i
) x
/
i
(y
i
(x
i
))
(x
i
) (1 (x
i
))
The MLE is the solution of the system of equations
1
N
N

i =1

_
x
i

_
x
/
i
_
y
i

_
x
i

__

_
x
i

_
1
_
x
i

__
= 0
For each observation i, the second derivative is

(
x
i

)
2
x
/
i
x
i

_
x
i

_
1
_
x
i

__


_
x
i

_
x
/
i
_
y
i

_
x
i

__ _

_
x
i

_ _
1 2
_
x
i

___

_
x
i

_
2

_
1
_
x
i

__
2
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
This expression is very long, however we take the conditional
expectation evaluated at
0
E[ H
i
(
0
)[ x
i
] =

(
x
i

)
2
x
/
i
x
i

_
x
i

_
1
_
x
i

__
and the asymptotic variance of MLE in this example is
1
N
N

i =1

(
x
i

)
2
x
/
i
x
i

_
x
i

_
1
_
x
i

__
which is always positive denite when the inverse exists.
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
We can use Wald, LM or QLR tests in this case.
In the MLE set-up, if the information inequality holds, these
tests have the same limiting distribution.
We will go back to the properties of this test and ecient of
MLE when we talk about GMM.
However, since MLE is based on distributional assumptions, it
is important to have a specication test that can be used in
this context.
One way to think about these specication tests is to test
moment conditions implied by the conditional density
specication. Let z
i
= (x
i
, y
i
), and suppose that when
f (.[ x; ) is correctly specied,
H
0
: E[g (w
i
,
0
)] = 0
where g (w
i
,
0
) is a Qx1 vector.
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Note that g (w
i
,
0
) cannot contain elements of the score.
One test is based on how far the sample average of g
_
w
i
,

_
is from zero.
The t-statistics will be based on the equality
1
_
N
N

i =1
g
_
w
i
,

_
=
1
_
N
N

i =1
_
g
_
w
i
,

_
s
i
_

_

0
_
where

N
i =1
s
i
_

_
= 0 and

0
=
_
E
_
s
i
(
0
) s
i
(
0
)
/
__
1

_
E
_
s
i
(
0
) g
i
(
0
)
/
__

0
is a PxQ matrix of population coecients from a
regression of g
i
(
0
)
/
on s
i
(
0
)
/
.
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Doing a mean value expansion around
0
1
_
N
N

i =1
_
g
_
w
i
,

_
s
i
_

_

0
_
=
1
_
N
N

i =1
[g (w
i
,
0
) s
i
(
0
)
0
]
+E[\

g (w
i
,
0
) \

s
i
(
0
)
0
]
_
N
_


0
_
+o
p
(1)
If the density is corrected specied,
E[\

g (w
i
,
0
) \

s
i
(
0
)
0
] = 0 since
\

s
i
(
0
)
0
=
_
E
_
s
i
(
0
) g
i
(
0
)
/
__
and using the same argument as in the conditional
information inequality
E[ \

g (w
i
,
0
)[ x
i
] = E
_
g
i
(
0
) s
i
(
0
)
/

x
i
_
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Using the results above,
1
_
N
N

i =1
_
g
_
w
i
,

_
s
i
_

_

0
_
=
1
_
N
N

i =1
[g (w
i
,
0
) s
i
(
0
)
0
] +o
p
(1)
We an get a consistent estimator for
0

=
_
1
N
N

i =1
s
i
_

_
s
i
_

_/
_
1

_
1
N
N

i =1
s
i
_

_
g
i
_

_/
_
and the asymptotic variance of
1
_
N

N
i =1
_
g
_
w
i
,

_
s
i
_

_

0
_
can be estimated by
1
_
N
N

i =1
_
g
_
w
i
,

_
s
i
_

_ _
g
_
w
i
,

_
s
i
_

_/
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
The Newey-Tauchen-White statistics (NTW) is
NTW =
_
N

i =1
g
_
w
i
,

_
_/

_
N

i =1
_
g
_
w
i
,

_
s
i
_

_ _
g
_
w
i
,

_
s
i
_

_
_
1

_
N

i =1
g
_
w
i
,

_
_
Under the null that the density is correctly specied,
NTW ~ A
2
Q
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Quasi-MLE
In general, we do not know the true (conditional) distribution
function.
In QMLE, we use a normal density function to approximate
the distribution when we do not know that distribution.
In this case, since the model is not corrected specied,
I (
0
) ,= E[H (
0
)], in this case the asymptotic variance of
MLE is
H
1
0
I
0
H
1
0
N
In this case, QMLE is not the true value that maximizes Q
0
(),
and we are solving the following maximizing problem
max

E
0
[l ()] = max

_
log f (y[ x; ) f (y[ x;
0
) v (dy)
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
Using the Kullback-Lieber information Criterion, we are
minimizing the distance between the true density function and
a parametric function
/(f ; x) =
_

log
_
p
0
(y[ x)
f (y[ x)
_
p
0
(y[ x) v (dy)
and we try to nd the pseudo-true parameter value that
makes the parametric density close to the true density.
Consistency of this estimator happens if true data density
belongs to the linear exponential family, and the conditional
mean is corrected specied.
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos
Examples Conditional MLE: Introduction Identication Asymptotic Normality Hypothesis Testing
References
Amemya: 4
Wooldridge:13
Rudd: 14 and15
Newey, W. and D. McFadden (1994). "Large Sample
Estimation and Hypothesis Testing", Handbook of
Econometrics, Volume IV, chapter 36.
Cristine Campos de Xavier Pinto Institute
Estimadores Extremos

You might also like