Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Asymptotic Theory OLS Large Sample Theory

Modelos Lineares
Teoria Assinttica e MQO Generalizado
Cristine Campos de Xavier Pinto
CEDEPLAR/UFMG
Maio/2010
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
When do you need to use asymptotic results?
Statistics are nonlinear in the underlying data
Joint distribution of those data is not parametrically specied.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Asymptotic Theory
Idea: As the sample size increases, the dierence between
the true distribution of the statistics and its approximate
distribution (normal distribution) shrinks to zero.
Two main steps in this approximation (Powells notes):
Step 1 (Limits theorems) : Show that a sample average
converges to its expectation and a standardized version of the
statistics has a distribution function that converges to a
standard normal (CLT and LLN)
Step 2 : Approximate the distribution of a smooth function
of sample averages by showing that this smooth function is
approximately a linear sample of sample averages (Slutsky
theorems).
Ingredients: regularity conditions to apply the limit and
Slutsky theorems + rules for mathematical manipulation of
the approximating objects.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Types of Convergences
X
N
or Y
N
: sequences of random vector (or matrices).
Cumulative Distribution Function: F
N
(x) = Pr [X
N
_ x],
where x has the same dimension as X
N
and the inequality
holds if it is true for each component.
Convergence in Distribution (or converge in law):
X
N

d
F (x) as N
if
lim
N
F
N
(x) = F (x)
at all points x where F (x) is continuos.
Implication 1: If X
N

d
F (x), we can use the cdf F (x) to
approximate probabilities for X
N
if N is large, i.e,
Pr X
N
A
A
=
_
A
dF (x)
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Types of Convergences
If F
N
(x) converges to a cdf of a degenerate random variable,
then we have another type of convergence.
Convergence in Probability: The sequence of random
variable X
N
converges to the (nonrandom) vector x,
p lim
N
X
N
= x
or
X
N

p
x as N
if
F
N
(c) = Pr X
N
_ c 1 x _ c
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Types of Convergences
Equivalent, we can dene convergence in probability using the
denition of textbooks:
X
N

p
x
if for any number > 0
lim
N
Pr |X
N
x| > = 0
We can show the equivalence of these two denitions.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Types of Convergences
Convergence in mean square: stronger form of
convergence of X
N
converges to the (nonrandom) vector x,
X
N

ms
x as N
if
lim
N
E
_
|X
N
x|
2
_
0
Convergence almost sure (strong convergence):
X
N

as
x as N
if
Pr
_
lim
N
X
N
= x
_
= 1
We know that
ms =
as

p =d
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Important Inequalities
Markovs inequality: If Z is a nonnegative scalar random
variable, i.e., Pr [Z > 0] = 1 with nite expectation, then
Pr [Z > K] _
E[Z]
K
Chebyshevs inequality: Lets take Z = (X E[X])
2
for a
scalar random variable with nite second moments and
K =
2
for any > 0, then
Pr [X E[X][ _ = Pr
_
(X E[X])
2
_
2
_
_
Var [X]

2
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Law of Large Numbers
Weak Law of Large Numbers: If X
N
=
1
N

N
i =1
X
i
is a
sample average scalar, i.i.d random variables X
i

N
i =1
with
E[X
i
] = and Var [X
i
] =
2
< , then
X
N

p
= E[X
i
]
Proof: Calculating means and variances of sums i.i.d
random variables, we can see that E
_
X
N

= and
Var
_
X
N

=

2
N
= E
_
(X
N
)
2
_
.
Note that as N , E
_
(X
N
)
2
_
0. By denition
X
N

ms
as N , which implies that X
N

p
.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Law of Large Numbers
We can have very general conditions that yield other LLNs for
heterogeneous (non-constant variances) and dependent
random variables.
Strong Law of Large Numbers: relax the condition of existence
of second moments and show almost sure convergence.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Central Limit Theorems
CLT (Lindeberg-Levy): If X
N
=
1
N

N
i =1
X
i
is a sample
average scalar, i.i.d random variables X
i

N
i =1
with E[X
i
] =
and Var [X
i
] =
2
< , then
Z
N
=
X
N
E
_
X
N

_
Var
_
X
N

=
_
N
_
X
N

d
A (0, 1)
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Central Limit Theorems
We can rewrite this result as
_
N
_
X
N

_

d
A
_
0,
2
_
In this form, it is explicit that X
N
converges to its mean at
the exactly the rate
_
N.
Limiting distribution of Z
N
: A (0, 1)
Asymptotic distribution of X
N
:
X
N
~
A
A
_
,

2
N
_
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Central Limit Theorems
There are dierent generalizations for the CLT, which relax
the assumptions of i.i.d while imposing stronger restrictions on
the existence of moments, or in the tail behavior of the
distribution.
To extend CLT and WLLN to random vector X
i
, we need
to use Cramer-Wold proposition.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Central Limit Theorems
Cramer-Wold: If a scalar random variable Y
N
=
/
X
N
converges in distribution for
Y =
/
X
for any xed vector having the same dimension as X
N
,
which means that

/
X
N

d

/
X
for any vector , then
X
N

d
X
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Central Limit Theorems
Using Cramer-Wold, we can get a multivariate version of
Linderg-Levy CLT,
A sample mean X
N
of i.i.d random vectors X
i
with
E[X
i
] = and Var [X
i
] = has
_
N (X
N
)
d
A (0, )
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Slutsky Theorems
We need to extend the limit theorems to statistics that are
functions of sample averages. We are going to use the
smoothness properties of these functions.
Continuity Theorem: If X
N

p
x
0
and the function g (x) is
continuos at x = x
0
, then
g (x)
p
g (x
0
)
Slutskys Theorem: If the random vectors X
N
and Y
N
have
the same length, and if X
N

d
x
0
and Y
N

d
Y, then
X
N
+Y
N

d
x
0
+Y
X
/
N
Y
N

d
x
/
0
Y
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Slutsky Theorems
Delta Method: If

N
is a random vector which is
asymptotically normal, with
_
N
_

N

0
_

d
A (0, )
for some
0
, and if g () is continuously dierentiable at
=
0
with Jacobian matrix
G
0
=
g (
0
)

/
that has full rank, then
_
N
_
g
_

N
_
g (
0
)
_

d
A
_
0, G
0
G
/
0
_
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Linear Model
y = X+
A1: (Y
i
, X
i
) : i = 1, .., N are i.i.d sample with bounded
fourth moments.
A2**: rank
_
E
_
X
/
X
__
= K
A3**: E
_
X
/

_
= 0 (weak assumption than before), E[] = 0
Exercise: Show that E[ [ X] = 0 implies that E
_
X
/

_
= 0.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
OLS Estimator
Note that under assumptions A1, A2** and A3**, the
parameter vector is identied.
Identication (linear models under random sampling): can
be written as a function of population moments in the
observable variables.
How to we show that is identied by these two assumptions?
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
OLS Estimator
By the analog principle, the OLS estimator is

=
_
X
/
X
_
1
X
/
Y
=
_
N

i =1
X
/
i
X
i
_
1
_
N

i =1
X
/
i
Y
i
_
= +
_
1
N
N

i =1
X
/
i
X
i
_
1
_
1
N
N

i =1
X
/
i

i
_
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
OLS Estimator; Consistency
Under assumption A2**, X
/
X is nonsingular with probability
approaching one.
Using LLN and the continuos mapping theorem, we can show
that
_
1
N
N

i =1
X
/
i
X
i
_
1

p
_
E
_
X
/
X
__
1
By LLN and using A3**
1
N
N

i =1
X
/
i

i

p
E
_
X
/

_
= 0
Using the Slutskys theorem


p
+
_
E
_
X
/
X
__
1
0 =
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
OLS Estimator: Consistency
Theorem
Under assumptions A1, A2** and A3**, the OLS estimator is
consistent.
This theorem shows that OLS consistently estimate
conditional expectation that are linear in parameters.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
OLS Estimator: Consistency
If A2** or A3** fails, then is not identied.
Under A1, A2** and A3**,

is not necessarily unbiased. As
we show last class, we need stronger assumptions to get
unbiased.
If we impose the assumption that is independent of X, and
E[] = 0, then following assumptions hold
E[ [ X] = 0
Var [ [ X] =
2
I
N
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
OLS Estimator: Asymptotic Inference
Note that we can write,
_
N
_


_
=
_
1
N
N

i =1
X
/
i
X
i
_
1
_
1
_
N
N

i =1
X
/
i

i
_
Applying the LLN, we can get that
_
1
N
N

i =1
X
/
i
X
i
_
1

p
_
E
_
X
/
X
__
1
= A
1
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
OLS Estimator: Asymptotic Inference
Note that
_
X
/
i

i
_
N
i =1
are sequence of i .i .d random variables
with mean zero and each element has nite variance (by
assumption A1), so we can apply CLT,
1
_
N
N

i =1
X
/
i

i

d
A (0, B)
where B =E
_

2
X
/
X
_
.
Theorem
Under assumptions A1, A2** and A3**,
_
N
_


_
~
A
A
_
0, A
1
BA
1
_
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
OLS Estimator: Asymptotic Inference
Suppose that we impose the following assumption
A4: Var [ [ X] =
2
Theorem
Under assumptions A1, A2** ,A3** and A4
_
N
_


_

d
A
_
0,
2
A
1
_
Proof.
Using the law of iterated expectation, we can show that under A4
B =E
_

2
X
/
X
_
=
2
A
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
OLS Estimator: Asymptotic Inference
Under the classical assumptions of OLS, we can estimate the
asymptotic variance of

Var
_

_
=
2
_
X
/
X
_
1
=
2
_
N

i =1
X
/
i
X
i
_
1
where

2
=

N
i =1

2
i
N K
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Heteroshedasticity
If assumption A4 does not hold, we still get consistency and
normality, but the nal asymptotic variance is dierent.
In general, assumption A4 fails. In this case, we cannot use
the previous formula to estimate the variance of OLS.
There are two solutions for violations of assumption A4:
1 Specify a model for Var [ y[ x] and use generalized least
squares, which leads to a dierent estimator of OLS.
2 Adjust the variance and test statistics, so they are valid under
the presence of heteroshedasticity.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Heteroshedasticity-Robust Inference
Without assumption A4, the asymptotic variance of

is
AVar
_

_
=
A
1
BA
1
N
In this case, we can estimate the asymptotic variance of

by
\
AVar
_

_
=
_
N

i =1
X
/
i
X
i
_
1

_
N

i =1

2
i
X
/
i
X
i
_

_
N

i =1
X
/
i
X
i
_
1
The standard roots of the diagonal elements of this matrix:
Eicker-Huber-White standard errors
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Heteroshedasticity-Robust Inference
Once we get the standard error, we can compute our test
statistics, like t.
However if A4 fails, the usual F test cannot be used to test
multiple linear restrictions, we need to use the
heteroskedasticity-robuts Wald that is an approximation for
this test.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Heteroshedasticity-Robust Inference
Suppose we can write our restrictions as
R = r
where R is QxK and has rank Q _ K and r is Qx1.
In this case, the robust-Wald statistics is
W =
_
R

r
_/ _
R

VR
/
_
1
_
R

r
_
where

V =
_

N
i =1
X
/
i
X
i
_
1

N
i =1

2
i
X
/
i
X
i
_

N
i =1
X
/
i
X
i
_
1
.
Under H
0
: W ~ A
2
Q
.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Generalized Least Squares
In the case of GLS model, instead of A4 we impose the
following assumption in the variance of
Var [ [ X] =
Heteroskedastic Linear Model
= diag
_

2
i
_
for some variances
_

2
i
: i = 1, .., N
_
Idea: We need to transform the data to satisfy the condition
of the Classical Regression Model.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Generalized Least Squares
Generalized Least Squares Estimator:

GLS
= arg min
b
(y Xb)
/

1
(y Xb)
= arg min
b
N

i =1
w
i
(Y
i
bX
i
)
2
where w
i
=
1

2
i
.
We just multiply our model by

1
2

1
2
y
. .
y
+
=

1
2
X
. .

X
+
+

1
2

. .

+
Note that
Var [
+
[ X] = I
and we are back to the homoskedastic model.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Generalized Least Squares
The GLS estimator is

GLS
=
_
X
/

1
X
_
1
_
X
/

1
y
_
In this case, the Generalized Least Squares Estimator reduces
to Weighted Least Squares (WLS).
If we do not know , this estimator is not feasible.
To construct a feasible GLS estimator, we need to replace the
unknown by an estimator

.
To do this, we need to model the variance terms
Var [
i
[ Xi ] =
2
i
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Generalized Least Squares
In general, the applications of feasible WLS assumes
multiplicative Heteroskedasticity Models:

i
= c
i
u
i
where u
i

N
i =1
are i.i.d random variables with E[ u
i
[ X
i
] = 0 and
Var [ u
i
[ X] =
2
.
We assume that the heteroskedasticity function c
2
i
has a
single index form
c
2
i
= h
_
Z
/
i

_
where
Z
i
are observable functions of the regressors X
the function h (.) is normalized so that h (0) = 1 with
derivative h
/
(.) assumed to be nonzero at zero, h
/
(0),= 0.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Generalized Least Squares
Examples:
1 Exponential Heteroskedasticity:
Var [
i
[ X
i
] =
2
exp
_
X
/
i

_
2 Variance Proportional to Square of Mean
Var [
i
[ X
i
] =
2

_
+X
/
i

_
2
, ,= 0
so

2
i
=
2

_
1 +X
/
i

_
2
where
2
=

2

2
and =

.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Score Test
Before correcting for heteroskedasticity, we do a diagnostic
test for the presence of heteroskedasticity.
We want to the test the null:
H
0
: c
i
= 1 =H
0
: = 0
Under H
0
:
E
_
u
2
i

X
i

= E
_

2
i

X
i

=
2
and
Var
_
u
2
i

X
i

= Va
_

2
i

X
i

= > 0
The null hypothesis generates a linear model for
2
i

2
i
=
2
+Z
/
i
+r
i
where E[ r
i
[ Z
i
] = 0 and Var [ r
i
[ Z
i
] = > 0 and the true
= 0 under H
0
.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Score Test
To test, we replace
2
i
by
2
i
=
_
y
i
x
/
i

_
2
.
The score test or LM test has the following steps:
1 Construct the least squares residuals:
2
i
=
_
y
i
x
/
i

_
2
, where

=
_
X
/
X
_
1
X
/
Y.
2 Regress
2
i
on 1, Z
i
and obtain R
2
from the "squared residual
regression".
3 Construct the test statistics T = NR
2
. Under H
0
,
T
d
A
2
(p), where p = dim() = dim(Z
i
). We reject if T
exceeds the upper critical value in the chi-square distribution
with p degrees of freedom.
They are dierent forms of this test, but all of them are based
in the same idea.
If we reject H
0
, we need to correct for heteroskedasticity.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Feasible GLS
To get a feasible WLS, we need to use the fact
E
_

2
i

X
i

=
2
i
=
2
h
_
Z
/
i

_
The procedure has two steps:
1 Replace
2
i
by
2
i
, and estimate
2
and by least squares.
2 Calculate h
_
Z
/
i

_
and then replace y
i
and x
i
by
y
+
i
=
y
i
_
h
_
Z
/
i

_
and x
+
i
=
x
i
_
h
_
Z
/
i

_
and do last squares regression of y
+
i
on x
+
i
.
3 If we have the correct specication of heteroskedasticity
_

2
i
=
2
h
_
Z
/
i

__
, we can use the usual least squares
standard error formulae and tests to do inference.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Feasible GLS
Example: Exponential Heteroskedasticity

2
i
= u
2
i
exp
_
X
/
i

_
We can take logarithms of both sides to get
log
_

2
i
_
= log (u
i
)
2
+X
/
i

Making the assumptions that u
i
is independent of X
i
E
_
log
_

2
i
_

X
i

= E
_
log (u
i
)
2
_
+X
/
i

In the rst step, we run a regression of log
_

2
i
_
on a
constant, X
i
and get

.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Properties of the GLS Estimator
Note that in general, we can write
=
2

The GLS estimator

GLS
=
_
X
/

1
X
_
1

_
X
/

1
Y
_
is a linear estimator that satises the hypothesis of the
classical linear model.
Using CLT, we can show that
_
N
_

GLS

_

d
A (0, V)
where
V =
2
p lim
_
1
N
X
/

1
X
_
1
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Properties of the Feasible GLS Estimator

FGLS
=
_
X
/

1
X
_
1

_
X
/

1
Y
_
Since

can be a function of Y, the estimator is no longer
linear in Y.
Under general conditions, we can show that FGLS and the
GLS are asymptotically equivalent
_
N
_

FGLS

GLS
_
0
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Properties of the Feasible GLS Estimator
so that
_
N
_

FGLS

_

d
A (0, V)
where
V = p lim s
2
FGLS

_
1
N
X
/

1
X
_
1
and
s
2
FGLS
=
1
N K
_
Y X
/

FGLS
_/

1
_
Y X
/

FGLS
_
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Properties of the Feasible GLS Estimator
Suppose that the parametric form of () is incorrect. In
this case,
2
i
,= h
_
Z
/
i

_
.
The asymptotic distribution of FGL estimator is going to
change. In this case,
_
N
_

FGLS

_

d
A
_
0, D
1
CD
1
_
where
D = p lim
_
1
N
X
/

1
X
_
C = p lim
1
N
X
/

1
X
for = Var [ [ X] and

= diag
_
h
_
Z
/
i

__
.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
Properties of the Feasible GLS Estimator
How do we get a consistent estimator for C?
Suppose that

= diag
_

2
i
[
h
(
Z
/
i

)]
2
_
and
2
i
,=
2
h
_
Z
/
i

_
. In
this case, we can consistently estimate C by

C = diag
_

2
i
_
h
_
Z
/
i

_
2
_
where
2
i
= Y
i
X
/
i

FGLS
.
Note that this matrix is the same Huber-Eicker-White that we
used in the variance-robust estimator, assuming that

= I.
Cristine Campos de Xavier Pinto Institute
Modelos Lineares
Asymptotic Theory OLS Large Sample Theory
References
Amemya: 3 e 6
Goldberger: 9 e 25
Hayashi: 3
Wooldridge: 4
Cristine Campos de Xavier Pinto Institute
Modelos Lineares

You might also like