Large Sample Results For The Linear Model: Walter Sosa-Escudero

Large Sample Theory
Large Sample Results for

Consistency of S 2
Large Sample Results for the Linear Model

Walter Sosa-Escudero
Econ 507. Econometric Analysis. Spring 2009
February 19, 2009
Large Sample Theory

Consistency of S 2
Why Large Sample Theory?
= . Easy, mostly due to

= (X 0 X)1 X 0 Y , then E()
linearity.
What about = g(X, Y ). In many cases, impossible to
derive finite sample properties without being too specific
about g(.).
Large Sample Theory

Consistency of S 2
Consider H0 : = 0 vs. HA : 6= 0. We relied on checking

We had to assume u

= 0, based on the distribution of .
normal and, through linearity, we got the distribution of .
It is equivalent to think about H0 : = 0 vs. HA : 6= 0

for 6= 0 and check whether
= 0. Why? If we choose
cleverly, the distribution of is much easier to get than that

For example, if we set = n, when n , ' N .
of .
Large Sample Theory

Consistency of S 2
Example: X (, 2 ), H0 : = 0 , S 2 = (n 1)1
X normal:
P
2
(Xi X)
0 a
X
Z=p
' tn1
S 2 /n
Reject if |z| > c , with P r(|Z| > c ) = .

X not normal:
0
X
Z=p
' N (0, 1)
S 2 /n
Reject if |z| > c , with P r(|Z| > c ) = .

If X is normal, we use the t distribution. The result holds for any
sample size.
If X is not normal, we use the normal distribution. The result holds
asymptotically.
Large Sample Theory

Consistency of S 2
What do we mean by valid asymptotically ?

How relevant is the previous result in practice, when n is
necesarily finite
Large Sample Theory

Consistency of S 2
X is normal
We should use the t distribution
to compute c .
What if we use the normal?
The approximation works fine.
For n = 30 differences are
negligible.
Large Sample Theory

Consistency of S 2
X is not-normal
Here we do not have
information to
compute c .
What if we use the
normal?
Uniform case: the
approximation works
well!
Large Sample Theory

Consistency of S 2
Large sample approximations work well in many cases, even

far from infinity.
In most cases it is much easier to derive the asymptotic
approximations instead of the finite sample result.
Large Sample Theory

Consistency of S 2
Basic Concepts
Convergence in probability:
A sequence {zn } of random scalars converges in probability to a
non-random constant a if for every > 0:
lim P r [|zn a| > ] = 0
We use the notation zn a, or plim zn = a

It extendes naturally to sequences of random vectors or
matrices, requiring element-by-element convergence.
Large Sample Theory

Consistency of S 2
Almost sure convergence:

A sequence {zn } of random scalars converges almost surely to a
constant a if

P r lim zn = = 1
n
a.s.
We use the notation zn a, or plim zn = a

Almost sure convergence implies convergence in probability.
Large Sample Theory

Consistency of S 2
Convergence in Distribution A sequence {zn } of random scalars

with distribution cumulative distribution Fn converges in
distribution to a random scalar z with distribution F
lim Fn = F
at every continuity point of F .

F is the limiting distribution of {zn }.
d
Notation: zn z.
Large Sample Theory

Consistency of S 2
Law of Large Numbers

Let zn be a secuence of random scalars, and construct a new
sequence
Pn
zi
zn = i=1
n
Kolmogorovs Strong LLN: {zn } and i.i.d. secuence of random

a.s.
scalars with E(zi ) = (exists and is finite). Then zn .
Large Sample Theory

Consistency of S 2
Central Limit Theorem
Lindeberg-Levy CLT: {zn } and i.i.d. secuence of random scalars

with E(zi ) = and V (zi ) = 2 (both exist and are finite). Then
zn d
n
N (0, 1)
Careful, the theorem does not refer to zn but to a

transformation involving n.
Large Sample Theory

Consistency of S 2
Useful Results
Continuity: {zn } a sequence of random vectors, z a random

vector, a a vector of constants. Let g(.) be a vector valued
continuous function that does not depend on n
p
a.s.
a.s.
zn a g(zn ) g(a), and zn a g(zn ) g(a)

zn z g(zn ) g(z)
p
Product rule: zn 0 and xn x, then zn xn 0
Slutzkys Theorem:
d
xn x and yn , then xn + yn x + .
p
d
d
xn x, An A, then An xn Ax, provided An and xn are
conformable.
Large Sample Theory

Consistency of S 2
Cramer Wold Device (restricted): xn a sequence of K 1

d
random vectors. If for any vector <K , 0 xn N then xn

converges to a multivariate normal random variable.
p
Asymptotic equivalence: If (xn yn ) 0 and yn Z, then

d
xn Z. We will say that xn and yn are asymptotically

equivalent.
{xn , un }, i.i.d., then {xi x0i } and {xi ui } are iid.
Large Sample Theory

Consistency of S 2
Estimators as sequences of random variables

An estimator n is any function of the sample. The sequence
{n }n refers to the sequence formed by increasing the sample size
progressively.
Consistency: n is consistent for 0 if n 0 (pis weak and
a.s. strong consistency.
Asymptotic normality: n is asymptotically normal if

d
n(n 0 ) N (0, ).
is the asymptotic variance. Careful with this!
Such estimators are usually called n-consistent.
Large Sample Theory

Consistency of S 2
Large Sample Properties of OLS estimators

Aside: some new notation
Let us write our linear model
yi = 1 x1i + 2 x2i + + K xKi + ui ,
i = 1, . . . , n
as: yi = x0i + ui
where xi is an K 1 vector (x1i , x2i , . . . , xKi )0 .
Note that xi x0i is a K K matrix. It is easy to check
P
P
X 0 X = ni=1 xi x0i , X 0 Y = ni=1 xi yi
P
P
n = ( n xi x0 )1 ( n xi yi )
i=1
i=1
n refers to the sequence of OLS estimators formed by increasing

the sample size.
Large Sample Theory

Consistency of S 2
The Model
Assumptions for asymptotic analysis

1
Linearity: yi = x0i 0 + ui
Random sample: {xi , ui } is a jointly i.i.d. process.
Predeterminedness: E(xik ui ) = 0 for all i and k = 1, . . . , K.
Rank condition: x E(xi x0i ) finite positive definite.
V (xi ui ) = S finite positive definite.
i = 1, . . . , n.
Large Sample Theory

Consistency of S 2
The results
p
Consistency: n 0 .
Asymptotic normality:
p

1
n n 0 N (0, 1
x Sx )
Plan: first prove results, then we discuss the assumptions.
Large Sample Theory

Consistency of S 2
Consistency
n = 0 + (X 0 X)1 X 0 u
0 1 0
XX
Xu
= 0 +
n
n
We will show:
p
n1 X 0 u 0
0 1
X X
does not explode
n
This argument will appear several times in this course!
Large Sample Theory

Consistency of S 2
The hth element of
X0u
n
is
n
n
1 X
1X
xhi ui =
zi ,
n
n
i=1
zi xhi ui
i=1
with E(zi ) = 0, so by Kolmogorovs LLN

n
1X
zi =
n
i=1
hence
Pn
i=1 xhi ui
n

X 0u
n
E(xhi ui ) = 0
Large Sample Theory

Consistency of S 2
Pn
x x
The (h, j) element of XnX is i=1n hi ji . Since E(xhi xji ) = x,hj ,

which exists and is finte, by the LLN (element-by-element)
X 0X p
x
n
Since x is invertible and by continuity:

X 0X
n
1
1
x
Summarizing:
n = 0 +
X0X
n
1
1
x
X0u
n
p
Hence, by the product rule: n 0
Large Sample Theory

Consistency of S 2
Asymptotic normality
The starting point is now:
X 0 X 1 X 0 u

n n 0 =
n
n
We have already shown:

X 0X
n
1
1
x
Now the goal is to find the asymptotic distribution of

use our continuity results.
0
X
u
n
and
Large Sample Theory

Consistency of S 2
X 0u

=
X 0u
n
n
is a vector of K VAs. By the Cramer-Wold Device, we will find

the distribution of:

0 X 0u
nc
n
for every vector c <K . Note:
P
P 0
P

0 X 0u
0 xi ui
c xi ui
zi
nc
= nc
= n
n
n
n
n
n
with zi c0 xi ui , a scalar random variable
Large Sample Theory

Consistency of S 2
It is easy to check (do it as exersise) that the assumptions imply:

E(zi ) = 0
V (zi ) = c0 Sc < .
Then, by the Lindeberg-Levy CLT applied to n

z:
0
d
0 X u
n (
z 0) = c
N (0, c0 Sc)
n
hence by the Cramer-Wald device:
0
Xu d
N (0, S)
n
Large Sample Theory

Consistency of S 2
Then:

0 1

X X
n n 0 =
n
p
1
x
0
X
u
n
N (0, S)
By Slutzkys theorem and by the linearity property of multivariate

normal variables:
p

1
n n 0 N (0, 1
x Sx )
Large Sample Theory

Consistency of S 2
A useful simplification
If we further assume E(u2i |xi ) = 02 , then using LIE:
S = V (xi ui ) = E(xi ui ui x0i ) = 02 E(xi x0i ) = 02 x
So the asymptotic variance of n reduces to
1
2 1
1
2 1
1
x Sx = 0 x x x = 0 x
Large Sample Theory

Consistency of S 2
Comments on the assumptions
Predeterminedness vs. Mean independence: E(xik ui ) = 0 is

an orthogonality assumption. If there is a constant in the
model, E(ui ) = 0 and this implies Cov(xi , ui ) = 0.
E(ui |xik ) = 0 is a stronger assumption, since it implies that
ui is uncorrelated to any (measurable) function of xi .
Predeterminedness vs. Strict Exogeneity: our old assumption

was E(u|X) = 0, which implies E(ui Xi ) = 0. We are
requiring a bit less than we did to show unbiasedness (careful).
Example: Yt = 1 + 2 Yt1 + ut , t = 1, 2, ... E(ut Yt1 ) = 0
(consistency), but E(u|Y1 ) 6= 0 (why?). The OLS estimator will
be consistent but biased!
Large Sample Theory

Consistency of S 2
Rank condition as no multicollinearity in the limit: this

guarantees the invertibility of n1 X 0 X in the limit.
Rank condition requires variability in X The rank condition

implies
1 0
p
X X E(xi x0i ), finite pd.
n
Consider the following model:
1
yi = 0 + 1 + ui , i = 1, . . . , n
i
Here the explanatory variable provides less information as
n .
Large Sample Theory

Consistency of S 2
We are not ruling out conditional heteroskedasticity!: the iid

assumption implies E(u2i ) is constant (unconditional
homoskedasticity), but the error term can be conditionally
heteroskedastic.
The intercept. Note that

Cov(Xk , ui ) = E(Xki ui ) E(Xki )E(ui )
If there is a constant in the model, E(X1i ui ) = E(ui ) = 0.
Then, this joint with predeterminedness implies no
contemporaneous correlation between explanatory variables
and the error term.
Do not confuse consistency with unbiasedness.
Large Sample Theory

Consistency of S 2
Extensions
The iid case: No fixed regressors, heteroskedasticity,

dependences.
Heterogeneity: moment restrictions (Markov, Lyapunov CLT).
Dependences: martingale difference, ergodic theorem, etc.
Large Sample Theory

Consistency of S 2
Consistency of S 2
p
Result: Under Assumptions 1-4, s2 E(u2i ), provided E(u2i )

exists and is finite.
Proof:
s2 =
e0 e
n
e0 e
n
u0 M u
=
=
nK
nK n
nK
n
Now
u0 M u = u0 (I X(X 0 X)1 X 0 )u
Ill let you finish the proof in the homework.

Large Sample Results For The Linear Model: Walter Sosa-Escudero

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Large Sample Results For The Linear Model: Walter Sosa-Escudero

Uploaded by

Copyright:

Available Formats

Large Sample Theory