2 PDF

ASYMPTOTIC THEORY
Convergence of Deterministic Sequences
A sequence of non-random numbers {aN : N = 1, 2, · · · } converges

to its limit a iff ∀ > 0, ∃N such that ∀N > N, |aN − a| < .
Written as: aN → a as N → ∞.
E.g. If aN = 2 + 1/N , Nlim

→∞ aN = 2. Thus aN → 2 as N → ∞.
A sequence {aN : N = 1, 2, · · · } is bounded iff ∃b < ∞ such that

|aN | ≤ b ∀N = 1, 2, · · · . Otherwise {aN } is unbounded.
E.g. If aN = (−1)N , then {aN } does not have a limit (is not conver-
gent), but is bounded by b = 1.
If aN = N 1/4, {aN } is not bounded. In fact aN → ∞ as N → ∞.
A sequence {aN } is O(Nλ) (i.e., at most of order N λ) iff {N −λaN }

is bounded.
As a special case, when λ = 0, {aN } is bounded and then {aN } is
O(1).
√
E.g., if aN = 10+ N , then {aN } = O(N 1/2) as {N −1/2aN } is bounded
by 11.
A sequence {aN } is o(Nλ) if N −λaN → 0 as N → ∞.

As a special case, when λ = 0, aN → 0 as N → ∞ and then {aN } is
o(1).
E.g. if aN = logN then {aN } = o(N λ), ∀λ > 1, since N −λlogN = logN .
Nλ
The numerator increases at a decreasing rate but the denominator
increases at an increasing rate as N → ∞. So the ratio decreases as
N → ∞ and converges to 0 in the limit.
Convergence in Probability
Random sequences: The best example of sequence of random vari-

ables is perhaps the sequence of particular sample statistics (say,
1 Pn
x̄ = n i=1 xi) for different sample sizes.
A sequence of random variables {xN : N = 1, 2, · · · } converges in

probability to the constant a if ∀ > 0, ∃N such that P [|xN − a| >
] < η whenever N > N, ∀η > 0.
It means that the probability that a term sufficiently higher up in the
sequence deviates from the constant “a” by any finite magnitude is
infinitesimally small.
Written as: P [|xN − a| > ] → 0 as N → ∞, or

p
xN → a (converges in probability to), or
plim xN = a (probability limit).
p
As a special case, when a = 0, {xN} is op(1), ie, xN → 0.
A sequence of random variables {xN } is (eventually) bounded in

probability iff ∀ > 0, ∃b and integer N such that P [|xN | ≥ b] <
, ∀N > N.
It means that the probability that a term sufficiently higher up in the
sequence lies beyond the range [−b, b] is insufficiently small.
When we say a sequence is bounded in probability, we shall mean

eventual boundedness.
We say {xN} is Op(1), when {xN } is bounded in probability.

The way to visualize a sequence of random numbers is a sequence of
probability densities associated with each random number.
Then the notion of convergence in probability states that the densities
associated with the random numbers higher and higher up in the
sequence will be more and more concentrated around the constant
“a” in the domain of x. In the limit, the density will be degenerate
at “a”.
The (eventual) boundedness in probability suggests that eventually
(i.e. except for a finite number of terms in the sequence), for any
pre-specified tail-probability the sequence will be contained within the
range [−b, b].
Tail-probability is the sum of areas in the shaded regions. The tail-
probability can be made as small as possible (indistinguishable from
0 in the limit) and yet we can obtain the bounds.
An unbounded sequence in probability means that the probability of

an extreme event (towards the tails) is always finite, however higher
up the term might be in the sequence.
The above notion of op(1) and Op(1) may be generalized in tandem

with the deterministic cases.
A random sequence {xN : N = 1, 2, · · · } is op(N δ ) for δ ∈ <, if

p
N −δ xN → 0, i.e., {N −δ xN } is convergent, i.e., P [|N −δ xN − 0| > ] <
η ∀η > 0, whenever N > N.
p
As a special case of this we get xN → 0 when δ = 0.
A random sequence {xN : N = 1, 2, · · · } is Op(N δ ) if P [|N −δ xN | ≥ b] <
∀ > 0 for any N > N; i.e., {N −δ xN } is bounded in probability.
As a special case we get {xN } is bounded in probability when δ = 0.
A random sequence {xN : N = 1, 2, · · · } is op(aN ) where {aN } is a

positive non-random sequence, iff xaN = op(1) i.e. xN = op(aN ).
N
A random sequence {xN : N = 1, 2, · · · } is Op(aN ) where {aN } is a

positive non-random sequence, iff xaN = Op(1) i.e. xN = Op(aN ).
N
Now we state some results without proving them.
p
LEMMA 1: If xN → a then xN = Op(1). So if a random sequence
converges in probability to any real number then the sequence is
bounded.
This result holds for vector xN or matrix XN .
LEMMA 2:
i) op(1) + op(1) = op(1)
ii) Op(1) + Op(1) = Op(1)
iii) Op(1) · Op(1) = Op(1)
iv) op(1) · Op(1) = op(1)
But since if a sequence is op(1) it is also Op(1) by Lemma 1, hence

op(1) + Op(1) = Op(1) [from (i)] and op(1) · op(1) = op(1), [from iv].
The notions of boundedness and convergence in probability of random

sequences extends to vectors or matrices as well.
p
Let {xN } be a sequence of random K × 1 vectors. Then xN → a iff
p
xN j → aj , j = 1 to K; where a is a K × 1 constant vector.
p
The above in turn is equivalent to ||xN −a|| → 0, where ||b|| = (b0b)1/2
is the Euclidean Norm of the vector b.
p
Let {ZN } be a sequence of random M × K matrices. Then ZN → B
p
where B is an M × K constant matrix iff ||ZN − B|| → 0 where ||A|| =
[tr(A0A)]1/2].
LEMMA 3: Let {ZN : N = 1, 2, · · · } be a sequence of J × K random

matrices such that ZN = op(1) and {xN : N = 1, 2, · · · } be a sequence
0
of random J ×1 vectors such that {xN } = Op(1). Then ZN xN = op(1).
This result follows from the above definitions and part (i) and (iv) of
Lemma 2.
LEMMA 4 (Slutsky’s Theorem): Let g : <K → <J be a vector
valued function continuous at some point c ∈ <K . Let {xN : N =
p
1, 2, · · · } be a sequence of K × 1 random vectors such that xN → c.
Then,
p
g(xN ) → g(c) as N → ∞.
In other words, plim g(xN ) = g(plim xN ), if g(·) is continuous at

plim xN .
This theorem gives plim operator its advantage over E(·) operator
and this is why finite sample analysis for many estimators are difficult
whereas we can still talk about their limiting properties.
A Digression – Probability Space:
Elements of the structure
• A non-empty set Ω of possible outcomes (sample space),
• a family = of subsets of Ω representing possible events (events

space), and
• a real-valued function P (.) on = such that ∀E ∈ =, P (E) is inter-

preted as probability of event E.
E.g. Tossing two coins simultaneously → a random experiment:
→ Ω = {(H, H), (H, T ), (T, H), (T, T )}

→ = is a family of all subsets of Ω. In this case the power set of Ω
can serve as =. Events may be identified as follows:
(i) The event of getting at least one head, A = [(H, H), (H, T ), (T, H)]
(ii) The event of getting exactly one head, B = [(H, T ), (T, H)]
→ If the coin is fair then for each ω ∈ Ω, P ({ω}) = 1/4.
In general, a set function structure (Ω, =, P (.)) is a probability space

iff ∀A and B ∈ =,
A1. = is a suitable algebra of sets on Ω
[For finite probability space, = is a Boolean algebra , i.e., closed un-
der finite complementation, union and intersections. For countable
probability space, = is σ-algebra, i.e., closed under countable comple-
mentation, union and intersection. For uncountable probability space,
= is the σ-algebra generated by a family of sub-sets of Ω.]
A2. P (A) ≥ 0 ∀A ∈ =
A3.P (Ω) = 1
A4. If A ∩ B = φ, then P (A ∪ B) = P (A) + P (B).
Thus, (Ω, =) is a measurable space and P (.), a suitable measure on

(Ω, =) satisfying the 4 axioms, is a mapping from = to < such that
if A ∈ =, then P (A) = ω∈A P (ω). Then (Ω, =, P (.)) is a probability
P
space.
A random variable is a mapping from Ω to <.
Let (Ω, =, P (.)) be a probability space. A sequence of events {ΩN :

N = 1, 2, · · · } ∈ = is said to occur with probability approaching one
(w.p.a.1) [or “almost surely”] iff P (ΩN ) → 1 as N → ∞.
The complement of ΩN , viz. ΩcN , can occur for every N, but its
chance of occurrence goes to 0 as N → ∞.
COROLLARY 1: Let {ZN : N = 1, 2, · · · } be a sequence of random

K × K matrices and let A be a non-random, invertible K × K matrix.
p
If ZN → A, then
−1
1. ZN exists w.p.a.1
−1 p
2. ZN → A−1 or plim Z−1
N = A−1 (in an appropriate sense).
Proof : Determinant is a continuous function in the space of all square

p
matrices. Hence det(ZN ) → det(A) [by Lemma 4].
But det(A) 6= 0 as A is non-singular. Hence P [det(ZN ) 6= 0] → 1 as

N → ∞. [Part 1 Proved]
How to define Z−1

N when ZN is non-singular?
Let ΩN be the set of outcomes (ω) such that ZN (ω) is non-singular

∀ω ∈ ΩN . Thus, P (ΩN ) → 1 as N → ∞ [from part 1].
Define a new sequence of matrices as

Z̃N (ω) ≡ ZN (ω) if ω ∈ ΩN .
Z̃N (ω) ≡ IK (ω) if ω ∈
/ ΩN .
p
Then P (Z̃N = ZN ) = P (ΩN ) → 1 as N → ∞. Now since ZN → A,
p
hence, Z̃N → A(by transitivity).
Now, inverse function is continuous on the space of invertible ma-
−1 p p
trices. So Z̃N → A−1. This implies Z−1N → A
−1 as the fact that
ZN can be singular with vanishing probabilities does not affect the

analysis.
Convergence in Distribution:
A sequence of random variables {xN : N = 1, 2, · · · } converges in

distribution to the continuous random variable x iff FN (ξ) → F (ξ) as
N → ∞ ∀ξ ∈ < where FN is the c.d.f of xN and F is the (continuous)
c.d.f of x.
d
Written as: xN → x.
d a
When x ∼ N (µ, σ 2), we say xN → N (µ, σ 2), or, xN ∼ N (µ, σ 2) [xN is
asymptotically normal].
SN −N ·p a
E.g. xN ≡ ∼ N (0, 1) where SN ∼ bin(N, p)
[N ·p(1−p)]1/2
So xN is not required to be continuous for any finite N, but the

limiting distribution is continuous.
A sequence of K × 1 random vectors {xN : N = 1, 2, · · · } converges

in distribution to the continuous random vector x iff ∀K × 1 non-
d
random vector c, where c0c = 1, c0xN → c0x.
d
Written as: xN → x.
d
E.g.: When x ∼ N (m, V) (i.e. c0x ∼ N (c0m, c0Vc)), c0xN → N (c0m, c0Vc)
d
such that c0c = 1 ⇒ xN → N (m, V).
d
LEMMA 5: If xN → x, where x is any K × 1 random vector, then
xN = Op(1)
We already know from Lemma 1 a sufficient condition for a random
sequence to be bounded in probability. This lemma provides us with
yet another sufficient condition for the same.
LEMMA 6: Let {xN } be a sequence of K × 1 random vectors such

d d
that xN → x. If g : <K → <J is a continuous function, then g(xN ) →
g(x).
This is called continuous mapping theorem. It is the counterpart

of Slutsky’s Theorem for convergence in distribution.
It is extremely useful for finding asymptotic distribution of test-statistics

once the limiting distribution of an estimator is known.
COROLLARY 2: If {zN } is sequence of K × 1 random vectors such
d
that zN → N (0, V), then
d
1. For any K × M non-random matrix A, A0zN → N (0, A0VA)
0 d
2. zN V−1zN → χ2K
The above result is intuitive from the knowledge that A0zN is a linear
d d
combination of the elements of zN . As zN → N (.), hence A0zN → N (.)
as the linear transformation is a continuous mapping.
0 d
Also zN V−1zN = (V−1/2zN )0(V−1/2zN ). As zN → N (0, V), hence
d a
V−1/2zN → N (0, IK ) by the above logic. Thus, (V−1/2zN )0(V−1/2zN ) ∼
χ2
K , being sum of squares of standard normal variates, asymptotically.
LEMMA 7: Let {xN } and {zN } be sequences of K × 1 random
d p d
vectors. If zN → z and xN − zN → 0, then xN → z.
This is asymptotic equivalence lemma. It is extremely useful and

used frequently in asymptotic analysis.
Limit Theorems for Random Samples
THEOREM 1: Let {wi : i = 1, 2, · · · } be a sequence of iid G × 1

random vectors such that E(|wig |) < ∞, g = 1, · · · , G. Then the
sequence satisfies the Weak Law of Large Numbers (WLLN):
p
N −1
PN
w
i=1 i → µw , where µw ≡ E(wi).
This is the familiar result that plim x̄ = m.
THEOREM 2 (Lindeberg-Levy): Let {wi : i = 1, 2, · · · } be a se-

quence of iid G × 1 random vectors such that E(|wig 2 |) < ∞, g =
1, · · · , G and E(wi = 0). Then {wi : i = 1, 2, · · · } satisfies Cen-

d
tral Limit Theorem (CLT): N −1/2
PN
w
i=1 i → N (0, B), where B =
0
var(wi) = E(wiwi) is positive (semi)definite.

2 PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2 PDF

Uploaded by

Copyright:

Available Formats

ASYMPTOTIC THEORY

Convergence of Deterministic Sequences

A sequence of non-random numbers {aN : N = 1, 2, · · · } converges

E.g. If aN = 2 + 1/N , Nlim

A sequence {aN : N = 1, 2, · · · } is bounded iff ∃b < ∞ such that

A sequence {aN } is O(Nλ) (i.e., at most of order N λ) iff {N −λaN }

A sequence {aN } is o(Nλ) if N −λaN → 0 as N → ∞.

Random sequences: The best example of sequence of random vari-

A sequence of random variables {xN : N = 1, 2, · · · } converges in

Written as: P [|xN − a| > ] → 0 as N → ∞, or

A sequence of random variables {xN } is (eventually) bounded in

When we say a sequence is bounded in probability, we shall mean

We say {xN} is Op(1), when {xN } is bounded in probability.

An unbounded sequence in probability means that the probability of

The above notion of op(1) and Op(1) may be generalized in tandem

A random sequence {xN : N = 1, 2, · · · } is op(N δ ) for δ ∈ <, if

A random sequence {xN : N = 1, 2, · · · } is op(aN ) where {aN } is a

A random sequence {xN : N = 1, 2, · · · } is Op(aN ) where {aN } is a

Now we state some results without proving them.

But since if a sequence is op(1) it is also Op(1) by Lemma 1, hence

The notions of boundedness and convergence in probability of random

LEMMA 3: Let {ZN : N = 1, 2, · · · } be a sequence of J × K random

In other words, plim g(xN ) = g(plim xN ), if g(·) is continuous at

Elements of the structure

• A non-empty set Ω of possible outcomes (sample space),

• a family = of subsets of Ω representing possible events (events

• a real-valued function P (.) on = such that ∀E ∈ =, P (E) is inter-

→ Ω = {(H, H), (H, T ), (T, H), (T, T )}

In general, a set function structure (Ω, =, P (.)) is a probability space

Thus, (Ω, =) is a measurable space and P (.), a suitable measure on

A random variable is a mapping from Ω to <.

Let (Ω, =, P (.)) be a probability space. A sequence of events {ΩN :

COROLLARY 1: Let {ZN : N = 1, 2, · · · } be a sequence of random

Proof : Determinant is a continuous function in the space of all square

But det(A) 6= 0 as A is non-singular. Hence P [det(ZN ) 6= 0] → 1 as

How to define Z−1

Let ΩN be the set of outcomes (ω) such that ZN (ω) is non-singular

Define a new sequence of matrices as

ZN can be singular with vanishing probabilities does not affect the

A sequence of random variables {xN : N = 1, 2, · · · } converges in

So xN is not required to be continuous for any finite N, but the

A sequence of K × 1 random vectors {xN : N = 1, 2, · · · } converges

LEMMA 6: Let {xN } be a sequence of K × 1 random vectors such

This is called continuous mapping theorem. It is the counterpart

It is extremely useful for finding asymptotic distribution of test-statistics

This is asymptotic equivalence lemma. It is extremely useful and

THEOREM 1: Let {wi : i = 1, 2, · · · } be a sequence of iid G × 1

This is the familiar result that plim x̄ = m.

THEOREM 2 (Lindeberg-Levy): Let {wi : i = 1, 2, · · · } be a se-

1, · · · , G and E(wi = 0). Then {wi : i = 1, 2, · · · } satisfies Cen-

You might also like

Written as: P [|xN − a| > ] → 0 as N → ∞, or