Chapter 7: Introduction: 1 Convergence in Distribution

Chapter 7: Introduction
Suppose X1 , X2 , . . . , X n is a random sample on a random variable X which has a N(µ, σ2 )

Ín
distribution. Denote the sample mean by X̄ n 1
n i1 X i . Then it is well know that
X̄ n ∼ N(µ, σ2 /n). What if X does not have a normal distribution?
Suppose X1 , X2 , . . . , X n is a random sample on a random variable X which has a
Statistical Inference, September 9, 2020
Ín
EXP(θ) distribution. Denote the sample sum by n X̄ n i1 X i . Then it is well know that
n X̄ n ∼ GAM(θ, n). What if X does not have an exponential distribution?
Suppose X1 , X2 , . . . , X n is a random sample on a random variable X which has a
distribution defined by the pdf f X (x). Further, suppose that t n t(x1 , . . . , x n ) is a function
of x 1 , . . . , x n such that Tn t(X1 , . . . , X n ) is a random variable. Several special forms of
Ín Ín
Tn are Tn X̄ n 1
n i1 X i , Tn S2n 1
n−1 i1 (X i − X̄ n )2 , Tn X(1) , Tn X(n) , etc. These
random variables play a key role in obtaining exact procedures for estimation, confidence
interval and testing of unknown parameters of the distribution.
In some cases the pdf of Tn is obtained easily, but there are many important cases
where the derivation is not tractable.
In many of these, it is possible to obtain useful approximate results that apply when n
is large. These results are based on the notions of convergence in distribution and limiting
distribution.
Chapter 7: Sequences of Random Variables
Consider a sequence of random variables Y1 , Y2 , . . . with a corresponding sequence of

CDFs G1 (y), G2 (y), . . . so that for each n 1, 2, . . .
G n (y) P[Yn ≤ y].
1 Convergence in distribution
If the CDF of Yn is G n (y) for each n 1, 2, . . . , .. and if for some CDF GY (y) of a random
variable Y,
lim G n (y) GY (y)
n→∞
1
for all values y at which GY (y) is continuous, then the sequence Y1 , Y2 , . . . , .. is said to
d
converge in distribution to Y, denoted by Yn → Y. The distribution corresponding to the
CDF GY (y) is called the limiting distribution of Yn .
Example 7.2.1 of the book Let X1 , . . . , X n , be a random sample from a uniform distribution,
X ∼ UNIF(0, 1). Then,
f X (x) 1, 0 < x < 1,
and zero otherwise, and



 0, if x ≤ 0


FX (x) x, if 0 < x < 1

 1, if x ≥ 1.


Further, let Yn X n:n , the largest order statistic. Then, it follows that the G n (y) of Yn is
G n (y) [FX (y)]n y n , 0<y<1
and zero if y ≤ 0 and one if y ≥ 1. Of course, when 0 < y < 1, y n approaches 0 as n

approaches ∞, and when y ≤ 0 or y ≥ 1, G( y) is a sequence of constants, with respective
limits 0 or 1. Thus, (
0, if y < 1
lim G n (y)
n→∞ 1, if y ≥ 1
The degenerate random variable

A random variable X is degenerate if, for some constant µ, P(X µ) 1. The CDF of
X is given by (
0, if x < µ
FX (x)
1, if x ≥ µ.
The moment generating function of X is
MX (t) exp(µt)
and
Var(X) 0.
Now, let Y be a degenerate random variable with P(Y 1) 1. Then

(
0, if y < 1
GY (Y)
1, if y ≥ 1.
2
converge in distribution to Y G(y), denoted by - Y. The distribution corre-
sponding to the CDF G(y) is called the limiting distribution of }.
Fxmph1 7.2.1 Let X1, ..., X, be a random sample from a uniform distribution, X. UNIF(O, 1),
and let }Ç = the largest order statistic. From the results of Chapter 6, it
follows that the
Now, one can check that CDF of is
G(y)y' lim G n (y) GY (y)
O<y<1 (7.2.3)
n→∞
zero if y O and one if y 1. Of course, when O <y < 1, yfl approaches O as n
for all values y at which GY (y) is continuous.
approaches x, and when y O or y ? 1, G(y) is a sequence of constants, with
This situation is illustrated in Figure 1 (Figure 7.1 in the book), which shows G(y) and
G(y) for n 2, 5, and 10.
FIGURE 7.1 Comparison of CDFs G(y) with limiting degenerate CDF G(y)
G,1(y) G(y)
5
Yo y2
y
o o
Figure 1: Comparison of CDFs G n (y) with limiting degenerate CDF GY (y)
d
Thus, Yn X n:n → Y where the random variable Y has a degenerate distribution with
P[Y 1] 1. In other words we can say that the nth order statistic from a uniform (0, 1)
distribution converges to a degenerate random variable. Or the nth order statistic from a
ession, OCR, web optimization
uniform usingdistribution
(0, 1) has a limiting a watermarked evaluationatcopy
which degenerates 1. of CVISION PDFCompressor
Example 7.2.2 of the book Let X1 , . . . , X n , be a random sample from an exponential
distribution, X ∼ EXP(θ). Then,
1 x
f X (x) exp − , x > 0, θ>0
θ θ
(
0, if x≤0
FX (x)
1 − exp − θ , if
x
x > 0.

Further, let Yn X1:n , the smallest order statistic. Then, it follows that the G n (y) of Yn is
ny
G n (y) 1 − [1 − FX (y)] 1 − exp −
n
, y>0
θ
3
y
and zero if y ≤ 0. We have limn→0 G n (y) 1 if y > 0 because exp − θ < 1 in this case.
Thus, the limit is zero if y < 0 and one if y > 0. Also, notice that the limit at y 0 is zero.
Thus
(
0, if y ≤ 0
lim G n (y)
n→∞ 1, if y > 0
Observe that the limiting function is not only discontinuous at y 0 but also not even
continuous from the right at y 0, which is a requirement of a CDF.
Now, define the CDF of a degenerate random variable Y as
(
0, if y < 0
GY (Y)
1, if y ≥ 0.
Now, note that the right hand side (limiting function) of limn→∞ G n (y) and GY (Y) are
equal except for one point y 0 but this is not a problem, because the definition of
convergence in distribution requires only that the limiting function agrees with a CDF at its
points of continuity and y 0 is the point of discontinuity of GY (Y).
d
Thus, X1:n → Y where the random variable Y has a degenerate distribution with
P[Y 0] 1. That is the first order statistic from an exponential distribution converges to
a degenerate random variable.
2 Stochastic convergence
A sequence of random variables, Y1 , Y2 , . . . , . . . is said to converge stochastically to a constant

c if it has a limiting distribution that is degenerate at c.
3 Non-degenerate limiting distributions
In earlier examples we have seen that limiting distribution are degenerate. But, not all
limiting distributions are degenerate, as seen in the next example. The following limits
are useful in many problems:
c nb
lim 1 + exp(cb),
n→∞ n
nb
c d(n)
lim 1 + + exp(cb) if lim d(n) 0.
n→∞ n n n→∞
4
The Pareto distribution
A random variable X is said to have a Pareto distribution with parameters θ and κ,
denoted by X ∼ PAR(θ, κ) if its density is given by
κ
f X (x; θ, κ) , x > 0, θ > 0, κ > 0.
θ(1 + x/θ)κ+1
The CDF is given by x −κ

FX (x; θ, κ) 1 − 1 + , x > 0.
θ
Example 7.2.3 of the book Let X1 , . . . , X n , be a random sample from a Pareto distribution,
X ∼ PAR(1, 1), and let Yn nX1:n . The CDF of X is
FX (x) 1 − (1 + x)−1 , x>0
so the CDF of Yn , is
G n (y) P[Yn ≤ y]
P[nX1:n ≤ y]
h yi
P X1:n ≤
n i
h y n
1 − 1 − FX
n
y −n
1− 1+ , y > 0.
n
Now, taking the limit of G n (y) as n → ∞, for y > 0, we get
y −n
lim G n (y) 1 − lim 1 + 1 − exp(−y).
n→∞ n→∞ n
WE know that if Y ∼ EXP(1), then
(
0, if y ≤ 0
GY (Y)
1 − exp(−y), if y > 0.
Now, observe that
lim G n (y) GY (Y).
n→∞
This is illustrated in Figure 2 (Figure 7.2 in the book), which shows the graphs of GY (y)
and G n (y) for n 1, 2, and 5.
Thus the limiting distribution of nX1:n is exponential which is a non-degenerate dis-
tribution.
5
which is the CDF of an exponential distribution, EXP(1) This is illustrated in
Figure 7.2, which shows the graphs of G(y) and G,,(y) for n = 1, 2, and 5.
FIGURE 7.2 Comparison of CDFs G(y) with 'imiting CDF G(y)
G(y) = i - e
Ql (Y) G2(y) G,(y)

Figure 2: following
The Comparison of CDFs
example G n (y)
shows thatwith limiting
a sequence ofCDF GY (y)
random variables need not
have a limiting distribution.
4 Limiting distribution does not exist
The following example shows that a sequence of random variables need not have a limiting
distribution.
X ∼ PAR(1, 1), and let Yn X n:n . The CDF of X is
compression, OCR, web optimization using a watermarked
−1 x evaluation copy of CVISION PDFCompr
FX (x) 1 − (1 + x) , x>0
1+x
G n (y) P[Yn ≤ y]
P[X n:n ≤ y]
n
FX y

n
y
, y > 0,
1+ y
and zero otherwise. Because y/(1 + y) < 1, we have limn→∞ G n (y) 0 for all y > 0 which
can not be a CDF because it does not approach one as y → ∞.
5 Limiting distribution - some more problems
X ∼ PAR(1, 1), and let Yn X n:n /n. The CDF of X is
x
FX (x) 1 − (1 + x)−1 , x>0
1+x
6
G n (y) P Yn ≤ y

X n:n
P[ ≤ y]
n n
FX n y
n
ny

1 + ny
−n
1 + ny

ny
−n
1
1+ , y > 0,
ny
c nb
and zero otherwise. Now, taking the limit (use the result limn→∞ 1 + exp(cb)) of

n
G n (y) as n → ∞, for y > 0, we get
−n
1 1
lim G n (y) lim 1 + exp − , y > 0.
n→∞ n→∞ ny y
Now, consider the CDF of Y (you can show that it is a CDF) given by
(
0, if y ≤ 0
GY (Y)
exp − 1y , if y > 0.
Then,
lim G n (y) GY (y)
n→∞
d
for all values y at which GY (y) is continuous. Hence, Yn n1 X n:n → Y where the random
variable Y has a non-degenerate distribution with pdf

1 −2
fY (y) exp − y , y > 0.
y
The distribution defined above is a special case of the inverse-gamma distribution.
Example 7.2.6 of the book Let X1 , . . . , X n , be a random sample from an exponential
distribution, X ∼ EXP(θ). Then,
1 x
f X (x) exp − , x > 0, θ>0
θ θ
(
0, if x≤0
FX (x)
− θx , if x > 0.

1 − exp
7
Further, let Yn θ1 X n:n − ln n. Then, it follows that G n (y) of Yn is
G n (y) P[Yn ≤ y]

1
P X n:n − ln n ≤ y
θ
P X n:n ≤ θ(y + ln n)

[FX (θ(y + ln n))]n

n
θ(y + ln n)

1 − exp −
θ
n
1 − exp −y exp (− ln n)

n
1
1 − exp −y , y > − ln n

n
and zero if y ≤ − ln n. Now, taking the limit of G n (y) as n → ∞, we get

n
1
lim G n (y) lim 1 − exp −y exp − exp −y , −∞ < y < ∞.

n→∞ n→∞ n
Now, consider the CDF of Y (you can show that it is a CDF) given by
GY (y) exp − exp −y , −∞ < y < ∞.

Then,
lim G n (y) GY (y)
n→∞
d
for all values y. Hence, Yn 1
θ X n:n − ln n → Y, where the random variable Y has a
non-degenerate distribution with pdf
fY (y) exp − exp −y exp −y , −∞ < y < ∞.

THE distribution defined by the above density is called the Gumbel distribution.
Example 7.2.7 of the book Let X1 , . . . , X n , be a random sample from a normal distribution,
X ∼ N(µ, σ2 ). Then, Yn X̄ n ∼ N(µ, n1 σ2 ). Further,
√
n(Yn − µ)
∼ N(µ, σ2 )
σ
which gives
√ √ √
n(Yn − µ) n(y − µ) n(y − µ)

G n (y) P[Yn ≤ y] P ≤ Φ , −∞ < y < ∞.
σ σ σ
8
where Φ(·) is the CDF of a standard normal distribution. To understand the explanation
easier, let us write the above expression in terms of an integral as
√
∫ n(y−µ)
σ
G n (y) P[Yn ≤ y] φ(z) dz,
−∞
where φ(z) is the pdf of a standard normal variable. Now, consider three cases y < µ,
y µ and y > µ.
√ √
n(y−µ) n(y−µ)
If y < µ, then σ < 0 and σ → −∞ as n → ∞ giving
lim G n (y) 0, if y < µ.

n→∞
√
n(y−µ)
If y µ, then σ 0 and
∫ 0
1
lim G n (y) lim φ(z) dz , if y µ.
n→∞ n→∞ −∞ 2
√ √
n(y−µ) n(y−µ)
If y > µ, then σ > 0 and σ → ∞ as n → ∞ giving
lim G n (y) 1, if y > µ.

n→∞
Thus,


 0, if y < µ,


lim G n (y) 1
2 , if y µ,
n→∞ 
 1, if y > µ.


Now, let Y be a degenerate random variable with P(Y µ) 1. Then
(
0, if y < µ
GY (Y)
1, if y ≥ µ.
Now, note that the right hand side (limiting function) of limn→∞ G n (y) and GY (Y) are
equal except for one point y µ but this is not a problem, because the definition of
convergence in distribution requires only that the limiting function agrees with a CDF at its
points of continuity and y µ is the point of discontinuity of GY (Y).
d
Thus, X̄ n → Y, where the random variable Y has a degenerate distribution with
P[Y µ] 1. That is the sample mean from a normal distribution with mean µ and
variance σ 2 converges to a degenerate random variable. Needless to say that this type of
convergence is also called stochastic convergence.
9
6 Limiting distributions - conclusion
We have seen that if X1 , . . . , X n , is a random sample from a Pareto distribution, X ∼

PAR(1, 1), then
• limiting distribution of nX1:n is exponential which is a non-degenerate distribution.

(Example 7.2.3)
• the limiting distribution of X n:n does not exist. (Example 7.2.4)

d
• 1
n X n:n → Y where the random variable Y has a non-degenerate distribution (inverse
gamma distribution) with the pdf

1 −2
fY (y) exp − y , y > 0.
y
(Example 7.2.5)
Also, we have shown that if X1 , . . . , X n is a random sample from an exponential

distribution, X ∼ EXP(θ), then
d
• X1:n → Y, where the random variable Y has a degenerate distribution with P[Y
0] 1 (Example 7.2.2)
d
• 1
θ X n:n − ln n → Y, where the random variable Y has a non-degenerate distribution
(Gumbel distribution) with the pdf
fY (y) exp − exp −y exp −y , −∞ < y < ∞.

(Example 7.2.6)
d
Also, we have shown that X̄ n → Y, where the random variable Y has a degenerate
distribution with P[Y µ] 1 and X̄ n is the sample mean from a normal distribution
with mean µ and variance σ2 . (Example 7.2.7)
That is the sequence of random variables may converge in distribution to a degenerate
random variable or a non-degenerate random variable or may not converge at all to any
random variable.
10
7 Approximation using limiting distribution
We have seen that if X1 , . . . , X n is a random sample from an exponential distribution,

X ∼ EXP(θ), then Yn θ1 X n:n − ln n has the CDF given by
n
1
G n (y) 1 − exp −y , y > − ln n

d
and zero if y ≤ − ln n. Further, Yn θ1 X n:n − ln n → Y, where the random variable Y has
a non-degenerate distribution (Gumbel distribution) with the CDF and the pdf given by
GY (y) exp − exp −y , −∞ < y < ∞.

and
fY (y) exp − exp −y exp −y , −∞ < y < ∞,

respectively.
We now illustrate the accuracy when this limiting CDF is used as an approximation to
G n (y) for large n
Suppose that the lifetime in months of a certain type of component is a random variable
X ∼ EXP(1), and suppose that 10 independent components are connected in a parallel
system. The time to failure of the system is T X10:10 and the CDF is
FT (t) P[T ≤ t] [1 − FX (t)]10 [1 − exp(−t)]10 , t > 0.
This CDF is evaluated at t 1, 2, 5 and 7 months in the table given below.
t 1 2 5 7
FT (t) 0.010 0.234 0.935 0.9909
G(t − ln 10) 0.025 0.258 0.935 0.9909
To approximate these probabilities with the limiting distribution, then
FT (t) P[T ≤ t]
P[Y10 + ln 10 ≤ t]
P[Y10 ≤ t − ln 10]
G(t − ln 10),
11
where
G(t − ln 10) exp − exp {−(t − ln 10)}

exp − exp(−t) exp(ln 10)

exp −10 exp(−t) .

Chapter 7: The Central Limit Theorem
In the previous examples, the exact CDF was known for each finite n and the limiting
distribution was obtained directly from this sequence. One advantage of limiting dis-
tributions is that it often may be possible to determine the limiting distribution without
knowing the exact form of the CDF for finite n. The limiting distribution then may pro-
vide a useful approximation when the exact probabilities are not available. One method
of accomplishing this result is to make use of MGFs. The following theorem (Theorem
7.3.1) is stated without proof.
Theorem 7.3.1 Let Y1 , Y2 , . . . , . . . be a sequence of random variables with respective CDFs

G1 (y), G2 (y), . . . , and MGFs M1 (t), M2 (t), . . . , .. If M(t) is the MGF of a CDF G(y), and
if limn→∞ M n (t) M(t) for all t in an open interval containing zero, −h < t < h, then
limn→∞ G(y) G(y) for all continuity points of G(y).
The Bernoulli distribution

A discrete random variable X is said to have a Bernoulli distribution with parameter
θ, denoted by X ∼ BIN(1, θ), if its probability function is given by
f X (x; θ) θ x (1 − θ)1−x , x 0, 1, 0 < θ < 1.
Further, if X1 , . . . , X n are independent Bernoulli random variables, X i ∼ BIN(1, θ), i

Ín Ín
1, . . . , n, then the sum Y i1 X i has a binomial distribution, Y i1 X i ∼ BIN(n, θ),
defined by the probability function

n y
fY (y; n, θ) θ (1 − θ)n−y , y 0, 1, . . . , n, 0 < θ < 1.
y
The MGF of Y is given by
n
MY (t) 1 − θ + θ exp(t) , −h < t < h, h > 0.
12
The Poisson distribution
A discrete random variable X is said to have the Poisson distribution with parameter µ > 0
if it has discrete pdf of the form
exp(−µ)µ x
f X (x; µ) , x 0, 1, µ > 0.
x!
A special notation that designates that a random variable X has the Poisson distribution
with parameter µ is X ∼ POI(µ). The MGF of X is given by
MX (t) exp µ(exp(t) − 1) , −h < t < h, h > 0.

It has been shown in Theorem 3.2.3 of the book that if X ∼ BIN(n, p), then for each
value x 0, 1, 2, . . . and p → ∞ with np µ constant,
exp(−µ)µ x

n x
lim p (1 − p)n−x .
n→∞ x x!
Example 7.3.1 of the book Let X1 , . . . , X n , be a random sample from a Bernoulli distribu-
Ín
tion distribution, X i ∼ BIN(1, p), i 1, . . . , n, consider the sum Yn i1 X i which has a
binomial distribution, Yn ∼ BIN(n, p) with MGF
n
MYn (t) 1 − p + p exp(t) , −h < t < h, h > 0.
If we let p → 0 as n → ∞ in such a way that np µ, for fixed µ > 0, then

n
MYn (t) 1 − p + p exp(t)
µ µ n
1 − + exp(t)
n n
h µ in
1 + (exp(t) − 1) .
n
Now, using the result
c nb
lim 1 + exp(cb),
n→∞ n
we have
h µ in
lim MYn (t) lim 1 + (exp(t) − 1) exp µ(exp(t) − 1) , −h < t < h, h>0

n→∞ n→∞ n
13
d
which is the MGF of a Poisson distribution with mean µ. Thus Yn → Y ∼ POI(µ).
Example 7.3.2 of the book (Bernoulli Law of Large Numbers) Let X1 , . . . , X n be a random
sample from a Bernoulli distribution distribution, X i ∼ BIN(1, p), i 1, . . . , n, consider
Ín
the the sequence of sample proportion Wn 1
n Yn 1
n i1 X i . The MGF of Wn in this
case
MWn (t) E[exp (tWn )]

1
E exp t Yn
n
h t i t
E exp Yn MYn
n t i n n
h t
1 − p + p exp , −h < < h, h > 0.
n n
t

Expanding exp n by power series expansion of the exponential function,
t1 t 1 t 2 1 t 3
exp 1+ + + +··· ,
n 1! n 2! n 3! n
in the above expression, we get
n
1 t 1 t 2 1 t 3

MWn (t) 1 − p + p 1 + + + +···
1! n 2! n 3! n
2 n
t3

pt t
1+ +p + +···
n 2n 2 6n 2
n
pt d(n)
1+ + ,
n n
where
t2 t3

d(n) p + +··· .
2n 6n 2
It can easily be verified that for fixed value of p, limn→∞ d(n) 0. Now, applying the
result nb
c d(n)
n→∞ n n n→∞
we obtain
n
pt d(n)
lim MWn (t) lim 1 + +
n→∞ n→∞ n n
exp(pt)
which is the MGF of a degenerate distribution with probability mass concentrated at p,

and thus Wn converges stochastically to p as n approaches infinity.
14
Example 7.3.3 of the book Let X1 , . . . , X n , be a random sample from a Bernoulli distribu-
tion, X i ∼ BIN(1, p), i 1, . . . , n, consider the the sequence of "standardized" variables
Ín
X i − np Yn − np Yn np
Z n pi1 − ,
np(1 − p) σn σn σn
where E(Yn ) np and Var(Yn ) np(1 − p) σn2 . The MGF of Z n in this case
MZ n (t) E[exp (tZ n )]

Yn np
E exp t −t
σn σ
n
np t
exp −t E exp Yn
σn σ
n
np t
exp −t MYn
σn σn
n
pt t t
exp − 1 − p + p exp , −h < < h, h > 0.
σn σn σn

pt t
Expanding exp − σn and exp σn by using the power series expansion of the exponential
function,
2
pt 1 pt 1 pt
exp − 1− + +···
σn 1! σn 2! σn
and 2
t 1 t 1 t
exp 1+ + +···
σn 1! σn 2! σn
in the above expression, we get
n
pt p 2 t 2 t2

t
MZ n (t) 1 − + + · · · 1 − p + p 1 + + +···
σn 2σn2 σn 2σn2
n
pt p 2 t 2 pt 2

pt
1− + +··· 1+ + +···
σn 2σn2 σn 2σn2
n
pt 2 pt 2 p2 t2 pt 2

pt pt pt pt
1+ + − 1+ + + 1+ + +···
σn 2σn2 σn σn 2σn2 2σn2 σn 2σn2
n
pt 2 p 2 t 2 p 2 t 2 p 2 t 3 p 3 t 3 p 3 t 4

1+ 2 − 2 + − 3 + + +··· .
2σn σn 2σn2 2σn 2σn3 4σn4
| {z } | {z }
first term second term
Now, substituting σn
p
np(1 − p) in the first term and the second term we get
pt 2 p2 t2 p2 t2 pt 2 p2 t2 (p − p 2 )t 2 (p(1 − p))t 2 t2
first term − + −
2σn2 σn2 2σn2 2σn2 2σn2 2σn2 2np(1 − p) 2n
15
and
p2 t3 p3 t3 p3 t4
second term + + +···
2σn3 2σn3 4σn4
p2 t3 p3 t3 p3 t4
+ + +···
2[np(1 − p)]3/2 2[np(1 − p)]3/2 4[np(1 − p)]2
√ 3 p
pt p3 t3 pt 4

1
+ p + +···
n 2 n(1 − p)3 2 n(1 − p)3 4n(1 − p)2

p
d(n)

n
where √ p
pt 3 pt 4 p3 t3
d(n) p + p + 2
+··· .
2 n(1 − p)3 2 n(1 − p)3 4n(1 − p)
It can be checked that, for a fixed value of p, d(n) → 0 as n → ∞. Now, MZ n (t) can be
written as n
t2

d(n)
MZ n (t) 1 + + .
2n n
Finally, applying the result
nb
c d(n)
n→∞ n n n→∞
we obtain
n
t2

d(n)
lim MZ n (t) lim 1 + +
n→∞ n→∞ 2n n
2
t
exp
2
d
which is the MGF of of the standard normal distribution, and so Z n → Z ∼ N(0, 1). This
is an example of a special limiting result known as the Central Limit Theorem.
Theorem 7.3.2 Let X1 , X2 , . . . , X n be a random sample from a from a distribution with

mean µ and variance σ 2 < ∞, then the limiting distribution of
Ín
i1 X i − nµ
Zn √
nσ
d
is the standard normal, Z n → Z ∼ N(0, 1).
16
Let m(t) denote the MGF of X − µ, m(t) MX−µ (t), and note that m(0) 1, m 0(0)
E(X − µ) 0, and m 00(t) E(X − µ)2 σ 2 . Expanding m(t) by the Taylor series formula
about 0 gives, for ξ between 0 and t,
m 00(ξ)t 2
m(t) m(0) + m 0(0)t +
2
m 00(ξ)t 2
1+
2
(m 00(ξ) − σ 2 + σ 2 )t 2
1+
2
σ 2 t 2 (m 00(ξ) − σ 2 )t 2
1+ + .
2 2
Now we may write
Ín n
i1 X i − nµ Õ X i − µ
Zn √ √
nσ i1
nσ
and
MZ n (t) E exp(tZ n )

" n
!#
Õ Xi − µ
E exp t √
i1
nσ
" n #
Xi − µ
Ö
E exp t √
i1
nσ
n
Xi − µ
Ö
E exp t √ ( ∵ X1 , . . . , X n are independent)
i1
nσ
n
t
E exp √ (X − µ) ( ∵ X1 , . . . , X n are identical)
nσ
n
t
m √
nσ
n
t2 (m 00(ξ) − σ 2 )t 2

|t |
1+ + , 0<ξ< √ .
2n 2nσ 2 nσ
|t |
Observe that n → ∞, √nσ → 0, and therefore ξ → 0, and consequently m 00(ξ)− σ 2 → 0.
So we take
(m 00(ξ) − σ 2 )t 2
d(n) , lim d(n) 0,
2σ 2 n→∞
and we re-write MZ n (t) as
n
t2

(d(n)
MZ n (t) 1 + + .
2n n
17
Finally, applying the result
nb
c d(n)
n→∞ n n n→∞
we obtain
n
t2

d(n)
lim MZ n (t) lim 1 + +
n→∞ n→∞ 2n n
2
t
exp .
2
Or equivalently,
lim GZ n (t) Φ(z)
n→∞
d
which suggests Z n → Z ∼ N(0, 1).
The major application of the CLT is to provide an approximate distribution in cases
where the exact distribution is unknown or intractable.
Example 7.3.4 of the book Let X1 , . . . , X n , be a random sample from a Uniform distribution
Ín
distribution, X i ∼ UNI(0, 1), i 1, . . . , n, and let Yn i1 X i . Because E(X i ) 1/2 and
Var(X i ) 1/12, we have the approximation
approx
n n
Yn ∼ N , .
2 12
For example, if n 12, then approximately
Y12 − 6 N (0, 1) .
This approximation is so close that it often is used to simulate standard normal random
numbers in computer applications. Of course this requires 12 uniform random numbers
to be generated to obtain one random number from the standard normal distribution.
Chapter 7: Approximations For The Binomial Distribution
Let X1 , . . . , X n , be a random sample from a Bernoulli distribution, X i ∼ BIN(1, p), i

Ín
1, . . . , n, consider the sum Yn i1 X i which has a binomial distribution, Yn ∼ BIN(n, p).
1. If we let p → 0 as n → ∞ in such a way that np µ, for fixed µ > 0, then it has been
d
shown that Yn → Y ∼ POI(µ). (Example 7.3.1 of the book)
18
Ín
2. If p is fixed, then Wn 1
n Yn 1
n i1 X i Wn converges stochastically to p as n
approaches infinity. (Example 7.3.2 of the book)
3. For a fixed value of p, the the sequence of "standardized" variables

Ín
X i − np
Z n pi1
np(1 − p)
converges in distribution to a standard normal variable. (Example 7.3.3 of the book)
Here we will concentrate on the last case which states that for a fixed value of p
a suitably standardized sequence of binomial random variable converges to a standard
normal distribution-suggesting a normal approximation. In particular, it suggests that
for large n and fixed p, approximately Yn ∼ N(np, np(1 − p)). This approximation works
best when p is close to 0.5, because the binomial distribution is symmetric when p 0.5.
The accuracy required in any approximation depends on the application. One guideline
is to use the normal approximation when np > 5 and n(1 − p) > 5, but again this would
depend on the accuracy required.
Example 7.4.1 of the book The probability that a basketball player hits a shot is p 0.5, If
he takes 20 shots, what is the probability that he hits at least nine? The exact probability is
P[Y20 ≥ 9] 1 − P[Y20 ≤ 8]
8
Õ 20
1− (0.5) y (1 − 0.5)20−y
y
y0
8
Õ 20
1 − (0.5) 20
y
y0
0.7483.
A normal approximation is
P[Y20 ≥ 9] 1 − P[Y20 ≤ 8]
" #
Y20 − np 8 − 10
1−P p ≤ p ( ∵ n 20, p 0.5)
np(1 − p) (20)(0.5)(0.5)
" #
Y20 − np
−2
1−P p ≤ √ .
np(1 − p) 5
Now,
Y20 − np approx
p ∼ N (0, 1)
np(1 − p)
19
and therefore

−2
P[Y20 ≥ 9] 1 − Φ − √
5
2
1− 1−Φ √
5
2
Φ √
5
Φ (0.894427)
0.8133.
You can use online calculator to compute normal probabilities here or here.
Because the binomial distribution is discrete and the normal distribution is continuous,
the approximation can be improved by making a continuity correction. In particular,
each binomial probability b(y; n, p) has the same value as the area of a rectangle of height
b(y; n, p) and with the interval [y − 0.5, y + 0.5] as its base, because the length of the base
is one unit. The area of this rectangle can be approximated by the area under the pdf of
Y ∼ N(np, np(1 − p)), which corresponds to fitting a normal distribution with the same
mean and variance as Yn ∼ BIN(n, p). This is illustrated for the case of n 20, p 0.5, and
20
y 7 in Figure 3, where the exact probability is b(7; 20, 0.5) 7
5 (0.5) (0.5)
13 0.0739.
The approximation, which is the shaded area in the Figure 3 is
" #
6.5 − 10 Y − np 7.5 − 10
P[6.5 ≤ Y ≤ 7.5] P p ≤ p ≤ p
(20)(0.5)(0.5) np(1 − p) (20)(0.5)(0.5)
P [−1.56525 ≤ Z ≤ −1.11803]
Φ(−1.11803) − Φ(−1.56525)
0.0732.
Y−np
where Y ∼ N(np, np(1 − p)), Z √ ∼ N(0, 1) with n 20 and p 0.5. You can
np(1−p)
compute normal probabilities here.
The same idea can be used with other binomial probabilities, such as
P[Y20 ≥ 9] 1 − P[Y20 ≤ 8]
where we approximate P[Y20 ≤ 8] as
P[Y20 ≤ 8] P[Y ≤ 8.5]
20
FIGURE 7.3 Continuity correction for normal approximation of a binomial probability
b(7;20,0.5) = 0.0739
6.5 7 7.5
Figure 3: Continuity correction for normal approximation of a binomial probability
" #
Y − np 8.5 − 10
P p ≤ p
np(1 − p) (20)(0.5)(0.5)
P [Z ≤ −0.67082]
Φ(−0.67082)
and
P[Y20 ≥ 9] 1 − P[Y ≤ 8.5] 1 − Φ(−0.67082) 0.7486
which is much closer to the exact value than without the continuity correction. The
situation is shown in Figure 4
Example 7.4.2 of the book Suppose that Yn ∼ POI(n), where n is a positive integer.
From the reproductive property of Poisson distribution, we know that Yn has the same
Ín
distribution as a sum i1 X i , where X1 , . . . , X n are independent, X i ∼ POI(1). According
to the CLT,
Yn − n
Zn √ Z ∼ N(0, 1),
n
approx
which suggests the approximation Yn ∼ N(n, n) for large n. For example, n 20, we
desire to find P[l0 < Y20 < 30]. The exact value is
30
Õ exp(−20)(20) y
P[10 ≤ Y20 ≤ 30] 0.982,
y!
i10
21
FIGURE 7.4 The normal approximation for a binomial distribution
Figure 4: The normal approximation for a binomial distribution
and the approximate value is

9.5 − 20 Y − n 30.5 Y−n
P[9.5 ≤ Y ≤ 30.5] P √ ≤ √ ≤ √ , √ ∼ N(0, 1)
ompression, OCR, web optimization using 20 n
a watermarked n copy of
20evaluation CVISION PDFCom
P [−2.34787 ≤ Z ≤ 2.34787]
Φ(2.34787) − Φ(−2.34787)
2Φ(2.34787) − 1
2(0.991) − 1
0.982.
To compute Φ use online computation (https://stattrek.com/online-calculator/normal.aspx)
22

Chapter 7: Introduction: 1 Convergence in Distribution

Uploaded by

Copyright:

Available Formats

You might also like

Chapter 7: Introduction: 1 Convergence in Distribution

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 7: Introduction: 1 Convergence in Distribution

Uploaded by

Copyright:

Available Formats

Chapter 7: Introduction

Suppose X1 , X2 , . . . , X n is a random sample on a random variable X which has a N(µ, σ2 )

Chapter 7: Sequences of Random Variables

Consider a sequence of random variables Y1 , Y2 , . . . with a corresponding sequence of

G n (y) P[Yn ≤ y].

and zero otherwise, and

G n (y) [FX (y)]n y n , 0<y<1

and zero if y ≤ 0 and one if y ≥ 1. Of course, when 0 < y < 1, y n approaches 0 as n

The degenerate random variable

Now, let Y be a degenerate random variable with P(Y 1) 1. Then

Figure 1: Comparison of CDFs G n (y) with limiting degenerate CDF GY (y)

A sequence of random variables, Y1 , Y2 , . . . , . . . is said to converge stochastically to a constant

3 Non-degenerate limiting distributions

The CDF is given by x  −κ

FX (x) 1 − (1 + x)−1 , x>0

FIGURE 7.2 Comparison of CDFs G(y) with 'imiting CDF G(y)

Ql (Y) G2(y) G,(y)

5 Limiting distribution - some more problems

[FX (θ(y + ln n))]n

and zero if y ≤ − ln n. Now, taking the limit of G n (y) as n → ∞, we get

GY (y) exp − exp −y , −∞ < y < ∞.

fY (y) exp − exp −y exp −y , −∞ < y < ∞.

lim G n (y) 0, if y < µ.

lim G n (y) 1, if y > µ.

We have seen that if X1 , . . . , X n , is a random sample from a Pareto distribution, X ∼

• limiting distribution of nX1:n is exponential which is a non-degenerate distribution.

• the limiting distribution of X n:n does not exist. (Example 7.2.4)

Also, we have shown that if X1 , . . . , X n is a random sample from an exponential

fY (y) exp − exp −y exp −y , −∞ < y < ∞.

We have seen that if X1 , . . . , X n is a random sample from an exponential distribution,

GY (y) exp − exp −y , −∞ < y < ∞.

FT (t) P[T ≤ t] [1 − FX (t)]10 [1 − exp(−t)]10 , t > 0.

This CDF is evaluated at t 1, 2, 5 and 7 months in the table given below.

To approximate these probabilities with the limiting distribution, then

G(t − ln 10) exp − exp {−(t − ln 10)}

exp − exp(−t) exp(ln 10)

exp −10 exp(−t) .

Chapter 7: The Central Limit Theorem

Theorem 7.3.1 Let Y1 , Y2 , . . . , . . . be a sequence of random variables with respective CDFs

The Bernoulli distribution

f X (x; θ) θ x (1 − θ)1−x , x 0, 1, 0 < θ < 1.

Further, if X1 , . . . , X n are independent Bernoulli random variables, X i ∼ BIN(1, θ), i

MX (t) exp µ(exp(t) − 1) , −h < t < h, h > 0.

If we let p → 0 as n → ∞ in such a way that np µ, for fixed µ > 0, then

MWn (t) E[exp (tWn )]

which is the MGF of a degenerate distribution with probability mass concentrated at p,

MZ n (t) E[exp (tZ n )]

n 2 n(1 − p)3 2 n(1 − p)3 4n(1 − p)2

Theorem 7.3.2 Let X1 , X2 , . . . , X n be a random sample from a from a distribution with

Chapter 7: Approximations For The Binomial Distribution

Let X1 , . . . , X n , be a random sample from a Bernoulli distribution, X i ∼ BIN(1, p), i

3. For a fixed value of p, the the sequence of "standardized" variables

converges in distribution to a standard normal variable. (Example 7.3.3 of the book)

where we approximate P[Y20 ≤ 8] as

P[Y20 ≤ 8]  P[Y ≤ 8.5]

Figure 3: Continuity correction for normal approximation of a binomial probability

P[Y20 ≥ 9]  1 − P[Y ≤ 8.5] 1 − Φ(−0.67082) 0.7486

The CDF is given by x −κ

P[Y20 ≤ 8] P[Y ≤ 8.5]

P[Y20 ≥ 9] 1 − P[Y ≤ 8.5] 1 − Φ(−0.67082) 0.7486