Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

LI ív ITI !

G
ISTRIBUTICNS

7J
J NTRO D U CTION

In Chapter 6, general methods were discussed for deriving the distribution of a


function of n random variables, say }Ç = u(X1, , X,j In some cases the pdf of
1 is obtained easily, but there are many important cases where the derivation is
not tractable. In many of these, it is possible to obtain useful approximate results
that apply when n is large. These results are based on the notions of convergence
in distribution and limiting distribution.
231

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
232 CHAPTER 7 LIMITING DISTRIBUTIONS

72
SEQUENCES OF RANDOM VARIABLES
Consider a sequence of random variables Y1, Y2, ... with a corresponding
sequence of CDFs G1(y), G2(y), ... so that for each n = 1,2,
G(y) = P[Y y]

Definition 7.2.1
If n - G(y) for each n 1, 2,.,., and if for some CDF G(y),
um Giy) = G(y) (7.2.2)
n-' it
for all values y at which G(y) is continuous, then the sequence Y1, Y2, ... is said to
converge in distribution to Y G(y), denoted by - Y. The distribution corre-
sponding to the CDF G(y) is called the limiting distribution of }.

Fxmph1 7.2.1 Let X1, ..., X, be a random sample from a uniform distribution, X. UNIF(O, 1),
and let }Ç = the largest order statistic. From the results of Chapter 6, it
follows that the CDF of is
G(y)y' O<y<1 (7.2.3)
zero if y O and one if y 1. Of course, when O <y < 1, yfl approaches O as n
approaches x, and when y O or y ? 1, G(y) is a sequence of constants, with

FIGURE 7.1 Comparison of CDFs G(y) with limiting degenerate CDF G(y)
G,1(y) G(y)

5
Yo y2

y
o o

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
7.2 SEQUENCES OF RANDOM VARIABLES 233

respective limits O or 1. Thus, 11m G(y) = G(y) where

y<l
G(y)
y;l (7.2.4)

This situation is illustrated in Figure 7.1, which shows G(y) and G(y) for n = 2,
5, and 10.

The function defined by equation (7.2.4) is the CDF of a random variable that
is concentrated at one value, y = i Such distributions occur often as limiting
distributions.

Definition 7.2.2
The function G(y) is the CDF of a degenerate distribution at the value y = c if
\JO y<c (7.2.5)

In other words, G(y) is the CDF of a discrete distribution that assigns probability
one at the value y = c and zero otherwise.

Examplo 7.2.2 Let X1, X2, ..., X be a random sample from an exponential distribution,
X EXP(0), and let = X1. be the smallest order statistic. It follows that the
CDFofis
G(y)=1e'° y>O (7.2.6)
and zero otherwise. We have 11m G,(y) = i if y > O because e6 < i in this
case. Thus, the limit is zero if y <0 and one if y > 0, which corresponds to a
degenerate distribution at the value y 0. Notice that the limit at y = O is zero,
which means that the limiting function is not only discontinuous at y = O but
also not even continuous from the right at y = 0, which is a requirement of a
CDF. This is not a problem, because Definition 7.2.1 requires only that the limit-
ing function agrees with a CDF at its points of continuity

Ljfinition 7.2.3
A sequence of random variables, Y1, Y2, ..., is said to converge stochastically to a
constant c if it has a limiting distribution that is degenerate at y = c

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
234 CHAPTER 7 LIMITING DISTRIBUTIONS

An alternative formulation of stochastic convergence will be considered in


Section 7.6, and a more general concept called convergence in probability will be
discussed in Section 7.7.
Not all limiting distributions are degenerate, as seen in the next example. The
following limits are useful in many problems:

(7.2.7)

um if lim d(n) = (7.2.8)

These are obtained easily from expansions involving the natural logarithm.
For example, limit (7.2.7) follows from the expansion nb In (1 + c/n)
= nb(c/n + .) = cb + «, where the rest of the terms approach zero as n - co.

Example 7.2.3 Suppose that X1, ..., X,, is a random sample from a Pareto distribution,
X. PAR(l, 1), and let = nX1,,,. The CDF of X. is F(x) = 1 (1 + x);
x> O, so the CDF of IÇ, is

y>0 (7.2.9)

Using limit (7.2.7), we obtain the limit G(y) = i - e; y > O and zero otherwise,
which is the CDF of an exponential distribution, EXP(1) This is illustrated in
Figure 7.2, which shows the graphs of G(y) and G,,(y) for n = 1, 2, and 5.

FIGURE 7.2 Comparison of CDFs G(y) with 'imiting CDF G(y)

G(y) = i - e

Ql (Y) G2(y) G,(y)

The following example shows that a sequence of random variables need not
have a limiting distribution.

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
7.2 SEQUENCES OF RANDOM VARIABLES 235

Example 72.4 For the random sample of the previous example, let us consider the largest order
statistic, 1Ç = Xn:n. The CDF of 1 is

(7.2.10)

and zero otherwise. Because y/l + y) < 1, we have hrn G(y) = G(y) = O for all y,
which is not a CDF because it does not approach one as y - cc.

Example 7.2.5 In the previous example, suppose hat we consider a rescaled variable,
= (l/n)X.,,, which has CDF

G(y) = (i + (7.2.11)

and zero otherwise. Using limit (7.2.7), we obtain the CDF G(y) = e11; y > O.

Example 7.2.6 For the random sample of Example 7.2.2, consider the modified sequence
= (1/6)X,,. ln n. The CDF is

G,,(y) = [i - ()e_'] y> in n (7.2.12)

and zero otherwise. Following from limit (7.2.7), the limiting CDF is
G(y)= exp(e'); cc <y <cc.

We now illustrate the accuracy when this limiting CDF is used as an approx-
imation to G(y) for large n Suppose that the lifetime in months of a certain type
of component is a random variable X '- EXP(l), and suppose that 10 indepen
dent components are connected in a parallel system The time to failure of the
system is T = X10 and the CDF is FT(t) = (1 - e_t)lO, t > O This CDF is
evaluated at t = 1, 2 5 and 7 months in the table at the top of page 236 To
approximate these probabilities with the limiting distribution, then
FT(t) = PUT t]
= P[Y10 + in 10 t]
G(t In 10)
= exp(_e_th
= exo ( lOe_ti

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
236 CHAPTER 7 LIMITING DISTRIBUTIONS

The approximate probabilities are given in the table for comparison


t: 1 2 5 7

FT(t): 0.010 0.234 0.935 0.9909


G(tlnlO) 0025 0258 0935 09909
The approximation should improve as n increases

Example 7.2.7 Consider the sample mean of a random sample from a normal distribution,
N(p «2), )Ç = , From the results of the previous chapter, N (u, «2/n), and

G(y) [(_ P)] (7.2.13)

The limiting CDF is degenerate at y = p, because hm G(y) = O if y <p, 1/2 if


y = p, and 1 if y > p, so that the sample mean converges stochastically to p, We
will show this in a more general setting in a later section.

Certain limiting distributions are easier to derive by using moment generating


functions

7,3
THE CENTRAL LIMIT THEOREM
In the previous examples, the exact CDF was known for each finite n and the
limiting distribution was obtained directly from this sequence. One advantage of
limiting distributions is that it often may be possible to determine the limiting
distribution without knowing the exact form of the CDF for finite n. The limiting
distribution then may provide a useful approximation when the exact probabil
ities are not available. One method of accomplishing this result is to make use of
MGFs. The following theorem is stated without proof.

Theorsm 7.3.1 Let Y1, Y2, .. be a sequence of random variables with respective CDFs G1(y),
G2(y), ... and MGFs M1(t), M2(t)......If M(t) is the MGF of a CDF G(y), and if
hm M(t) = M(t) for all t in an open interval containing zero, - h < t < h, then
h-'
urn G(y) = G(y) for all continuity points of G(y).

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
7.3 THE CENTRAL LJMIT THEOREM 237

Example 7.3.1 Let X1, ., X be a random sample from a Bernoulli distribution,


X BIN(l, p), and consider = X1. If we let p . O as n . in such a way
that np = p, for fixed p > O, then
M(t) = (pet + q)'1

+i
= (
= + (7.3.1)

and from limit (7.2.7) we have


hm M(t) = e' (7.3.2)

which is the MGF of the Poisson distribution with mean p. This is consistent
with the result of Theorem 3.2.3 and is somewhat easier to verify. We conclude
that } -Y P01(p).

Example 7.3.2 Bernoulli Law of Large Numbers Suppose now that we keep p fixed and con-
sider the sequence of sample proportions, = p = Y,.,/n By using the series
expansion e' = i + u + u2/2 + with u = tin, we obtain
M(t) = (pe"1 + q)'1

I
(7.3.3)

where d(n)/n involves the disregarded terms of the series expansion, and d(n) * O
as n -+ . From limit (7.2.8) we have
um
n-'
M(t) e"t (7.3.4)

which is the MGF of a degenerate distribution at y = p, and thus converges


stochastically to p as n approaches infinity.

Note that this example provides an approach to answering the question that
was raised in Chapter 1 about statistical regularity. If, in a sequence of M inde-
pendent trials of an experiment, M represents the number of occurrences of an
event A, then JA = YAI/M is the relative frequency of occurrence of A. Because the

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
238 CHAPTER 7 LIMITING DISTRIBUTIONS

Bernoulli parameter has the value p = P(A) in this case, it follows that f4 con-
verges stochastically to P(A) as M -* x. For example, if a coin is tossed repeat-
edly, and A = {H}, then the successive relative frequencies of A correspond to a
sequence of random variables that will converge stochastically to p = 1/2 for an
unbiased coin. Even though different sequences of tosses generally produce differ-
ent observed numerical sequences offA, in the long run they all tend to stabilize
near 1/2.

Example 7.32 Now we consider the sequence of "standardized" variables:

(7.3.5)

With the simplified notation a,, ,.Jnpq, we have Z,, = Ya/a,, - np/os. Using the
series expansion of the previous example,

M(t) = e - ?PtICt(petI + q)"


p'1'pe'j« + q)]'
p2t2
+
2a,,2

=
- t2 d(n)1"
2n n

where d(n) - O as n - co. Thus,


um iVI,(t) = e'212 (7.3.7)
n-'

which is the MGF of the standard normal distribution, and so Z,, - Z N(O, 1).
Thi is an example of a special limiting result known as the Central Limit
Theorem.

TIwrem 7.3.2 Central Limit Theorem (CLT) If X1, ..., X,, is a random sample from a dis-
tribution with mean p and variance a2 < co, then the limiting distribution of

(7.3.8)

is the standard normal, Z,, -* Z N(O, 1) as n - co.

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
7.3 THE CENTRAL LIMIT THEOREM 239

Proof
This limiting result holds for random samples from any distribution with finite
mean and variance, but the proof will be outlined under the stronger assumption
that the MGF of the distribution exists. The proof can be modified for the more
general case by using a more general concept called a characteristic function,
which we will not consider here
Let m(t) denote the MGF of X - p, m(t) = M_,1(t), and note that m(0) = 1,
-
m'(0) = E(X - p) = O, and m"(0) = E(X p)2 = c2. Expanding m(t) by the Taylor
series formula about O gives, for between O and t,
m"()t2
m(t) = m(0) + m'(0)t +

-+ 2

-+ 2
(m"() - c72)t
2 (7,3.9)

by adding and subtracting o2t2/2.


Now we may write

and

M(t) =

= flM
n

= [m(

"f Z -
2nr2
+
2n2 Ij <
Asn-*t/,/cr-*O, c-+O,andm"()c2-O,so
M,,(t) = [i + + (7.3.10)

where d(n) -+0 as n -cc. It follows


hrn M (t) = ¿2/2 (7.3.11)
n-

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
240 CHAPTER 7 LIMITING DISTRIBUTIONS

or
limF(z) = (7.3.12)

which means that Z,, + Z N(O, 1).

Note that the variable in limit (7.3.8) also can be related to the sample mean,

Z,, (7.3.13)

The major application of the CLT is to provide an approximate distribution in


cases where the exact distribution is unknown or intractable.

Examplø 7.3.4 Let X1, ..., X,, be a random sample from a uniform distribution, X UNIF(O, 1),
n

and let }Ç = X1. Because E(X) = 1/2 and Var(X) = 1/12, we have the
approximation
nfl
2' 12
For example, if n = 12, then approximately
Y2 - 6 N(O, 1)
This approximation is so close that it often is used to simulate standard normal
random numbers in computer applications Of course this requires 12 uniform
random numbers to be generated to obtain one normal random number.

74
APPROXMATIONS FOR THE BINOMIAL DISTRIBUTION
Examples 7.3.1 through 7.3.3 demonstrated that various limiting distributions
apply, depending on how the sequence of binomial variables is standardized and
also on assumptions about the behavior of p as n - co
Example 7.3.1 suggests that for a binomial variable BIN(n, p), if n is large
and p is small, then approximately POI(np). This was discussed in a different
context and an illustration was given in Example 3.2.9 of Chapter 3.
Example 7 33 considered a fixed value of p, and a suitably standardized
sequence was found to have a standard normal distribution, suggesting a normal
approximation. In particular, it suggests that for large n and fixed p, approx-
imately 1 N(np, np q). This approximation works best when p is close to 0.5,
because the binomial distribution is symmetric when p = 0.5. The accuracy

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
7.4 APPROXIMATIONS FOR THE BINOMIAL DISTRIBUTION 241

required in any approximation depends on the application. One guideline is to


use the normal approximation when np 5 and nq 5, but again this would
depend on the accuracy required.

Example 7.4.1 The probability that a basketball player hits a shot is p = 0.5, If he takes 20
shots, what is the probability that he hits at least nine? The exact probability is
P[Y209]=1P[Y208]
8
=i- )0.5Y0.520-Y = 0,7483
yo Y
A normal approximation is

P[Y209]= i P[Y208] (8_1O)


i

= i - c1(-0.89) = 0.8133
Because the binomial distribution is discrete and the normal distribution is
continuous, the approximation can be improved by making a continuity correc-
tion In particular each binomial probability b(y, n, p) has the same value as the
area of a rectangle of height b(y n p) and with the interval [y - 0 5 y + 0 5] as
its base, because the length of the base is one unit. The area of this rectangle can
be approximated by the area under the pdf of Y '- N(np, np q), which corre-
sponds to fitting a normal distribution with the same mean and variance as
BIN(n, p). This is illustrated for the case of n = 20, p = 0.5, and y = 7 in
Figure 7.3, where the exact probability is b(7; 20, 0.5) = ()(o.5)7(o.5)13
= 0.0739. The approximation, which is the shaded area in the figure, is

FÍGUNE 7.3

/
Continuity correction for normal approximation of a binomial probability

b(7;20,0.5) - 0.0739

65 7 7.5

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
242 CHAPTER 7 LIMITING DISTRIBUTIONS

«(7.5_1o) «(6.5_1o) - «(-1.12) - (-L57) = 0,0732

The same idea can be used with other binomial probabilities, such as

PEY20 ; 9] = I - P[Y20 ( 8]

-- i - (0118.5 - 10
= i - i(-0.67)
= 0.7486

which is much closer to the exact value than without the continuity correction.
The situation is sho i in Figure 7 4
In general, if « BIN(n, p) and a b are integers, then

P[a b] (7.4.1)

Continuity corrections also are useful with other discrete distributions that can
be approximated by the normal distribution.

FIGURE 7.4 The normal approximation for a binomial distribution

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
7.5 ASYMPTOTIC NORMAL DISTRIBUTIONS 243

Example 7.4.2 Suppose that 1 POl(n), where n is a positive integer. From the results
of Chapter 6, we know that 1 has the same distribution as a sum
where X1, ..., X are independent, X PoI(i). According to the CLT, Z
( -
n)// Z ' N(0, 1), which suggests the approximation 1Ç N(n, n)
for large n. For example, n = 20, we desire to find PElO Y20 30]. The exact
30
value is e20(20)/y! = 0.982, and the approximate value is
y1O

(30.5 - 20\ ¿9.5 - 20 - (2.35) - (-2.35) = 0.981


) )
which is quite close to the exact value.

7,5
ASYMPTOTIC NORMAL DISTRIBUTIONS

From the CLT it follows that when the sample mean is standardized according to
equation (7.3.13), the corresponding sequence Z - Z N(0, 1).
It would not be unreasonable to consider the distribution of the sample mean
X,, as approximately N (u, o-2/n) for large n. This is an example of a more general
notion.

Definition 74.1
If Y1, Y2, ... is a sequence of random variables and m and c are constants such that

Z=_mZN(O,l) (7.5,1)

as n -* m, then Y,, is said to have an asymptotic normal distribution with asymptotic


mean m and asymptotic variance c2/n.

Examplo 7.5.1 Consider the random sample of Example 4.6.3, which involved n = 40 lifetimes of
electrical parts X, EXP(100) By the CLT X,, has an asymptotic normal dis
tribution with mean m = 100 and variance c2/n = (100)2/40 250.

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
244 CHAPTER 7 LIMITING DISTRIBUTIONS

ASYMPTOTIC DISTRIBUT!OiJ OF CENTRAL ORDER STATISTICS


In Section 7 1 we showed several examples that involved extreme order statistics,
such as the largest and smallest, with limiting distributions that were not normal.
Under certain conditions, it is possible to show that "central" order statistics are
asymptotically normal.

Theorem Z Let X1, ..., X be a random sample from a continuous distribution with a pdf
f(x) that is continuous and nonzero at the pth percentile, xi,, for O <p < 1. If
k/n -+ p (with k - np bounded), then the sequence of kth order statistics, Xk ,,, is
asymptotically normal with mean x, and variance c2/n, where

C
2 p(l - (7.5.2)

Example 7,5,2 Let X1, X be a random sample from an exponential distribution,


X. EXP(1), so that f(x) = ex and F(x) = i - e_x; x > 0. For odd n, let
k = (n + 1)/2, so that Yk = Xk. is the sample median. If p = 0.5, then the median
is x0 = in (0.5) = In 2 and

2 0.5(1 - 05) 0.25


C
- [f(ln 2)]2 - (05)2
Thus, Xk is asymptotically normal with asymptotic mean x0 = in 2 and
asymptotic variance c2/n = 1/n.

Example 7.5.3 Suppose that X1, ., X is a random sample from a uniform distribution,
X. '- UNIF(0, 1), SO thatf(x) = i and F(x) = X; O < x < 1. Also assume that n is
odd and k = (n + 1)/2 so that Y, = Xk is the middle order statistic or sample
median Formula (6 53) gives the pdf of k which has a special form because
k - .1 = n - k = (n - l)/2 in this example. The pdfis
n!
= {[(n - 1)/2] !} 0<y<1 (7.5.3)

According to the theorem, with p = 0.5, the pth percentile is x0,5 = 0.5 and
c2 = 0.5(1 - 0.5)/[i]2 = 0.25, so that Z,, = -
0.5)70.5 Z -
N(O, 1).
Actually, this is strongly suggested by the pdf (7.5.3) after the transformation
z = /(y - 0.5)/0.5, which has inverse transformation y = 0.5 + 0.5z/4 and
Jacobian J = The resulting pdf is
n!(0.5)" / z2\"
11--
1)/2
jzj < (7.5.4)
- 1)72] !}2 n

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
7.6 PROPERTIES OF STOCHASTIC CONVERGENCE 245

It follows from limit (7.2.7), and the fact that (1 z2/n)112 + 1, that

um f
n-.co\
/
i--z2\-1)12 =
fl

and it is also possible to show that the constant in (7.5.4) approaches i// as
n -+ .
Thus, in the example, the sequence of pdf's corresponding tò Z converges to a
standard normal pdf. It is not obvious that this will imply that the CDFs also
converge, but this can be proved. However, we will not pursue this point.

7.6
PROPERTES OF STOCHASTIC C'JFC
We encountered several examples in which a sequence of random variables con-
verged stochastically to a constant For instance in Example 7 3 2 we discovered
that the sample proportion converges stochastically to the population propor
tion. Clearly, this is a useful general concept for evaluating estimators of
unknown population parameters, and it would be reasonable to require that a
good estimator should have the property that it converges stochastically to the
parameter value as the sample size approaches infinity.
The following theorem, stated without proof, provides an alternate criterion for
showing stochastic convergence.

Theorem 7.6.1 The sequence Y, Y2, ... converges stochastically to c if and only if for every
6 >0,

hm P[ ) c z s] = i (7.6.1)
n-'

A sequence of random variables that satisfies Theorem (7.6.1) is also said to


converge in probability to the constant c, denoted by 1 - c. The notion of con-
vergence in probability will be discussed in a more general context in the next
section.

íE.rampIe 7.6.1 Example 7.3.2 verified the so-called Bernoulli Law of Large Numbers with the
MGF approach. It also can be verified with the previous theorem and the Cheby-
chev inequality. Specifically, the mean and variance of are E() = p and

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
246 CHAPTER 7 UM'ITING DISTRIBUTIONS

Var() = pq/n, so that


- p <c] 22fl
(7.6.2)

for any r > O, so hm P[I ,, - p <r] = 1.


n-'
This same approach can be used to prove a more general result, usually
referred to as the Law of Large Numbers (LLN).

Theorem 7.6.2 If X1, ..., X is a random sample from a distribution with finite mean and
variance c2, then the sequence of sample means converges in probability to ,
- P
Xn P

Proof
This follows from the fact that E(X) = p, Var() = c2/n, and thus

F[I - I <] i
- (7.6.3)

so that hm P[l
n'
-1 <r] = 1.

These results further illustrate that the sample mean provides a good estimate
of the population mean, in the sense that the probability approaches i that ,, is
arbitrarily close to p as n - co.
Actually, the right side of inequality (7.6.3) provides additional information.
Namely, for any r > O and O < (5< 1, if n> 2/(82, then
P pr < )<p+ e]
The following theorem, which is stated without proof, asserts that a sequence
of asymptotically normal variables converges in probability to the asymptotic
mean.

Theorem 7.6.3 If Z = /(Y, - m)/c - Z N(O, 1), then Y,, - m.

ExampTh 7.6.2 We found in Examples 7.5.2 and i.5.3 that the sample median Xk.,, is asymp-
totically normal with asymptotic mean x05, the distribution median. It follows
from the theorem that Xk:fl - x05 as n - co, with k/n -t 0.5.
Similarly, under the conditions of Theorem 7.5.1, it follows that if k/n
the kth smallest order statistic converges stochastically to the pth percentile,
- p, then
Xkfl !* xp.

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
7.7 ADDITIONAL LIMIT THEOREMS 247

7e7
ADDITIONAL LIMIT THEOREMS

Definition 7.7.1
Convergence in Probability The sequence of random variables } is said to con-
verge in probability to Y, written 1 - Y, if

um P[ 1 - VI <e] = 1 (7.7.1)
n-'

It follows from equation (7.5.2) that stochastic convergence is equivalent to


convergence in probability to the constant c, and for the most part we will
restrict attention to this special case. Note that convergence in probability is a
stronger property than convergence in distribution. This should not be sur-
prising, because convergence in distribution does not impose any requirement on
the joint distribution of 1 and Y, whereas convergence in probability does. The
following theorem is stated without proof

Theorem 7.7,1 For a sequence of random variables, if

then

For the special case Y = c, the limiting distribution is the degenerate distribu-
tion P[Y = c] = 1. This was the condition we initially used to define stochastic
convergence.

Theorem 7.7.2 If 1 -4 c, then for any function g(y) that is continuous at c,

g()g(c)
Proof
Because g(y) is continuous at c, it follows that for every a > O a ö> O exists such
that y - c < 5 implies ¡ g(y) - g(c) j < a. This, in turn, implies that
P[j g(Y,) - g(c) I <a] ? P[j 1 - cj <]

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
248 CHAPTER 7 LIMTNG DISTRIBUTIONS

because P(B) ? P(A) whenever A c B. But because } - c, it follows for every


> O that
um P[J g(}Ç) - g(c) <e] um P[J 1 - cl <] = i
The left-hand limit cannot exceed 1, so it must equal 1, and g()

Theorem 7.7.2 is also valid if 1 and c are k-dimensional vectors. Thus this
theorem is very useful, and examples of the types of results that follow are listed
in the next theorem.

Thoorc'i.î 7.7.3 If X and are two sequences of random variables such that X - c and
} - d, then:

aX+b}--ac+bd.
X1-cd.
X/c- I, for c r O.

1/X I/c if P[X O] = i for all n, c O.

if P[X,, O] = i for all n.

Example 7.7.1 Suppose that Y ' BIN(n, p). We know that = Y/n - p. Thus it follows that
(i )-p(1--p)

The following theorem is helpful in determining limits in distribution.

Theorem 7.7.4 Slutsky's Theorem 1f X and Y are two sequences of random variables such
thatX-c and }->Y,then:
1. X+ 1-*c+ Y.
2 XY,-*cY
3. 1,/X4Y/c; cO.

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
7.7 ADDITIONAL LIMIT THEOREMS 249

Note that as a special case X could be an ordinary numerical sequence such


asX=n/(n-1).

xa,pll 77.2 Consider a random sample of size n from a Bernoulli distribution,


X BIN(1, p). We know that

p
ZN(O,1)
- p)/n

We also know that (1 - ) - p(1 - p), so dividing by [(1 - - p)]''2 gives

PP d
-+Z-N(O, 1) (7.7.2)
Jî'(1 -

Theorem 7.7.2 also may be generalized.

liseorem 7.7.6 If } . Y, then for any cont nqs function g(y), g(1) - g(Y).
Note that g(y) is assumed not t depend on n.

Thorm 7.7.6. If /(Y - m)/c - Z Ñ(O, 1), and if g(y) has a nonzero derivative at y
g'(m) O, then

- g(m)]
N(O,1)
I cg (m) I

Proof
Define u(y) = [g(y) - g(m)]/(y - m) - g'(m) if y m, and let u(m) = O. It follows
that u(y) is continuous at m with u(m) = O, and thus g'(m) ± i(Y) - g'(m). Further-
more,

- g(m)] k( ' - m)1 [g'(m) + u( )]


[cg (m)] L c j g'(m)

From Theorem 7.7.3, we have [g'(m) + u(Y)]/g'(m) - 1, and the result follows
from Theorem 7.7.4.

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
250 CHAPTER 7 LIMITING DISTRIBUTIONS

According to our earlier interpretation of an asymptotic normal distribution,


we conclude that for large n, if 1Ç, N(m, c2/n), then approximately
c2[m)]2}
g() N {(m) (7.7.3)

Note the similarities between this result and the approximate mean and
variance formulas given in Section 2.4.

ExampI 7.7.3 The Central Limit Theorem says that the sample mean is asymptotically nor-
mally distributed,

N(O, 1)

or, approximately for large n,


-
xn
/ c2-
Nyi
We now know from Theorem 7.7.6 that differentiable functions of .i,, also will be
asymptotically normally distributed. For example, if g(,,) = , then g'(u) =
and approximately,

7e8*

ASYMPTOTIC DISTRIBUTIOI\S OF ETEME


DE STATISTICS
As noted in Section 7.5, the central order statistics, Xk.,,, are asymptotically nor-
mally distributed as n * cc and k/n p. If extreme order statistics such as X1,,,
X2:n, and X,,.,, are standardized so that they have a nondegenerate limiting dis-
tribution, this limiting distribution will not be normal. Examples of such limiting
distributions were given earlier. It can be shown that the nondegenerate limiting
distribution of an extreme variable must belong to one of three possible types of
distributions. Thus, these three types of distributions are useful when studying
extremes, analogous to the way the normal distribution is useful when studying
means through the Central Limit Theorem.
For example, in studying floods, the variable of interest may be the maximum
flood stage during the year. This variable may behave approximately like the

* Advanced (or optional) topic

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
78 ASYMPTOTIC DISTRIBUTIONS OF EXTREME ORDER STATISTICS 251

maximum of a large number of independent flood levels attained through the


year. Thus, one of the three limiting types may provide a good model for this
vaiiable. Similarly, the strength of a chaXn is equal to that of its weakest link, or
the strength of a Ceramic may be the (strength at its weakest flaw, where the
number of flaws, n, may be quite large. Also, the lifetime of a system of indepen.
dent and identically distributed components connected in series is equal to the
minimum lifetime of the components Again, one of the limiting distributions may
provide a good approximation for the lifetime of the system, even though the
distribution of the lifetimes of the individual components may not be known.
Similarly, the lifetime of a system of components connected in parallel is equal to
the maximum lifetime of the components
The following theorems, which are stated without proof are useful in studying
the asymptotic behavior of extreme order statistics.

Theorn! 72.1 If the limit of a sequence of CDFs is a Continuous CDF, F(y) = hm F(y), then
for any a >0 and b,
hm F(a y + b) = F(ay + b) (7.8.1)

if and only if hm a, = a >0 and Jim b = b.

Theorem 7.8.2 If the limit of a sequence of CDFs is a continuous CDF, and if


hm F(a y + b) = G(y) for all a> O and all real y, then um F(; y + J3,,) = G(y)
for; >0, if and only if,1/a-+ land (ß - b)/a 0 as n- c.

LIMITING DISTRIBUTIONS OF MAXIMUMS


Let X1, ..., denote an ordered random sample of size n from a distribution
with CDF F(x). In the context of extreme-value theory, the maximum X, is said
to have a (nondegenerate) limiting distribution G(y) if there exist sequences of
standardizing constants {a} and {b} with a > O such that the standardized
variable, I = (X - b)/;, converges in distribution to G(y),

Y=
X: - b d
G(y) (7.8.2)

That is, if we say that X,,, has a limiting distribution of type G, we will mean that
the limiting distribution of the standardized variable 1 is a nondegenerate dis-
tribution G(y). As suggested by Theorems 7.8.1 and 7.8.2, if G(y) is continuous,
the sequence of standardizing constants will not be unique; however, it is not

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
252 CHAPTER 7 LIMITING DISTRIBUTIONS

possible to obtain a limiting distribution of a different type by changing the stan-


dardizing constants.
Recall that the exact distribution of X,,.,, is given by
F,,.,,(x) = [F(x)] (7.8.3)
If we consider 1 = (X,,.,, - b,,)/a,,, then the exact distribution of 1 is
G,,(y) = P[ y] = F,,.,,(a,,y + b,,)
= [F(a,,y + b,,)]" (7.8.4)
Thus, the limiting distribution of X,,,, (or more correctly 1Ç) is given by
G(y) = hm G,,(y) = hm [F(a,, y + b,,)j (7.8.5)

Thus, equation (7.8.5) provides a direct approach for determining a limiting


extreme-value distribution, if sequences {a,,J and {b,,} can be found that result in
a nondegenerate limit.
Recall from Example 7.2.6 that if X EXP(1), then we may let ci,, = i and
b,, = In n. Thus,

G,,(y) = [F(y + In )]fl = [i ()e]" (7.8.6)

and thus,

G(y) = 11m [1 - ()e]" = exp (e) (7.8.7)

The three possible types of limiting distributions are provided in the following
theorem, which is stated without proof.

..ni 7.8.3 If = (X,,.,, - b,,)/a,, has a limiting distribution G(y), then G(y) must be one of the
following three types of extreme-value distributions:
Type I (for maximums) (Exponential type)
G"(y) = exp (e) <y < cc (7.8.8)
Type II (for maximums) (Cauchy type)
G21(y) = exp (_y_Y) y> O, y > 0 (789)
Type III (for maximums) (Limited type)
Jexp [_(_y)Y] y <O, y > O
yo (7.8,10)

The limiting distribution of the maximum from densities such as the normal,
lognormal, logistic, and gamma distributions is a Type I extreme-value distribu-

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
7.8 ASYMFTOTIC DISTRIBUTIONS OF EXTREME ORDER STATISTICS 253

tion. Generally speaking, such densities have tails no thicker than the exponential
distribution. This class includes a large number of the most common distribu-
tions, and the Type I extreme-value distribution (for maximum) should provide a
useful model for many types of variables related to maximums. Of course, a loca-
tion parameter and a scale parameter would need to be introduced into the
model when applied directly to the nonstandardized variable
The Type II limiting distribution results for maximums from densities with
thicker tails, such as the Cauchy distribution. The Type III case may arise from
densities with finite upper limits on the range of the variables.
The following theorem provides an alternative form to equation (7.8.5), which
is sometimes more convenient for carrying out the limit.

Theorem 7.8.4 Gnedenko In determining the limiting distribution of 1 = - b)/a,


hm G,(y) = him [F(ay + b)] = G(y) (7.8.11)

if and only if
hm n[1 - F(a y + b,,)] = - ln G(y) (7.8.12)

In many cases the greatest difficulty involves determining suitable stan-


dardizing sequences so that a nondegenerate limiting distribution will result. For
a given CDF, F(x), it is possible to use Theorem 7.8.4 to solve for a,, and b,, in
teiiíis of F(x) for each of the three possible types of limiting distributions. Thus, if
the limiting type for F(x) is known, then a,, and b,, can be computed. If the type is
not known, then a,, and b,, can be computed for each type and then applied to see
which type works out. One property of a CDF that is useful in expressing the
standardizing constants is its "characteristic largest value."

Definition 7.8.1
The characteristic largest value, u,,, of a CDF F(x) is defined by the equation
[1 F(u,,)] = i (7.8.13)

For a random sample of size n from F(x), the expected number of observations
that will exceed u,, is 1. The probability that one observation will exceed u,, is
p = P[X > u,,] = i - F(u,,)
and the expected number for n independent observations is

np = n[1 - F(u)]

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
254 CHAPTER 7 LIMITING DISTRIBUTIONS

Theorem 7.8.5 Let X F(x), and assume that = - b,,)/a has a limiting distribution.
1. If F(x) is continuous and strictly increasing, then the limiting distribution
of }Ç is of exponential type if and only if

hm n[1 - F(ay + b,,)] = e' - <y <z co (7.8.14)

where b,, = u,, and a,, is the solution of

F(a,, + u,,) = i - (ne)1


G(y) is of Cauchy type if and only if

um
1F(y)
i - F(ky) k k>O,y>O (7.8,15)

and in this case, a,, = u,, and b,, = O.


G(y) is of limited type if and only if

um lF y+xo)ky k>O (7.8.16)


y-.o- 1F(y+x0)
where x0 = max {x F(x) < 1}, the upper limit of x. Also b,, = x0 and
un.

7.7 Suppose again that X EXP(0), and we are interested in the maximum of a
random sample of size n. The characteristic largest value u,, is obtained from
nEl - F(u,,)] = nEl - (1 - e'°)] =
which gives

u,, = û ln n
We happen to know that the exponential density falls in the Type I case, so we
will try that case first. We have b,, = u,, = O In n, and a,, is determined from

F(a,, + u,,) = i - e_""° i


= i - i/(ne)
which gives a,, = O.
Thus, if the exponential density is in the Type I case, we know that
X,,,,Olfl n
-YG (y)
d
(1)
(7.8.17)

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
7.8 ASYMPTOTIC DISTRIBUTIONS OF EXTREME ORDER STATISTICS 255

This is verified easily by using condition i of Theorem 7.8.5, because


um n[l - F(ay + b,)] = um n[e"]
n-'
= um e
n-

=e <y<
Example 7.8.2 The density for the CDF F(x) = i - x, x 1, has a thick upper tail, so one
would expect the limiting distribution to be of Cauchy type. If we check the
Cauchy-type condition given in Theorem 7.8.5, we find
- F(y)
um
y-'
- um
i - F(ky) y-' (ky)
- k° (7.8.18)

so the limiting distribution is of Cauchy type with y = 8. Also, we have


nEl - F(u,)] = nu° = 1, which gives u, = = a,, and we let b, = O in this
case. Thus we know that
X,:, j
Y G2'(y) (7.8.19)
- nilo -
Now that we know how to standardize the variable, we also can verify this
result directly by Theorem 7.8.4. We have
hrn n[l - F(a,y + b,)] = hrn y = In G(y) (7820)

so G(y) = exp (y°), which is the Cauchy type with y = O.

Example 7.8.3 For X - UNIF(O, 1), where F(x) = x, O < x < 1, we should expect a Type III
limiting distribution. We have
nEl - F(u,)] = n(1 - u,) = i
which gives u, = i - 1/n. Thus, b, = = i and a, = x0 - u, = 1/n. Checking
condition 3 of Theorem 7.8.5,
i - 'F(ky + x0) i - (ky + x0)
hrn - hrn .
hrnkyy- = k
- y-0.

y-'ø i - F(y + x0) y-*0 -


i (y + x0)
so the limiting distribution of 1 = fl(Xn:n - 1) is Type III with y = 1. Again, if we
look directly at Theorem 7.8.4 to further illustrate, we have

hrn n[i - F(a, y + b,)] =


- ( +)]
=y
= in G(y) (7.8.21)

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
256 CHAPTER 7 LIMITING DISTRIBUTIONS

and } = - 1) - Y 's G(y), where


{eY <O
G(y) = G3(y)
i Oy

LIMITING DISTRIBUTIONS OF MINIMUMS


If a nondegenerate limiting distribution exists for the minimum of a random
sample then it also will be one of three possible types Indeed, the distribution of
a minimum can be related to the distribution of a maximum, because
min (x1, ..., x) = max x) (7.8.22)
Thus, all the results on maximums can be modified to apply to minimums if the
details can be sorted out.
Let X be continuous, X F(x), and let Z = -x ' F(z) = i - F(z). Note
also that X1.,, =
Now consider = (X1.,, + b,,)/a,,. We have

G,,(w)
[x1.+ b,,

+ b,,
[
[Zn.n - b,,
a,,

=P[i w]
= I - Gy,,(w)
The limiting distribution of J4', say 11(w), then is given by
11(w) = hm G,,(w) = 11m [1 - G,,(w)]

=1 - G(w)
where G(y) now denotes the limiting distribution of 1 = (Z,,,, - b,,)/a,,. Thus to
find H(w), the limiting distribution for a minimum, the first step is to determine
F(z) = i - F(z)
then determine a,,, b,,, and the limiting distribution G(y) by the methods
described for maximums as applied to Fz(z). Then the limiting distribution for 4'
is

H(w) = i - G(w) (7.8.23)

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
7.8 ASYMPTOTIC DISTRIBUTIONS OF EXTREME ORDER STATISTICS 257

Note that if F(x) belongs to one limiting type, it is possible that F(z) will
belong to a different type For example maximums from EXP(0) have a Type I
limiting distribution, whereas Fz(z) in this case has a Type III limiting distribu-
tion, so the limiting distribution of the minimum will be a transformed Type III
distribution.
In summary, a straightforward procedure for determining a,,, b,,, and H(w) is
first to find F(z) and apply the methods for maximums to determine G(y) for
= (Z,,.,, - b,,)/a,,, and then to use equation (7.8.23) to obtain H(w). It also is
possible to express the results directly in terms of the original distribution F(x).

Definition 7.8.2
The smallest characteristic value is the value s, defined by
nF(s,,) i (7.8.24)

It follows from equation (7.8.22) that s,,(x) = - u,,(z). Similarly, the condition
F(a, + u,,(z)) = i - 1/(ne) becomes F(a,, + s,,) = 1/(ne), and so on.

Theorem 7.8.6 1f I4 = (X1.,, + b,,)/a,, has a limiting distribution H(w), then 11(w) must be one of
the following three types of extreme-value distributions:
Type I (for minimums) (Exponential type)
In this case, b,, = - s,,, a,, is defined by
Xi:,,
F(s,, - a) w n

and
HW(w) = i - GW(w) = i - exp (_e') co <w < co
if and only if hm nF(a,,y + s,,) = e.
Type II (for minimums) (Caichy type)
In this case, a,, = s,,, b,, = O, H' = X1.,,/s,,, and
R)(w) = i - G2(w) = i - exp [(--w)] w <O, y> O
if and only if

y-
um
F(y)
F(ky) k k>O,y>O
or
him nF(s,,y) = yY y>O

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
258 CHAPTER 7 LIMITING DISTRIBUTIONS

3. Type III (for minimums) (Limited type)


If x1 = min {x F(x) > O} denotes the löwer limit for x (that is,
= x0), then
b=x1 a=xl+s - xi
Sn - Xl
and
H)(w) = i - G3'(w) = wy) w>O,y>O
if and only if

F(ky + x1)
um
y-'O F(y+x1)
-k
or
hm nF[(x1 - )y + xi] = (_y)Y
n-'

Note that the Type I distribution for minimums is known as the Type I
extreme-value distribution. Also, the Type III distribution for minimums is a
Weibull distribution Recall that the limiting distribution for maximums is Type I
for many of the common densities. In determining the type of limiting distribu-
tion of the minimum, it is necessary to consider the thickness of the right-hand
tail of F(z), where Z = X. Thus the limiting distribution of the minimum for
some of these common densities, such as the exponential and gamma, belongs to
Type III. This may be one reason that the Weibull distribution often is encoun-
tered in applications.

Example 7.8.4 We now consider the minimum of a random sample of size n from EXP(0). We
already know in this case that X1.,, EXP(O/n), and so flXi:n/8 EXP(1).
Thus the limiting distribution of nX1.,,/O is also EXP(1), which is the Type III
case with y = I If we did not know the answer, then we would guess that the
limiting distribution was Type III, because the range of the variable Z = -x is
limited on the right. Checking condition 3 in Theorem 7.8.6, we have x1 = O and
F(ky + x1) exp (- ky) k exp (- ky)
Jim him
yO F(y+x1) yO iexp (y) y+ exp(y) - k
Thus, we know that H(w) = i - e_w, where
X1., - x1 X1.,,
Snx1 8n

PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor

You might also like