Generating Functions

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

GENERATING FUNCTION METHODS

1. Generating Functions Ultimately, the subject of probability theory is about the business of calculating probabilities and expectations, either exactly or approximately. One of the more useful and mysterious tools for doing such calculations is the use of generating functions. Generating functions provide a bridge between the discrete world of combinatorics and the continuous world of function theory and analysis. Denition 1. The generating function of a sequence {a n }n0 of real (or complex) numbers is the function A(z) dened by the power series (1) A(z) =
n=0

an z n .

Observe that for an arbitrary sequence {a n } the series (1) need not converge for all complex values of the argumetn z. In fact, for some sequences the series (1) diverges for every z except z = 0: this is the case, for instance, if a n = nn . But for many sequences of interest, there will exist a positive number R such that, for all complex numbers z such that |z| < R, the series (1) converges absolutely. In such cases, the generating function A(z) is said to have positive radius of convergence. The generating functions in all of the problems considered in these notes will have positive radius of convergence. Notice that if the entries of the sequence an are probabilities, that is, if 0 an 1 for all n, then the series (1) converges absolutely for all z such that |z| < 1. If the generating function A(z) has positive radius of convergence then, at least in principal, all information about the sequence {a n } is encapsulated in the generating function A(z). In particular, each coecient a n can be recovered from the function A(z), since n!a n is the nth derivative of A(z) at z = 0. Other information may also be recovered from the generating function: for example, if the sequence {a n } is a discrete probability density, then its mean may be obtained by evaluating A (z) at z = 1, and all of the higher moments may be recovered from the higher derivatives of A(z) at z = 1. A crucially important property of generating functions is the multiplication law : The generating function of the convolution of two sequences is the product of their generating functions. This is the basis of most uses of generating functions in random walk theory, and all of the examples considered below. The calculation that establishes this property is spelled out in section 3. Note that for probability generating functions, this fact is a consequence of the multiplication law for expectations of independent random variables: If X and Y are independent, nonnegative-integer valued random variables, then (2) Ez X+Y = Ez X Ez Y . 2. A First Passage Time for Simple Random Walk One of the principal reasons that generating functions are so useful in combinatorial problems is that recursive relationships among sequences imply functional equations relating
1

their generating functions. In some cases, the functional equations can be solved, giving explicit formulas for the generating functions. In other cases, the functional equations cannot be solved explicitly, but nevertheless may be exploited to obtain useful information about the sequence(s). In this section we shall discuss a simple but important example where interesting information can be extracted from a functional equation for a generating function. Simple Random Walk: The dynamical behavior of simple random walk has been of interest to mathematicians and physicists for the past century. As some of you will soon learn, it is also of central importance in mathematical nance theory, where it is used in the tree method for numerical approximations to options prices. Because certain nancial contracts are barrier-activated, (that is, they take eect if and when an asset [e.g., stock] price reaches a certain level), rst-passage times (such as the random variable discussed below) are of particular interest. Simple random walk on the integers is the Markov chain that evolves as follows: given that the current state is x Z, the next state is x 1, with probability 1/2 for each of the two possibilities. More formally, let X 1 , X2 , . . . be independent, identically distributed Rademacher random variables, that is, random variables whose distribution is given by (3) 1 P {Xi = +1} = P {Xi = 1} = , 2 and for each n = 0, 1, 2, . . . , let Sn = X1 + X2 + + Xn . The sequence {Sn } is called simple random walk with initial point S 0 = 0. The one-sided rst-passage time (x) = x is the integer-valued random variable (4) (x) = min{n : Sn = x},

that is, the rst time that the discrepancy betweeen the number of Heads and the number of Tails reaches x. We shall see that (x) < with probability one. Abbreviate = (1). Note that is a stopping time for the simple random walk (see Ross for the denition of a stopping time). Probability Generating Function of : To study the distribution of the random variable , we shall obtain an explicit algebraic expression for its probability generating function (5) F (z) := Ez =
n=1

z n P { = n}.

(Note: The domain of the argument z is the set of complex numbers |z| < 1. Since we do not yet know that < with probability 1, we must interpret z as 0 on the event that = . ) The key to the analysis is the Markov property. Consider the rst step of the random walk: either S1 = +1, in which case = 1, or S1 = 1. On the event that S1 = 1, the random walk must rst return to 0 before it can reach the level +1. But the amount of time it takes to reach 0 starting from 1 has the same distribution as ; and upon reaching 0, the amount of additional time to reach = 1 again has the same distribution as , and, by the Markov property, is conditionally indepedent of the time taken to get from 1 to 0. Therefore, z z (6) F (z) = + Ez + , 2 2

where , are independent random variables each with the same distribution as . Because the probability generating function of a sum of independent random variables is the product of their p.g.f.s, it follows that (7) F (z) = (z + zF (z)2 )/2.

Since this is a quadratic equation in F (z), the quadratic formula implies that F (z) = (1 1 z 2 )/z. But which is it: ? For this, observe that F (z) must take values between 0 and 1 when 0 < z < 1. It is a routine calculus exercise to show that only one of the two possibilities has this property, and so 1 1 z2 (8) F (z) = z Consequences: First, observe that F is continuous at z = 1 and lim z1 F (z) = 1, where the limit is taken from the left, through real values of z approaching 1. Consequently, P { < } = 1, because by the Monotone Convergence Theorem, 1 = lim F (z)
z1 n=1

= lim =
n=1

z1

z n P { = n}

P { = n}

= P { < }. This has an interesting corollary: the simple random walk must, with probability one, visit every integer innitely many times (in the language of Markov chain theory, the process is recurrent). The reasoning is as follows: Since < almost surely, the random walk must at some time visit the state x = 1. But upon reaching the level 1, the process starts afresh, that is, the subsequent steps are, conditional on the steps taken up until time , independent coin tosses; thus, at some future time the random walk must, with probability one, visit the state 2. Similarly, it must visit every x = 3, 4, . . . . By the same reasoning, it must, at some time after rst reaching the level 1, return to the initial state 0. But this implies that there must be an innite chain of visits to 1, 0, 1, 0, . . . . Since all states communicate, it follows that all states are recurrent, and so must be visited innitely often. Second, note that F (z) is not dierentiable at z = 1; therefore, E = . This, of course, could be proved without an explicit formula for the generating function F (z) by using Walds identity: if it were the case that E < then Wald would imply that 1 = ES = (E )(EX1 ) = 0. The fact that E = implies that the simple random walk is null recurrent: After the rst step, the random walk is either at 1; in either, case, the number of additional states needed to return to the initial state 0 has the same distribution as , and so the expected return time to 0 is innite. Third, observe that the explicit formula (8) allows us to write an explicit expression for the discrete density of . According to Newtons binomial formula, (9) 1 z2 =
n=0

1/2 (z 2 )n , n

and so, after a small bit of unpleasant algebra, we obtain 1/2 (10) P { = 2n 1} = (1)n1 . n Problem 1. (a) Verify that P { = 2n 1} = 2 2n1 (2n 1)1 2n1 . Observe that this n implies that P { = 2n 1} = P {S2n1 = 1}/(2n 1). (b) If possible, nd a combinatorial derivation of this identity that uses neither formula (10) nor generating functions. (c) Show that P { = 2n 1} C/n3/2 for some constant C, and identify C. Remark 1. Problem 1c asserts that the density of obeys a power law with exponent 3/2. Those of you who attended J. Miltons colloquium several weeks ago will recall that such a power law arose in his experimental data. Problem 2. (a) Show that the generating function F (z) given by equation (8) satises the relation as z 1 . (11) 1 F (z) 2 1 z

(b) The random variable (m) = min{n : S n = m} is the sum of m independent copies of = (1), and so its probability generating function is the nth power of F (z). Use this fact and the result of part (a) to show that for every real number > 0, (12)

lim E exp{ (m)/m2 } = e 2 m Remark 2. The function () = exp{ 2} is the Laplace transform of a probability density called the one-sided stable law of exponent 1/2. You will hear more about this density when we talk about Brownian motion later in the course. The result of problem 2b, together with the continuity theorem for Laplace transforms, implies that the rescaled random variables (m)/m2 converge in distribution to the one-sided stable law of exponent 1/2. 3. Linear Recursions and Renewal Theory 3.1. The Renewal Equation. Let {fk }k1 be a probability distribution on the positive integers with nite mean = kfk , and let X1 , X2 , . . . be a sequence of independent, identically distributed random variables with common discrete distribution {f k }. Dene the renewal sequence associated to the distribution {f k } to be the sequence um = P {Sn = m for some n 0} =
n=0

(13)

P {Sn = m}

where Sn = X1 + X2 + + Xn (and S0 = 0). note that u0 = 1 A simple linear recursion for the sequence um , called the renewal equation, may be obtained by conditioning on the rst step S1 = X1 :
m

(14)

um =
k=1

fk umk + 0 (m)

where 0 is the Kronecker delta (1 at 0; and 0 elsewhere). The renewal equation is a particularly simple kind of recursive relation: the right side is just the convolution of the sequences f k and um . The appearance of a convolution should

always suggest the use of some kind of generating function or transform (Fourier or Laplace), because these always convert convolution to multiplication. Lets try it: dene generating functions (15) (16) U (z) = F (z) =
m=0 k=1

um z m fk z k .

and

Observe that if you multiply the renewal equation (14) by z m and sum over m then the left side becomes U (z), so (17) U (z) = 1 + =1+
m

fk umk z m fk z k umk z mk

m=0 k=1 k=1 m=k

= 1 + F (z)U (z). Thus, we have a simple functional equation relating the generating functions U (z) and F (z). It may be solved for U (z) in terms of F (z): (18) U (z) = 1 1 F (z)

3.2. Partial Fraction Decompositions. Formula (18) tells us how the generating function of the renewal sequence is related to the probability generating function of the steps Xj . Extracting useful information from this relation is, in general, a dicult analytical task. However, in the special case where the probability distribution {f k } has nite support, the method of partial fraction decomposition provides an eective method for recovering the terms um of the renewal sequence. Observe that when the probability distribution {f k } has nite support, its generating function F (z) is a polynomial, and so in this case the generating function U (z) is a rational function. 1 The strategy behind the method of partial fraction decomposition rests on the fact that a simple pole may be expanded as a geometric series: in particular, for |z| < 1, (19) (1 z)1 =

zn.

n=0

Dierentiating with respect to z repeatedly gives a formula for a pole of order k + 1: (20) (1 z)k1 =
n=k

n nk z . k

Suppose now that we could write the generating function U (z) as a sum of poles C/(1 (z/))k+1 (such a sum is called a partial fraction decomposition). Then each of the poles could be expanded in a series of type (19) or (20), and so the coecients of U (z) could be obtained by adding the corresponding coecients in the series expansions for the poles.
1A rational function is the ratio of two polynomials.

Example: Consider the probability distribution f 1 = f2 = 1/2 (a two-sided die, if you can envision such a thing). The generating function F is given by F (z) = (z + z 2 )/2. The problem is to obtain a partial fraction decomposition for (1 F (z)) 1 . To do this, observe that at every pole z = the function 1 F (z) must take the value 0. Thus, we look for potential poles at the zeros of the polynomial 1 F (z). In the case under consideration, the polynomial is quadratic, with roots 1 = 1 and 2 = 2. Since each of these is a simple root both poles should be simple; thus, we should try 1 C1 C2 = + . 2 )/2 1 (z + z 1 z 1 + (z/2)

The values of C1 and C2 can be gotten either by adding the fractions and seeing what works or by dierentiating both sides and seeing what happens at each of the two poles. The upshot is that C1 = 2/3 and C2 = 1/3. Thus, (21) U (z) = 2/3 1/3 1 = + . 1 F (z) 1 z 1 + (z/2)

We can now read o the renewal sequence u m by expanding the two poles in geometric series: 2 1 (22) um = + (2)m . 3 3 There are several things worth noting. First, the renewal sequence u m has limit 2/3. This equals 1/, where = 3/2 is the mean of the distribution {f k }. We should be reassured by this, because it is what the Feller-Erds-Pollard Renewal Theorem predicts the limit should o be. Second, the remainder term (1/3)(2) m decays exponentially in m. As we shall see, this is always the case for distributions {f k } with nite support. It is not always the case for arbitrary distributions {fk }, however. Problem 3. The Fibonacci sequence 1, 1, 2, 3, 5, 8, . . . is the sequence a n such that a1 = a2 = 1 and such that am+2 = am + am+1 . (A) Find a functional equation for the generating function of the Fibonacci sequence. (B) Use the method of partial fractions to deduce a formula for the terms of the Fibonacci sequence. Note: Your answer should involve the so-called golden ratio (the larger root of the equation x2 x 1 = 0.) 3.3. Step Distributions with Finite Support. Assume now that the step distribution {fk } has nite support, is nontrivial (that is, does not assign probability 1 to a single point) and is nonarithmetic (that is, it does not give probability 1 to a proper arithmetic progression). Then the generating function F (z) = fk z k is a polynomial of degree at least two. By the Fundamental Theorem of Algebra, 1 F (z) may be written as a product of linear factors:
K

(23)

1 F (z) = C

j=1

(1 z/j )

Lemma 1. If the step distribution {f k } is nontrivial, nonarithmetic, and has nite support, then the polynomial 1 F (z) has a simple root at 1 = 1, and all other roots j satisfy the inequality |j | > 1.

Proof: It is clear that 1 = 1 is a root, since F (z) is a probability generating function. To see that 1 = 1 is a simple root (that is, occurs only once in the product (23)), note that if it were a multiple root then it would have to be a root of the derivative F (z) (since the factor (1 z) would occur at least twice in the product (23)). If this were the case, then F (1) = 0 would be the mean of the probability distribution {f k }. But since this distribution has support {1, 2, 3 }, its mean is at least 1. In order that be a root of 1 F (z) it must be the case that F () = 1. Since F (z) is a probability generating function, this can only happen if || 1. Thus, to complete the proof we must show that there are no roots of modulus one other than = 1. Suppose, then, that = ei is such that F () = 1, equivalently, fk eik = 1. Then for every k such that fk > 0 it must be that eik = 1. This implies that is an integer multiple of 2/k, and that this is true for every k such that f k > 0. Since the distribution {fk } is nonarithmetic, the greatest common divisor of the integers k such that f k > 0 is 1. Hence, is an integer multiple of 2, and so = 1. Corollary 2. If the step distribution {f k } is nontrivial, nonarithmetic, and has nite support, then (24) 1 1 = + 1 F (z) (1 z)
R r=1

Cr (1 z/r )kr

where is the mean of the distribution {f k } amd the poles r are all of modulus strictly greater than 1. Proof: The only thing that remains to be proved is that the simple pole at 1 has residue 1/. To see this, multiply both sides of equation (24) by 1 z: 1z = C + (1 z) 1 F (z)
r

Cr . (1 z/r )kr

Now take the limit of both sides as z 1: the limit of the right side is clearly C, and the limit of the left side is 1/, because is the derivative of F (z) at z = 1. Hence, C = 1.

Corollary 3. If the step distribution {f k } is nontrivial, nonarithmetic, and has nite support, then (25)
m

lim um = 1/,

and the remainder decays exponentially as m . Remark. The last corollary is a special case of the Feller-Erds-Pollard Renewal Theorem. o This theorem asserts that (25) is true under the weaker hypothesis that the step distribution is nontrivial and nonarithmetic. Various proofs of the Feller-Erds-Pollard theorem are o known, some of which exploit the relation (18). Partial fraction decomposition does not work in the general case, though.

3.4. Delayed Renewal Processes. In many applications of renewal theory the increments X1 , X2 , . . . are the times between successive occurrences (renewals) in some random process. In such applications it is sometimes desirable to incorporate a random initial delay S 0 0 until the initial renewal. Assume that this initial delay is independent of all subsequent interoccurrence times X1 , X2 , . . . , and has distribution {gk }k0 , possibly dierent from the interoccurrence time distribution {f k }. Set (26)
Sn = S 0 + X 1 + X 2 + + X n = S0 + Sn ,

where Sn is the time of the nth occurrence in the undelayed renewal process. Dene the delayed renewal sequence (27)
vm = P {Sn = m

for some n 0}.

Conditioning on the value of the initial delay S 0 leads to a relation between the delayed renewal sequence vm and the (undelayed) renewal sequence u m :
m

(28)

vm =
k=0

gk umk .

This is, once again, a simple convolution, and so the corresponding generating functions gm z m are related by multiplication: V (z) = vm z m and G(z) = (29) V (z) = G(z)U (z) = G(z) . 1 F (z)

Equilibrium Delay Distribution: There is an interesting choice of the initial delay distribution {gk }, the so-called equilibrium residual lifetime distribution or equilibrium delay distribution, that makes the delayed renewal sequence constant, that is, v m = 1/ for all m = 1, 2, . . . . The relation (29) makes it apparent what the generating function G(z) of such a distribution must be in order for this to occur: The right side of (29) must be z a simple pole with residue 1/, and so the only possible choice for G is (30) G(z) = z(1 F (z)) . (1 z)

Is this really the probability generating function of a probability distribution on the positive integers? There are several ways to verify this, but the most direct is to work from (30) to deduce what the density must be. Multiply both sides of (30) by (1 z) to get
k=1

gk (z z

k+1

)=z

k=1

fk (1 z k ).

Now you can read o from the powers of z what the values g k must be: in particular, it must be the case that for every k 1, (gk gk+1 ) = fk , and so (31) gk = 1
rk

fr

for k = 1, 2, 3, . . .

That this is a probability distribution on the positive integers follows from summation by parts:
k=1

gk = 1 = 1 = 1

fr fr

k=1 r=k r r=1 k=1

rfr

r=1

= 1 = 1. The Renewal Equation: The fundamental linear recursive relation (14) for the renewal seqnece um is a special case of what is called the Renewal Equation. The Renewal Equation is a linear convolution equation relating two sequences a m and bm as follows:
m

(32) Let (33) A(z) =

a m = bm +
k=1 m=0

fk amk .

am z m

and B(z) =

m=0

bm z m

be the generating functions of the two sequences; then by essentially the same calculation as was used to obtain the functional equation (18), (34) A(z) = B(z) = B(z)U (z). 1 F (z)
m

The second equality shows that the sequence a m may be recovered by convolving the sequence bm with the renewal sequence um : (35) am =
k=0

bk umk .

Problem 4. For each integer m 0, dene the residual lifetime random variable Z m to be the time remaining until the next renewal, that is, Z m = SN (m) m where N (m) = min{n : Sn > m}. Fix r 1, and dene am = P {Zm = r}. (A) Write a renewal equation for the sequence am . (B) Use the method of partial fraction decomposition to give an explicit formula for the distribution of Zm when the interoccurrence time distribution is k fk = k+1 for k = 1, 2, 3, . . . 2 Problem 5. (A) Assume that the sequence b m is absolutely summable, that is, m |bm | < . Using the Feller-Erds-Pollard theorem (which asserts that lim m um = 1/), show o that the sequence am dened by (35) satises (36)
m

lim am = 1

k=0

bk .

(B) What does this relation imply about the sequence a m = P {Zm = r} discussed in Problem 4 above? 4. Galton-Watson Processes Galton-Watson processes were introduced by Francis Galton in 1889 as a simple mathematical model for the propagation of family names. They were reinvented by Leo Szilard in the late 1930s as models for the proliferation of free neutrons in a nuclear ssion reaction.2 The extinction probability formulas that we shall derive below played a key role in the calculation of the critical mass of ssionable material needed for a sustained chain reaction. Galton-Watson processes continue to play a fundamental role in both the theory and applications of stochastic processes. First, an informal desription: A population of individuals (which may represent people, organisms, free neutrons, etc., depending on the context) evolves in discrete time n = 0, 1, 2, . . . according to the following rules. Each nth generation individual produces a random number (possibly 0) of individuals (called ospring) in the (n + 1)st generation. The ospring counts , , , . . . for distinct individuals , , , . . . are (conditionally) independent, independent of the ospring counts of individuals from earlier generations, and identically distributed, with common distribution {p k }k0 . The state Zn of the GaltonWatson process at time n is the number of individuals in the nth generation. More formally, Denition 2. A Galton-Watson process {Z n }n0 with ospring distribution {pk }k0 is a discrete-time Markov chain taking values in the set Z + of nonnegative integers whose transition probabilities are as follows: (37) P {Zn+1 = k | Zn = m} = pm . k

Here {pm } denotes the mth convolution power of the distribution {p k }. k In other words, the conditional distribution of Z n+1 given that Zn = m is the distribution of the sum of m i.i.d. random variables each with distribution {p k }. The default initial state is Z0 = 1. 4.1. Recursive Structure and Generating Functions. The Galton-Watson process Z n has a simple recursive structure that makes it amenable to analysis by generating function methods. Each of the rst-generation individuals , , , . . . behaves independently of the others; moreover, all of its descendants (the ospring of the ospring, etc.) behaves independently of the descendants of the other rst-generation individuals. Thus, each of the rst-generation individuals engenders an independent copy of the Galton-Watson process. It follows that a Galton-Watson process is gotten by conjoining to the single individual in the 0th generation Z1 (conditionally) independent copies of the Galton-Watson process. Note the similarity with the recursive property (6); here the recursion involves a union rather than a sum. The recursive structure leads to a simple set of relations among the probability generating functions of the random variables Z n :
2This had important historical consequences, as it was Szilard who convinced Einstein to suggest to President Roosevelt that the U. S. begin the Manhattan project. The story is told in detail in Richard Rhodes book The Making of the Atomic Bomb.

Proposition 4. Denote by n (t) = EtZn the probability generating function of the random variable Zn , and by (t) = pk tk the probability generating function of the ospring k=0 distribution. Then n is the nfold composition of by itself, that is, (38) (39) 0 (t) = t and n 0. n+1 (t) = (n (t)) = n ((t))

Proof: There are two ways to proceed, both simple. The rst uses the recursive structure directly to deduce that Zn+1 is the sum of Z1 conditionally independent copies of Z n . Thus, n+1 (t) = EtZn+1 = En (t)Z1 = (n (t)). The second argument relies on the fact the generating function of the mth convolution power {pm } is the mth power of the generating function (t) of {p k }. Thus, k n+1 (t) = Et
Zn+1

= =

k=0 k=0

E(tZn+1 | Zn = k)P (Zn = k) (t)m P (Zn = k)

= n ((t)). By induction on n, this is the (n + 1)st iterate of the function (t). Problem 6. (A) Show that if the mean ospring number := k kpk < then the n . (B) Show that if the variance 2 = expected size of the nth generation is EZ n = 2 k (k ) pk < then the variance of Zn is nite, and give a formula for it. 4.2. Extinction Probability. If for some n the population size Z n = 0 then the population size is 0 in all subsequent generations. In such an event, the population is said to be extinct. The rst time that the population size is 0 (formally, = min{n : Z n = 0}, or = if there is no such n) is called the extinction time. The most obvious and natural question concerning the behavior of a Galton-Watson process is: What is the probability P { < } of extinction? Proposition 5. The probability of extinction is the smallest nonnegative root t = of the equation (40) (t) = t.

Proof: As in the problems considered in sections 2 and 3, the key idea is recursion. Consider what must happen in order for the event < of extinction to occur: Either (a) the single individual alive at time 0 has no ospring; or (b) each of its ospring must engender a Galton-Watson process that reaches extinction. Possibility (a) occurs with probability p 0 . Conditional on the event that Z1 = k, possibility (b) occurs with probability k . Therefore, = p0 +
k=1

pk k = (),

that is, the extinction probability is a root of the Fixed-Point Equation (40).

There is an alternative proof that = () that uses the iteration formula (39) for the probability generating function of Z n . Observe that the probability of the event Z n = 0 is easily recovered from the generating function n (t): By the nature of the Galton-Watson process, these probabilities are nondecreasing in n, because if Zn = 0 then Zn+1 = 0. Therefore, the limit := limn n (0) exists, and its value is the extinction probability for the Galton-Watson process. The limit must be a root of the Fixed-Point Equation, because by the continuity of , () = ( lim n (0))
n

P {Zn = 0} = n (0).

= lim (n (0))
n

= lim n+1 (0)


n

= . Finally, it remains to show that is the smallest nonnegative root of the Fixed-Point Equation. This follows from the monotonicity of the probability generating functions n : Since 0, n (0) n () = . Taking the limit of each side as n reveals that .

It now behooves us to nd out what we can about the roots of the Fixed-Point Equation (40). First, observe that there is always at least one nonnegative root, to wit, t = 1, this because (t) is a probability generating function. If {p k }k0 is the degenerate probability distribution that puts p1 = 1 and pk = 0 for all other k, then (t) t, and so every t is a root of (40). This is not a very interesting case, however, because the Galton-Watson process with this ospring distribution does not change in time: the population size is always 1. In every other case, the roots of (40) are isolated. In fact, the next lemma shows that there are at most two. Proposition 6. Unless the ospring distribution is the degenerate distribution that puts mass 1 at k = 1, the Fixed-Point Equation (40) has either one or two roots. If the ospring distribution is nondegenerate (that is, if p k < 1 for every k), then the Fixed-Point Equation has a root t = < 1 smaller than one if and only if the mean number of ospring := k kpk is greater than one. Combining this result with Proposition 5, we nd that extinction is certain (that is, has probability one) if and only if the mean ospring number is 1.

Proof: Consider rst the case where the ospring distribution is degenerate, that is, p k = 1 for some integer k 0 such that k = 1. There are two subcases: (a) If p 0 = 1 then the generating function (t) 1, and so the only root of (40) is t = 1. (b) If p k = 1 for some integer k 2 then (t) = tk , and so the only real roots of (40) are t = 0, 1. Observe that the behavior of the Galton-Watson process in either case is uninteresting, as no randomness is involved. Assume now that the ospring distribution is nondegenerate. There are again two cases. (a) If p0 + p1 = 1 then (t) = p0 + p1 t is linear in t, with slope p1 < 1, and so the equation (40) has only the single root t = 1. Note that the Galton-Watson process for this ospring distribution is only slightly more interesting than when the ospring distribution

is degenerate: The population size remains at 1 for a geometrically distributed number of steps, then drops to 0. (b) In all other cases, the generating function (t) is genuinely nonlinear (it has at least one term p k tk with positive coecient and k 2), and so its second derivative is strictly positive for all t > 0. Consequently, (t) is strictly convex, and so there can be no more than two roots of (40). Next, consider the two possibilities for the mean ospring number. Observe that if the mean ospring number is nite, then = (1). Since the second derivative (t) > 0, the rst derivative (t) is strictly increasing. Consequently, if = (1) 1, then (t) < 1 for all t < 1, and so the existence of a root t = < 1 of the Fixed-Point Equation is impossible, as this would contradict the Mean Value theorem of calculus (if (1) = 1 and () = then the mean of (t) on the interval < t < 1 would be 1). Finally, suppose that > 1 (or = ). If p 0 = 0 then the Fixed-Point Equation has roots t = 0, 1, and because (t) is strictly convex, there are no other positive roots. If p0 > 0 then (0) = p0 > 0. Since (1) = > 1, it must be the case that (t) < t for values of t < 1 suciently near 1. Thus, for some 0 < t < 1. By the Intermediate Value Theorem, there must exist (0, t ) such that () = 0. 4.3. Problems. Problem 7. Suppose that the ospring distribution is nondegenerate, with mean = 1, and let be the smallest positive root of the Fixed-Point Equation. (A) Show that if = 1 then the root is an attractive xed point of , that is, () < 1. (B) Prove that for a suitable positive constant C, n (0) C ()n . (Hence the term attractive xed point.) Hint: All but the rst few terms of the sequence n (0) are in a small neighborhood of , where the Taylor series (t) = + ()(t ) + provides a good approximation to (t). Problem 8. Suppose that the ospring distribution is nondegenerate, with mean = 1. This is called the critical case. Suppose also that the ospring distribution has nite variance 2 . (A) Prove that for a suitable positive constant C, (B) Use the result of part (A) to conclude that the distribution of the extinction time has the following scaling property: for every x > 1,
n

(0) 0 > 0

and (t ) t < 0

1 n (0) C/n.

lim P ( > nx | > n) = C/x.

Hint for part (A): The Taylor series approximation to (t) at = 1 leads to the following approximate relationship, valid for large n: 1 1 n+1 (0) 1 n (0) (1)(1 n (0))2 , 2 which at rst does not seem to help, but on further inspection does. The trick is to change variables: if xn is a sequence of positive numbers that satises the recursion xn+1 = xn bx2 n

then the sequence yn := 1/xn satises yn+1 = yn + b + b/yn + . . . . Problem 9. Theres a Galton-Watson process in my random walk! Let S n be the simple nearest-neighbor random walk on the integers (cf. section 2). Dene T to be the time of rst return to the origin, that is, the smallest n 1 such that S n = 0. Dene Z0 = 1 and Zk =
T 1 n=0

1{Xn = k and Xn+1 = k + 1}.

In words, Zk is the number of times that the random walk X n crosses from k to k + 1 before rst visiting 0. (A) Prove that the sequence {Zk }k0 is a Galton-Watson process, and identify the ospring distribution. (B) Use the result of (A) and our fundamental theorem about extinction probabilities to give another proof that the simple symmetric nearest-neighbor random walk is recurrent. (C) Show that T = k1 Zk is the total number of individuals ever born in the course of the Galton-Watson process, and show that (the extinction time of the Galton-Watson process) is the maximum displacement M from 0 attained by the random walk before its rst return to the origin. What does the result of problem 8, part (B), tell you about the distribution of M ?

You might also like