Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Journal of Research of the Na tional Bureau of Standards Vol. 50, No.

4, April 1953 Research Paper 2411

Contributions to the Theory of Markov Chains


Kai Lai Chung 1
The fundam entals of t he tJl eory of d enumerable Markov cha in s with station a ry tra ns i-
tion probabilit ies were laid down by Kolmogorov , and fur ther work was don e by Doblin.
The t heo ry of rec urrent e vents of F eller is closely related , if no t coexten sive. So me n ew
resul ts obtained by T. E. Harris turn out to t ie up very nicely with so me amplification s of
Doblin 's work. Harris was led to consid er t he probabilities of hi ttin g one state befor e
another , starting fro m a third one. This idea of co nsid erin g t hree states, on e initial , one
"taboo", and o ne fin a l, is more full y d eveloped in t he prese n t work. The notion o f first
passage t im e to t he " uni on" or " in tersection " of two states is also in trodu ced h ere. The
in terplay bet ween t hese no t ions is illustra ted.

The fundamen tals of t h e theory of denumerable i, .j, k, l , h, deno te positive integers and wil l be
Markov chains 2 wit h stationary transition proba- used as state labels:
bilities (DMCS) were laid down by Kolmogol'ov
[IJ 3 and furthcr work was done by Doblin [2]. The a,
theory of reC UlTent events of F ell er [3] is closeh- Pi7)= P(X,, = .i IXo = i); .,
p .IO) = {
r elated, if not coextensive. R ecently some inter est- 1, ~ = .7
ing n ew r esults were obtained by '1'. E . Hal'l'is [4] and
communicated to t he au thor. They t urn out to tie kP/j)= P(X ,, = j, Xv~ lc , l ..s;v < n IX o= i)
up very nicely with some amplifications of Doblin's Filj')= P(X,,=j, Xv ~j, l ..s;v < n IXo = i)
work t he author was engaged in. Alt hou gh Hal'l'is'
main purpose lies elsewher e, he was led to consider kFi';)= P(X ,,= j, X,, ~ .i, ~ lc , l ..s;v < n !X o= i)
the probabili ties of hitting one state before another,
star ting from a third one. This idea of considering Q* = ~ Q(nJ
t hree (instead of the customary two) states, one n= l
initial , one" taboo," and one final , will be more full:,-
develop ed in t h e present work. The notion of first where Q may stand for any of the symbols kP ij, Fij,
passage time t o t h e "union" 01' " intersection" of or kFij.
two states will also be introdu ced h er e. The inter- We offer th e following clue to th e above no tat ions.
play b etween these notions will b e illustra ted . The letter P design ates "passage" ; the letter F,
R ecorded r es ul ts in this paper will be labeled as "first passage"; th e first right-hand subscrip t desig-
formulas and th eorems, r espectively. Rel evan t r e- nates th e initial state; the second , the final state; the
marks as to the ir origin or signifi cance will be found left-hand subscript designates tb e "taboo state,"
in the body of the paper. The author is indeb ted to namely, one to be esch ewed during th e passage (r:x -
Dr. Harris for communicating some of his r esults elusive of bo t h terminals) ; ·the star on a letter WIth
b efore publication . su bscl'ipts designates the s um of th e corresponding
infin i te series (fini te or + 00 ) summed from n = 1 ad
1. The sequence of random variables { X ,,}, n= 0, inj. W e admit th at this is no t th e most logical
1,2, . . . forms a DMCS. The sta tes will be de- system of notations we co uld have inven ted. For
noted simply by the positive integers. The (one- instance, we have the superB uity n i )= jP.I;d , and if
step ) transition probability from the state i to th e we bad allowed more than on e left-hand subscript ,·
state j will be denoted by Pg). B ecause of station- we could have used only one letter P and written
arity we have kF?i)=j,kPt:l. However, we consider our no tations
to be preferred to the arbitrary use of all sorts of
Pg)= P(X m+ l = j lX m = i) letters from the Latin and Greek alphabets. Also,
after painful deliberations we decided not to define
°
for all integers m ~ for whieh the conditional prob-
ability is d efin ed . With this understanding, we shall
kPi7), FiT , or kF/? ), while r eserving the righ t to do so
later in some cases.
permit ourselves to write m = O in the definitions to FORMULA I: I} i ~j, then
follow , as if th e conditional probabilities were always
defin ed . F:; = iF7';(1 + jP: i ) , (I )
NOTA'I'IONS:
where on the right side 0· 00 is to be taken as 0.5
n, N, v, r, s, denote positive integers and will be
used as time parameters or general numerals; • T his n aturally suggests t be consideration of morc t han on c taboo statc .
• This follows also from t he eas ily intcrpreted relations
1 National Bureau of Standards and Insti tute of Statistics, Uni versity of co " n _ _ l _ _ ~:, .
North Carolina , Chapel Hill, N. C. l+ j P ;.:= ~ (, F lI ) - " - •
n= O I -,ll'" , F~1
2 "Denumerable" means "with a denumerable number oC states;JI "chain"
refers to a process with an in tegral time parameter . Th e convention that O· '" is to be taken as 0 w ill be un derstood in sim il a r cir-
3 Fi gures in brackets indicate tbe literature references at tbe end of this paper. cumstances.

203
J
I
PROOF: We start from the formula FORMULA III : If i ~j, then

(1)
(III)
where we agree that jPi?) = 1. Equation (1) is
proved as follows. Either the state i is not entered
at all during the passage from i to j, which contin- PROOF: We start from the formula
gency contributes the term corresponding to v=O on
the right side of (1); or there is a last entry of i,
occurring at the vth step, 1~v~n-1, which con- (3)
tingency contributes the general term.
Summing (1) over n, we obtain
where we agree that iPi?) = 0. The proof of (3) IS
entirely similar to that of (1).
Summing (3) from n = O to n = N, we obtain

We need an elementary lemma which is frequently


Since the terms of the double series are nonnegative, useful in such connections.
the inversion is justified and (I) is proved. Moreover,
this proves that if jF,j > 0, then jPi~ ro. It follows < LEMMA. LetO~a .~ l, b. ;:::O; ~ a,> O, lim b,=
v=o
from (I) that jFij= O if, and only if, F,j = O, namely, v-+cn

P,<'; )= 0 for all n. B~ + ro o Then


N
FORMULA II: If j ~k, then L2 a.b N _.
lim
N-too
.=0
-=--="N:r--- B.
(II)
L2
v=O
a.
(This formula is easily interpreted in terms of math-
ematical expectations.) Applying the lemma to (4) we obtain (III). That
iP,j < ro is clear from (IIa) , and the remarks at the
PROOF: We start from the formula end of the proof of (I),
THEOREM I, The limit
n
kP .('! )
11
= '" ~
kF\J
.(~ ) kP 11
('! -, ) , (2)
v=1
where as before kP/?)= 1. If we ignore the left-hand
subscripts, (2) reduces to a familiar formula. The (5)
proof of the latter extends immediately to (2).
Summing (2) over n we obtain
00 00 n exists, and is equal to any of the following three
k
P*if -- 'L...J
" k
p (n ) _
if - L...J £.-J k F if
'" '" (' )kp ii(n - , ) expressions:
n=1 n=l v=1
l +jPi~ Fij jFi~.
(IVa, b, c)
l + i Pij' F~ iFti'

the first always, the second if i~j, the third if Fij Fi~ > 0.
We note the following corollaries to (I) and (II),
to be used later. PROOF. Doblin [2] has shown, trivially, that
FORMULA IIa: If i~j, then
N
(II a) "'P !'!)
.L..J '1
'
]1m n=O
- N- - =
F*
'ii ' (6)
FORMULA lIb: If j ~k, then N-t oo ' " P ('! )
.L..J
n=O "

(lIb) Comparing (III) and (6), we obtain (IVb) if i~j.


(IVa) now follows from (IIc) and obviously holds for
FORMULA IIc: If i~j, then i=j. If FijFi~ > O, then t he denominator of (IVc) is
not zero, and this is then equal to (IVb) by (lIb),
(II c) with k =i.
204
That the limit (5) exists, and is finite and not zero , tributed random variables. Evidently we have
was proved by Doblin [2]; that it is equal to (IVa)
was previously proved by the author [5]. The
<
<X>

present approach seems to be the simplest. E (Ws) = L; jPii )= jP,i co •


n =1
COROLLARY. If i,j ,k ,l, are distinct states of one
, class, 6 then Now by definition we have vy,,:::;'n< vY,,+l and
Y.- l Y.
F,i (1 + tP/;) L; W. :::;' ZN:::;' VI+ L;Ws.
F :Z (l +jPti)' 8= 1 8=1

Consequently,
Naturally there are other expressions for it, and Y.-l Y.
we omit the tedious considerations when some of the L; Ws Z L;W.
states are identical. 8= 1 <~< ~ + ~. (8)
2. W e now consider two states i and j belonging Yn - Y ,,- Y n Y"
to the sam e r ecurrent class, namely:
Applying Iiliintehine-Kolmogorov's s trong law of
large numb er s (see, e. g., [9J p . 208) to the sequence
{ W.} we obtain:
A fundamental idea in t h e theory of DM CS, already
found in Kolmogorov's work, is that whatever tran-
spires b etween s uccessive en tries at a r ecurrent state
forms a sequen ce of indep end ent events. Using this (9)
idea, Harris [4] and L evy [10], indep endently of each
other , discovered theorem 2. Our proof is somewhat
differ ent from theirs. Moreov er , P (v 1 <+ co) = 1. It follows from (8) and
L et i ~j and define (9) that

Y,, = the numb er of v, l :::;'v :::;'n, such that X .= i ; P (lim Z n=iP;i ) = 1. (10)
n --+<Xl Yn
Z ,, = the nll1nb er of v, l :::;'v:::;' n, s uch that X .=j. N ow F,i = 1. H ence th eorem 2 follows from (10)
and theorem 1, using (IVb) there.
In words, Y " (or Z n) is the number of entries at the This theorem includes as special case a previous
state i (or j ) in th e first n steps. Using the average r esult by Erdos and t h e author [7] . Consider in-
ergodic theorem (see (11) below) it is easy to show dependen t, identically distributed random variables
t ha t if j is a positive state, and P (XOEC) = 1, wher e { Un} which assume only integer values with m ean
C is the class containing j, th en we have zero. They form a DMCS with all integer s as the
states. Since the m ean is zero, all possible states
Z are r ecurrent by a theorem of Fuchs and t h e author
p(lim n " = 1) = 1. [8] .7 Without loss of generality, we may suppose that
n--+<x> ""'P C,)
£......J
v=O
11 every integer is a possible, therefore r ecurrent, state.
11
The following theorem covers both positive and null 'W riting S ,,= L; Uv, we see that
classes . v=1
THEOREM2 (Harris-Levy). If i and j are two states
in a recurrent class C and P (XOEC) = 1, then

""'P C~)

[
Z n l' £......J
1·lm -"' = n ] 11

P
,,--+ <x> Y n
Im -v= O- - = 1 .
n--+ <x> ±Pii)
(7) P (lim yZ "=
n--+Q) n
1)=1,
v= O
which is th eorem 8 in [7] . Needless to say, as far as
PROOF . Since i is reClll'l'ent, we have P(1im Y ,,= this statement is concerned, Harris' approach is
n---><x>
+ co ) = 1. L et VI <V2< . . . b e the successive in-
incomparably better . However, we note that ther e
we actually proved a sharper result, i. e.
dices V, such that X .=i. L et lVs = the numb er of
v, vs< v< vs+! such that X .= j. Then as r emarked
above, the W s's are indep endent, identically dis-
'Slightly generalizing Kolmogorov, we define a class to be a set of states such 7 T bis important step cannot be circumventcd by tbe present, more general,
that for all Y two states i alld j belonging to it, wc bavc F;! l'j'>O
. See [6]. approach.

205
n H ence we ob tain from t he above:
for ever y ~ > O, wher e k f n = 2.: P (Sv=i). See also
v=1
th eorem 7 in [7J . It would b e of inter es t t o investi- FORMULA VI. If j ~ k, then
gate correspondin g strong relations for th e gen eral
M arkov-chain case, using p erhaps a more precise
form of th e stron g law of lar ge numb ers.
3. W e now consider a positive r ecurren t class C. FORMULA VII. If j ~ k, then
According t o Kolmogorov, in C all m ean r ecurren ce
and firs t passage times are finite, n am ely , for all m(i, j n k )= m ij+kF ij m jk= m ik+jF tkmkl' (VII) I

i,.1 ~ C we h ave
'" Since
m ij= 2.: nF \/)
n=1
< ro.
W e in troduce the notions of first p assage t o .1Uk a nd we dedu ce from (VII ) :
t o .in k , as follo ws . L et .1 ~ k .
T (i, j Uk ) = th e smallest in teger n ~ 1 such th at FORMULA VIII . If j~ k, then
X n=.1 or X n= k , which ever h appens fir st, given th at
Xo = i; (VIII )
T (i, j n k )= th e sm all est in teger n~ 1 such th a t
th ere exist two in tegers nl and nz su ch th at 1h :;:n z, W e no te th e following sp ecial case (i = k) of (VIII ):
l ~nl~n, l ~nz~n, and X "l=.1, X nz= k , given th a t
Xo = i; (VIlla)
m(i,.1Uk )= E { T (i,j U k ) } ;
m(i,j n k ) = E { T (i, j n k ) }. This last formula is due to H arris [4], who also d e-
L r, t, w denote th e sample poin t. Put rived from it th e following relation:
e ~ = { w :Xo(w) = i, X v(w) ~ .1 , ~ k
(VIII b)
for 1 ~ v ~n ; Xn (W) =j} ;

ej = Ue~.
71=1
Now in a positive class th e ergodic th eorem of K ol-
mogorov holds: 8
Thus e j is th e even t that Xo=i and th e state .i is ,. l ~ p () 1
11m - L..J .~ = - -. (11)
r each ed b efore th e state k. Since i, .1, and k b elong n-.'" n V= 1 " m jj
to on e r ecurren t class, we h ave
Thus (VlIIb ) t urns ou t to b e a sp ecial case of I

theorem 1, using (IVc) and no ting th at F 7k= F :i = 1.


Dividing (VIII ) by th e produ ct of (VIlla) and
L et P (Xo = i) = c> O. W e have th e following rela- (VIIlb ) we ob tain
tions, immediate consequ en ces of t h e d efinitions.

cm(i,.1U k )= ,ti {J~~ n P (dw)+ .r n P (dw) }


(12)

By formula (lIb), th e right side is kPt;, s:nce Fik= 1 in


Cm ij= ~ {J~~ n P (dw) +J~~ (n + m kj)dP (w) } th e presen t case. Thus w e ob tain
FORMULA (IX) . If j~ k then

= cm(i , .1U k )+ P (ek)m kj m ik+ mkj- m ij


(IX)
m jj
cm(i, n j lc) = ~ {L (n + m jk)P (dw)
As an application consider, as in th e Cen tral Limit
Theorem for Markov chains, r andom variables { Y n }
+ i~ (n + m kJ)P (dW) } attach ed to th e M arkov ch ain { Xn } in th e following
way: Yn = Xi if X n= i where th e x/s are arbitrary
real numbers.
T HEOREM 3. L et i be a positive state. Given Xo =i,
let Vo denote the smallest n ~ 1 such that X n= i. Then
N ow b y defini tion we h ave
if the series on the right-hand side converges a bsolu tely ,
8 T his theorem actuall y establishes t he limit of Pli) as n-.",. T he average form
(11 ) is an easy consequence of a H ardy-Littlewood 'rauberian theorem.

206

L
This r elation is in striking r esemblance to a familiar
formula in the elementary calculu s of probabilities
according to whi ch if A and B arc any two event~
then
P (A UB ) + p (A n B ) = P (A) + P (B ).
~rhe generalizations of the last r elation to fl.ny
fimte number of events is known as Poincar e's for-
(Xb) mula (sec, for example, [9], p. 61 ) ; and we immedi-
ately suspect that th e same may be true for the m ean
first passage t imes. This is indeed so. W e d efine
/ PROOF. It is more convenient to consid er new m(i ,jl l.} .. . Uj ,) and m (i,jln . . . n j,) as t h e obvioLls
! variables Zn defined as follows: extenslOns fro m th e case s= 2. We sh all also write
/ In ... . n j ,....;-j to denotejl n .. . n j r_l n j ,+ln . . .n j , if
o if X .=i for som e v, 1 ~v < n;
J =.7r (1 ~ r ~ s) andjl n . . . n j, if j is not one of the
Zn= { j r's .
Xj if X n=j and X ..~i, 1 ~v < n , FO :l~ M u LA XI. If jl, . . .,.1, are distinct states in a
. where j may be i in the last-written line. Evid ently
positive cLass to which i also belongs, then
we have then

E { t Yn I Xo= i } =E {~ ZnIXo=i}
n=1 n=1 + l $ r' <~" < ' 3$8 m(i, j rln .1r2n j(3) - ... +
(- l )'-lm (i, j1 n . .. n js). (XI)

P.ROOF. Put

e~ ={ w:Xo(w) = i, XV (W)~jl"'" .1s


by (IX) with i = k.
Furthermore, we have for 1 ~ v< n, X n(W) = .1T}
e= U e~l '
E {(~ Y n) 2IX o= i } = E {(~ ZnYIXo=i}
T

7> = 1

W e have, as at th e b eginning of section 3,


= E{f:(Z~ + 2 $ ~ ZrZ,) IXo = i }.
7> = 1 r<8$ n1
m(i, .1' ln . . . n.ir l )

As before, we obtain r eadily


~~ i~ {n+m(jr, .1r1 n
I
= c- 1 .. . n jrl....;-j,) }P(dw)
E{~
n=1
Z~ IXo= i } = ~ m xi.
:' = 1 m jj
ii
= m(i, j lU . . . Uj ,)
Next we have 8
+ ~ c-IP(e')m(j"
,= 1
jT. n ... n j . ....;- ';.). (13)
~ ~ E(ZrZ, IX o=i)=
r = 18= r+ l S ub stitute (13 ) into the r ight side of (XI ) and
consider the typical term c- IP(er)m(.i" .1'ln
n j rl )' where r is distinct from rl, . . . ,1'1. It
appears on the right side of (XI) once in m (i,.1T 1 n .
By (IX) this red uces to (Xb) . n .1'l) . and. once in m(i,j,n j rl n . .. n j'I)' with
OppOSIte SIgns; .and does not appear in any other
The t·wo expressions on the left sides of (Xa) and terms. H en ce ItS n et coefficient on the rio'ht side
(Xb ) play an important role in Doblin's C entral of (XI) is zero. It remains only to consider th e
Limit Theorem for DMCS. W e r efer t h e r eader to t erm m(i,.iI U . . . U.i,). This appears once in
[2] for details . They are h ere eval uated in what every term and h ence its n et coefficient is
see:rns to us more tangible terms .
4. From formulas (VI) and (VII) it follo ws that

, Doob pointed out to me that this is a case of Wald 's equation for Markov
chains . Therefore (XI) is established.
207
We remark that trivial as this proof is, it does not where, as later, an unspecified summation runs from
exactly correspond with the familiar proof of Poin- 1 to co Summing (16) over n, we obtain
care's formula, and we do not know if there is any
closer relation between the two apparent twins. " k P*.11.
L..J P hiCI) -_ k P*i j - k P i CI)j ,
h;><k
We also leave possible extensions suggested by the
known extensions of Poincare's formula to the or
interested reader. "L..J k P*i h P hi(1)_ P* + Pki(1) - P if(1)
- k if (17 ) . (
h
5. We now give another method of computing
~P7;. This method requires the ergodic theorem
since
(11). An interesting byproduct is the following:
THEOREM 5. If i, j and k (not necessarily distinct)
belong to a positive class, then We assert that in general

~ {p Cn)_pCn)} -_ mjk-mik•
£.....J ik ik
n=1 mkk
PROOF. Using the familiar formula (d. the remark This is readily shown by induction on n , starting
after (2» with (17). Now sum from n = 1 to n = N, divide by
P iCn)-" " F C'ik) p Cnkk -,) , N, and let N -7 co. By (11) and theorem 5 we obtain
k -.LJ
v =1
we have ( L: kPth)ml =kP;j+ f; { P~i) -P;(i)}
h v=1
j j

Now,
Substituting from (11), we see that the right side
of (14) is, as N -7 OJ, asymptotically equal to

-£ {F ;'2 -F }'2} N-
v= 1 mkk
v. (15)
= f;1 f;Fi~)
f; P(Xvr!'-k, 1::; v<n l Xo=i)= n= = m ik'
n =1 v=n
Now since (19)
00 00 (18) and (19) give (IX) .
F;*k = Fik= l, L:vF;'2= mik< OJ, L:vF}V2= m jk< OJ,
v= 1 v= 1
References
we have, as N -7 co
[1] A. N. Kolmogorov, Anfangsgrlinde der Theorie der
Markoffschen Ketten mit un endlich vielen moglichen
Zustanden, Matematiceskii Sbornik, N . S., vo!. 1, 607
to 610 (1936); Bu!. Univ. Etat Moscow 1, (1937) (in
Russian) .
Using this in (15), we see that its limit as N -7 OJ IS [2] W. Doblin, Sur deux problemes de M. Kolmogoroff con-
cernant les chaines denombrables, Bu!. Soc. Math.
France 66, 210 t o 220 (1938) .
lim _1_ £ (-v ){ F ;'1 -F}'n m jk- mik. [3] "ltV. F eller , Fluctuation t heory of recurrent events, Trans.
N ->oo mkk v= 1 mkk Am. Mat h. Soc. 67, 98 to 119 (1949) . See also [10],
chap. 12.
We note that theorem 5 gives a convenient de- [4] T. E. Harris, First passage and recurrence dis tributions .
termination of the mean first passage times in terms Tran s. Am. Math. Soc. 73, 471 to 486 (1952) .
of the transition probabilities; in particular [5] K. L . Chung, An ergodic theorem for stationary Markov
chains with a count able numb er of states, Proc. Int er-
national Congress of Math. II, 568 (1950) .
[6] K. L . Chung, Lecture notes on Marko v chains, Graduate
Mat hematical Statistics Society , Columbia University,
New York, N. Y . ~...c 1951 ) .
[7] K. L. Chung and Y . Erdos, Probabili ty limi t theorems
We do not know what the situation is in a null class. assuming only the first moment I, M em. Am. Mat h.
All we can infer from theorem 1, (IVc), is that if Soc. 6 (1951).
i and j belong to one recurrent class and jFt; r!'- iFtil [8] K. L. Chung and W. Fuchs, On t he distribution of values
of sums of random variables, M em . Am. Math. Soc .
then 6 (1951) .
[9] W. Feller, An introduction to probability theory and its
applications (John Wiley & Sons, New York, N . Y.,
1950) .
[10J P. Levy , System es Markoviens et stationnaires. Ca s
To return to kP7i' If j r!'- k, we have evidently denombrables. Arln. Sci. Ecole Norm. Sup. (3) , 68,
327 to 381 (1952) .
"£....J;\: p "Cn)
I.
P hiCI) -_k p ic"i +1) , (16)
h;>< k WASHINGTON, January 14, 1952.
208

You might also like