Markov Chain

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

Chapter 29 Markov Chains

Kuo-Hao Chang
Dept. of Industrial Engineering & Logistics Management
National Tsing Hua University
Stochastic Process (SP)
A sequence of random variables indexed by time t {Xt }.

Xt : state of the system at time t (varies as t varies)


t can be discrete (discrete time SP) or continuous (continuous
SP)

Example:
1. Weather

0 if day t is dry
Xt =
1 if day t is rainy

1 / 46
2. Inventory
Dt : # of cameras that would be sold in week t if the
inventory is not depleted (i.e., demand)
Xt : # of cameras on hand at the end of week t

Markorvian property
Pr{Xt+1 = j|X0 = k0 , X1 = k1 , . . . , Xt−1 = kt−1 , Xt = i}
= Pr{Xt+1 = j|Xt = i}
for t = 0, 1, . . . and every sequence i, j, k0 , k1 , . . . , kt−1

Markov chain
A stochastic process {Xt }t=0,1,... is a Markov chain if it has the
Markovian Property.

2 / 46
Transition Probabilities (one-step)
Pr{Xt+1 = j|Xt = i}

Stationary Transition Probabilities


If, for each i, j
Pr{Xt+1 = j|Xt = i} = Pr{X1 = j|X0 = i} for all t = 0, 1, 2, . . .

n-step stationary transition probabilities


For each i, j and n (n = 0, 1, 2, . . .)
Pr{Xt+n = j|Xt = i} = Pr{Xn = j|X0 = i} for all t = 0, 1, 2, . . .

Let
(1)
Pij = Pr{Xt+1 = j|Xt = i} (= Pij )
(n)
Pij = Pr{Xt+n = j|Xt = i}

3 / 46
Two properties
(n)
1. Pij > 0 for all i, j; n = 0, 1, 2, ...
PM (n)
2. j=0 Pij = 1 for all i; n = 0, 1, 2, ...

n-step transition matrix

4 / 46
We assume
1. A finite number of states
2. Stationary transition probabilities

Example:
1. Weather
Suppose
Pr{Xt+1 = 0|Xt = 0} = 0.8, Pr{Xt+1 = 0|Xt = 1} = 0.6
Then

5 / 46
2. Inventory
Note :
(1) Xt : the state of system at time t
(2) Pr{Xt+1 = j|Xt = i, Xt−1 = k1 , Xt−2 = k2 , . . .}
= Pr{Xt+1 = j|Xt = i}
So {Xt }t=0,1,2... is a Markov chain

Suppose Dt+1 is a Poisson r.v. with a mean of 1.


n ·e−1
Pr{Dt+1 = n} = 1 n! , for n = 0, 1, 2, . . .
−1
Pr{Dt+1 = 0} = e = 0.368
Pr{Dt+1 = 1} = e−1 = 0.368
Pr{Dt+1 = 2} = 12 e−1 = 0.184
Pr{Dt+1 > 3} = 1 − Pr{Dt+1 6 2} = 0.08

6 / 46

0 order 3 cameras
Further suppose Xt =
> 0 do not order any cameras
Possible states = 0, 1, 2, 3 cameras on hand
Then(for t = 0, 1, 2, . . .)

max{3 − Dt+1 , 0} if Xt = 0
Xt+1 =
max{Xt − Dt+1 , 0} if Xt > 1

So when Xt = 0
Pr{Xt+1 = 3} = Pr{Dt+1 = 0} = P03 = 0.368
Pr{Xt+1 = 2} = Pr{Dt+1 = 1} = P02 = 0.368
Pr{Xt+1 = 1} = Pr{Dt+1 = 2} = P01 = 0.184
Pr{Xt+1 = 0} = Pr{Dt+1 > 3} = P00 = 0.08

7 / 46
when Xt > 1
P11 = Pr{Dt+1 = 0} = 0.368
P10 = Pr{Dt+1 > 1} = 1 − Pr{Dt+1 = 0} = 0.632
P22 = Pr{Dt+1 = 0} = 0.368
P21 = Pr{Dt+1 = 1} = 0.368
P20 = Pr{Dt+1 > 2} = 1 − Pr{Dt+1 6 1} = 1 − (0.368 + 0.368)
= 0.264

Similarly, we can calculate P33 , P32 , P31 , P30

So

8 / 46
The state transition diagram

9 / 46
3. Gambling
Starting with $ 1. Win $1 with probability P > 0, or lose $
1 with probability 1 − P > 0. The game ends when $3 or
$0 is reached.

Note : States 0 and 3 are called an absorbing state (a state that


is never left once the process enters that state).

10 / 46
Chapman-Kolmogorov Equations
M
(n) (m) (n−m)
X
Pij = Pik Pkj
k=0

for i = 0, 1, 2, . . . , M ; j = 0, 1, 2, . . . , M ;
and any m = 1, 2, . . . , n − 1
special case:
(n) PM (n−1)
(1) m = 1, Pij = k=0 Pik Pkj , for all states i, j
(n) PM (n−1)
(2) m = n − 1, Pij = k=0 Pik Pkj , for all states i, j
(2) PM
Based on the above, we can obtain Pij = k=0 Pik Pkj

11 / 46
Using matrix notation, P(2) = P · P = P2
Similarly, for n = 3, 4, . . .

P(n) = P · P(n−1) = P(n−1) · P = P · Pn−1 = Pn−1 · P = Pn

When n → ∞, all rows of P n have identical entries


(steady-state probabilities)
– the initial state is no longer relevant.

Unconditional state probabilities


(n)
Pr{Xn = j|X0 = i} = Pij − conditional probabilities
(n) (n)
Pr{Xn = j} = Pr{X0 = 0}P0j + Pr{X0 = 1}P1j + · · ·
(n)
+ Pr{X0 = M }PM j − unconditional probabilities

12 / 46
(n)
1. State j is accessible from state i if Pij > 0 for some
n ≥ 0. (i → j)
2. If state j is accessible from state i and state i is accessible
from state j, then state i and j are said to communicate.
(i ↔ j)
Note:
(1) Any state communicates with itself
(0)
(because Pii = Pr{X0 = i|X0 = i} = 1).
(2) If state i communicates with state j, then state j
communicates with state i.
(3) If state i communicates with state j and state j
communicates with state k, then state i communicates with
state k.
3. A class includes all states that communicate with each
other. (A class may consist of a single state)
4. If there is only one class, i.e., all states communicate, the
Markov chain is said to be irreducible.
13 / 46
Classification of states of a Markov chain
I Transient state – if, upon entering this state, the process
might never return to this state again.
Some state i is transient ↔ there exists a state j (j 6= i)
that is accessible from state i but not vice versa.
I Recurrent state – if, upon entering this state, the process
definitely will return to this state again.
I Absorbing state – if, upon entering this state, the process
will never leave this state again.
Gambling example : states 1, 2 are transient and states 0, 3 are
recurrent (also absorbing).

14 / 46
Some properties that you should know about states :
1. Recurrence is a class property (all states in a class are
either recurrent or transient).
2. In a finite-state Markov chain, not all states can be
transient.
3. Following the above, all states in an irreducible finite-state
Markov chain are recurrent.
(n)
4. There exists a value of n for which Pij > 0 for all i and j
→ all states are accessible.

15 / 46
Example:

state 0 : recurrent state 3 : transient


state 1 : recurrent state 4 : transient
state 2 : absorbing(recurrent)

16 / 46
Periodicity Properties
Period of state i :
(n)
The integer t(t > 1) such that Pii = 0 for all values of n other
than t, 2t, 3t and t is the smallest integer.

Aperiodic state (has period 1) :


If there are two consecutive numbers s and s + 1 such that the
process can be in state i at times s and s + 1.

Gambling example : Both states 1 and 2 have period 2.

Periodicity is a class property (all states i in one class have the


same period).

17 / 46
For finite-state Markov chain, recurrent and aperiodic states are
called ergodic states.

Long run properties of Markov chains


(n)
lim Pij = πj > 0
n→∞

I a limiting probability that the system will be in each state


j after a large number of transitions
I this probability is independent of the initial state

18 / 46
πj uniquely satisfy the following steady-state equations
(1) πj = M
P
i=0 πi Pij , for j = 0, 1, 2, . . . , M
PM
(2) j=0 πj = 1
(M + 2 equations, M + 1 unknowns)
In matrix form, π = πP when π = (π0 , π1 , ...πM )

πj :
(1) steady-state probabilities,i.e., the probability of finding the
process in a certain state, say j, after a large number of
transitions tends to the value πj , independent of the
probability distribution of the initial state.
(2) also known as stationary probabilities (not to be confused
with stationary transition probabilities), i.e.,
if Pr{X0 = j} = πj then Pr{Xn = j} = πj

19 / 46
Weather example :
π0 = 0.8π0 + 0.6π1
π1 = 0.2π0 + 0.4π1
π0 + π1 = 1
⇒ π0 = 0.25, π1 = 0.75

Note :
1. If i and j are recurrent states belonging to different classes,
(n)
then Pij = 0 for all n.
(n)
2. If j is a transient state, then limn→∞ Pij = 0 for all i.

20 / 46
Expected Average Cost per Unit Time
(n)
I For any irreducible ergodic Markov chain, limn→∞ Pij
exists and is independent of i.
(Note: In a finite-state Markov chain, recurrent states that
are aperiodic are called ergodic)
(n)
I If the states are not aperiodic, limn→∞ Pij may not exist.
Ex:


(n) 1 if n is even
P00 =
0 if n is odd

21 / 46
(k)
limn→∞ ( n1 nk=1 Pij ) = πj always exists for an irreducible
P
I
finite-state Markov chain.
Here πj satisfy the steady-state equations:

I Suppose a cost C(Xt ) is incurred when the process is in the


state Xt at time t, for t = 1, 2, . . . and that C(·) is
independent of t. The expected average cost incurred over
the finite n periods is given by
n
1X
E[ C(Xt )]
n
t=1

22 / 46
(k)
Since limn→∞ ( n1 nk=1 Pij ) = πj , the long-run expected
P
average cost per unit time is given by
n M
1X X
lim E[ C(Xt )] = πj C(j)
n→∞ n
t=1 j=0

Ex:
For the inventory problem, suppose


 0 Xt =0
2 Xt =1

C(Xt ) =

 8 Xt =2
18 Xt =3

Pn
limn→∞ E[ n1 t=1 C(Xt )]

= 0.286 · 0 + 0.285 · 2 + 0.263 · 8 + 0.166 · 18 = 5.662


23 / 46
Actual Average P Cost per Unit P Time
limn→∞ [ n1 nt=1 C(Xt )] = M j=0 πj C(j) P
(= limn→∞ E[ n1 nt=1 C(Xt )])
Special case:

1 if Xt = j
If C(Xt ) =
0 if Xt = j
the long-run expected fraction of times the system is in state j
is given by
n
1X
lim E[ C(Xt )]
n→∞ n
t=1
= lim E(fraction of times the system is in state j)
n→∞
= πj
24 / 46
Expected Average Cost per Unit Time for Complex
Functions
For the inventory example, suppose the costs to be considered
are the ordering cost and the penalty cost for unsatisfied
demand.
Assume that the number of cameras ordered to arrive at the
beginning of week t depends only on the state of the process
Xt−1 (the number of cameras in stock) when the order is placed
at the end of week t − 1.
The long-run expected average cost per unit time is given by
n M
1X X
lim E[ C(Xt−1 , Dt )] = k(j)πj
n→∞ n
t=1 j=0

where k(j) = E[C(j, Dt )](the conditional expectation taken


with respect to Dt , given the state j)
25 / 46
The long-run actual average cost per unit time is given
by
n M
1X X
lim [ C(Xt−1 , Dt )] = k(j)πj
n→∞ n
t=1 j=0

Example:


10 + 25 · 3 + 50 · max{Dt − 3, 0} if Xt−1 = 0
C(Xt−1 , Dt ) =
50 · max{Dt − Xt−1 , 0} if Xt−1 > 1

for t = 1, 2, . . .
Hence, C(0, Dt ) = 85 + 50 · max{Dt − 3, 0}

26 / 46
k(0) = E[C(0, Dt )] = 85 + 50 · E(max{Dt − 3, 0})
= 85 + 50{PD (4) + 2PD (5) + 3PD (6) + · · · } = 86.2

Similarly, k(1) = 18.4, k(2) = 5.2, k(3) = 1.2


So the long-run expected average cost per week is given by
3
X
k(j)πj = 86.2(0.286) + 18.4(0.285) + 5.2(0.263) + 1.2(0.166)
j=0

= 31.46

In general, if
1. {Xt } is an irreducible (finite-state) Markov chain.
2. Associated with this Markov chain is a sequence of random
variables {Dt } which are independent and identically
distributed.
27 / 46
3. For a fixed m = 0, ±1, ±2, . . . , a cost C(Xt , Dt+m ) is
incurred at time t, for t = 0, 1, 2, . . ..
4. The sequence X0 , X1 , X2 , . . . , Xt must be independent of
Dt+m .
then
n M
1X X
lim E[ C(Xt , Dt+m )] = k(j)πj
n→∞ n
t=1 j=0

where k(j) = E[C(j, Dt+m )]


(for “essentially” all paths of the process)

28 / 46
First passage time – the length of time the process takes
from state i to state j for the first time
Recurrence time – the “first” passage time when i = j

Example:
X0 = 3, X1 = 2, X2 = 1, X3 = 0, X4 = 3, X5 = 1
the first passage time in going from state 3 to state 1 is 2 weeks.
(n)
Let fij be the probability that the first passage time from
state i to j is equal to n

(1) (1)
fij = Pij = Pij
(2) (1)
X
fij = Pik fkj
k6=j
(n) (n−1)
X
fij = Pik fkj
k6=j

29 / 46
Example (Inventory):

(1)
f30 = P30 = 0.08
(2) (1) (1) (1)
f30 = P31 f10 + P32 f20 + P33 f30
= 0.184(0.632) + 0.368(0.264) + 0.368(0.08) = 0.243

(n)
For fixed i, j, ∞
P
n=1 fij 6 1. (“< 1” because a process initially
in state i may never reach state j)

(n) (n)
Only when ∞
P
n=1 fij = 1, fij is considered as a probability
distribution for the random variable, the first passage time.

30 / 46
Expected first passage time
( (n)
if ∞
P
∞ n=1 fij < 1
µij = P∞ (n) (n)
if ∞
P
n=1 nfij n=1 fij = 1
If

(n)
X
fij = 1,
n=1
µij uniquely satisfies the equation
X
µij = 1 · Pij + (1 + µkj )Pik
k6=j
X X
= Pij + Pik + Pik µkj
k6=j k6=j
X
=1+ Pik µkj
k6=j
31 / 46
Example (Inventory):

µ30 = 1 + P31 µ10 + P32 µ20 + P33 µ30


µ20 = 1 + P21 µ10 + P22 µ20 + P23 µ30
µ10 = 1 + P11 µ10 + P12 µ20 + P13 µ30

or

µ30 = 1 + 0.184µ10 + 0.368µ20 + 0.368µ30


µ20 = 1 + 0.368µ10 + 0.368µ20
µ10 = 1 + 0.368µ10

32 / 46
⇒ µ10 = 1.58 (weeks), µ20 = 2.51 (weeks), µ30 = 3.50 (weeks)
So the expected time until the cameras are out of stock is 3.5
weeks.

Expected recurrence time


If the steady-state probabilities are (π0 , π1 , . . . , πM ), the
expected recurrence times are µii = π1i , for i = 0, 1, . . . , M .
For the inventory example, µ00 = π10 = 0.286 1
= 3.5 (weeks).

Probability of absorption
If state k is an absorbing state, and the process starts in state i,
the probability of ever going to state k is called the probability
of absorption, denoted by fik .

33 / 46
fik satisfies
M
X
fik = Pij fjk , for i = 0, 1 . . . , M
j=0

subject to
fkk = 1,
fik = 0, if state i is recurrent and i 6= k.

Random Walk

34 / 46
Example(Gambling):

Two players (A and B), each having $2, agree to keep playing
the game and betting $1 at a time until one player is broke.

Pr(A wins a single bet) = 13 , Pr(B wins a single bet) = 2


3
State: # of dollars that player A has before each bet

35 / 46
So

f00 = 1,
2 1 1
f10 = f00 + f11 + f20 ,
3 1 3
1 2 1 1
f20 = f11 + f10 + f11 + f30 ,
1 3 1 3
1 1 2 1 1
f30 = f11 + f11 + f20 + f11 + f40 ,
1 1 3 1 3
f40 = 0

⇒ f20 = 45 , i.e., the probability that A loses all her money is 0.8
Check: f24 = 0.2 (the probability that B loses all her money)

36 / 46
Continuous Time Markov Chains
Previously: t is discrete, state is discrete.
Now: t is continuous, state is discrete.
X(t): the state of system at time t (one of the possible values
from 0, 1, 2, . . . , M )
Markovian property
Pr{X(t + s) = j|X(s) = i and X(r) = j}
= Pr{X(t + s) = j|X(s) = i}, for all r ≥ 0, s > r, and t > 0.

37 / 46
Pr{X(t + s) = j|X(s) = i} — transition probability
Pr{X(t + s) = j|X(s) = i} = Pr{X(t) = j|X(0) = i}
— stationary transition probability
Let Pij (t) = Pr{X(t) = j|X(0) = i}
— continuous time transition probability function
Assume 
1 if i = j
lim Pij (t) =
t→0 0 if i 6= j
A continuous time stochastic process {X(t) : t ≥ 0} is a
continuous time Markov chain if it has the Markovian property.

38 / 46
Similarly as discrete time Markov chain, we assume
1. A finite number of states
2. Stationary transition probability

39 / 46
The Markovian property implies
Pr{Ti > t + s|Ti > s} = Pr{Ti > t}
Only one (continuous) probability distribution possesses this
property – exponential distribution.

Check: if Ti ∼Exp(qi )
e−qi (t+s)
Pr{Ti > t + s|Ti > s} = e−qi s
= e−qi t = Pr{Ti > t}

For Continuous time Markov chains


1. The random variable Ti has an exponential distribution
with a mean of q1i

40 / 46
2. When leaving state i, the process moves to a state j with
probability Pij , where the Pij satisfy the conditions

Pii = 0 for all i,


and
M
X
Pij = 1 for all i.
j=0

3. The next state visited after state i is independent of the


time spent in state i.

41 / 46
Transition intensities

1 − Pii (t) Pii (t) − Pii (0) d


qi = lim = lim − = − Pii (0) for i = 0, 1, 2, .
t→0 t t→0 t dt
Pij (t) d
qij = lim = qi · Pij = Pij (0), for all j 6= i
t→0 t dt
where Pij (t) is the continuous time transition probability
function.
qi and qij are also interpreted as transition rates
qi : the expected number of times that the process leaves
1
state i per unit time spent in state i.(i.e., qi = E(T i)
)
qij : the expected number of times that the process transits
from P state i to state j per unit of time spent in state i
qi = j6=i qij

42 / 46
Steady-State Probabilities
For any state i, j, and nonnegative numbers t and s (0 ≤ s ≤ t)
M
X
Pij (t) = Pik (s)Pkj (t − s)
k=0

(the continuous time version of Chapman-Kolmogorov


Equations)

I A pair of states i and j are said to communicate if there


are times t1 and t2 such that Pij (t1 ) > 0 and Pij (t2 ) > 0.
I All states that communicate are said to form a class.
I If all states form a single class (i.e., the Markov chain is
irreducible), then Pij (t) > 0 for all t > 0 and all states i
and j.

43 / 46
limt→∞ Pij (t) = πj (steady-state probabilities) always exists
and is independent of the initial state of Markov chain, for
j = 0, 1, . . . , M .

The πj satisfy
M
X
πj = πi Pij (t), for j = 0, 1, . . . , M and every t ≥ 0
i=0

A more useful system of equations (balance equations)


X
πj qj = πi qij , for j = 0, 1, . . . , M
i6=j

(leaving rate) = (entering rate)


M
X
and πj = 1
j=0

44 / 46
Example:
A certain shop has two identical machines that are operated
continuously except when they are broken down. Because they
break down fairly frequently, the top-priority assignment for a
full-time maintenance person is to repair them whenever
needed. The time required to repair a machine has an
exponential distribution with a mean of 12 day. Once the repair
of a machine is completed, the time until the next breakdown of
that machine has an exponential distribution with a mean of 1
day. These distributions are independent.

Let X(t) = # of machines broken down at time t.


{X(t); t ≥ 0} is a continuous time Markov chain (why?)
The possible states are 0, 1, 2

45 / 46
q02 = 0, q20 = 0, q21 = 2, q10 = 2, q12 = 1, q01 = 2

So
q0 = q01 = 2
q1 = q10 + q12 = 3
q2 = q21 = 2

Balance equations

2π0 = 2π1
3π1 = 2π0 + 2π2
2π2 = π1 ⇒ (π0 , π1 , π2 ) = ( 52 , 25 , 15 )
π0 + π1 + π2 = 1

46 / 46

You might also like