Markov chains_lectures_DCM_2024

Markov Chains
• Innocent Ndoh Mbue, PhD

(Professor in Ecoinformatics)
• E-mail: dndoh2009@gmail.com
• Tel: 653754070/677540384
.
Markov Chain :
a process with a finite number of states (or outcomes, or events) in
which the probability of being in a particular state at step n + 1
depends only on the state occupied at step n.
Prof. Andrei A. Markov (1856-1922) , published his result in 1906
2
A sequence of trials of an experiment is a Markov chain if:
1. the outcome of each experiment is one of a set of discrete states;
2. the outcome of an experiment depends only on the present state, and not on any
past states.
Categories
3
…Categories
 If the time parameter is discrete {t1,t2,t3,.....}, it is called Discrete Time Markov
Chain (DTMC ).
 If time parameter is continues, (t≥0) it is called Continuous Time Markov Chain

(CTMC )
If Xn= j, we say that the process is in state j. The numbers P(Xm+1=j|Xm= i) are called
the transition probabilities. We assume that the transition probabilities do not depend
on time. That is, pij = P(Xm+1= j|Xm= i) does not depend on m.
4
State Transition Matrix and Diagram
We often list the transition probabilities in a matrix. The matrix is
called the state transition matrix or transition probability matrix and is
usually shown by P. Assuming the states are 1, 2, ⋯, r, then the state
transition matrix is given by
5
two states => n = 1 (0, 1)
P00 = P(X t+1 = 0 | X t = 0) = ¼ ; 0 0
P01 = P(X t+1 = 1 | Xt = 0) = ¾; 0 1
P10 = P(X t+1 = 0 | Xt = 1) = ½; 1 0
P11 = P(X t+1 = 1 | Xt = 1) = ½; 1 1
6
7
Transition matrix features:
It is square, since all possible states must be used both as rows

and as columns.
 All entries are between 0 and 1, because all entries represent
probabilities.
 The sum of the entries in any row must be 1, since the
numbers in the row give the probability of changing from the
state at the left to one of the states indicated across the top.
8
Matrix Representation
A B C D The transition probabilities
0.95 0 0.05 0 Matrix M =(ast)
A
M is a stochastic Matrix:
B 0.2 0.5 0 0.3
a t st 1
0 0.2 0 0.8
C The initial distribution vector
(u1…um) defines the distribution
0 0 1 0 of X1 (p(X1=si)=ui) .
D
Then after one move, the distribution is changed

to X2 = X1M
9
Matrix Representation
A B C D
0.95 0 0.05 0
A
Example: if X1=(0, 1, 0, 0)
B 0.2 0.5 0 0.3 then X2=(0.2, 0.5, 0, 0.3)
0 0.2 0 0.8 And if X1=(0, 0, 0.5, 0.5)

C then X2=(0, 0.1, 0.5, 0.4).
0 0 1 0
D
The i-th distribution is Xi = X1Mi-1
10
Representation of a Markov Chain as a
Digraph
State Transition Diagram
0.95
A B C D
0.95 0 0.05 0
0.2 0.5 A
0.2 0.5 0 0.3
B
0.05 0.2 0.3 0 0.2 0 0.8

C
0.8 D
0 0 1 0
1
A state transition diagram.
Each directed edge AB is associated with the positive

transition probability from A to B.
11
Example: Consider the Markov chain shown in Figure below:
12
Solution
13
Write any transition diagrams as transition matrices
a) b)
c)
a)
14
Probability Distributions: State Probability Distributions:
Consider a Markov chain {Xn,n= 0,1,2,...}, where Xn∈S={1,2,⋯,r}.
Suppose that we know the probability distribution of X0. More

specifically, define the row vector π(0) as:
π(0) = [P(X0 = 1)P(X0 = 2)⋯P(X0 = r)].
How can we obtain the probability distribution of X1, X2, ⋯?
We can use the law of total probability. More specifically, for any j∈S,
we can write:
15
If we generally define
we can rewrite the above result in the form of matrix multiplication
where P is the state transition matrix. Similarly, we can write:
More generally, we can write:
16
Our work thus far is summarized below:
Example
17
Solution
18
Solved example
Voting Trends At the end of June in a presidential election year, 40% of the voters
were registered as liberal, 45% as conservative, and 15% as independent. Over a one-
month period, the liberals retained 80% of their constituency, while 15% switched to
conservative and 5% to independent. The conservatives retained 70% and lost 20% to
the liberals. The independents retained 60% and lost 20% each to the conservatives
and liberals. Assume that these trends continue.
a) Write a transition matrix using this information.
b) Write a probability vector for the initial distribution.
Find the percent of each type of voter at the end of each of
the following months.
c). July
d). August
e). September
f. October
Solution
Transition matrix
L I C
Liberal 0.8 0.15 0.05
a) Conservative 0.2 0.7 0.1
Independent 0.2 0.2 0.6
19
b) Initial distribution
40% of the voters were registered as liberal, 45% as conservative, and 15% as
independent:
b) The probability vector for the initial distribution:
X0 = π(0) = [0.4 0.45 0.15]
c) Percent of each type of voter at the end of July:
44% L; 40.5% C; 15.5% I
d) Percent of each type of voter at the end of August:
46.4% L; 38.05% C; 15.55% I
e) Percent of each type of voter at the end of August: Apply X0 *P(n)
47.84% L; 36.705% C; 15.3355% I
f) Percent of each type of voter at the end of September:
48.704% I; 35.9605% C; 15.3355% I
20
Chapman Kolmogorov equations
Multiple step transition probabilities
21
Chapman-Kolmogorov equation
22
2-step transition probabilities
23
n-step, m-step and (m + n)-step
24
Interpretation
Chapman Kolmogorov is intuitive. Recall
25
Matrix form
n-step transition probabilities
26
n-Step Transition Probabilities:
Consider a Markov chain {Xn, n = 0,1,2,...}, where Xn∈S. If X0= i, then X1=j with
probability pij. That is, pij gives us the probability of going from state i to state j in
one step. Now suppose that we are interested in finding the probability of going from
state i to state j in two steps, i.e.,
We can find this probability by applying the law of total probability. In particular, we
argue that X1 can take one of the possible values in S. Thus, we can write:
We conclude
27
Example: Happy-Sad
28
29
Regular Transition Matrices
One of the many applications of Markov chains is in finding long-range predictions.

It is not possible to make long-range predictions with all transition matrices, but for a
large set of transition matrices, long-range predictions are possible. Such predictions
are always possible with regular transition matrices.
A transition matrix is regular if some power of the matrix contains all positive
entries. A Markov chain is a regular Markov chain if its transition matrix is regular.
30
NOTE
If a transition matrix P has some zero entries, and P2 does as well, you may wonder
how far you must compute Pk to be certain that the matrix is not regular.
The answer is that if zeros occur in the identical places in both Pk and Pk+1 for any k,
they will appear in those places for all higher powers of P, so P is not regular.
Suppose that v is any probability vector. It can be shown that for a regular Markov
chain with a transition matrix P, there exists a single vector V that does not depend on
v, such that gets v.Pn closer and closer to V as n gets larger and larger 31
PROPERTIES OF REGULAR MARKOV CHAINS
Suppose a regular Markov chain has a transition matrix P.

1) As n gets larger and larger, the product v.Pn approaches a
unique vector V for any initial probability vector v. Vector V is
called the equilibrium/steady state vector or fixed vector.
2) Vector V has the property that V.P = P
3) To find V, solve a system of equations obtained from the

matrix equation V.P = V and from the fact that the sum of the
entries of V is 1.
4) The powers Pn come closer and closer to a matrix whose

rows are made up of the entries of the equilibrium vector V.
32
EQUILIBRIUM VECTOR OF A MARKOV CHAIN
If a Markov chain with transition matrix P is regular, then there is a unique vector V
such that, for any probability vector v and for large values of n
Vector V is called the equilibrium vector or the fixed vector of the Markov chain.
If a Markov chain with transition matrix P is regular, then there exists a probability
vector V such that
This vector V gives the long-range trend of the Markov chain. Vector V is found by
solving a system of linear equations, as shown in the next example.
Example:
33
34
35
36
37
38
39
Markov Chain (cont.)
X1 X2 Xn-1 Xn
• For each integer n, a Markov Chain assigns probability to

sequences (x1…xn) over D (i.e, xi D) as follows:
n
p(( x1 , x2 ,...xn ))  p( X 1  x1 ) p( X i  xi | X i 1  xi 1 )
n
i 2
 p ( x1 ) axi1xi
i 2
Similarly, (X1,…, Xi ,…) is a sequence of probability
distributions over D. There is a rich theory which studies the
properties of these sequences. A bit of it is presented next.
40
Markov Chain (cont.)
X1 X2 Xn-1 Xn
Similarly, each Xi is a probability distributions over D, which

is determined by the initial distribution (p1,..,pn) and the
transition matrix M.
There is a rich theory which studies the properties of such
“Markov sequences” (X1,…, Xi ,…). A bit of this theory is
presented next.
41
Random walk with borders (gambling)
42
43
44
45
46
47
48
Classification of States
Performance Questions to be answered:

 How often a certain state is visited?
 How much time will be spent in a state by the system?
 What is the average length of intervals between visits?
Other Properties:
Irreducible
Recurrent
Mean Recurrent Time
Aperiodic
Homogeneous
49
Classification of States
The first definition concerns the accessibility of states from each other: If it is
possible to go from state i to state j, we say that state j is accessible from state i. In
particular, we can provide the following definitions.
50
On dira que deux états i et j communiquent, noté,
51
Exemple.
Considérons une chaîne de Markov d’espace d’état S = {0, 1, 2, 3} de
matrice de transition
On souhaite définir les communications entre les états. On a 0 1,

0 2 et 2 3, les états 2 et 3 sont inaccessibles des états 1 et 0, l’état 3
est inaccessible de 2. La chaîne n’est pas irréductible, on a trois classes:
Considérons une chaîne de Markov d’espace d’état S = {0, 1, 2} de matrice de transition

Conclusion????
Tous les états communiquent, la
chaîne est irréductible 52
Example 2:
Find the communicating classes associated with the transition diagram

shown.
Solution:
{1, 2, 3}, {4, 5}.
State 2 leads to state 4, but state 4 does not lead back to state 2, so they are in
different communicating classes.
Definition: A communicating class of states is closed if it is not possible to
leave that class.
Example: In the transition diagram above:

• Class {1, 2, 3} is not closed: it is possible to escape to class {4, 5}
.• Class {4, 5} is closed: it is not possible to escape
53
Example:
Consider the Markov chain shown in Figure that follows.
It is assumed that when there is an arrow from state i to state j, then

pij>0. Find the equivalence classes for this Markov chain.
54
There are four communicating classes in this Markov chain. Looking at Figure below,
we notice that states 1 and 2 communicate with each other, but they do not
communicate with any other nodes in the graph.
Similarly, nodes 3 and 4

communicate with each other,
but they do not communicate
with any other nodes in the
graph. State 5 does not
communicate with any other
states, so it by itself is a class.
Finally, states 6, 7, and 8
construct another class.
Thus, here are the classes:

Class 3={state 5},
Class 1={state 1,state 2},
Class 4={state 6,state 7,state 8}
Class 2={state 3,state 4},
55
A Markov chain is said to be irreducible if all states communicate with each other.
Looking at Figure below:
Notice that there are two kinds of classes. In particular, if at any time the Markov
chain enters Class 4, it will always stay in that class.
On the other hand, for other classes this is not true. For example, if X0=1, then the
Markov chain might stay in Class 1 for a while, but at some point, it will leave that
class and it will never return to that class again. The states in Class 4 are called
recurrent states, while the other states in this chain are called transient.
56
Recurrent states
In general, a state is said to be recurrent if, any time that we leave that state, we will
return to that state in the future with probability one. On the other hand, if the
probability of returning is less than one, the state is called transient. Here, we provide
a formal definition:
Consider a Markov chain and assume X0=i. If i is a recurrent state, then the
chain will return to state i any time it leaves that state. Therefore, the chain will
visit state i an infinite number of times. On the other hand, if i is a transient
state, the chain will return to state i with probability fii<1. Thus, in that case, the
total number of visits to state i will be a Geometric random variable with
parameter 1−fii.
57
Recurrent – Transient State
58
59
60
61
Example 1
Show that in a finite Markov chain, there is at least one recurrent class.
Solution
Consider a finite Markov chain with r states, S={1,2,⋯,r}.

Suppose by contradiction that all states are transient.
Then, starting from time 0, the chain might visit state 1 several times, but at some
point the chain will leave state 1 and will never return to it. That is, there exists an
integer M1>0 such that Xn≠1, for all n ≥ M1. Similarly, there exists an integer M2>0
such that Xn≠2, for all n≥M2, and so on. Now, if you choose
n ≥ max{M1,M2,⋯,Mr},
then Xn cannot be equal to any of the states 1,2,⋯,r.
This is a contradiction, so we conclude that there must be at least one recurrent state,
which means that there must be at least one recurrent class.
62
Ex2
Sur l’ensemble S = {0, 1, . . . , n} on considère la chaîne de Markov de
matrice de transition P donnée pour 0 ≤ x ≤ n − 1 par:
l’état n étant absorbant (i.e. P(n, x) = 1 si n = x et 0 sinon), et avec 0 < p < 1.

a) Dessiner le graphe associé à cette chaîne de Markov. Quels sont les états récurrents
et les états transients de cette chaîne ?
b) Soit S = {1, . . . , 6}, compléter la matrice suivante pour qu’elle soit matrice de
transition d’une chaîne de Markov
et déterminer quels sont ses états transitoires et récurrents. 63

Solution
a) On a deux classe d'équivalence (deux sous-chaînes irréductibles) dans cette
chaîne. La première contient les états {0, 1, . . . , n - 1} et l'autre l'état n.
L'état n est irréductible, car P {Tn < ∞|X0 = n} = 1.

Les autres états sont soit tous irréductibles soit tous transients
(car ils sont dans la même composante fortement connexe du graphe).
Étudions l'état n -1
On a
P{Tn-1<∞|X0=n-1} ≤ P{X1=0|X0=n-1}
=1–p
<1
Les états 0 à n - 1 sont donc transients
64
… Solution
b) Dessiner le graphe de la chaîne de Markov.
????????????
On voit qu'il y a deux composantes fortement connexes fermées (avec

aucune arrête sortante) : C1 = {1, 2} et C2 = {3, 5}. L'ensemble {4, 6} est
aussi fortement connexe mais n'est pas fermé.
Si on regroupe ces composantes connexes et qu'on dessine l'arbre

associé, l'ensemble {4, 6} est la racine de l'arbre, et les noeuds C1 et C2
sont ses fils et sont des feuilles de l'arbre.
Comme le graphe est ni, on sait que les états dans les feuilles sont
récurrents et les autres sont transitoires. Montrons le dans ce cas là.
65
… Solution
On montre de même que les états de C2 sont récurrents.

Montrons maintenant que l'état 4 n'est pas récurrent (comme il
communique avec 6, cela montrera aussi que 6 est transitoire).
Comme 6 est le seul état atteignable en un coup depuis 4 qui peut
ensuite permettre de revenir en 4, on a
P[Tx<∞|X0=4] ≤ P [X1 = 6|X0 = 4] < 1.
Donc 4 et 6 sont transitoires

66
Periodicity
Consider the Markov chain shown in Figure below:
There is a periodic pattern in this chain. Starting from state 0, we only return to 0 at
times n=3,6,⋯. In other words, p(n)00= 0, if n is not divisible by 3. Such a state is
called a periodic state with period d(0)=3.
67
Periodic/aperiodic
A class is said to be periodic if its states are periodic. Similarly, a class is said to be
aperiodic if its states are aperiodic. Finally, a Markov chain is said to be aperiodic if
all of its states are aperiodic.
If i↔j, then d(i)=d(j).
68
Example 2
Consider the Markov chain
Is Class 1={state 1,state 2} aperiodic?
Is Class 2={state 3,state 4} aperiodic?
Is Class 4={state 6,state 7,state 8}

aperiodic?
Solution
gcd : greatest common divisor 69

Irrudicible/Irréductible
A Markov chain is said to be irreducible if all states communicate with

each other.
Une chaîne de Markov est dite irréductible si elle ne posséde qu’une
seule classe, c’est `a dire si tous les états communiquent entre eux
70
71
gcd : greatest common divisor
72
73
Ergodic Markov Chains
A Markov chain is ergodic if :

1. the corresponding graph is
strongly connected.
2. It is not peridoic
Ergodic Markov Chains are important since they guarantee the

corresponding Markovian process converges to a unique
distribution, in which all states have strictly positive probability.
74
Ex2
On considère la matrice Q de taille 7 × 7 suivante :
75
Ex4
76
Ex5
On considère un joueur qui joue au jeu suivant au casino :
77
“Good” Markov chains
A Markov Chains is good if the distributions Xi , as i∞:
(1) converge to a unique distribution, independent of the initial

distribution.
(2) In that unique distribution, each state has a positive

probability.
The Fundamental Theorem of Finite Markov Chains:

A Markov Chain is good  the corresponding graph is
ergodic.
We will prove the  part, by showing that non-ergodic Markov

Chains are not good.
78
Examples of
“Bad” Markov Chains
A Markov chains is not “good” if either :

1. It does not converge to a unique
distribution.
2. It does converge to u.d., but some states
in this distribution have zero probability.
79
Bad case 1: Mutual Unreachabaility
Consider two initial distributions:

a) p(X1=A)=1 (p(X1 = x)=0 if x≠A).
b) p(X1= C) = 1
In case a), the sequence will stay at A forever.

In case b), it will stay in {C,D} for ever.
Fact 1: If G has two states which are unreachable from each

other, then {Xi} cannot converge to a distribution which is
independent on the initial distribution.
80
Bad case 2: Transient States
Once the process moves from B to D, it will never come back.
81
Bad case 2: Transient States
Fact 2: For each initial distribution,

with probability 1 a transient state
will be visited only a finite number
of times.
Proof: Let A be a transient state, and let X be the set of states

from which A is unreachable. It is enough to show that,
starting from any state, with probability 1 a state in X is
reached after a finite number of steps (Exercise: complete the
proof)
82
Corollary: A good Markov
Chain is irreducible
83
Bad case 3: Periodic Markov Chains
E
Recall: A Markov Chain is periodic if all the states in it have a period

k >1. The above chain has period 2.
In the above chain, consider the initial distribution p(B)=1.
Then states {B, C} are visited (with positive probability) only in odd
steps, and states {A, D, E} are visited in only even steps.
84
Bad case 3: Periodic States
E
Fact 3: In a periodic Markov Chain (of period k >1) there are initial
distributions under which the states are visited in a periodic manner.
Under such initial distributions Xi does not converge as i∞.
Corollary: A good Markov

Chain is not periodic
85
The Fundamental Theorem of Finite Markov
Chains:
We have proved that non-ergodic Markov Chains are not good.
A proof of the other part (based on Perron-Frobenius theory) is
beyond the scope of this course:
If a Markov Chain is ergodic, then

1. It has a unique stationary distribution vector V > 0, which is an
Eigenvector of the transition matrix.
2. For any initial distribution, the distributions Xi , as i∞,
converges to V.
86
Hidden Markov Model
M M M M
S1 S2 SL-1 SL
T T T T
x1 x2 XL-1 xL
L
A Markov chain (s1,…,sL): p( s1 , , sL )   p( si | si 1 )
i 1
and for each state s and a symbol x we have p(Xi=x|Si=s)
Application in communication: message sent is (s1,…,sm) but we

receive (x1,…,xm) . Compute what is the most likely message sent ?
Application in speech recognition: word said is (s1,…,sm) but we
recorded (x1,…,xm) . Compute what is the most likely word said ?
87
Hidden Markov Model
M M M M
S1 S2 SL-1 SL
T T T T
x1 x2 XL-1 xL
Notations:
Markov Chain transition probabilities: p(Si+1= t|Si = s) = ast
Emission probabilities: p(Xi = b| Si = s) = es(b)
L
For Markov Chains we know: p( s)  p( s1 , , sL )   p( si | si 1 )
i 1
What is p(s,x) = p(s1,…,sL;x1,…,xL) ?
88
Hidden Markov Model
M M M M
S1 S2 SL-1 SL
T T T T
x1 x2 XL-1 xL
p(Xi = b| Si = s) = es(b), means that the probability of xi

depends only on the probability of si.
Formally, this is equivalent to the conditional
independence assumption:
p(Xi=xi|x1,..,xi-1,xi+1,..,xL,s1,..,si,..,sL) = esi(xi)
L
Thus p( s, x)  p( s1 , , sL ; x1 ,..., xL )   p( si | si 1 ) esi ( xi )

i 1
89
Absorbing Markov Chains
Not all Markov chains are regular. In fact, some of the most important life science
applications of Markov chains do not involve transition matrices that are regular. One
type of Markov chain that is widely used in the life sciences is called an absorbing
Markov chain.
When we use the ideas of Markov chains to model living organisms, a common state
is death. Once the organism enters that state, it is not possible to leave. In this
situation, the organism has entered an absorbing state.
For example, suppose a Markov chain has transition matrix
The matrix shows that, P12 ,the probability of

going from state 1 to state 2, is 0.6, and that,
P22 ,the probability of staying in state 2, is 1.
Thus, once state 2 is entered, it is impossible to

leave. For this reason, state 2 is called an
absorbing state.
90
The resulting transition diagram shows that it is not possible to leave
state 2.
Generalizing from this example leads to the following definition
91
Absorbing Markov Chains
Identify all absorbing states in the Markov chains having the following matrices.
Decide whether the Markov chain is absorbing
92
Solution
a) Since P11 = 1, and P33 =1 both state 1 and state 3 are absorbing states. (Once
these states are reached, they cannot be left.) The only non-absorbing state is state
2.
There is a 0.3 probability of going from state 2 to the absorbing state 1, and a 0.2
probability of going from state 2 to state 3, so that it is possible to go from the
nonabsorbing state to an absorbing state. This Markov chain is absorbing. The
transition diagram is shown in Figure.
93
b) States 2 and 4 are absorbing, with states 1 and 3 non-absorbing.
From state 1, it is possible to go only to states 1 or 3; from state 3 it is possible to go
only to states 1 or 3.
As the transition diagram in Figure below shows, neither non-absorbing state leads to
an absorbing state, so that this Markov chain is nonabsorbing.
94
Let P be the transition matrix for an absorbing Markov chain.
Rearrange the rows and columns of P so that the absorbing states come first.
Matrix P will have the form:
where Im is an identity matrix, with m equal to the number of absorbing states, and O
is a matrix of all zeros.
 The element in row i, column j of the fundamental matrix gives the number of
visits to state j that are expected to occur before absorption, given that the
current state is state i.
 The product FR gives the matrix of probabilities that a particular initial
nonabsorbing state will lead to a particular absorbing state. 95
Using the Law of Total Probability with Recursion
In this section, we will use this technique to find:
 absorption probabilities,
 mean hitting times, and
 mean return times.
96
Consider the Markov chain shown in the state transition diagram:
The state transition matrix of this Markov chain is given by the following matrix
How many classes are there? For each class, mention if it is recurrent or transient.
There are three classes: Class 1 consists of one state, state 0, which is a recurrent state.
Class two consists of two states, states 1 and 2, both of which are transient. Finally,
class three consists of one state, state 3, which is a recurrent state.
97
Note that states 0 and 3 have the following property:
 once you enter those states, you never leave them.
For this reason, we call them absorbing states. For our example here, there are two
absorbing states. The process will eventually get absorbed in one of them. The first
question that we would like to address deals with finding absorption probabilities.
Absorption Probabilities
Consider the Markov chain in Figure below oncemore:
Let's define ai as the absorption probability in state 0 if we start from state i. More
specifically,
98
By the above definition, we have a0=1 and a3= 0. To find the values of a1 and a2, we
apply the law of total probability with recursion. The main idea is the following:
 if Xn=i, then the next state will be Xn+1=k with probability pik.
Thus, we can write:
Solving the above equations will give us the values of a1 and a2. More specifically,
using Equation above, we obtain
We also know a0=1 and a3=0. Solving for
a1 and a2, we obtain
99
Exercise:
Consider the Markov chain in above markov chain oncemore.
Let's define bi as the absorption probability in state 3 if we start from state i. Use the
above procedure to obtain bi for i = 0,1,2,3.
100
From the definition of bi and the Markov chain graph above, we have b0=0 and b3=1.
Now, from
, we have:
101
Exampe 2
102
In sum,
 In general, a finite Markov chain might have several transient as well as several
recurrent classes.
 As n increases, the chain will get absorbed in one of the recurrent classes and it
will stay there forever.
 We can use the above procedure to find the probability that the chain will get
absorbed in each of the recurrent classes.
 In particular, we can replace each recurrent class with one absorbing state. Then,
the resulting chain consists of only transient and absorbing states.
 We can then follow the above procedure to find absorption probabilities.
103
Mean Hitting Times
Definition: the expected time until the process hits a certain set of states for the first
time.
Again, consider the Markov chain below:
Let's define ti as the number of steps needed until the chain hits state 0 or state 3,
given that X0=i. In other words, ti is the expected time (number of steps) until the
chain is absorbed in 0 or 3, given that X0=i. By this definition, we have t0=t3= 0
To find t1 and t2, we use the law of total probability with recursion as before. For
example, if X0=1, then after one step, we have X1=0 or X1=2. Thus, we can write
Similarly, we can write:
104
In sum,
105
Vector of hitting probabilities
Let A be some subset of the state space S. (A need not be a

communicating class: it can be any subset required, including the
subset consisting of a single state: e.g. A = {4}.)
The hitting probability from state i to set A is the probability of ever

reaching the set A, starting from initial state i.
We write this probability as hiA

.
Thus
hiA = P(Xt ∈ A for some t ≥ 0 | X0 = i).
106
Example:
Let set A = {1, 3} as shown
The hitting probability for set A is:

 1 starting from states 1 or 3
(We are starting in set A, so we hit it immediately);
 0 starting from states 4 or 5
(The set {4, 5} is a closed class, so we can never escape out to set
A);
 0.3 starting from state 2
(We could hit A at the first step (probability 0.3), but otherwise we
move to state 4 and get stuck in the closed class {4, 5} (probability 0.7).)
107
We can summarize all the information from the example above in a vector of hitting
probabilities:
108
Example
109
Mean Return Times
110
Calculating the mean hitting times
The vector of expected hitting times mA = (miA : i ∈ S) is the
minimal non-negative solution to the following equations:
111
112
113
Example:Consider the Markov chain shown in Figure.
Let tk be the expected number of steps until the
chain hits state 1 for the first time, given that X0=k.
Clearly, t1=0. Also, let r1 be the mean return time to
state 1.
1) Find t2 and t3
2) Find r1
Solution
114
115
Stationary and Limiting Distributions
Here, we would like to discuss long-term behavior of Markov chains. In particular,
we would like to know the fraction of times that the Markov chain spends in each
state as n becomes large. More specifically, we would like to study the distributions
as n→∞. To better understand the subject, we will first look at an example and then
provide a general analysis
116
Example
Consider a Markov chain with two possible states, S={0,1}. In particular, suppose
that the transition matrix is given by
where a and b are two real numbers in the interval [0,1] such that 0<a+b<2. Suppose
that the system is in state 0 at time n=0 with probability α, i.e.,
117
118
119
120
Example 2:
Consider a Markov chain with two possible states, S={0,1}. In particular, suppose
that the transition matrix is given by:
where a and b are two real numbers in the interval [0,1] such that 0<a+b<2. Find the
mean return times, r0 and r1, for this Markov chain.
121
122
Finite Markov Chains:
Here, we consider Markov chains with a finite number of states. In general, a finite
Markov chain can consist of several transient as well as recurrent states.
As n becomes large the chain will enter a recurrent class and it will stay there forever.
Therefore, when studying long-run behaviors we focus only on the recurrent classes.
If a finite Markov chain has more than one recurrent class, then the chain will get
absorbed in one of the recurrent classes.
It turns out that in this case the Markov chain has a well-defined limiting behavior if it
is aperiodic (states have period 1). How do we find the limiting distribution? The trick
is to find a stationary distribution.
Here is the idea: If π=[π1,π2,⋯] is a limiting distribution for a Markov chain, then we
have :
123
Similarly, we can write:
We can explain the equation π = πP intuitively:
Suppose that Xn has distribution π. As we saw before, πP gives the probability

distribution of Xn+1. If we have π = πP, we conclude that Xn and Xn+1 have the same
distribution. In other words, the chain has reached its steady-state (limiting)
distribution. We can equivalently write π = πP as :
The righthand side gives the probability of going to state j in the next step. When we
equate both sides, we are implying that the probability of being in state j in the next
step is the same as the probability of being in state j now.
124
We now summarize the discussion in the following theorem.
125
126
127
128
Example
Consider the Markov chain shown below:
a) Is this chain irreducible?

b) Is this chain aperiodic?
c) Find the stationary distribution for this chain.
d) Is the stationary distribution a limiting distribution for the chain?
129
Solution
The chain is irreducible since we can go from any state to any other states in a finite
number of steps.
Since there is a self-transition, i.e., p11>0, we conclude that the chain is aperiodic.
To find the stationary distribution, we need to solve :
d) Since the chain is irreducible and aperiodic, we conclude that the above stationary
distribution is a limiting distribution.
130
Countably Infinite Markov Chains
Theorem
131
where rj is the mean return time to state j.
How do we use the above theorem? Consider an infinite Markov chain {Xn,n
=0,1,2,...}, where Xn∈S={0,1,2,⋯}. Assume that the chain is irreducible and
aperiodic. We first try to find a stationary distribution π by solving the equations
132
Example
Consider the Markov chain shown in Figure below.
Assume that 0<p<12. Does this chain have a limiting distribution?

133
Solution
This chain is irreducible since all states communicate with each other. It is also
aperiodic since it includes a self-transition, P00>0. Let's write the equations for a
stationary distribution.
For state 0, we can write:
134
135
Consider the Markov chain shown in Figure below:
Assume that 12<p<1. Does this chain have a limiting distribution? For all
i,j∈{0,1,2,⋯}, find
lim P(Xn= j|X0 = i).
n→∞
Solution
The chain is irreducible since all states communicate with each other. It is also
aperiodic since it includes a self-transition, P00>0. Let's write the equations for a
stationary distribution.
For state 0, we can write :
136
137

Markov chains_lectures_DCM_2024

Uploaded by

Copyright:

Available Formats

You might also like

Markov chains_lectures_DCM_2024

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Markov chains_lectures_DCM_2024

Uploaded by

Copyright:

Available Formats

Markov Chains

• Innocent Ndoh Mbue, PhD

Prof. Andrei A. Markov (1856-1922) , published his result in 1906

 If time parameter is continues, (t≥0) it is called Continuous Time Markov Chain

It is square, since all possible states must be used both as rows

Then after one move, the distribution is changed

0 0.2 0 0.8 And if X1=(0, 0, 0.5, 0.5)

The i-th distribution is Xi = X1Mi-1

0.05 0.2 0.3 0 0.2 0 0.8

Each directed edge AB is associated with the positive

Consider a Markov chain {Xn,n= 0,1,2,...}, where Xn∈S={1,2,⋯,r}.

Suppose that we know the probability distribution of X0. More

π(0) = [P(X0 = 1)P(X0 = 2)⋯P(X0 = r)].

How can we obtain the probability distribution of X1, X2, ⋯?

we can rewrite the above result in the form of matrix multiplication

where P is the state transition matrix. Similarly, we can write:

More generally, we can write:

Chapman Kolmogorov is intuitive. Recall

n-step transition probabilities

One of the many applications of Markov chains is in finding long-range predictions.

Suppose a regular Markov chain has a transition matrix P.

2) Vector V has the property that V.P = P

3) To find V, solve a system of equations obtained from the

4) The powers Pn come closer and closer to a matrix whose

• For each integer n, a Markov Chain assigns probability to

Similarly, each Xi is a probability distributions over D, which

Performance Questions to be answered:

On souhaite définir les communications entre les états. On a 0 1,

Considérons une chaîne de Markov d’espace d’état S = {0, 1, 2} de matrice de transition

Find the communicating classes associated with the transition diagram

Example: In the transition diagram above:

It is assumed that when there is an arrow from state i to state j, then

Similarly, nodes 3 and 4

Thus, here are the classes:

Consider a finite Markov chain with r states, S={1,2,⋯,r}.

then Xn cannot be equal to any of the states 1,2,⋯,r.

l’état n étant absorbant (i.e. P(n, x) = 1 si n = x et 0 sinon), et avec 0 < p < 1.

et déterminer quels sont ses états transitoires et récurrents. 63

L'état n est irréductible, car P {Tn < ∞|X0 = n} = 1.

On voit qu'il y a deux composantes fortement connexes fermées (avec

Si on regroupe ces composantes connexes et qu'on dessine l'arbre

On montre de même que les états de C2 sont récurrents.

P[Tx<∞|X0=4] ≤ P [X1 = 6|X0 = 4] < 1.

Donc 4 et 6 sont transitoires

Is Class 1={state 1,state 2} aperiodic?

Is Class 2={state 3,state 4} aperiodic?

Is Class 4={state 6,state 7,state 8}

gcd : greatest common divisor 69

A Markov chain is said to be irreducible if all states communicate with

A Markov chain is ergodic if :

Ergodic Markov Chains are important since they guarantee the

(1) converge to a unique distribution, independent of the initial

(2) In that unique distribution, each state has a positive

The Fundamental Theorem of Finite Markov Chains:

We will prove the  part, by showing that non-ergodic Markov

A Markov chains is not “good” if either :

Consider two initial distributions:

In case a), the sequence will stay at A forever.

Fact 1: If G has two states which are unreachable from each