Professional Documents
Culture Documents
Markov chains_lectures_DCM_2024
Markov chains_lectures_DCM_2024
Markov chains_lectures_DCM_2024
• E-mail: dndoh2009@gmail.com
• Tel: 653754070/677540384
.
Markov Chain :
a process with a finite number of states (or outcomes, or events) in
which the probability of being in a particular state at step n + 1
depends only on the state occupied at step n.
2
A sequence of trials of an experiment is a Markov chain if:
1. the outcome of each experiment is one of a set of discrete states;
2. the outcome of an experiment depends only on the present state, and not on any
past states.
Categories
3
…Categories
If the time parameter is discrete {t1,t2,t3,.....}, it is called Discrete Time Markov
Chain (DTMC ).
If Xn= j, we say that the process is in state j. The numbers P(Xm+1=j|Xm= i) are called
the transition probabilities. We assume that the transition probabilities do not depend
on time. That is, pij = P(Xm+1= j|Xm= i) does not depend on m.
4
State Transition Matrix and Diagram
We often list the transition probabilities in a matrix. The matrix is
called the state transition matrix or transition probability matrix and is
usually shown by P. Assuming the states are 1, 2, ⋯, r, then the state
transition matrix is given by
5
two states => n = 1 (0, 1)
P00 = P(X t+1 = 0 | X t = 0) = ¼ ; 0 0
P01 = P(X t+1 = 1 | Xt = 0) = ¾; 0 1
P10 = P(X t+1 = 0 | Xt = 1) = ½; 1 0
P11 = P(X t+1 = 1 | Xt = 1) = ½; 1 1
6
7
Transition matrix features:
8
Matrix Representation
A B C D The transition probabilities
0.95 0 0.05 0 Matrix M =(ast)
A
M is a stochastic Matrix:
B 0.2 0.5 0 0.3
a t st 1
0 0.2 0 0.8
C The initial distribution vector
(u1…um) defines the distribution
0 0 1 0 of X1 (p(X1=si)=ui) .
D
9
Matrix Representation
A B C D
0.95 0 0.05 0
A
Example: if X1=(0, 1, 0, 0)
B 0.2 0.5 0 0.3 then X2=(0.2, 0.5, 0, 0.3)
10
Representation of a Markov Chain as a
Digraph
State Transition Diagram
0.95
A B C D
0.95 0 0.05 0
0.2 0.5 A
0.2 0.5 0 0.3
B
0.8 D
0 0 1 0
1
A state transition diagram.
11
Example: Consider the Markov chain shown in Figure below:
12
Solution
13
Write any transition diagrams as transition matrices
a) b)
c)
a)
14
Probability Distributions: State Probability Distributions:
We can use the law of total probability. More specifically, for any j∈S,
we can write:
15
If we generally define
16
Our work thus far is summarized below:
Example
17
Solution
18
Solved example
Voting Trends At the end of June in a presidential election year, 40% of the voters
were registered as liberal, 45% as conservative, and 15% as independent. Over a one-
month period, the liberals retained 80% of their constituency, while 15% switched to
conservative and 5% to independent. The conservatives retained 70% and lost 20% to
the liberals. The independents retained 60% and lost 20% each to the conservatives
and liberals. Assume that these trends continue.
a) Write a transition matrix using this information.
b) Write a probability vector for the initial distribution.
Find the percent of each type of voter at the end of each of
the following months.
c). July
d). August
e). September
f. October
Solution
Transition matrix
L I C
Liberal 0.8 0.15 0.05
a) Conservative 0.2 0.7 0.1
Independent 0.2 0.2 0.6
19
b) Initial distribution
40% of the voters were registered as liberal, 45% as conservative, and 15% as
independent:
b) The probability vector for the initial distribution:
X0 = π(0) = [0.4 0.45 0.15]
c) Percent of each type of voter at the end of July:
44% L; 40.5% C; 15.5% I
d) Percent of each type of voter at the end of August:
46.4% L; 38.05% C; 15.55% I
e) Percent of each type of voter at the end of August: Apply X0 *P(n)
47.84% L; 36.705% C; 15.3355% I
f) Percent of each type of voter at the end of September:
48.704% I; 35.9605% C; 15.3355% I
20
Chapman Kolmogorov equations
Multiple step transition probabilities
21
Chapman-Kolmogorov equation
22
2-step transition probabilities
23
n-step, m-step and (m + n)-step
24
Interpretation
25
Matrix form
26
n-Step Transition Probabilities:
Consider a Markov chain {Xn, n = 0,1,2,...}, where Xn∈S. If X0= i, then X1=j with
probability pij. That is, pij gives us the probability of going from state i to state j in
one step. Now suppose that we are interested in finding the probability of going from
state i to state j in two steps, i.e.,
We can find this probability by applying the law of total probability. In particular, we
argue that X1 can take one of the possible values in S. Thus, we can write:
We conclude
27
Example: Happy-Sad
28
29
Regular Transition Matrices
A transition matrix is regular if some power of the matrix contains all positive
entries. A Markov chain is a regular Markov chain if its transition matrix is regular.
30
NOTE
If a transition matrix P has some zero entries, and P2 does as well, you may wonder
how far you must compute Pk to be certain that the matrix is not regular.
The answer is that if zeros occur in the identical places in both Pk and Pk+1 for any k,
they will appear in those places for all higher powers of P, so P is not regular.
Suppose that v is any probability vector. It can be shown that for a regular Markov
chain with a transition matrix P, there exists a single vector V that does not depend on
v, such that gets v.Pn closer and closer to V as n gets larger and larger 31
PROPERTIES OF REGULAR MARKOV CHAINS
32
EQUILIBRIUM VECTOR OF A MARKOV CHAIN
If a Markov chain with transition matrix P is regular, then there is a unique vector V
such that, for any probability vector v and for large values of n
Vector V is called the equilibrium vector or the fixed vector of the Markov chain.
If a Markov chain with transition matrix P is regular, then there exists a probability
vector V such that
This vector V gives the long-range trend of the Markov chain. Vector V is found by
solving a system of linear equations, as shown in the next example.
Example:
33
34
35
36
37
38
39
Markov Chain (cont.)
X1 X2 Xn-1 Xn
p ( x1 ) axi1xi
i 2
Similarly, (X1,…, Xi ,…) is a sequence of probability
distributions over D. There is a rich theory which studies the
properties of these sequences. A bit of it is presented next.
40
Markov Chain (cont.)
X1 X2 Xn-1 Xn
41
Random walk with borders (gambling)
42
43
44
45
46
47
48
Classification of States
Other Properties:
Irreducible
Recurrent
Mean Recurrent Time
Aperiodic
Homogeneous
49
Classification of States
The first definition concerns the accessibility of states from each other: If it is
possible to go from state i to state j, we say that state j is accessible from state i. In
particular, we can provide the following definitions.
50
On dira que deux états i et j communiquent, noté,
51
Exemple.
Considérons une chaîne de Markov d’espace d’état S = {0, 1, 2, 3} de
matrice de transition
Solution:
{1, 2, 3}, {4, 5}.
State 2 leads to state 4, but state 4 does not lead back to state 2, so they are in
different communicating classes.
Definition: A communicating class of states is closed if it is not possible to
leave that class.
53
Example:
Consider the Markov chain shown in Figure that follows.
Notice that there are two kinds of classes. In particular, if at any time the Markov
chain enters Class 4, it will always stay in that class.
On the other hand, for other classes this is not true. For example, if X0=1, then the
Markov chain might stay in Class 1 for a while, but at some point, it will leave that
class and it will never return to that class again. The states in Class 4 are called
recurrent states, while the other states in this chain are called transient.
56
Recurrent states
In general, a state is said to be recurrent if, any time that we leave that state, we will
return to that state in the future with probability one. On the other hand, if the
probability of returning is less than one, the state is called transient. Here, we provide
a formal definition:
Consider a Markov chain and assume X0=i. If i is a recurrent state, then the
chain will return to state i any time it leaves that state. Therefore, the chain will
visit state i an infinite number of times. On the other hand, if i is a transient
state, the chain will return to state i with probability fii<1. Thus, in that case, the
total number of visits to state i will be a Geometric random variable with
parameter 1−fii.
57
Recurrent – Transient State
58
59
60
61
Example 1
Show that in a finite Markov chain, there is at least one recurrent class.
Solution
Then, starting from time 0, the chain might visit state 1 several times, but at some
point the chain will leave state 1 and will never return to it. That is, there exists an
integer M1>0 such that Xn≠1, for all n ≥ M1. Similarly, there exists an integer M2>0
such that Xn≠2, for all n≥M2, and so on. Now, if you choose
n ≥ max{M1,M2,⋯,Mr},
This is a contradiction, so we conclude that there must be at least one recurrent state,
which means that there must be at least one recurrent class.
62
Ex2
Sur l’ensemble S = {0, 1, . . . , n} on considère la chaîne de Markov de
matrice de transition P donnée pour 0 ≤ x ≤ n − 1 par:
b) Soit S = {1, . . . , 6}, compléter la matrice suivante pour qu’elle soit matrice de
transition d’une chaîne de Markov
Étudions l'état n -1
On a
P{Tn-1<∞|X0=n-1} ≤ P{X1=0|X0=n-1}
=1–p
<1
Les états 0 à n - 1 sont donc transients
64
… Solution
b) Dessiner le graphe de la chaîne de Markov.
????????????
Comme le graphe est ni, on sait que les états dans les feuilles sont
récurrents et les autres sont transitoires. Montrons le dans ce cas là.
65
… Solution
There is a periodic pattern in this chain. Starting from state 0, we only return to 0 at
times n=3,6,⋯. In other words, p(n)00= 0, if n is not divisible by 3. Such a state is
called a periodic state with period d(0)=3.
67
Periodic/aperiodic
A class is said to be periodic if its states are periodic. Similarly, a class is said to be
aperiodic if its states are aperiodic. Finally, a Markov chain is said to be aperiodic if
all of its states are aperiodic.
If i↔j, then d(i)=d(j).
68
Example 2
Consider the Markov chain
Solution
72
73
Ergodic Markov Chains
74
Ex2
On considère la matrice Q de taille 7 × 7 suivante :
75
Ex4
76
Ex5
On considère un joueur qui joue au jeu suivant au casino :
77
“Good” Markov chains
A Markov Chains is good if the distributions Xi , as i∞:
79
Bad case 1: Mutual Unreachabaility
b) p(X1= C) = 1
81
Bad case 2: Transient States
83
Bad case 3: Periodic Markov Chains
E
84
Bad case 3: Periodic States
E
Fact 3: In a periodic Markov Chain (of period k >1) there are initial
distributions under which the states are visited in a periodic manner.
Under such initial distributions Xi does not converge as i∞.
86
Hidden Markov Model
M M M M
S1 S2 SL-1 SL
T T T T
x1 x2 XL-1 xL
L
A Markov chain (s1,…,sL): p( s1 , , sL ) p( si | si 1 )
i 1
and for each state s and a symbol x we have p(Xi=x|Si=s)
87
Hidden Markov Model
M M M M
S1 S2 SL-1 SL
T T T T
x1 x2 XL-1 xL
Notations:
Markov Chain transition probabilities: p(Si+1= t|Si = s) = ast
Emission probabilities: p(Xi = b| Si = s) = es(b)
L
For Markov Chains we know: p( s) p( s1 , , sL ) p( si | si 1 )
i 1
88
Hidden Markov Model
M M M M
S1 S2 SL-1 SL
T T T T
x1 x2 XL-1 xL
89
Absorbing Markov Chains
Not all Markov chains are regular. In fact, some of the most important life science
applications of Markov chains do not involve transition matrices that are regular. One
type of Markov chain that is widely used in the life sciences is called an absorbing
Markov chain.
When we use the ideas of Markov chains to model living organisms, a common state
is death. Once the organism enters that state, it is not possible to leave. In this
situation, the organism has entered an absorbing state.
90
The resulting transition diagram shows that it is not possible to leave
state 2.
91
Absorbing Markov Chains
Identify all absorbing states in the Markov chains having the following matrices.
Decide whether the Markov chain is absorbing
92
Solution
a) Since P11 = 1, and P33 =1 both state 1 and state 3 are absorbing states. (Once
these states are reached, they cannot be left.) The only non-absorbing state is state
2.
There is a 0.3 probability of going from state 2 to the absorbing state 1, and a 0.2
probability of going from state 2 to state 3, so that it is possible to go from the
nonabsorbing state to an absorbing state. This Markov chain is absorbing. The
transition diagram is shown in Figure.
93
b) States 2 and 4 are absorbing, with states 1 and 3 non-absorbing.
From state 1, it is possible to go only to states 1 or 3; from state 3 it is possible to go
only to states 1 or 3.
As the transition diagram in Figure below shows, neither non-absorbing state leads to
an absorbing state, so that this Markov chain is nonabsorbing.
94
Let P be the transition matrix for an absorbing Markov chain.
Rearrange the rows and columns of P so that the absorbing states come first.
Matrix P will have the form:
where Im is an identity matrix, with m equal to the number of absorbing states, and O
is a matrix of all zeros.
The element in row i, column j of the fundamental matrix gives the number of
visits to state j that are expected to occur before absorption, given that the
current state is state i.
The product FR gives the matrix of probabilities that a particular initial
nonabsorbing state will lead to a particular absorbing state. 95
Using the Law of Total Probability with Recursion
absorption probabilities,
96
Consider the Markov chain shown in the state transition diagram:
The state transition matrix of this Markov chain is given by the following matrix
How many classes are there? For each class, mention if it is recurrent or transient.
There are three classes: Class 1 consists of one state, state 0, which is a recurrent state.
Class two consists of two states, states 1 and 2, both of which are transient. Finally,
class three consists of one state, state 3, which is a recurrent state.
97
Note that states 0 and 3 have the following property:
once you enter those states, you never leave them.
For this reason, we call them absorbing states. For our example here, there are two
absorbing states. The process will eventually get absorbed in one of them. The first
question that we would like to address deals with finding absorption probabilities.
Absorption Probabilities
Consider the Markov chain in Figure below oncemore:
Let's define ai as the absorption probability in state 0 if we start from state i. More
specifically,
98
By the above definition, we have a0=1 and a3= 0. To find the values of a1 and a2, we
apply the law of total probability with recursion. The main idea is the following:
if Xn=i, then the next state will be Xn+1=k with probability pik.
Solving the above equations will give us the values of a1 and a2. More specifically,
using Equation above, we obtain
We also know a0=1 and a3=0. Solving for
a1 and a2, we obtain
99
Exercise:
Consider the Markov chain in above markov chain oncemore.
Let's define bi as the absorption probability in state 3 if we start from state i. Use the
above procedure to obtain bi for i = 0,1,2,3.
100
From the definition of bi and the Markov chain graph above, we have b0=0 and b3=1.
Now, from
, we have:
101
Exampe 2
102
In sum,
In general, a finite Markov chain might have several transient as well as several
recurrent classes.
As n increases, the chain will get absorbed in one of the recurrent classes and it
will stay there forever.
We can use the above procedure to find the probability that the chain will get
absorbed in each of the recurrent classes.
In particular, we can replace each recurrent class with one absorbing state. Then,
the resulting chain consists of only transient and absorbing states.
We can then follow the above procedure to find absorption probabilities.
103
Mean Hitting Times
Definition: the expected time until the process hits a certain set of states for the first
time.
Let's define ti as the number of steps needed until the chain hits state 0 or state 3,
given that X0=i. In other words, ti is the expected time (number of steps) until the
chain is absorbed in 0 or 3, given that X0=i. By this definition, we have t0=t3= 0
To find t1 and t2, we use the law of total probability with recursion as before. For
example, if X0=1, then after one step, we have X1=0 or X1=2. Thus, we can write
104
In sum,
105
Vector of hitting probabilities
106
Example:
107
We can summarize all the information from the example above in a vector of hitting
probabilities:
108
Example
109
Mean Return Times
110
Calculating the mean hitting times
The vector of expected hitting times mA = (miA : i ∈ S) is the
minimal non-negative solution to the following equations:
111
112
113
Example:Consider the Markov chain shown in Figure.
Let tk be the expected number of steps until the
chain hits state 1 for the first time, given that X0=k.
Clearly, t1=0. Also, let r1 be the mean return time to
state 1.
1) Find t2 and t3
2) Find r1
Solution
114
115
Stationary and Limiting Distributions
Here, we would like to discuss long-term behavior of Markov chains. In particular,
we would like to know the fraction of times that the Markov chain spends in each
state as n becomes large. More specifically, we would like to study the distributions
as n→∞. To better understand the subject, we will first look at an example and then
provide a general analysis
116
Example
Consider a Markov chain with two possible states, S={0,1}. In particular, suppose
that the transition matrix is given by
where a and b are two real numbers in the interval [0,1] such that 0<a+b<2. Suppose
that the system is in state 0 at time n=0 with probability α, i.e.,
117
118
119
120
Example 2:
Consider a Markov chain with two possible states, S={0,1}. In particular, suppose
that the transition matrix is given by:
where a and b are two real numbers in the interval [0,1] such that 0<a+b<2. Find the
mean return times, r0 and r1, for this Markov chain.
121
122
Finite Markov Chains:
Here, we consider Markov chains with a finite number of states. In general, a finite
Markov chain can consist of several transient as well as recurrent states.
As n becomes large the chain will enter a recurrent class and it will stay there forever.
Therefore, when studying long-run behaviors we focus only on the recurrent classes.
If a finite Markov chain has more than one recurrent class, then the chain will get
absorbed in one of the recurrent classes.
It turns out that in this case the Markov chain has a well-defined limiting behavior if it
is aperiodic (states have period 1). How do we find the limiting distribution? The trick
is to find a stationary distribution.
Here is the idea: If π=[π1,π2,⋯] is a limiting distribution for a Markov chain, then we
have :
123
Similarly, we can write:
The righthand side gives the probability of going to state j in the next step. When we
equate both sides, we are implying that the probability of being in state j in the next
step is the same as the probability of being in state j now.
124
We now summarize the discussion in the following theorem.
125
126
127
128
Example
Consider the Markov chain shown below:
129
Solution
The chain is irreducible since we can go from any state to any other states in a finite
number of steps.
Since there is a self-transition, i.e., p11>0, we conclude that the chain is aperiodic.
To find the stationary distribution, we need to solve :
d) Since the chain is irreducible and aperiodic, we conclude that the above stationary
distribution is a limiting distribution.
130
Countably Infinite Markov Chains
Theorem
131
where rj is the mean return time to state j.
How do we use the above theorem? Consider an infinite Markov chain {Xn,n
=0,1,2,...}, where Xn∈S={0,1,2,⋯}. Assume that the chain is irreducible and
aperiodic. We first try to find a stationary distribution π by solving the equations
132
Example
Consider the Markov chain shown in Figure below.
134
135
Consider the Markov chain shown in Figure below:
Assume that 12<p<1. Does this chain have a limiting distribution? For all
i,j∈{0,1,2,⋯}, find
lim P(Xn= j|X0 = i).
n→∞
Solution
The chain is irreducible since all states communicate with each other. It is also
aperiodic since it includes a self-transition, P00>0. Let's write the equations for a
stationary distribution.
For state 0, we can write :
136
137