Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 25

1

A Markov Model (Markov Chain) is:



similar to a finite-state automata, with probabilities of
transitioning from one state to another:

What is a Markov Model?
S
1
S
5
S
2
S
3
S
4

0.5
0.5 0.3
0.7
0.1
0.9 0.8
0.2
transition from state to state at discrete time
intervals

can only be in 1 state at any given time
1.0
2

Elements of a Markov Model (Chain):
clock
t = {1, 2, 3, T}
N states
Q = {1, 2, 3, N}
the single state j at time t is referred to as q
t
N events
E = {e
1
, e
2
, e
3
, , e
N
}

initial probabilities

j
= P[q
1
= j] 1 j N

transition probabilities
a
ij
= P[q
t
= j | q
t-1
= i] 1 i, j N

What is a Markov Model?
3

A Markov Model (Markov Chain) is:

similar to a finite-state automata, with probabilities of
transitioning from one state to another:

What is a Markov Model?
S
1
S
5
S
2
S
3
S
4

0.5
0.5 0.3
0.7
0.1
0.9 0.8
0.2
transition from state to state at discrete time intervals

can only be in 1 state at any given time
1.0
4

Consider a system described by the following process
At any given time, the system can be in one of possible states =1,2

At regular times, the system undergoes a transition to a new state
Transition between states can be described probabilistically
Markov property
In general, the probability that the system is in state = is a function of
the complete history of the system
To simplify the analysis, however, we will assume that the state of the
system depends only on its immediate past
=|1=,2===|1=
This is known as a first-order Markov Process
We will also assume that the transition probability between any two states is
independent of time ==|1= .. 0 =1=1


Discrete Markov Processes
7

Hidden Markov Model:

more than 1 event associated with each state.

all events have some probability of emitting at each state.

given a sequence of observations, we cant determine exactly
the state sequence.

We can compute the probabilities of different state sequences
given an observation sequence.

Doubly stochastic (probabilities of both emitting events and
transitioning between states); exact state sequence is hidden.
What is a Hidden Markov Model?
8

Elements of a Hidden Markov Model:

clock t = {1, 2, 3, T}

N states Q = {1, 2, 3, N}

M events E = {e
1
, e
2
, e
3
, , e
M
}

initial probabilities
j
= P[q
1
= j] 1 j N

transition probabilities a
ij
= P[q
t
= j | q
t-1
= i] 1 i, j N

observation probabilities b
j
(k)=P[o
t
= e
k
| q
t
= j] 1 k M
b
j
(o
t
)=P[o
t
= e
k
| q
t
= j] 1 k M

A = matrix of a
ij
values, B = set of observation probabilities,
= vector of
j
values.

Entire Model: = (A,B,)
Elements of Hidden Markov Model?
9

HMM Topologies
There are a number of common topologies for HMMs:

Ergodic (fully-connected)





Bakis (left-to-right)
0.3
0.3
0.6 0.2
0.1
0.1
0.6
0.5
0.3
S
1
S
2
S
3

1
= 0.4

2
= 0.2

3
= 0.4
0.3
0.6 0.4
S
1
S
2

1
= 1.0

2
= 0.0

3
= 0.0

4
= 0.0

S
3
0.1
0.4
S
4
1.0
0.9
0.1
0.2
1
0

HMM Topologies
The topology must be specified in advance by the
system
designer

Common use in speech is to have one HMM per
phoneme, and three states per phoneme. Then, the
phoneme-level
HMMs can be connected to form word-level HMMs

1
= 1.0

2
= 0.0

3
= 0.0

0.4
0.6 0.3
A
1
A
2
A
3
0.5
0.7
0.5
0.5 0.2
B
1
B
2
B
3
0.3
0.8 0.4
0.6 0.3
A
1
A
2
A
3
0.5
0.7 0.4
0.6 0.2
T
1
T
2
T
3
4.0
0.8
0.5
0.7
0.5 0.6
HMM generation of observation sequences

The three basic HMM problems
Forward and Backward procedures
The Forward procedure
The Backward procedure
The Viterbi algorithm


The problem with choosing the individually most likely
states is that the overall state sequence may not be valid
Consider a situation where the individually most likely
states are = and +1=, but the transition
probability =0
Instead, and to avoid this problem, it is common to look
for the single best state sequence, at the expense of
having sub-optimal individual states
This is accomplished with the Viterbi algorithm
The Viterbi algorithm
- To find the single best state sequence we define yet
another variable =max12112=,1
2|
which represents the highest probability along a single
path that accounts for the first observations and ends at
state
By induction, +1 can be computed as +1=max
+1
To retrieve the state sequence, we also need to keep
track of the state that maximizes at each time , which
is done by constructing an array +1=arg max1

+1 is the state at time from which a transition to
state maximizes the probability +1
Baum-Welsh re-estimation
Re-estimation procedure
2
5

L
SHOW ALL ALERTS
SH OW AA AX L ER TS
SH SH SH OW OW OW AA AA AA L L L AX AX AX L L L ER ER ER TS TS TS
Example

You might also like