Professional Documents
Culture Documents
Lecture 3 PDF
Lecture 3 PDF
Mehryar Mohri
Courant Institute of Mathematical Sciences
mohri@cims.nyu.edu
This Lecture
Shortest-distance algorithms
Composition
Epsilon-removal
Determinization
Pushing
Minimization
• Generalization of Floyd-Warshall.
• Single-source shortest-distance algorithm.
Mehryar Mohri - Speech Recognition page 3 Courant Institute, NYU
Shortest-Distance Problem
Definition: for any regulated weighted transducer T ,
define the shortest distance from state q to F as
!
d(q, F ) = w[π].
π∈P (q,F )
Problem: compute d(q, F ) for all states q ∈ Q .
Algorithms:
• Generalization of Floyd-Warshall.
• Single-source shortest-distance algorithm.
Mehryar Mohri - Speech Recognition page 4 Courant Institute, NYU
All-Pairs Shortest-Distance Algorithm
(MM, 2002)
Assumption: closed semiring (not necessarily
idempotent).
Idea: generalization of Floyd-Warshall algorithm.
Properties:
implementation.
• Tropical semiring.
• Probability semiring when including infinity or
when restricted to well-defined closures.
Generic-Single-Source-Shortest-Distance (G, s)
1 for i ← 1 to |Q|
2 do d[i] ← r[i] ← 0
3 d[s] ← r[s] ← 1
4 S ← {s}
5 while S $= ∅
6 do q ← head(S)
7 Dequeue(S)
8 r" ← r[q]
9 r[q] ← 0
10 for each e ∈ E[q]
11 do if d[n[e]] $= d[n[e]] ⊕ (r" ⊗ w[e])
12 then d[n[e]] ← d[n[e]] ⊕ (r" ⊗ w[e])
13 r[n[e]] ← r[n[e]] ⊕ (r" ⊗ w[e])
14 if n[e] $∈ S
15 then Enqueue(S, n[e])
16 d[s] ← 1
Mehryar Mohri - Speech Recognition page 9 Courant Institute, NYU
Notes
Complexity:
Semiring Frameworks and Algorithms for Shortest-Distance Problems 1
As mentioned
ing Frameworks andbefore, the Generic-Single-Source-Shortest-Distance
Algorithms for Shortest-Distance Problems algo
rithm works for withshortest-first and FIFO
any queue discipline. orders.
Some queue disciplines are better than oth
•
ers. The appropriate choice depends on the semiring K and the specific restriction
is: imposedlinear on G. With for acyclic
a good choice,graphs using topological
the maximum number of times order.
a vertex is in
serted in S (maxq∈Q N (q)) can be limited.9 The algorithm is then very efficient. I
O(|Q| + (T⊕ classical
the next sections, we will present some
+ T⊗ )|E|)
algorithms using the following queu
xt sections, Approximation:
disciplines: we will topological
examine order,
the - k -closed
!shortest-first
use semiring,
of theorder, first-in
generic e.g.,
first-out
algorithms
case, since the number of simple paths from s to q may be exponential in the size o
for
order. In the wors
just presented
us semirings. graphs
the graph
Thisinwill
(|Q| + |E|), probability
show how semiring.
the same algorithm can be used in diff
the complexity of the algorithm is exponential.
xts by just The modifying
results presented the underlying
in this section algebra.
can be extended to cover the case of non
commutative semirings [28]. Our framework can also be generalized by introducin
10
right
lassical and left
shortest-distance
Mehryar Mohri
semirings
- Speech Recognition
[28]. A right semiring is an algebraic
algorithms page 10
structure similar t
Courant Institute, NYU
This Lecture
Shortest-distance algorithms
Composition
Epsilon-removal
Determinization
Pushing
Minimization
a:a/0.6
b:a/0.2
b:b/0.3 2 a:b/0.5 a:b/0.3 2 b:a/0.5
a:b/0.1 b:b/0.1
0 1 b:b/0.4 3/0.7 0 1 3/0.6
a:b/0.4
a:b/0.2
a:a/.04 (0, 1)
a:a/.02 (3, 2)
a:b/.18
a:b/.01 b:a/.06 a:a/0.1
(0, 0) (1, 1) (2, 1) (3, 1)
a:b/.24
A' )
Redundant ε-Paths Problem
!"!
'
#"!(
(
$"!(
*
%"%
+
,%-
!#"! !#"! !#"! !#"! (MM et al. 1996)
T1 0 a:a!"% 1 b:! ! "&2 c:! %"!3 d:d 4 0 a:d 1 !:e 2 d:a 3 T2
!
B' ) ' ( *
+.*
T = T!1 ◦ F ◦ T!2 .
Mehryar Mohri - Speech Recognition page 15 Courant Institute, NYU
Correctness of Filter
Proposition: filter F allows a unique path between
two states of the following grid.
a
!1 :!1 !1 :!1 !1:!1
(0,0) (1,0) (2,0)
!1:!1
b !2:!2 !2 :!1
!2 :!2
!2 :!1
!2 :!2 !2:!1
x:x
1
c x:x
!1 :!1 !1 :!1
(0,1) (1,1) (2,1) 0
!2:!2 !2:!2
!2 :!1 !2 :!1
!2 :!2 !2 :!2 !2 :!2 x:x 2
!1 :!1 !1 :!1
(0,2) (1,2) (2,2)
Proof: Observe
Fig. 3. (a) Redundantthat a Anecessary
!-paths. straightforwardand sufficient
generalization of the !-free case co
condition
all the pathsisfrom
that the
(0, 0) to (2,following
2) for example,sequences
(b) Filter transducer M allowing a unique !-path.
be
even when composing just two simple
forbidden: ab , ba , ac , and bc .
Mehryar Mohri - Speech Recognition page 16 Courant Institute, NYU
Correctness of Filter
Proof (cont.): Let σ = {a, b, c, x}, then set of
sequences forbidden is exactly
L = σ (ab + ba + ac + bc)σ .
∗ ∗
A det(A) det(A)
Fig. 4. (a) Finite automaton A representing the set of disallowed sequences. (b) Auto
a)result
Finite
Mehryar automaton
ofMohri
the- Speech A representing
Recognition
determinization of A. the set ofaredisallowed
Subsets indicated sequences.
page 17
at each (b)(c)Automaton
Courant
state. AutomatonBC
Institute, NYU
Other Filters
(Pereira and Riley, 1997)
ε1:ε1
x:x ε2:ε2
ε2:ε2
0 1
x:x
Sequential Filter.
• Computation
!
of the ε-closure at each state:
" #
C[p] = (q, d! [p, q]) : d! [p, q] != 0) with d! [p, q] = w[π].
• Removal of εs.
π∈P (p,!,q)
• On-demand construction.
Mehryar Mohri - Speech Recognition page 20 Courant Institute, NYU
Illustration
!/0.9
!/0.9
!/0.1
!/0.1
!/0.4
!/0.4 3 3
b/0.8b/0.8
b/0.6
b/0.6 a/0.5
a/0.5
a/0.1 !/0.2 !/0.8 a/0.1a/0.1
a/0.1 !/0.2 !/0.8 0 1/1.21/1.2
00 11 2 2 4/0.2
4/0.2 0
c/0.9c/0.9
c/0.7
c/0.7 b/0.3
b/0.3
(a)
(a) (b) (b)
b/0.8 a/1.3
a/1.3
a/0.1
a/0.1
a/0.1 b/0.3 1 1 b/0.5b/0.5
0 1/1.2 4/0.2 0 0 a/0.3
a/0.3 b/0.6
b/0.6
c/0.9 a/0.7 b/0.8 4/0 4/0
b/0.8 a/0.7
a/0.7
c/0.7
c/0.7 2
2 b/1.8
b/1.8
(b) (c)(c)
!-removalininthe
Figure 1:1: !-removal
Figure thetropical semiring.
tropical semiring.(a)(a)
Weighted
Weighted autom
aut
b/0.5
1 Mehryar !-transitions. (b)
!-transitions.
Mohri - Speech Recognition (b)Weighted
Weightedautomaton
automaton BBequivalent to to
equivalent
page 21 Courant A result
Institute,
A NYU of th
result o
Main Algorithm
Shortest-distance algorithms:
{(0, b, 0)}
a:a/0.2
a:b/0.1
(a) (b)
Non-Determinizable Transducer
Figure 3: (a) Non-determinizable weighted automaton over the tropical semiring; states 1
nd 2 are non-twin siblings. (b) The first states created by determinization applied to the
utomaton of Figure (a). The algorithm does not halt and produces an infinite number of
tates in this case.
(a) (b)
Figure 4: (a) Non-determinizable functional transducer, or weighted automaton over the
tring semiring; states 1 and 2 are non-twin siblings. (b) Determinization does not halt and
reates an infinite number of states in this case.
• c=c; !
x:u/w q
• u u = (uv) u v .
−1 " −1 " " I y:v’/c’
x:u’/w’ q’
u’ u’ v’
Theorem 1 If T has the twins property then T
u w u v w
(Choffrut, 1978; Mohri, 1997)
Mehryar Mohri - Speech Recognition page 28 Courant Institute, NYU
Determinizability
(Choffrut, 1978; MM 1997; Allauzen and MM, 2002)
Theorem: a trim unambiguous weighted
automaton over the tropical semiring is
determinizable iff it has the twins property.
Theorem: let T be a weighted transducer over the
tropical semiring. Then, if T has the twins
property, then it is determinizable.
Algorithm for testing the twins property:
•
q∈Q
e/1
e/1 f/11
f/11 e/11
e/11 f/1
f/1
22 22
a/0.3266
a/0.3266
a/0
a/0 e/0.3132
e/0
e/0 b/1.3266
b/1.3266 1 1 e/0.3132
b/1
b/1 11 f/1.3132
f/1
f/1 f/1.3132
c/4
c/4 c/4.3266
c/4.3266
00 3/03/0 0 0 3/-0.6393/-0.63
d/0 e/10
e/10 e/0.3132
e/0.3132
d/10.326
d/10.326
e/1
e/1 f/11
f/11
22
f/1.3132
f/1.3132
e/11.326
e/11.326 2 2
a:a a:d
a:a b: ! 3 c: ! a:d
c:d a:d b: ! 3 c: ! a: ! f: !
0 1 2 d:d 4 5 6
c:d! a:d a: ! f: !
0 b: 1 2 4 5 6
d:d
b: !
d/0
d/0 c/1
c/1
d/3
d/3 33 e/2
e/2
a/0
a/0 e/1
e/1
00 11 d/4
d/4 66 7/0
7/0
e/3
e/3
b/1
b/1
55
a/2
a/2 c/3
c/3
44
22
b/0
b/0 c/0
c/0
d/0
d/0 c/1
c/1
d/6
d/6 33 e/0
e/0
a/0
a/0 e/0
e/0
00 11 d/6
d/6 66 7/6
7/6
e/0
e/0
b/1
b/1
55
a/3
a/3 c/0
c/0
44
d/0
d/0
b/0
b/0
c/1
c/1
a/0
a/0 a/3
a/3 22 c/0
c/0
00 11
b/1
b/1 e/0
e/0 e/0
e/0
d/6
d/6 33 66 7/6
7/6
Illustration
a:A 1 3 6 7
0 c:!
0 b:CCDDB c:D
b:C b:!
4 b:C 5
4 5
2
b:B c:C a:DB
Fig. 6. Transducer T2 obtained by quasi-determinization from T .
Fig. 5. Sequential transducer T .
d:D e:D f:BC
a:A 1 3 6 7
This ST is0 not minimal
The application considered as
of quasi-determinization
c:D
an automaton.
leads to theThe application
transducer of the
T2 (figure
b:C
automata
(6)) which computes4 theleads
minimization sameto5function.
b:C the reduced transducer
Only represented
the output in figure
labels differ from
(7) which
those of T is
. the minimal ST as previously defined.
2 Fig. 5. Sequential transducer T .
Transducers are often usedb:!in both directions,
c:! a:DB from inputs to outputs and vice
The application
versa. of quasi-determinization
The first stage of quasi-determinization
d:CDB
leadsin to the transducer
e:C the minimization
f: !
Talgorithm
2 (figure
a:ABCDB 1 3 6 7
(6))also
has whichan computes
interesting the same
effect on function. Only
the reverse the output
application labels
of the differtrans-
minimal from
0 c:!
those ofIndeed,
ducer. T . b:CCDDB
since it reduces
b:! the ambiguities, the cost of matching a string
4 5
with the output strings of the transducer
2
is reduced.
b:! c:! a:DB
a:DB
Fig. 6. Transducer T2 b:!
obtained
d:CDB
2 by c:!
quasi-determinization
e:C f: !
from T .
a:ABCDB 1
a:ABCDB 3 e:C 6 f: ! 7
0 1 3 5 6
d:CDBc:!
This ST is0 not minimal considered as an automaton. The application of the
b:CCDDB
b:CCDDB
b:!
automata minimization4 leads to5 the reduced transducer represented in figure
(7) which is the minimal
Fig. 7.ST as previously
Minimal sequentialdefined.
transducer T . 3
Fig. 6. Transducer T2 obtained by quasi-determinization
Mehryar Mohri - Speech Recognition page 39 from .
TInstitute,
Courant NYU
References
• Cyril Allauzen and Mehryar Mohri. Efficient Algorithms for Testing the Twins Property.
Journal of Automata, Languages and Combinatorics, 8(2):117-144, 2003.
• John E. Hopcroft. An n log n algorithm for minimizing the states in a finite automaton. In
The Theory of Machines and Computations, pages 189-196. Academic Press, 1971.
• Mehryar Mohri, Fernando C. N. Pereira, and Michael Riley. The Design Principles of a
Weighted Finite-State Transducer Library. Theoretical Computer Science, 231:17-32, January
2000.
• Mehryar Mohri and Michael Riley. A Weight Pushing Algorithm for Large Vocabulary
Speech Recognition. In Proceedings of the 7th European Conference on Speech
Communication and Technology (Eurospeech'01). Aalborg, Denmark, September 2001.
• Fernando Pereira and Michael Riley. Finite State Language Processing, chapter Speech
Recognition by Composition of Weighted Finite Automata. The MIT Press, 1997.