Professional Documents
Culture Documents
Lecture 2 PDF
Lecture 2 PDF
Mehryar Mohri
Courant Institute of Mathematical Sciences
mohri@cims.nyu.edu
1
Software Libraries
FSM Library: Finite-State Machine Library. General
software utilities for building, combining,
optimizing, and searching weighted automata and
transducers (MM, Pereira, and Riley, 2000).
http://www.research.att.com/projects/mohri/fsm
OpenFst Library: Open-source Finite-state
transducer Library (Allauzen et al., 2007).
http://www.openfst.org
http://www.research.att.com/projects/mohri/grm
DCD Library: Decoder Library. General software
collection for speech recognition decoding and
related functions (MM and Riley, 2003).
http://www.research.att.com/~fsmtools/dcd
Mehryar Mohri - Speech Recognition page Courant3 Institute, NYU
3
FSM Library
The FSM utilities construct, combine, minimize, and
search weighted finite-states transducers.
• User Program Level: Programs that read from
and write to files or pipelines, fsm(1):
fsmintersect in1.fsm in2.fsm >out.fsm
• C(++) Library Level: Library archive of C(++)
functions that implements the user program
level, fsm(3):
Fsm in1 = FSMLoad("in1.fsm");
Fsm in2 = FSMLoad("in2.fsm");
Fsm out = FSMIntersect(fsm1, fsm2);
FSMDump("out.fsm", out);
• automata/acceptor files,
• transducer files,
• symbols files.
Binary format: compiled representation used by
all FSM utilities.
Printing
• fsmprint -iA.syms <A.fsm >A.txt
Drawing
• fsmdraw -iA.syms <A.fsm | dot -Tps >A.ps
Semiring Set ⊕ ⊗ 0 1
Boolean {0, 1} ∨ ∧ 0 1
Probability R+ + × 0 1
Log R ∪ {−∞, +∞} ⊕log + +∞ 0
Tropical R ∪ {−∞, +∞} min + +∞ 0
green/0.3 blue/0
0 1 2 /0.8
yellow/0.6
green:blue/0.3 blue:green/0
0 1 2 /0.8
yellow:red/0.6
i[π]:o[π]
p[π] n[π]
previous state next state or
or source state destination state
Sets of paths
• P (R1 , R2 ) : paths from R1 ⊆ Q to R2 ⊆ Q .
• P (R1 , x, R2 ) : paths in P (R1 , R2 ) with input label x.
• P (R1 , x, y, R2 ) : paths in P (R1 , x, R2 ) with output
label y.
Mehryar Mohri - Speech Recognition page Courant
13Institute, NYU
13
General Definitions
Alphabets: input Σ , output ∆.
States: Q, initial states I , final states F.
Transitions: E ⊆ Q × (Σ ∪ {!}) × (∆ ∪ {!}) × K × Q.
Weight functions:
• initial: λ : I → K .
• final: ρ : F → K .
Transducer T = (Σ, ∆, Q, I, F, E, λ, ρ)
∀x ∈ Σ∗ , y ∈ ∆∗ ,
!
[[T ]](x, y) = λ(p[π]) ⊗ w[π] ⊗ ρ(n[π])
π∈P (I,x,y,F )
[[A]](x) =
Sum of the weights of all successful
paths labeled with x
[[A]](abb) = .1 × .2 × .3 × .1 + .5 × .3 × .6 × .1
Product
!
[[T1 ⊗ T2 ]](x, y) = [[T1 ]](x1 , y1 ) ⊗ [[T2 ]](x2 , y2 ).
x=x1 x2
Closure y=y1 y2
!
∞
[[T ∗ ]](x, y) = [[T ]]n (x, y)
n=0
• lazy implementation.
Mehryar Mohri - Speech Recognition page Courant20Institute, NYU
20
Sum - Illustration
Program: fsmunion A.fsm B.fsm >C.fsm
red/0.5 green/0.4 1 /0
Graphical
0
green/0.3 representation:
1
blue/0
2 /0.8
0 blue/1.2
yellow/0.6 2 /0.3
red/0.5 green/0.4 1 /0
red/0.5 green/0.4 1 /0
red/0.5
0 blue/1.2 1 /0
red/0.5 green/0.3 blue/0 green/0.4
0 red/0.5 1 2 /0.8 green/0.4 1 /0
yellow/0.6 2 /0.3
green/0.3 blue/0 0 blue/1.2
0 green/0.3 1 blue/0 2 /0.8 0 blue/1.2
0 1 2 /0.8
yellow/0.6
yellow/0.6 2 /0.3
2 /0.3
A.fsa B.fsa
2 /0.3
2 /0.3 eps/0.3 2 /0.3
B.fsa
B.fsa C.fsa
blue/0
Mehryar Mohri - Speech Recognition page Courant
2 /0.8 23Institute, NYU
yellow/0.6 23
This Lecture
Weighted automata and transducers
Rational operations
Elementary unary operations
Fundamental binary operations
Optimization algorithms
Search algorithms
Inversion
[[T −1
]](x, y) = [[T ]](y, x)
Projection
!
[[A]](x) = [[T ]](x, y)
y
Linear-time complexity, lazy implementation (not
for reversal).
Mehryar Mohri - Speech Recognition page Courant
25Institute, NYU
25
Reversal - Illustration
Program: fsmreverse A.fsm >C.fsm
Graphical representation:
red/0.5
green/1.2 3 /0
red/0.5
eps/0 4 green/1.2A.fsa
blue/0 green/0.3
0 3 2 red/0.5 1 /0
eps/0.3 blue/2
yellow/0.6
eps/0 4 green/1.2
5
blue/0 green/0.3
0 3 2 1 /0
eps/0.3 blue/2
yellow/0.6
5 C.fsa
Mehryar Mohri - Speech Recognition page Courant
26Institute, NYU
26
Inversion - Illustration
Program: fsminvert A.fsm >C.fsm
red:bird/0.5
Graphical representation:
green:pig/0.3 blue:cat/0
0
red:bird/0.5 1 2 /0.8
yellow:dog/0.6
green:pig/0.3 blue:cat/0
0 1 2 /0.8
yellow:dog/0.6
A.fst
bird:red/0.5
A.fst
pig:green/0.3 cat:blue/0
0
bird:red/0.5 1 2 /0.8
dog:yellow/0.6
pig:green/0.3 cat:blue/0
0 1 2 /0.8
dog:yellow/0.6
C.fst
Mehryar Mohri - Speech Recognition page Courant
27Institute, NYU
27
Projection - Illustration
Program: fsmproject -1 T.fsm >A.fsm
red:bird/0.5
Graphical representation:
green:pig/0.3 blue:cat/0
0 1 2 /0.8
red:bird/0.5
yellow:dog/0.6
green:pig/0.3 blue:cat/0
0 1 2 /0.8
yellow:dog/0.6
T.fst
red/0.5
A.fst
bird:red/0.5green/0.3 blue/0
0 1 2 /0.8
yellow/0.6
pig:green/0.3 cat:blue/0
0 1 2 /0.8
dog:yellow/0.6
• lazy implementation.
c:b/0.9
c:b/0.9
c:b/0.9
!:e
1,1 (!1:!1)a:d1,2 !:e 1,2 a:a b:! c:! d:d
0,0 (x:x) 1,1 0 1 2 3 4
:! b:e b:!
(!1:!1) a:a b:! c:!
0 1 2
!2) b:!
(!2:!1) (!2:!2) b:e b:!
(!2:!2) (!2:!1) (!2:!2)
!:e
2,1 2,2 !:e
(!1:!1) 2,2
2,1 (!1:!1) 0
a:d
1
!:e
2
d:a
3
! c:! 0
a:d
1
!:e
2
d:a
2) c:!
(!2:!2) c:!
(!2:!2) (!2:!2)
3,1 !:e 3,2
(!1:!1) !:e
3,1d:a (!1:!1) 3,2
(x:x) d:a
(x:x)
4,3
4,3
ndant !-paths. A straightforward generalization of the !-free case could generate all the paths
3, 2)Figure
when 8: Redundant
composing the!-paths. A straightforward
two simple transducers ongeneralization
the right-handofside.
the !-free case could gen
from (1, 1) to (3, 2) when composing the two simple transducers on the right-hand side.
λ[p[π]]⊗w[π]⊗x. Thus, x can be viewed as the resid- nal state and its final weight ! is obta
33
Solution - Filter F for Composition
!1:!1
!2:!1 !1:!1
x:x 1
x:x
0 !2:!2
!2:!2
x:x 2
0
green/0.3 Graphical
1
representation:
blue/0
2 /0.8 red/0.2 yellow/1.3
0 /0 1 2 /0.5
yellow/0.6 blue/0.6
red/0.5
red/0.5 green/0.4
green/0.4
green/0.3
green/0.3 blue/0
blue/0 red/0.2 yellow/1.3
00 11 22 /0.8
/0.8
00 /0
/0 red/0.2
11
yellow/1.3
22 /0.5
/0.5
yellow/0.6
yellow/0.6 blue/0.6
A.fsa blue/0.6 B.fsa
A.fsa
A.fsa B.fsa
B.fsa
blue/0.6 3 /0.8
red/0.7 green/0.7
0 1 2 blue/0.6
yellow/1.9
blue/0.6 33 /0.8
/0.8
red/0.7
red/0.7 green/0.7
green/0.7
00 11 22 yellow/1.9
yellow/1.9 4 /1.3
44 /1.3
/1.3
C.fsa
Mehryar Mohri - Speech Recognition C.fsa
C.fsa page Courant
35Institute, NYU
35
Difference - Illustration
Program: fsmdifference A.fsm B.fsm >C.fsm
Graphical representation:
green
red/0.5
red/0.5 green
red/0.5 red green/0.4 yellow
green/0.3 blue/0 0 1 2
0 1 2 /0.8
green/0.3 blue/0
yellow/0.6 blue
red
red/0.2 yellow
yellow/1.3
0 0 green/0.3 1 1 blue/0 2 /0.8
2 /0.8 0 /0 0 11 22 /0.5
yellow/0.6
yellow/0.6 blue
blue/0.6
A.fsa B.fsa
A.fsa B.fsa
B.fsa
A.fsa red/0.5
red/0.5
red/0.5
red/0.5 1 2 green/0.3
3 /0.8
red/0.5
green/0.3
blue/0.6 blue/0
0 red/0.5 1 2 green/0.3 3 4 /0.8
red/0.7 green/0.7 yellow/0.6
blue/0
0 0 green/0.31 2 3
yellow/1.9 4 /0.8
yellow/0.6
4 /1.3
C.fsa
C.fsa
Mehryar Mohri - Speech Recognition C.fsa page Courant
36Institute, NYU
36
This Lecture
Weighted automata and transducers
Rational operations
Elementary unary operations
Fundamental binary operations
Optimization algorithms
Search algorithms
green/0.2
3
red/0.5 4 /0.2 blue/0 2 /0.8
green/0.3 yellow/0.6
0
red/0.5 1 blue/0
red/0 2 /0.8
green/0.3 yellow/0.6
0 1 5
red/0
A.fsa 5
red/0.5
A.fsa
green/0.3 blue/0
0 1 2 /0.8
yellow/0.6
C.fsa
Corinna
Mehryar Mohri - Speech Recognition Cortes and Mehryar Mohri - WFSTs in Computational Biology
page Courant
40Institute, NYU
40
Connection - Algorithm
Definition:
• Input: weighted transducer T1 .
• Output: equivalent weighted transducer T2 with
all states connected.
Description:
3. Depth-first search of T1 from I1 .
4. Mark accessible and coaccessible states.
5. Keep marked states and corresponding
transitions.
Complexity: linear O(|Q1 | + |E1 |).
Mehryar Mohri - Speech Recognition page Courant
41Institute, NYU
41
ε-Removal - Illustration
Program: fsmrmepsilon T.fsm >TP.fsm
Graphical representation:
!:!/0.3
4/0 1.6
b:b/0.5 1 0.6
!:!/0.3 2 b:a/0.7 0.2
!:!/0.3 !:!/1 4/0 1.6
a:b/0.1 1 b:b/0.5 4/0 1 1.6 1
0 !:!/0.6 b:b/0.5 2 1 0.8 0.6 3 2
2 b:a/0.7 4/0 0.2 0.6
!:!/0.2
a:b/0.1 !:!/1 a:a/0.8 b:a/0.7 0 0.2
0 a:b/0.1 1 !:!/0.6 !:!/1 1
0 1 !:!/0.6
b:a/0.9 0.8 0.3 3 1 2
!:!/0.2 b:a/0.4 3 a:a/0.8 4/0 0.8 3 2
!:!/0.2 a:a/0.8 4/0 0
0 0.3
b:a/0.4 b:a/0.9 3 0.3
b:a/0.4 b:a/0.9 3
a:a/1.6
a:a/1.6
a:a/1.6
b:a/1
b:a/1
b:a/1
b:a/1.7 b:a/1.5
b:a/2.3
0 b:a/1.5
a:b/0.1
b:a/1.7 b:a/1.5
b:a/1.7 a:a/1.4
b:a/2.3
1 b:a/2.3 4/0
00 a:b/0.1
a:b/0.1 b:b/0.7 b:b/0.5 a:a/1.4
11 a:a/1.4 b:a/0.7 4/04/0
b:b/0.7
b:b/0.7 b:b/0.5
b:b/0.5
2 b:a/0.7
b:a/0.7
b:a/0.4
b:a/0.9 22
b:a/0.4 a:a/0.8
b:a/0.4 b:a/0.9
b:a/0.9
a:a/0.8
a:a/0.8
b:a/1.7
3
b:a/1.7
b:a/1.7
33
{(1,2),(2,0)}/2
a/1 {(1,2),(2,0)}/2 b/1
a/1 b/1
{(0,0)} {(3,0)}/0
{(0,0)} b/1 b/3 {(3,0)}/0
b/1 b/3
{(1,0),(2,3)}/0
{(1,0),(2,3)}/0
{(0, b, 0)}
a:a/0.2
a:b/0.1
• Tropical semiring:
a/0
a/0 a/0
a/0
e/0
e/0 e/0
e/0
b/1
b/1 11 b/1
b/1 11
f/1
f/1 f/1
f/1
c/4
c/4 c/4
c/4
00 3/0 00 3/0
3/0
d/0
d/0 e/10
e/10 d/10
d/10 e/0
e/0
e/1
e/1 f/11
f/11 e/11
e/11 f/1
f/1
22 22
e/1
e/1 f/11
f/11
22
f/1.3132
f/1.3132
e/11.326
e/11.326 2 2
a:a
a:a a:d
a:d
b:!
b: 3 c: !
c:d
c:d a:d
a:d a:
a:!! f:f:!!
00 11 22 d:d 44 55 66
b:b:!!
d/0
d/0 c/1
c/1
d/3
d/3 33 e/2
e/2
a/0
a/0 e/1
e/1
00 11 d/4
d/4 66 7/0
7/0
e/3
e/3
b/1
b/1
55
a/2
a/2 c/3
c/3
44
22
b/0
b/0 c/0
c/0
d/0
d/0 c/1
c/1
d/6
d/6 33 e/0
e/0
a/0
a/0 e/0
e/0
00 11 d/6
d/6 66 7/6
7/6
e/0
e/0
b/1
b/1
55
a/3
a/3 c/0
c/0
44
d/0
d/0
b/0
b/0
c/1
c/1
a/0
a/0 a/3
a/3 22 c/0
c/0
00 11
b/1
b/1 e/0
e/0 e/0
e/0
d/6
d/6 33 66 7/6
7/6
3 /0.4
D.fsa
D.fsa
red/0 blue/0
0 1 2 /1.3
red/0 yellow/0.3
blue/0
0 1 2 /1.3
yellow/0.3
M.fsa
M.fsa
Mehryar Mohri - Speech Recognition page Courant
59Institute, NYU
59
This Lecture
Weighted automata and transducers
Rational operations
Elementary unary operations
Fundamental binary operations
Optimization algorithms
Search algorithms
A.fsa
A.fsa
green/0.3 blue/0
0 green/0.3 1 blue/0 2 /0.8
0 1 2 /0.8
C.fsa
C.fsa
red/0.5
red/0.5 1 red/0.5 2 green/0.3
red/0.5 1 2 green/0.3
green/0.3 blue/0
0 green/0.3 3 blue/0 4 /0.8
0 3 4 /0.8
yellow/0.6
yellow/0.6
A.fsa
A.fsa
green/0.3
green/0.3 blue/0
blue/0
00 1 1 22/0.8
/0.8
yellow/0.6
C.fsa
C.fsa
• Cyril Allauzen, Mehryar Mohri, and Brian Roark. The Design Principles and Algorithms of a
Weighted Grammar Library. International Journal of Foundations of Computer Science, 16(3):
403-421, 2005.
• Cyril Allauzen, Michael Riley, Johan Schalkwyk, Wojciech Skut, and Mehryar Mohri.
OpenFst: a general and efficient weighted finite-state transducer library. In Proceedings of
the Ninth International Conference on Implementation and Application of Automata, (CIAA
2007), volume 4783 of Lecture Notes in Computer Science, pages 11-23. Springer, Berlin,
2007.
• Mehryar Mohri. Weighted Grammar Tools: the GRM Library. In Robustness in Language and
Speech Technology. pages 165-186. Kluwer Academic Publishers, The Netherlands, 2001.
• Mehryar Mohri, Fernando C. N. Pereira, and Michael Riley. The Design Principles of a
Weighted Finite-State Transducer Library. Theoretical Computer Science, 231:17-32, January
2000.
• Mehryar Mohri, Fernando C. N. Pereira, and Michael Riley. Weighted Automata in Text and
Speech Processing. In Proceedings of the 12th biennial European Conference on Artificial
Intelligence (ECAI-96),Workshop on Extended finite state models of language. Budapest,
Hungary, 1996. John Wiley and Sons, Chichester.
• Mehryar Mohri and Michael Riley. A Weight Pushing Algorithm for Large Vocabulary
Speech Recognition. In Proceedings of the 7th European Conference on Speech
Communication and Technology (Eurospeech'01). Aalborg, Denmark, September 2001.
• Fernando Pereira and Michael Riley. Finite State Language Processing, chapter Speech
Recognition by Composition of Weighted Finite Automata. The MIT Press, 1997.