Professional Documents
Culture Documents
Lexical Analysis2 Modefied
Lexical Analysis2 Modefied
Lexical Analysis2 Modefied
Scanners
Tokens
Regular expressions
Finite automata
3
Scanners
token
characters Scanner Parser
next token
Symbol
Table
4
Tokens
Scanner
Scanner
definition in Scanner
Generator
Mata Language
Program in
Scanner Token types &
programming
semantic values
language
7
Languages
9
Regular Expressions
is a RE denoting L = {}
If a alphabet, then a is a RE denoting L = {a}
Suppose r and s are RE denoting L(r) and L(s)
alternation: (r) | (s) is a RE denoting L(r) L(s)
concatenation: (r) • (s) is a RE denoting L(r)L(s)
repetition: (r)* is a RE denoting (L(r))*
(r) is a RE denoting L(r)
10
Examples
a|b {a, b}
(a | b)(a | b) {aa, ab, ba, bb}
a* {, a, aa, aaa, ...}
(a | b)* the set of all strings of a’s and b’s
a | a*b the set containing the string a and all
strings consisting of zero or more a’s followed
by a b
11
Regular Definitions
if {return IF;}
[a-z][a-z0-9]* {return ID;}
[0-9]+ {return NUM;}
([0-9]+“.”[0-9]*)|([0-9]*“.”[0-9]+) {return REAL;}
(“--”[a-z]*“\n”)|(“ ” | “\n” | “\t”)+
{/*do nothing for white spaces and comments*/}
. { error(); }
14
Finite Automata
15
Nondeterministic Finite Automata
(NFA)
An NFA consists of
– A finite set of states
– A finite set of input symbols
– A transition function that maps (state,
symbol) pairs to sets of states
– A state distinguished as start state
– A set of states distinguished as final states
16
An Example
18
An Example
(a | b)*abb aabb
start a b b
0 1 2 3
19
An Example
(a | b)*abb aaba
start a b b
0 1 2 3
a
b
20
Another Example
aa* | bb*
a
a
2 3
start
1
4 5
b
b
22
Deterministic Finite Automata (DFA)
23
An Example
RE: (a | b)*abb
States: {1, 2, 3, 4}
Input symbols: {a, b}
Transition function:
(1,a) = {2}, (2,a) = {2}, (3,a) = {2}, (4,a) = {2}
(1,b) = {1}, (2,b) = {3}, (3,b) = {4}, (4,b) = {1}
Start state: 1
Final state: {4}
24
Finite-State Transition Diagram
(a | b)*abb
b
a
start b b
1 a 2 3 4
a
b a
25
Acceptance of DFA
26
An Example
(a | b)*abb aabb
b
a
start b b
1 a 2 3 4
a
b a
27
An Example
(a | b)*abb aaba
b
a
start b b
1 a 2 3 4
a
b a
28
Combined Finite Automata
start i f
if 1 2 3 IF
ID
start a-z a-z,0-9
[a-z][a-z0-9]* 1 2
0-9 REAL
([0-9]+“.”[0-9]*) 0-9
.
2 3 0-9
| start
1 0-9
([0-9]*“.”[0-9]+) . 4 5 0-9
29 REAL
Combined Finite Automata
i f
2 3 4 IF
ID
start a-z a-z,0-9
1 5 6
0-9 REAL
0-9
.
8 9 0-9
7
. 0-9
NFA 10 11 0-9
REAL
30
Combined Finite Automata
f IF
2 3
g-z
a-e a-z,0-9
i 4
j-z a-z,0-9
ID
start 1 a-h 0-9 REAL
0-9 .
5 6 0-9
.
0-9
DFA 7 8 0-9
31 REAL
Recognizing the Longest Match
32
An Example
f IF
2 3
iffail+ g-z S C L P
a-e a-z,0-9 1 0 0
i 4
j-z a-z,0-9 i 2 0 0
ID f 3 3 2
start a-h 0-9 f 4 4 3
1 0-9 REAL
. a 4 4 4
5 6 0-9i 4 4 5
.
0-9 l 4 4 6
DFA 7 8 0-9+ ?
REAL
33