Professional Documents
Culture Documents
Lecture 4 Lexical Analyzer
Lecture 4 Lexical Analyzer
Lecture 04
Structure of the Generated Analyzer
Input buffer
lexeme
Automaton
simulator
Lex
L Lex
L Transition
program compiler Table
Actions
Implementing the Lookahead Operator
• Reading Assignment
– Implementing the Lookahead Operator
Regular Expression to DFA
r r#
start … start #
0 F 0 … F F′
Syntax Tree
a b
1 2
NFA for (a|b)*abb#
ε
1 a C
ε
start ε ε ε b b #
a
A B E 3 4 5 6 F
ε ε
b
2 D •
ε • #
6
• Lettered states are non-important • b
5
states • b
4
• Number states are important states
* a
– Numbers correspond to the 3
number in syntax tree |
a b
1 2
DFA for (a|b)*abb#
ε
1 a C
ε
start ε ε ε b b #
a
A B E 3 4 5 6 F
ε ε
b
2 D
b b
start 1, 2, a 1, 2, b 1, 2, b 1, 2,
3 3, 4 3, 5 3, 6
a
a a
Terminology
• Nullable:
– Nodes that are the root of some sub-expression that generate empty
string
n •
• If n is an cat-node (•) with children c1 and c2
– nullable (n) = nullable(c1) and nullable (c2) c2
c1
n
• If n is an star-node (*) with children c1 *
– nullable (n) = true
c2
Terminology
• Firstpos(n):
– Set of positions that can match the first symbol of a string generated by
the sub-expression rooted at n
• If n is
i an or-node
d (|) with
ith children
hild c1
1 and
d c2
2 n |
– firstpos (n) = firstpos(c1) ∪ firstpos (c2)
c1 c2
• If n is a cat-node
cat node (•) with children c1 and c2
– firstpos(n) = If nullable (c1) then firstpos(c1) ∪ firstpos (c2) n •
else firstpos(c1)
c1 c2
• If n is
i an or-node
d (|) with
ith children
hild c1
1 and
d c2
2 n |
– lastpos (n) = lastpos(c1) ∪ lastpos (c2)
c1 c2
• If n is an cat-node
cat node (•) with children c1 and c2
– lastpos(n) = If nullable (c2) then lastpos(c1) ∪ lastpos (c2) n •
– else lastpos(c2)
c1 c2
•
{1, 2, 3} • {6}
• #
6 {1, 2, 3} • {5}
• b
{1, 2, 3} • {4} #
5 {6} {6}
• b
4 {1, 2, 3} • {3} {5} b {5}
* a
{4}
b
{4}
{1, 2} {1, 2}
3 *
| a
{3} {3}
{1, 2} | {1, 2}
a b
1 2 a b
{1} {1} {2} {2}
Terminology
• Followpos(i):
– Tells what positions can follow position i in the syntax tree
• Rule 1:
If n is a cat-node with left child c1 and right child c2 and i is a
position
iti iin llastpos
t ((c1),
1) th
then allll positions
iti iin fifirstpos(c2)
t ( 2) are iin
followpos(i)
• Rule 2:
If n is a star node, and i is a position in lastpos(n), then all
positions in firstpos(n) are in followpos(i)
• After computing firstpos and lastpos for each node follow pos of
each position can be computed by making depth-first traversal
of the syntax tree
followpos example
{1, 2, 3} • {6} N d
Node f ll
followpos
{1, 2, 3} • {5} 1 {1,2,
{1, 2, 3} • {4} #
{6} {6} 2 {1,2,
{1, 2, 3} • {3} {5} b {5}
3 {
b
{1, 2} {4} {4}
* {1, 2} 4 {
a
{3} {3}
{1 2} | {1,
{1, {1 2} 5 {
a b 6 {
{1} {1} {2} {2}
• At star-node:
– lastpos(*) = {1,2} and firstpos(*)={1,2}
– According to Rule 2:
» followpos{1} = {1,2}
» followpos{2} = {1,2}
followpos example
{1 2
{1, 2, 3} • {6} N d
Node F ll
Followpos
{1, 2, 3} • {5}
1 {1,2,3
{1, 2, 3} • {4} #
{6} {6}
2 {1,2,3
{1, 2, 3} • {3} {5} b {5}
3 {
b
{1, 2} {4} {4}
* {1, 2}
a 4 {
{3} {3}
{1 2} | {{1,, 2}}
{1,
5 {
a b
{1} {1} {2} {2} 6 {
• At cat-node above the star-node, ‘*’ is left child and ‘a’ is right
child
– lastpos(*)
p ( ) = {{1,2}
, } and firstpos(a)={3}
p ( ) { }
– According to Rule 1:
» followpos{1} = {3}
» followpos{2} = {3}
followpos example
{1, 2, 3} • {6} N d
Node F ll
Followpos
{1, 2, 3} • {5} 1 {1,2,3
{1, 2, 3} • {4} #
{6} {6} 2 {1,2,3
{1, 2, 3} • {3} {5} b {5}
3 {{4
b
{1, 2} {4} {4}
* {1, 2} 4 {5
a
{3} {3}
{1 2} | {1,
{1, {1 2} 5 {6
a b 6 {
{1} {1} {2} {2}
3 4 5 6
2
Construction of DFA from RE
• Method:
1. Construct syntax tree ST for augmented RE r#
2. Construct the functions nullable, firstpos, lastpos and
followpos for ST
3. Construct Dstates: set of states of D
Dtrans: transition table for D
Construction of DFA from RE
• Algorithm
a b
Dstates a b
a b {1} {1} {2} {2}
1 2
{1,2,3} ≡ A B A
fi t
firstpos{root}
{ t} = {1,2,3}
{1 2 3} ≡ A (unmarked)
( k d)
{1,2,3,4} ≡ B
For the input symbol a, positions are 1, 3
p ( ) ∪ followpos{3}
∴ followpos(1) p { }
={1,2,3,4} ≡ B
For the input symbol b, positions are 2
∴ followpos(2)= {1,2,3,} ≡ A
DFA for (a|b)*abb#
• {1, 2, 3} • {6}
Node followpos
• # {1, 2, 3} • {5}
1 {1,2,3}
6 # 2 {1 2 3}
{1,2,3}
• b {1 2
{1, 2, 3} • {4} {6} {6}
5 3 {4}
• b {1, 2, 3} • {3} {5} b {5} 4 {5}
4 b 5 {6}
{1, 2} {4} {4}
* a * {1, 2} 6 -
3 a
{3} {3}
| {1, 2} | {1, 2}
a b
Dstates a b
a b {1} {1} {2} {2}
1 2
{1,2,3} ≡ A B A
{1 2 3 4} ≡ B (unmarked)
{1,2,3,4}
a b
Dstates a b
a b {1} {1} {2} {2}
1 2
{1,2,3} ≡ A B A
{1 2 3 5} ≡ C (unmarked)
{1,2,3,5}
a b
Dstates a b
a b {1} {1} {2} {2}
1 2
{1,2,3} ≡ A B A
{1 2 3 6} ≡ D (unmarked)
{1,2,3,6}
b a
a b b
123 1234 1235 1236
a a
DFA State Minimization
b
b
a
a b b
123 1234 1235 1236
a a
Example:
p
• A partitioning of a set...
...breaks the set into non-overlapping subsets.
(The partition breaks the set into “groups”)
• Example:
S = {A, B, C, D, E, F, G}
π = {(A B) (C D E F) (G) }
π2 = {(A) (B C) (D E F G) }
• We can “refine” a partition...
πi = { (A B C) (D E) (F G) }
• πi = { ( A B D ) ( C E ) }
P1 P2
• Consider one gro
group...
p (A B D)
• Look at output edges on some symbol (e.g., “x”)
p t
a s q p a q
s t
.. DMIN
D ..
newPartition (π)
1. Set πnew = π
2. For ( each group G of π)
3. partition G into subgroups such that two states s and t
are in
i th
the same subgroup
b iff ffor allll iinputt symbols
b l a, state
t t s and
dt
have transition on a to states in the same group of π
4. replace G in π by the set of all subgroups found
Example
NFA to Regular Expression
We can write
– A=aB
– B=bD|bC
– C=aD
– D=aB|ε
NFA to Regular Expression
4. 7. Remove epsilon:
• A=a(b|ba)D • A = (a b | a b a ) D
• D = a (b | b a) D | ε • D = (a b | a b a)*
5. Factor: 8. Substitute:
• A = (a b | a b a ) D • A = (a b | a b a )
• D = ((a b | a b a)) D | ε • ( b | a b a)*
(a )*
6 Arden:
6. 9 Simplify:
9.
• A = (a b | a b a ) D • A = (a b | a b a)+
• D = (a b | a b a)*
a) ε
NFA to Regular Expression
a
1 2
a
start ε
0
ε b
b
3 4
1 a C
ε
start ε ε ε b b
a
A B E 3 4 5 6
ε ε
b
2 D
ε
Trading Time for Space in DFA Simulation
a
s q
r t
Trading Time for Space in DFA Simulation
Trading Time for Space in DFA Simulation
Trading Time for Space in DFA Simulation