Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 23

CPSC 388 – Compiler Design

and Construction

Scanner – Regular Expressions to DFA


Announcements
 ACM Programming contest
(Tues 8pm)
 PROG 1 Feedback
 Linux Install Fest – When?
Saturday?, Fliers, CDROMS, Bring
Laptops (do at own risk)
 LUG
 Understanding Editors (Eclipse, Vi,
Emacs)
Scanners

Source Lexical Analyzer


Token
Code (Scanner)
Stream
Deterministic
Regular Finite
Expression State
Automata
Nondeterministic
Finite
State
Automata
Regular Expressions
 Easy way to express a language that is
accepted by FSA
 Rules:
 ε is a regular expression
 Any symbol in Σ is a regular expression
If r and s are any regular expressions then so is:
 r|s denotes union e.g. “r or s”
 rs denotes r followed by s (concatination)
 (r)* denotes concatination of r with itself zero or
more times (Kleene closer)
 () used for controlling order of operations
RE to NFA: Step 1
 Create a tree from the Regular
Expression
 Example *

Cat
(a(a|b))*

•Leaf Nodes are either a |


members of Σ
or ε
•Internal Nodes are operators a b
cat, |, *
RE to NFA: Step 2
 Do a Post-Order Traversal of Tree
(children processed before parent)
 At each node follow rules for
conversion from a RE to a NFA
Leaf Nodes
 Either ε or member of Σ *
ε
S F
Cat

a a |
S F

a b
Internal Nodes
 Need to keep track of left (l)and right
(r) NFA and merge them into a single
NFA
 Or
 Concatination
 Kleene Closure
Or Node
l

ε
ε

S F
r ε
ε
Concatenation Node

l r
Kleene Closure
ε

ε ε
S F

ε
Try It
 Convert the regular expression to a
NFA
(a|b)*abb
 First convert RE to a tree
 Then convert tree to NFA
NFA to DFA
 Recall that a DFA can be represented
as a transition table
Characters
+ - Digit

S A A B
State A B
B B
Operations on NFA
 ε-closure(t) – Set of NFA states
reachable from NFA state t on ε-
transitions alone.
 ε-closure(T) – Set of NFA states
reachable from some NFA state t in
set T on ε-transitions alone.
 move(T,a) – Set of NFA states to
which there is a transition on input
symbol a from some state t in T
NFA to DFA Algorithm
Initially ε-closure(s) is the only state
in DFA and it is unmarked
While (there is unmarked state T in DFA)
mark T;
for (each input symbol a) {
U = ε-closure(move(T,a));
if (U not in DFA)
add U unmarked to DFA
transition[T,a]=U;
Try it
 Take NFA from previous example and
construct DFA
Regular Expression: (a|b)*abb
ε

a
2 3
ε ε ε ε a b b
S 1 6 7 8 9 F
ε ε
4 5
b

ε
Corresponding DFA
b

C
1,2,4,
5,6,7
b
b
a

NewS a B D NewF
b b
S,1,2,4,7 1,2,3,4 1,2,4,5, 1,2,4,5,
6,7,8 6,7,9 6,7,F
a

a a
Start State and Accepting States
 The Start State for the DFA is
ε-closure(s)

 The accepting states in the DFA are


those states that contain an accepting
state from the NFA
Efficiency of Algorithms
 RE -> NFA
O(|r|) where |r| is the size of the RE

 NFA -> DFA


O(|r|22|r|) – worst case
(not seen in typical programming languages)

 Recognition of a string by DFA


O(|x|) where |x| is length of string
More Practice
 Convert RE to NFA
((ε|a)b*)*

 Convert NFA to DFA


a
a
1 2
ε
S b
ε 3 b 4
Solution to Practice
 RE to NFA ε

ε ε
2 3
ε ε ε ε b ε ε
S 1 6 7 8 9 F
ε ε
4 5
a ε

ε
Solution to Practice
 NFA to DFA a

A
2
a
NewS
S,1,3

b B
4

b
Summary of Scanners
 Lexemes
 Tokens
 Regular Expressions, Extended RE
 Regular Definitions
 Finite Automata (DFA & NFA)
 Conversion from RE->NFA->DFA
 JLex

You might also like