Professional Documents
Culture Documents
UNIT1 - Lexical Analysis1
UNIT1 - Lexical Analysis1
UNIT1 - Lexical Analysis1
Lexical Analysis
pass token
Source read char Lexical Parser
program analyzer
get next
Symbol Table
Functions
Grouping input characters into
tokens
Stripping out comments and white
spaces
Correlating error messages with the
source program
NUM 10
Character-at-a-time I/O
Block / Buffered I/O
Block/Buffered I/O
Utilize Block of memory
Stage data from source to buffer block at a time
Maintain two blocks -
Asynchronous I/O - for 1 block
While Lexical Analysis on 2nd block
Block 1 Block 2
forward := forward + 1
end
else if forward at end of second half then begin
Checking if
reload first half ; forward ptr is at
the end of 2nd half
move forward to beginning of first half
end
else forward := forward + 1;
E = M * C * * 2 eof
OPERATION DEFINITION
union of L and M L M = {s | s is in L or s is in M}
written L M
concatenation of L LM = {st | s is in L and t is in M}
and M written LM
Kleene closure of L L*= Li
written L*
i 0
L written L+ L +=
i 1
L = {A, B, C, D } D = {1, 2, 3}
L D = {A, B, C, D, 1, 2, 3 }
LD = {A1, A2, A3, B1, B2, B3, C1, C2, C3, D1, D2, D3 }
L2 = { AA, AB, AC, AD, BA, BB, BC, BD, CA, … DD}
L4 = L2 L2 = {set of all 4 letter strings}
L* = { All possible strings of L plus }
L+ = L* -
L (L D )* = ??
More examples
a | b denotes the set {a,b}
(a|b) (a|b) denotes the set {aa, ab,ba,bb}
a* denotes {, a, aa, aaa, …}
(a|b)* denotes all strings of a’s and b’s
also equal to (a*b*)*
L = {A, B, C, D } D = {1, 2, 3}
A|B|C|D =L
(A | B | C | D ) (A | B | C | D ) = L2
(A | B | C | D )* = L*
(A | B | C | D ) ((A | B | C | D ) | ( 1 | 2 | 3 )) = L (L D)
AXIOM DESCRIPTION
r|s=s|r | is commutative
r | (s | t) = (r | s) | t | is associative
(r s) t = r (s t) concatenation is associative
r(s|t)=rs|rt
(s|t)r=sr|tr concatenation distributes over |
r = r
r = r Is the identity element for concatenation
blank b
tab ^T
newline ^M
delim blank | tab | newline
ws delim +
A state
An accepting state
a
A transition
other
8 * RTN(G)
= 4 * return(relop, LT)
5 return(relop, EQ)
>
=
6 7 return(relop, GE)
other
8 * return(relop, GT)
id :
letter or digit
E digit
digit digit
return(num, install_num())
digit
accept
switch (state) {
…
case 9: c = nextchar();
if (isletter( c) ) state = 10; else state = failure();
break;
case 10: ….
case 11: retract(1); insert(id); return;
Yes
String(x) FA
No
A FA contains :-
• a set of states (s)
• set of i/p symbols (∑)
• start state(s0)
• set of final (accepting states)(F)
• set of transitions(∂)
06/22/2009 Department of Computer ScienceER&DCI 46
Institute of Technology
Finite Automata
Finite Automata 22 types
types
S = { 0, 1, 2, 3 } Transition Diagram
a
s0 = 0 start
0 a 1 b 2 b 3
F={3}
= { a, b } b
Transition Table
input
a b
(null) moves possible
s
t 0 { 0, 1 } {0} i j
a 1 -- {2}
t Switch state but do not
e 2 -- {3} use any input symbol
06/22/2009 Department of Computer ScienceER&DCI 52
Institute of Technology
Acceptance of NFA
aa* | bb*
a
a
1 2
start
0
3 4
b
b
Since transition tables don’t have any alternative options, DFAs are
easily simulated via an algorithm.
s s0
c nextchar;
while c eof do
s move(s,c);
c nextchar;
end;
if s is in F then return “yes”
else return “no”
06/22/2009 Department of Computer ScienceER&DCI 58
Institute of Technology
Example - DFA Transition Table
input
b
a a b
start a b b s 0 1 0
0 1 2 3
t 1 1 2
a a
b a t 2 1 3
e 3 1 0
a
start a b b
0 1 2 3
b
06/22/2009 Department of Computer ScienceER&DCI 59
Institute of Technology
Conversion of
Conversion of NFA
NFA to
to DFA
DFA
Why?
Method
y = ε-closure of T
a
start a b b
0 1 2 3
b
a
start a b b
0 1 2 3
b
(0,a) = {0,1} New states
a b
(0,b) = {0}
A = {0} A B A
({0,1}, a) = {0,1}
({0,1}, b) = {0,2} B = {0,1} B B C
({0,2}, a) = {0,1} C = {0,2}
D = {0,3} C B D
({0,2}, b) = {0,3}
D B A
06/22/2009 Department of Computer ScienceER&DCI 67
Institute of Technology
NFA to
NFA to DFA
DFA conversion
conversion (cont.)
(cont.)
a a
start a b b
A B C D
b
a
a b
b
A B A
B B C
C B D
D B A
06/22/2009 Department of Computer ScienceER&DCI 68
Institute of Technology