Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

1

Syntax Analysis
Part I
Chapter 4
2

Defining LL(1) Grammar


• A grammar G is LL(1) if for each collections of
productions
A  1 | 2 | … | n
for nonterminal A the following holds:

1. FIRST(i)  FIRST(j) =  for all i  j


2. if i *  then
2.a. j *  for all i  j
2.b. FIRST(j)  FOLLOW(A) = 
for all i  j
3

Predictive Parsing: Variant 1


Recursive Descent Parsing
• Grammar must be LL(1)
• Every nonterminal has one (recursive) procedure
responsible for parsing the nonterminal’s syntactic
category of input tokens
• When a nonterminal has multiple productions,
each production is implemented in a branch of a
selection statement based on input look-ahead
information
4

Using FIRST and FOLLOW to


Write a Recursive Descent Parser
procedure rest();
begin
expr  term rest if lookahead in FIRST(+ term rest) then
rest  + term rest match(‘+’); term(); rest()
else if lookahead in FIRST(- term rest) then
| - term rest match(‘-’); term(); rest()
| else if lookahead in FOLLOW(rest) then
term  id return
else error()
end;

FIRST(+ term rest) = { + }


FIRST(- term rest) = { - }
FOLLOW(rest) = { $ }
5

Example 9: LL(1) Grammars are Unambiguous


SR  e S, will be selected because of e
Ambiguous grammar A FIRST() FOLLOW(A)
S  i E t S SR | a S  i E t S SR i e$
SR  e S |  Sa a
Eb SR  e S e e$
SR   
Eb b t
Error: duplicate table entry
a b e i t $
S Sa S  i E t S SR
SR  
SR SR  
SR  e S
E Eb
6

Example 9: Below Grammar is


Ambiguous
Ambiguous grammar A FIRST() FOLLOW(A)
S  i E t S SR | a S  i E t S SR i e$
SR  e S |  Sa a
Eb SR  e S e e$
SR   
Eb b t

SR  e S |  (1 =  and 2 = eS)

1 = * , First (2)= {e} and Follow (SR) = {e, $}


FOLLOW(SR)  FIRST(2) = {e, $}  {e} ≠ 
7

Non-LL(1) Examples

Grammar Not LL(1) because


SSa|a Left recursive
SaS|a FIRST(a S)  FIRST(a)  
SaR|
RS| For R: S *  and  * 
SaRa For R:
RS| FIRST(S)  FOLLOW(R)  
8

Predictive Parsing: Variant 2


Non-Recursive (Table Driven)
• Given an LL(1) grammar G=(N,T,P,S)
construct a table M[A,a] for A  N, a  T
and use a driver program with a stack
input a + b $

stack
Predictive parsing
X output
program (driver)
Y
Z Parsing table
$ M
9

Predictive Parsing Program


push($)
(Driver)
push(S)
a := lookahead
repeat
X := pop()
if X is a terminal or X = $ then
match(X) // move to next token, a := lookahead
else if M[X,a] = X  Y1Y2…Yk then
push(Yk, Yk-1, …, Y2, Y1) // such that Y1 is on top
produce output and/or invoke actions
else error()
endif
until X = $
10

Tracing from Parsing Table: Example 10


E  TE’
E’  + TE’ | 
T  FT’
T’  * FT’ | 
F  ( E ) | id Input String: id + id * id$
Table M

Non- INPUT SYMBOL


terminal
id + * ( ) $
E ETE’ ETE’
E’ E’+TE’ E’ E’
T TFT’ TFT’
T’ T’ T’*FT’ T’ T’
F Fid F(E)
11

Trace of Example 10
STACK INPUT OUTPUT E  TE’
$E id + id * id$ E’  + TE’ | 
$E’T id + id * id$ E TE’
T  FT’
$E’T’F id + id * id$ T FT’ T’  * FT’ | 
$E’T’id id + id * id$ F  id F  ( E ) | id
$E’T’ + id * id$
$E’ + id * id$ T’   Expend Input
$E’T+ + id * id$ E’  +TE’
$E’T id * id$
$E’T’F id * id$ T FT’
$E’T’id id * id$ F  id
$E’T’ * id$
$E’T’F* * id$ T’  *FT’
$E’T’F id$
$E’T’id id$ F  id
$E’T’ $
$E’ $ T’  
$ $ E’  
12

Leftmost Derivation for Example 10


The leftmost derivation for the example is as follows:

E  TE’  FT’E’  id T’E’  id E’  id + TE’  id + FT’E’


 id + id T’E’  id + id * FT’E’  id + id * id T’E’
 id + id * id E’  id + id * id
Example 11: Repeated 13

A
Parsing Table FIRST() FOLLOW(A)
E  T ER ( id $)
ER  + T ER + $)
E  T ER
ER  + T ER |  ER   
T  F TR T  F TR ( id +$)
TR  * F TR |  TR  * F TR * +$)
F  ( E ) | id TR   
F(E) ( *+$)
F  id id

id + * ( ) $
E E  T ER E  T ER
ER ER  + T ER ER   ER  
T T  F TR T  F TR
TR TR   TR  * F TR TR   TR  
F F  id F(E)
14

Example 11: Table-Driven Parsing


Stack Input Production applied
$E id+id*id$
$ERT id+id*id$ E  T ER
$ERTRF id+id*id$ T  F TR
$ERTRid id+id*id$ F  id
$ERTR +id*id$
$ER +id*id$ TR  
$ERT+ +id*id$ ER  + T ER
$ERT id*id$
$ERTRF id*id$ T  F TR
$ERTRid id*id$ F  id
$ERTR *id$
$ERTRF* *id$ TR  * F TR
$ERTRF id$
$ERTRid id$ F  id
$ERTR $
$ER $ TR  
$ $ ER  
Error Recovery
15

When Do Errors Occur? Recall Predictive Parser Function:

a + b $ Input

Stack X Predictive Parsing Output


Program
Y
Z
Parsing Table
$ M[A,a]

1. If X is a terminal and it doesn’t match input.


2. If M[ X, Input ] is empty – No allowable actions
Consider two recovery techniques:
A. Panic Mode
B. Phrase-level Recovery
16

Panic Mode Recovery

General Approach: Modify the empty cells of the Parsing Table.


1. if M[A,a] = {empty} and a belongs to Follow(A) then we set
M[A,a] = “synch”
Error-recovery Strategy :
If A=top-of-the-stack and a=current-input,
1. If A is NT and M[A,a] = {empty} then skip a from the input.
2. If A is NT and M[A,a] = {synch} then pop A.
3. If A is a terminal and A!=a then pop token (essentially inserting
it).
1. Panic Mode Recovery 17

Add synchronizing actions to


undefined entries based on FOLLOW

First(E,F,T) = { (, id } FOLLOW(E) = { $ ) }
FOLLOW(ER) = { $ ) }
First(ER) = { +,  } FOLLOW(T) = { + $ ) }
First(TR) = { *,  } FOLLOW(TR) = { + $ ) }
FOLLOW(F) = { * + $ ) }

id + * ( ) $
E E  T ER E  T ER synch synch
ER ER  + T ER ER   ER  
T T  F TR synch T  F TR synch synch
TR TR   TR  * F TR TR   TR  
F F  id synch synch F(E) synch synch
synch: pop A and skip input till synch token
or skip until FIRST(A) found
18

Example 12: Panic Mode Recovery


Stack Input Production applied
E  T ER $E )id*+id$ Error, skip )
ER  + T ER |  $E id*+id$ id is in First of (E)
$ERT id*+id$ E  T ER
T  F TR $ERTRF id*+id$ T  F TR
TR  * F TR |  $ERTRid id*+id$ F  id
F  ( E ) | id $ERTR *+id$
$ERTRF* *+id$ TR  * F TR
$ERTRF +id$ Error, M[F, +]= synch
$ERTR +id$ F is popped
$ER +id$ TR  
$ERT+ +id$ ER  + T ER
$ ERT id$
$ERTRF id$ T  F TR
$ERTRid id$ F  id
$ERTR $
$ER $ TR  
$ $ ER  
19

2. Phrase-Level Recovery
Change input stream by inserting missing *
For example: id id is changed into id * id

id + * ( ) $
E E  T ER E  T ER synch synch
ER ER  + T ER ER   ER  
T T  F TR synch T  F TR synch synch
TR insert * TR   TR  * F TR TR   TR  
F F  id synch synch F(E) synch synch

insert *: insert missing * and redo the production


20

Example 13: Phrase-Level Recovery


Stack Input Production applied
E  T ER $E id id$
ER  + T ER |  $ERT id id$ E  T ER
$ERTRF id id$ T  F TR
T  F TR $ERTRid id id$ F  id
TR  * F TR |  $ERTR id$
F  ( E ) | id $ERTR *id$ Error, insert * in input
$ERTRF * *id$ TR  * F TR
$ERTRF id$
First(E,F,T) = { (, id } $ERTRid id$ F  id
First(ER) = { +,  } $ ERTR $
First(TR) = { *,  } $ER $ TR  
$ $ ER  
21

Adding Error Productions


E  T ER
Add error production:
ER  + T ER | 
TR  F TR
T  F TR
to ignore missing *, e.g.: id id
TR  * F TR | 
F  ( E ) | id

id + * ( ) $
E E  T ER E  T ER synch synch
ER ER  + T ER ER   ER  
T T  F TR synch T  F TR synch synch
TR TR  F T R TR   TR  * F TR TR   TR  
F F  id synch synch F(E) synch synch
22

Example 14: Adding Error Productions


Stack Input Production applied
E  T ER $E id id$
ER  + T ER |  $ERT id id$ E  T ER
$ERTRF id id$ T  F TR
T  F TR $ERTRid id id$ F  id
TR  * F TR |  $ERTR id$
F  ( E ) | id $ERTR id$ Add Production
$ERTRF id$ TR  F TR
$ERTRid id$ F  id
First(E,F,T) = { (, id } $ ERTR $
First(ER) = { +,  } $ER $ TR  
First(TR) = { *,  } $ $ ER  
23

Revised Parsing Table: Panic Mode?

Non- INPUT SYMBOL


terminal
id + * ( ) $
E ETE’ ETE’
E’ E’+TE’ E’ E’
T TFT’ TFT’
T’ T’ T’*FT’ T’ T’
F Fid F(E)

From Follow sets. Pop top of stack Skip input symbol: emp.
NT “synch” action
24

Example 15: Revised Parsing Table


STACK INPUT Remark
$E + id * + id$ error, skip + E  TE’
$E id * + id$ E’  + TE’ | 
$E’T id * + id$ T  FT’
$E’T’F id * + id$ T’  * FT’ | 
$E’T’id id * + id$ F  ( E ) | id
$E’T’ * + id$
$E’T’F* * + id$
$E’T’F + id$ error, M[F,+] = synch
$E’T’ + id$ F has been popped
$E’ + id$
$E’T+ + id$
$E’T id$
$E’T’F id$ Follow(E,E’) = { ), $}
$E’T’id id$
$E’T’ $ Follow(F) = { *, +, ), $ }
$E’ $ Follow(T,T’) = { +, ) , $}
$ $
25
How Would You Implement TD Parser
• Stack – Easy to handle.
• Input Stream – Responsibility of lexical analyzer
• Key Issue – How is parsing table implemented ?
One approach: Assign unique IDS

Non- INPUT SYMBOL


terminal
id + * ( ) $
E ETE’ ETE’ synch synch
E’ E’+TE’ E’ E’
T TFT’ synch TFT’ synch synch
T’ T’ T’*FT’ T’ T’
F Fid synch synch F(E) synch synch

All rules have Also for blanks


synch actions
unique IDs which handle
errors
26

Revised Parsing Table:


Non- INPUT SYMBOL
terminal
id + * ( ) $
E 1 18 19 1 9 10
E’ 20 2 21 22 3 3
T 4 11 23 4 12 13
T’ 24 6 5 25 6 6
F 8 14 15 7 16 17

1 ETE’
2 E’+TE’ 9 – 17 : 18 – 25 :
3 E’ Sync Error
4 TFT’ Actions Handlers
5 T’*FT’
6 T’
7 F(E)
8 Fid

You might also like