Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Compilers

by
Marwa Yusuf

Lecture 9
Mon. 13-3-2023

Chapter 4 (4.6.1 to 4.6.4)

Syntax Analysis
Intro to LR Parsing

LR(k) parsing: L for scanning left to right, R for
rightmost derivation in reverse, K look-ahead
symbols.

We will only consider K<= 1. LR by default is
LR(1).

Mar 8, 2023 2
LR Grammar

LR parser is table driven.

LR Grammar: a grammar for which you can
construct a parsing table (as will be shown).

For a grammar to be LR, sufficient that a left-to
right shift reduce can recognize handles of right-
sentential forms when they appear on top of the
stack.

Mar 8, 2023 3
Why LR Parser?

For reading.

Mar 8, 2023 4
Items & LR Automaton

Shift-reduce decisions: states to keep track of parsing position.

States represent sets of items.

An (LR(0)) item of a grammar G: a production with a dot at some position in
body.

A → XYZ yields:
– A → ·XYZ (hope to see string derivable from XYZ on the input).
– A → X·YZ (saw a string derivable from X and hope to see a string derivable from YZ).
– A → XY·Z
– A → XYZ· (saw a string derivable from XYZ and may be the time to reduce XYZ to A).

A → ε yields: A → ·

Canonical LR(0): one collection of sets of LR(0) items.
– Provides the basis for constructing a DFA (LR(0) automaton), used for parsing
decisions.
– Each state represents a set of items.

Mar 23, 2022 5


Example Automaton

Mar 23, 2022 6


Constructing canonical LR(0) collection C

Augmented grammar G’ for grammar G = G with new start
symbol S’ and production S’→ S.

Acceptance occurs when and only when about to reduce by S’
→ S.

To construct canonical LR(0) collection we need augmented
grammar and CLOSURE and GOTO functions.

If I is a set of items, CLOSURE(I):
– Add every item in I to CLOSURE(I).
– If A → α·Bβ is in CLOSURE(I), then add each B→·γ until no more
items can be added.

Mar 23, 2022 7


Example

E’ → E
E→E+T|T
T→T*F|F
F → ( E ) | id
● If I is the set {[E’ → ·E]} then CLOSURE(I) is I0 in the prev. figure.

It may be sufficient to list non-terminals, not productions.
1) Kernel items: S’ → ·S and all items with no dots on the left.
2) Nonkernel items: the rest.

Nonkernel are shaded in figure.

Mar 23, 2022 8


CLOSURE computation algorithm
SetOfltems CLOSURE(I) {
J = I;
repeat
for ( each item A → α·Bβ in J )
for ( each production B → γ of G )
if ( B → ·γ is not in J )
add B → ·γ to J;
until no more items are added to J on one round;
return J;
}

Mar 23, 2022 9


GOTO function

The transition from the state for I under input X.

GOTO(I, X): the closure of the set of all items [A → αX·β] such that [A → α·Xβ]
is in I.

If I := {[E’ → E·], [E → E· + T]}
then GOTO(I, +) contains:
E → E +· T
T → ·T * F
T → ·F
F → ·(E)
F → ·id

Find items with + immediately to the right of the dot.

Mar 23, 2022 10


Algorithm to construct C
void items(G'){
C = CLOSURE({[S' -> ·S]});
repeat
for ( each set of items I in C )
for ( each grammar symbol X )
if ( GOTO(I, X) is not empty and not in C )
add GOTO(I, X) to C;
until no new sets of items are added to C on a
round;
}

Mar 23, 2022 11


Example

Mar 23, 2022 12


Use of LR(0) Automaton

SLR (Simple LR) parsing: construction from the
grammar of the LR(0) automaton.

States = sets of items, transitions = GOTO function.

The start state is the CLOSURE({[S’ → ·S]}).
● Statej = set of items Ij.

Shift-reduce parsing decisions: at some state, shift
on next input symbol if there is a transition for that
symbol, otherwise reduce.

Mar 23, 2022 13


Example
LI STACK SYMB INPUT ACTION
N OLS
E

(1) 0 $ id * id $ shift to 5

(2) 0 5 $ id * id $ reduce by F→ id

(3) 0 3 $F * id $ reduce by T→ F

(4) 0 2 $T * id $ shift to 7
(5) 0 2 7 $T* id $ shift to 5

(6) 0 2 7 5 $ T * id $ reduce by F → id

(7) 0 2 7 10 $ T * F $ reduce by T → T * F

(8) 0 2 $T $ reduce by E → T
Mar 23, 2022 14
(9) 0 1 $E $ accept
The LR-Parsing Algorithm

Driver: the same for all LR parsers, parsing table
changes.

Shift state (states from LR(0) automaton): each one
represents info on the stack below, each has a
corresponding grammar symbol.

Mar 23, 2022 15


Structure of the LR Parsing Table

ACTION function: ACTION[i, a] =
– Shift j: shift a but use state as a representative.
– Reduce A → β: β is on top of stack.
– Accept.
– Error.
● GOTO function: GOTO[Ii, A] = Ij.

Mar 23, 2022 16


LR-Parser Configurations

A configuration (stack states, remaining input:
(s0s1 … sm, aiai+1 … an$)
● Recall that states (except s0) represent symbols.
Hence this pair represents a right sentential form:
X1 X2 … Xm ai ai +1 … an

Mar 23, 2022 17


Behavior of the LR Parser
● Given sm on top of stack, ai as next input symbol, if
ACTION[sm, ai] =
– shift s: (s0s1 … sms, ai+1 … an$)
– reduce A→β: (s0s1 … sm-rs, aiai+1 … an$)
● r=length of β, s=GOTO[sm-r, A]

output = semantic action of the production, for now just print it.
– accept: parsing complete.
– error: call error recovery routine.

Mar 23, 2022 18


LR-Parser Algorithm

INPUT : An input string w and an LR-parsing table
with functions ACTION and GOTO for a grammar G.

OUTPUT : If w is in L(G), the reduction steps of a
bottom-up parse for w; otherwise, an error indication.
● METHOD : Initially, the parser has s0 on its stack,
where s0 is the initial state, and w$ in the input buffer.
The parser then executes the following program.

Mar 23, 2022 19


LR-Parser Algorithm
let a be the first symbol of w$;
while (1) { /* repeat forever */
let s be the state on top of the stack;
if ( ACTION[s, a] = shift t ) {
push t onto the stack;
let a be the next input symbol;
} else if ( ACTION [s, a] = reduce A → β ) {
pop |β| symbols off the stack;
let state t now be on top of the stack;
push GOTO[t, A] onto the stack;
output the production A → β;
} else if ( ACTION[s, a] = accept ) break; /* parsing is
done */
else call error-recovery routine;
}
Mar 23, 2022 20
Example
GOTO[0, id] shift and stack 5 reduce by prod. 2


Grammar: STATE ACTION GOTO
id + * ( ) $ E T F
1) E → E + T 0 s5 s4 1 2 3
1 s6 acc
2) E → T
2 r2 s7 r2 r2
3) T → T * F 3 r4 r4 r4 r4
4 s5 s4 8 2 3
4) T → F 5 r6 r6 r6 r6
5) F → (E) 6 s5 s4 9 3
7 s5 s4 10
6) F → id 8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5

Mar 23, 2022 error accept 21


1) E→E+T
2) E→T

Example 3)
4)
5)
T→T*F
T→F
F → (E)
6) F → id
STACK SYMs INPUT ACTION
S ACTION GOTO
1 0 id * id + id $ s5
id + * ( ) $ E T F
2 05 id * id + id $ r F→ id
0 s5 s4 1 2 3
3 03 F * id + id $ r T→ F
1 s6 acc
4 02 T * id + id $ s7
2 r2 s7 r2 r2
5 027 T* id + id $ s5
3 r4 r4 r4 r4
6 0275 T * id + id $ r F → id
4 s5 s4 8 2 3
7 0 2 7 10 T*F + id $ r T → T * F
5 r6 r6 r6 r6
8 02 T + id $ r E → T
6 s5 s4 9 3
9 01 E + id $ s6
7 s5 s4 10
10 0 1 6 E+ id $ s5
8 s6 s11
11 0 1 6 5 E + id $ r F → id
9 r1 s7 r1 r1
12 0 1 6 3 E+F $ rT→F
10 r3 r3 r3 r3
13 0 1 6 9 E+T $ rE→E+T
11 r5 r5 r5 r5
14 0 1 E $ acc
Mar 23, 2022 22
Constructing SLR Parsing Tables

INPUT : An augmented grammar G'.

OUTPUT : The SLR-parsing table functions ACTION and GOTO for G'.

METHOD:
1) Construct C = {Io,I1, ... ,In}, the collection of sets of LR(0) items for G'.
2) State i is constructed from Ii. The parsing actions for state I are determined as follows:
a) If [A → α·aβ] is in Ii and GOTO(Ii, a) = Ij, then set ACTlON[i, a] to "shift j." Here a must be a
terminal.
b) If [A → α·] is in Ii, then set ACTION[i,a] to "reduce A → α" for all a in FOLLOW(A); here A may
not be S'.
c) If [S' → S·] is in Ii, then set ACTION[i, $] to "accept."
If any conflicting actions result from the above rules, we say the grammar is not SLR
(l) . The algorithm fails to produce a parser in this case.
3) The goto transitions for state i are constructed for all non-terminals A using the rule: If
GOTO(Ii, A) = Ij, then GOTO[i, A] = j.
4) All entries not defined by rules (2) and (3) are made "error."
5) The initial state of the parser is the one constructed from the set of items containing [S'
→ ·S].
Mar 23, 2022 23
Constructing SLR Parsing Tables

This algorithm produces SLR(1) table for G.

An LR parser using SLR(1) table for G is SLR(1)
parser for G.

A grammar having SLR(1) table is SLR(1).

Usually (1) is omitted.

Mar 23, 2022 24



FOLLOW(E) = { + , ) , $ }

FOLLOW(T) = { + , * , ) , $ }
Example ●
FOLLOW(F) = { + , * , ) , $ }

STAT ACTION GOTO


E
id + * ( ) $ E T F

0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
Mar 23, 2022 11 r5 r5 r5 r5 25
Note

Every SLR(1) grammar is unambiguous, but not
every unambiguous grammar is SLR(1).

Mar 23, 2022 26


Skipped

From 4.6.5 till the end of chapter.

Mar 23, 2022 27


Thanks!

Mar 8, 2023 28

You might also like