Download as pdf or txt
Download as pdf or txt
You are on page 1of 68

CS327 - Compilers

Bottom-up Parsing

Abhishek Bichhawat 14/02/2024


Bottom-up Parsing
● More general than top-down parsing
○ Just as efficient as top-down parsing
○ Do not need special grammars
○ Can work with this grammar:
E→T+E|T
T → int * T | int | (E)
● Reduces a string to the start symbol by inverting productions
● Generates a rightmost derivation in reverse
Bottom-up Parsing
E→T+E|T
T → int * T | int | (E)

Str: int * int + int


Bottom-up Parsing
E→T+E|T
T → int * T | int | (E)

Str: int * int + int T → int


int * T + int
Bottom-up Parsing
E→T+E|T
T → int * T | int | (E)

Str: int * int + int T → int


int * T + int T → int * T
T + int
Bottom-up Parsing
E→T+E|T
T → int * T | int | (E)

Str: int * int + int T → int


int * T + int T → int * T
T + int T → int
T+T
Bottom-up Parsing
E→T+E|T
T → int * T | int | (E)

Str: int * int + int T → int


int * T + int T → int * T
T + int T → int
T+T E→T
T+E
Bottom-up Parsing
E→T+E|T
T → int * T | int | (E)

Str: int * int + int T → int


int * T + int T → int * T
T + int T → int
T+T E→T
T+E E→T+E
E
Bottom-up Parsing
E→T+E|T
T → int * T | int | (E) E

Str: int * int + int


T + E
int * T + int
T + int
T
T+T
int * T
T+E
int
E
int
How do reductions happen?
● Split string into left and right substrings
○ Right substring contains only terminals (unexamined input)
○ Left substring contains a set of non-terminals and terminals
○ Initially, the string is all terminals; replace some substring with a
non-terminal and then proceed
● We will use a marker (|) to point the split
● Initially, we have |a1a2..an
Shift-reduce Parsing
● Shift moves the marker to the right
○ |a1a2..an → a1|a2..an
○ Shifts a terminal into the left substring
● Reduce applies a production on the left substring
○ Suppose A → a1a2 , then a1a2|..an → A|..an
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: | int * int + int shift $


Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int | * int + int shift $ int


Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * | int + int shift $ int *


Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int shift $ int * int


Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $ int * T


int * T | + int
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $T


int * T | + int reduce T → int * T
T | + int
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $T+


int * T | + int reduce T → int * T
T + | int shift
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $ T + int


int * T | + int reduce T → int * T
T + int | shift
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $T+T


int * T | + int reduce T → int * T
T + int | reduce T → int
T+T|
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $T+E


int * T | + int reduce T → int * T
T + int | reduce T → int
T+T| reduce E → T
T+E |
Shift-reduce Parsing
E→T+E|T
T → int * T | int | (E) Stack

Str: int * int | + int reduce T → int $E


int * T | + int reduce T → int * T
T + int | reduce T → int
T+T| reduce E → T
T+E | reduce E → T + E
E|
Handles
● Handle is a substring that matches the right hand side of the
production
○ Can be reduced to the LHS non-terminal
○ If int * int | + int reduces to int * T | + int, then int is said to be the
handle
● If a grammar is unambiguous, then every right-sentential form
has exactly one handle
● Use stack to implement shift-reduce parsing with handle on the
top of the stack
Conflicts
● We may shift or reduce at some of the steps in the parsing
○ Known as shift-reduce conflict
○ Expected and can be removed
● We may reduce by two different productions at some steps
○ Known as reduce-reduce conflict
○ Problems with grammar that need to be resolved
Conflicts
E→T+E|T
T → int * T | int | (E)

Str: int | * int + int

● At this point, how do we know whether to shift or reduce


● Reduction can be performed but we shall not reduce to E
LR Parsing
● LR(k) Parsers are bottom-up parsers
○ L is for scanning inputs left to right
○ R is for constructing a rightmost derivation trees
○ k is the number of lookahead symbols
● LR parsers make shift-reduce decisions by maintaining states to
keep track of where we are
○ These states are set of “items”
○ An item of grammar G is a production of G with . (dot) somewhere in RHS
For A → BC: A→ .BC, A → B.C, A → BC. are all items of the grammar
○ Item indicates the production we have seen at a given point in parsing
Recognizing Handles - Viable Prefixes
● Handles are the substrings that we want to reduce
○ Handles always appear on the top of the stack; no backtracking
○ Need to recognize handles to correctly shift and reduce
○ No known algorithm to recognize handles
● Use heuristics to guess that a substring on the stack is handle
● Stack contents are prefix of a right sentential form, i.e.,
if α is on the top of the stack and the rest is β then we should be
able to reduce αβ to the start symbol
● LR parsing is based on the fact that the items can construct a FA
known as LR(0) automaton, which accepts viable prefixes
Viable Prefixes - Example
E→T+E|T
T → int * T | int | (E)

Given the input (int)

● (E|) is a state of the shift-reduce parse


● (E is a viable prefix as T → (E)
● T → (E.) says that so far we have matched (E for the input and
we hope to see ) next
Viable Prefixes - Example
Given the input string (int * int):
(int * | int) is a state in the parsing where

( is a prefix of T → (E) T → (.E)


ε is a prefix of E → T E → .T
int * is a prefix of T → int * T T → int * . T
LR(0) Automaton
● LR(0) automaton takes as input a stack and returns whether or
not the symbols on the stack are viable prefixes
● Each state of the automaton will contain a set of items
● Start by adding a dummy production S’ → S to the grammar
○ Indicates the parser that when we reduce using S’ → S, accept the input
● If I (state in FA) is the set of items, then compute CLOSURE(I):
○ Every item in I is in the CLOSURE(I)
○ CLOSURE (I) = CLOSURE(I) ∪ {B → .γ},
if A → α.Bβ is in the CLOSURE(I) and B → .γ
● Compute GOTO(I, X) for some grammar symbol X:
○ Closure of the set of all items [A → αB.β] such that [A → α.Bβ] is in I
E→T+E|T
LR(0) Automaton T → int * T | int | (E)

E’ → .E
E → .T + E
E → .T
T → .int * T
T → .int
T → .(E)
E→T+E|T
LR(0) Automaton T → int * T | int | (E)

E’ → .E E’ → E.
E → .T + E E
E → .T
T → .int * T
T → .int
T → .(E)
E→T+E|T
LR(0) Automaton T → int * T | int | (E)

E’ → .E E’ → E.
E → .T + E E
E → .T
T → .int * T T
E → T. + E
T → .int E → T.
T → .(E)
E→T+E|T
LR(0) Automaton T → int * T | int | (E)

E’ → .E E’ → E.
E → .T + E E
E → .T
T → .int * T T
E → T. + E
T → .int E → T.
T → .(E) int

T → int. * T
T → int.
E→T+E|T
LR(0) Automaton T → int * T | int | (E)

E’ → .E E’ → E.
E → .T + E E
E → .T
T → .int * T T
E → T. + E
T → .int E → T.
T → .(E) int

T → int. * T
T → int. T → (.E)
E → .T + E
E → .T
T → .int * T
T → .int
T → .(E)
(
E→T+E|T
LR(0) Automaton T → int * T | int | (E)
E → T +. E
E → .T + E
E → .T
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T T → .(E)
T → .int * T T
E → T. + E
T → .int E → T.
T → .(E) int

T → int. * T
T → int. T → (.E)
E → .T + E
E → .T
T → .int * T
T → .int
T → .(E)
(
E→T+E|T
LR(0) Automaton T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T T → .(E)
T → .int * T T
E → T. + E T
T → .int E → T.
T → .(E) int int (
T → int. * T
T → int. T → (.E)
E → .T + E
E → .T
T → .int * T
T → .int
T → .(E)
(
E→T+E|T
LR(0) Automaton T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T T → .(E)
T → .int * T T
E → T. + E T
T → .int E → T.
T → .(E) int int (
T → int. * T
T → int. T → (.E)
E → .T + E
E → .T
* T → int *. T T → .int * T
T → .int * T T → .int
T → .int T → .(E)
T → .(E) (
E→T+E|T
LR(0) Automaton T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T T → .(E)
T → .int * T T
E → T. + E T
T → .int E → T.
T → .(E) int int (
T → int. * T
T → int. T → (.E)
E → .T + E
int E → .T
* T → int *. T T → .int * T
T → .int * T T T → .int
T → int * T. T → .(E)
T → .int
T → .(E) (
E→T+E|T
LR(0) Automaton T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T T → .(E)
T → .int * T T
E → T. + E T
T → .int T → (E.)
E → T.
T → .(E) int int
T (
E
T → int. * T
T → int. T → (.E)
int E → .T + E
int E → .T
* T → int *. T T → .int * T
T → .int * T T T → .int
T → int * T. T → .(E)
T → .int
T → .(E) (
(
E→T+E|T
LR(0) Automaton T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T T → (E).
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T )
T T → .(E)
T → .int * T E → T. + E T
T → .int T → (E.)
E → T.
T → .(E) int int T (
E
T → int. * T
T → int. T → (.E)
int E → .T + E
int E → .T
* T → int *. T T → .int * T
T → .int * T T T → .int
T → int * T. T → .(E)
T → .int
T → .(E) (
(
LR Parsing Algorithm
● Use stack of states and symbols
○ All states in the FA are numbered
● Two parse tables
○ Goto table (State x Symbol → State)
■ GOTO[i, A] = j, if statei →A statej
○ Action table (state x terminal → {shift, reduce, accept, reject})
■ statei has S’ → S., then ACTION [i, $] = accept
■ statei has A → α.tβ and GOTO[i, t] = j, then ACTION[i, t] = shift j
■ statei has A → α. and t ∈ FOLLOW(A), then ACTION[i, t] = reduce A → α
■ None of the above ⇒ ACTION[i, t] = reject
LR Parsing Algorithm
INPUT: An input string w and an LR parsing table
OUTPUT: If w is accepted, the reduction steps; else error

a ← first symbol of w
while (true) {
s ← state on the top of the stack /* initially it is s0 or start state */
if (ACTION[s, a] = shift t) {
push t onto the stack
a ← next symbol of w
} else if (ACTION[s, a] = reduce A → β) {
pop |β| elements off the stack
t ← state on top of the stack
push GOTO[t, A] onto the stack
output production A → β
} else if (ACTION[s, a] = accept) { break } /* parsing is done */
else { reject with error }
LR(0) Parsing
● Shift if the item A → α.Bβ can transition on B, i.e., there is a
transition from A → α.Bβ to A → αB.β
● Reduce by A → α, if the item-set contains the item A → α.
● Example
String : (id)
Grammar : S → E | (E)
E → id
LR(0) Parsing
● Reduce by A → α, if the item-set contains the item A → α.
● Shift if the item A → α.Bβ can transition on B, i.e., there is a
transition from A → α.Bβ to A → αB.β
● Conflicts
○ Shift-reduce
■ If any state in the DFA has a shift and a reduce item,
e.g., A → α. and A’ → α’.B
○ Reduce-reduce
■ If any state in the DFA has two reduce items, e.g., A → α. and A’ → α’.
E→T+E|T
Shift-reduce Conflict T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T T → (E).
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T )
T T → .(E)
T → .int * T E → T. + E T T
T → .int T → (E.)
E → T.
T → .(E) int int (
E
T → int. * T
T → int. T → (.E)
int E → .T + E
int E → .T
* T → int *. T T → .int * T
T → .int * T T T → .int
T → int * T. T → .(E)
T → .int
T → .(E) (
(
Simple LR Parsing
● Reduce by A → α, if the item-set contains the item A → α. and
the next input symbol is in FOLLOW(A)
● Shift if the item A → α.Bβ can transition to B, i.e., there is a
transition from A → α.Bβ to A → αB.β
● There may still be conflicts because not all grammars are SLR
● Example: int * int + int
E→T+E|T
SLR Parsing T → int * T | int | (E)
E → T +. E
E → .T + E E E → T + E.
E → .T T → (E).
E’ → .E E’ → E. T → .int * T
E → .T + E E + T → .int
E → .T )
T T → .(E)
T → .int * T E → T. + E T
T → .int T → (E.)
E → T.
T → .(E) int int
T (
E
T → int. * T
T → int. T → (.E)
int E → .T + E
int E → .T
* T → int *. T T → .int * T
T → .int * T T T → .int
T → int * T. T → .(E)
T → .int
T → .(E) (
(
int + * ( ) $ E T

GOTO Table 1 4 11 2 3

3 5

4 6

5 4 11 7 3

6 4 11 8

10 9

11 4 11 10 3
State int + * ( ) $

ACTION Table 1 s4 s11

2 A

3 s5 r3 r3

4 r5 s6 r5 r5

5 s4 s11

1. E’ → E 6 s4 s11

2. E→T+E 7 r2 r2

3. E→T 8 r4 r4 r4

4. T → int * T 9 r6 r6 r6

5. T → int 10 s9
6. T → (E) 11 s4 s11
State int + * ( ) $ E T

Parse Table 1 s4 s11 2 3

2 A

3 s5 r3 r3

4 r5 s6 r5 r5

5 s4 s11 7 3
1. E’ → E 6 s4 s11 8
2. E→T+E 7 r2 r2
3. E→T 8 r4 r4 r4
4. T → int * T
9 r6 r6 r6
5. T → int
10 s9
6. T → (E)
11 s4 s11 10 3
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 1 Shift 4


SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 4 int Shift 4


int | * int + int $ 4 1 Shift 6
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 6 * Shift 4


int | * int + int $ 4 4 int Shift 6
int * | int + int $ 6 1 Shift 4
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 4 int Shift 4


int | * int + int $ 4 6 * Shift 6
int * | int + int $ 6 4 int Shift 4
int * int | + int $ 4
1
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 Goto [6,T] T Shift 4


int | * int + int $ 4 6 * Shift 6
int * | int + int $ 6 4 int Shift 4
int * int | + int $ 4 Reduce5 (T → int)
1
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 8 T Shift 4


int | * int + int $ 4 6 * Shift 6
int * | int + int $ 6 4 int Shift 4
int * int | + int $ 4 Reduce5 (T → int)
1
int * T | + int $ 8
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 Shift 4


int | * int + int $ 4 Shift 6
int * | int + int $ 6 Goto[1,T] T Shift 4
int * int | + int $ 4 1 - Reduce5 (T → int)
int * T | + int $ 8 Reduce4 (T → int * T)
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 Shift 4


int | * int + int $ 4 3 T Shift 6
int * | int + int $ 6 1 - Shift 4
int * int | + int $ 4 Reduce5 (T → int)
int * T | + int $ 8 Reduce4 (T → int * T)
T | + int $ 3 Shift 5
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 5 + Shift 4


int | * int + int $ 4 3 T Shift 6
int * | int + int $ 6 1 - Shift 4
int * int | + int $ 4 Reduce5 (T → int)
int * T | + int $ 8 Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
SLR Parsing with Parse Tables
Input State on Stack Action
| int * int + int $ 1 4 int Shift 4
int | * int + int $ 4 5 + Shift 6
int * | int + int $ 6 Shift 4
3 T
int * int | + int $ 4 Reduce5 (T → int)
int * T | + int $ 8 1 - Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4
SLR Parsing with Parse Tables
Input State on Stack Action
| int * int + int $ 1 Goto[5,T] T Shift 4
int | * int + int $ 4 5 + Shift 6
int * | int + int $ 6 Shift 4
3 T
int * int | + int $ 4 Reduce5 (T → int)
int * T | + int $ 8 1 - Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
SLR Parsing with Parse Tables
Input State on Stack Action
| int * int + int $ 1 3 T Shift 4
int | * int + int $ 4 5 + Shift 6
int * | int + int $ 6 Shift 4
3 T
int * int | + int $ 4 Reduce5 (T → int)
int * T | + int $ 8 1 - Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
T+T|$ 3
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 Goto[5,E] E Shift 4


int | * int + int $ 4 Shift 6
5 +
int * | int + int $ 6 Shift 4
int * int | + int $ 4 3 T Reduce5 (T → int)
int * T | + int $ 8 1 - Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
T+T|$ 3 Reduce3 (E → T)
SLR Parsing with Parse Tables
Input State on Stack Action

| int * int + int $ 1 7 E Shift 4


int | * int + int $ 4 Shift 6
5 +
int * | int + int $ 6 Shift 4
int * int | + int $ 4 3 T Reduce5 (T → int)
int * T | + int $ 8 1 - Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
T+T|$ 3 Reduce3 (E → T)
T+E|$ 7
SLR Parsing with Parse Tables
Input State on Stack Action
| int * int + int $ 1 Goto[1,E] E Shift 4
int | * int + int $ 4 Shift 6
int * | int + int $ 6 1 - Shift 4
int * int | + int $ 4 Reduce5 (T → int)
int * T | + int $ 8 Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
T+T|$ 3 Reduce3 (E → T)
T+E|$ 7 Reduce2 (E → T + E)
SLR Parsing with Parse Tables
Input State on Stack Action
| int * int + int $ 1 $ Shift 4
int | * int + int $ 4 Shift 6
int * | int + int $ 6 2 E Shift 4
int * int | + int $ 4 1 - Reduce5 (T → int)
int * T | + int $ 8 Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
T+T|$ 3 Reduce3 (E → T)
T+E|$ 7 Reduce2 (E → T + E)
E|$ 2
SLR Parsing with Parse Tables
Input State on Stack Action
| int * int + int $ 1 Acc $ Shift 4
int | * int + int $ 4 Shift 6
int * | int + int $ 6 2 E Shift 4
int * int | + int $ 4 1 - Reduce5 (T → int)
int * T | + int $ 8 Reduce4 (T → int * T)
T | + int $ 3 Shift 5
T + | int $ 5 Shift 4
T + int | $ 4 Reduce5 (T → int)
T+T|$ 3 Reduce3 (E → T)
T+E|$ 7 Reduce2 (E → T + E)
E|$ 2 Accept

You might also like