Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

4/27/2019

Syntax Directed Translation

Outline
Syntax Directed Definitions
Evaluation Orders of SDD’s
Applications of Syntax Directed Translation
Syntax Directed Translation Schemes

1
4/27/2019

Semantic Analysis computes additional information


related to the meaning of the program once the
syntactic structure is known.
In typed languages as C, semantic analysis involves
adding information to the symbol table and
performing type checking.
The information to be computed is beyond the
capabilities of standard parsing techniques, therefore
it is not regarded as syntax.
As for Lexical and Syntax analysis, also for Semantic
Analysis we need both a Representation Formalism
and an Implementation Mechanism.

Syntax Directed Translation: Intro


The Principle of Syntax Directed Translation states
that the meaning of an input sentence is related to its
syntactic structure, i.e., to its Parse-Tree.
By Syntax Directed Translations we indicate those
formalisms for specifying translations for
programming language constructs guided by context-
free grammars.
We associate Attributes to the grammar symbols
representing the language constructs.
Values for attributes are computed by Semantic Rules
associated with grammar productions.

2
4/27/2019

Evaluation of Semantic Rules may: –


Generate Code
Print messages
Insert information into the Symbol Table;
Perform Semantic Check;
Issue error messages;

Syntax directed translation


schemes
An SDT is a Context Free grammar with program fragments
embedded within production bodies
Those program fragments are called semantic actions
They can appear at any position within production body
Any SDT can be implemented by first building a parse tree
and then performing the actions in a left-to-right depth
first order
Typically SDT’s are implemented during parsing without
building a parse tree

3
4/27/2019

Syntax Directed Definitions


We can associate information with a language construct by attaching attributes
to the grammar symbols.
Attributes is any quantity associated with programming construct. Like line
numbers, data types, instruction details, table references, memory location etc
Values of attributes are computed by semantic rules associated with
productions.
2 notations for associating semantic rules:- syntax-directed definitions(SDD)
and syntax directed translation(SDT) schemes.
SDD are high level specifications for translation and hide implementation
details. Syntax directed definition (SDD)
E1 → E2 +T E1.code = E2.code||T.code||’+’
Production Semantic Rule
• We may alternatively insert the semantic actions inside the grammar
Syntax directed translation Scheme (SDT)
E → E +T {print ’+’} // semantic action E -> E1+T {print ‘+’}
F → id {print id.val}

•SDD: Specifies the values of attributes by associating


semantic rules with the productions.
• SDT scheme: embeds program fragments (also called
semantic actions) within production bodies.
•The position of the action defines the order in which
the action is executed (in the middle of production or
end).
•SDD is easier to read; easy for specification.
•SDT scheme – can be more efficient;
-easy for implementation.

4
4/27/2019

Example: SDD vs SDT scheme – infix to postfix trans

Syntax Directed Definitions


Attributes are Portioned into 2 subsets
Synthesized attributes
Value of synthesized attribute at node N is defined only in terms of
attribute values of children of N and at N itself
Inherited attributes
Value of an inherited attribute at node N is defined only in terms of
attribute values at N’s parent, N itself and N’s siblingss

5
4/27/2019

Form of Syntax Directed Definition


Each production, A → α, is associated with a set of semantic rules:
b := f ( c1, c2, . . . , ck ), where f is a function and either
1. b is a synthesized attribute of A, and c 1, c 2, . . . , c k are
attributes of the grammar symbols of the production, OR
2. b is an inherited attribute of a grammar symbol in α, and c1, c
2, . . . , ck are attributes of grammar symbols in α or attributes
of A.
• Note 1. Terminal symbols are assumed to have synthesized
attribute s supplied by the lexical analyzer.
• Note 2. Procedure calls (e.g. print in the next slide) define values
of Dummy synthesized attributes of the non terminal on the
left-hand side of the production.

An Example to illustrate working of SDD


SDD for desk calculator program
Production Semantic Rules
1) L -> E n Print( E.val)
2) E -> E1 + T E.val = E1.val + T.val
3) E -> T E.val = T.val
4) T -> T1 * F T.val = T1.val * F.val
5) T -> F T.val = F.val
6) F -> (E) F.val = E.val
7) F -> digit F.val = digit.lexval

Val is integer value synthesized attribute associated with each non


terminal E, T and F.
Semantic rules for production for E, T and F compute value of attribute
val
Digit has synthesized attribute lexval

6
4/27/2019

Synthesized Attributes

SDD using synthesized attributes is known as S-attributed definition.


Parse tree for S-attributed definition can be annotated by evaluating
semantic rules for attributes at each node bottom up from leaves to root

Annotated parse tree for 3*5 + 4n

Inherited Attributes
Value at node in parse tree is defined in terms of
attributes at parent and/or siblings of that node.
Convenient for expressing dependence of programming
language construct
Evaluation order:- cannot be evaluated by simple
traversal
Order of computation of inherited attributes of children are
important.
Inherited attributes of children can depend from both left and
right siblings.

7
4/27/2019

Inherited Attributes: An Example


Example. Let us consider the syntax directed definition with both inherited
and synthesized attributes for the grammar for “type declarations”:

The non terminal T has a synthesized attribute, type, determined by the


tokens int/real in the corresponding production.
• The production D -> TL is associated with the semantic rule L.in := T.type
which set the inherited attribute L.in.
• Note: The production L L1, id distinguishes the two occurrences of L.

Inherited Attributes: An Example (Cont.)


Synthesized attributes can be evaluated by a PostOrder traversal.
• Inherited attributes that do not depend from right children can be
evaluated by a PreOrder traversal.
• The annotated parse-tree for the input real id1, id2, id3 is:

L.in is then inherited top-down the tree by the other L-nodes.


• At each L-node the procedure addtype inserts into the symbol
table the type
of the identifier.

8
4/27/2019

Implementing Syntax Directed Definitions


Implementing a Syntax Directed Definition consists primarily in finding
an order for the evaluation of attributes
– Each attribute value must be available when a computation is
performed.
• Dependency Graphs are the most general technique used to
evaluate syntax directed definitions with both synthesized and inherited
attributes.
• A Dependency Graph shows the interdependencies among the
attributes of the various nodes of a parse-tree.
– There is a node for each attribute;
– If attribute b depends on an attribute c there is a link from the
node for c to the node for b (b  c).
• Dependency Rule: If an attribute b depends from an attribute c,
then we need to fire the semantic rule for c first and then the semantic
rule for b.

Evaluation orders for SDD’s


A dependency graph is used to determine the order of
computation of attributes
Dependency graph
For each parse tree node, the parse tree has a node for
each attribute associated with that node
If a semantic rule defines the value of synthesized
attribute A.b in terms of the value of X.c then the
dependency graph has an edge from X.c to A.b
If a semantic rule defines the value of inherited attribute
B.c in terms of the value of X.a then the dependency
graph has an edge from X.a to B.c
Example!

9
4/27/2019

Constructing dependency graph

Dependency graph

10
4/27/2019

Evaluation Order
• The evaluation order of semantic rules depends from a
Topological Sort derived from the dependency graph.
• Topological Sort: Any ordering m1,m2, . . . ,mk such that if
mi mj is a link in the dependency graph then mi < mj.
• Any topological sort of a dependency graph gives a valid order to
evaluate the semantic rules.

•Nodes in dependency graph are numbered


•There is edge to node 5 for L.in from node 4 for T.Type because of semantic rule
L.in := T.Type for production D-> TL
• Downward arrows into nodes 7 and 9 because of rule L1.in =L.in for production
L -> L1,id
•Rule addtype(id.entry,L.in) leads to creation of dummy attributes. (nodes 6,8,10)

11
4/27/2019

Ordering the evaluation of attributes


Attributes can be evaluated by building a dependency graph at
compile-time and then finding a topological sort.
Tolological sort of dircted graph is linear ordering of vertices such
that every directed edge uv from u to v , u comes before v in
ordering.
• Disadvantages
1. This method fails if the dependency graph has a cycle: We need
a test for non-circularity;
2. This method is time consuming due to the construction of the
dependency graph.

•The graph shown to the left has many valid topological sorts,
including:5, 7, 3, 11, 8, 2, 9, 10 (visual left-to-right, top-to-bottom)
•3, 5, 7, 8, 11, 2, 9, 10 (smallest-numbered available vertex first)
•5, 7, 3, 8, 11, 10, 9, 2 (fewest edges first)
•7, 5, 11, 3, 10, 8, 9, 2 (largest-numbered available vertex first)
•5, 7, 11, 2, 3, 8, 9, 10 (attempting top-to-bottom, left-to-right)
•3, 7, 8, 5, 11, 10, 2, 9 (arbitrary)

S-Attributed definitions
An SDD is S-attributed if every attribute is synthesized
We can have a post-order traversal of parse-tree to
evaluate attributes in S-attributed definitions

postorder(N) {
for (each child C of N, from the left) postorder(C);
evaluate the attributes associated with node N;
}

12
4/27/2019

L-Attributed definitions
A SDD is L-Attributed if the edges in dependency graph
goes from Left to Right but not from Right to Left.
More precisely, each attribute must be either
Synthesized
Inherited, but if there is a production A->X1X2…Xn and there
is an inherited attribute Xi.a computed by a rule associated
with this production, then the rule may only use:
Inherited attributes associated with the head A
Either inherited or synthesized attributes associated with the
occurrences of symbols X1,X2,…,Xi-1 located to the left of Xi
Inherited or synthesized attributes associated with this occurrence
of Xi itself, but in such a way that there is no cycle in the graph

Application of Syntax Directed


Translation
Type checking and intermediate code generation
(chapter 6)
Construction of syntax trees
Leaf nodes: Leaf(op,val)
Interior node: Node(op,c1,c2,…,ck)
Example:

Production Semantic Rules


1) E -> E1 + T E.node=new node(‘+’, E1.node,T.node)
2) E -> E1 - T E.node=new node(‘-’, E1.node,T.node)
3) E -> T E.node = T.node
4) T -> (E) T.node = E.node
5) T -> id T.node = new Leaf(id,id.entry)
6) T -> num T.node = new Leaf(num,num.val)

13
4/27/2019

Syntax tree for L-attributed definition


Production Semantic Rules
1) E -> TE’ E.node=E’.syn
E’.inh=T.node
2) E’ -> + TE1’ E1’.inh=new node(‘+’, E’.inh,T.node)
E’.syn=E1’.syn
3) E’ -> -TE1’
E1’.inh=new node(‘+’, E’.inh,T.node)
4) E’ -> ∈ E’.syn=E1’.syn
E’.syn = E’.inh
5) T -> (E)
T.node = E.node
6) T -> id
7) T -> num T.node=new Leaf(id,id.entry)
T.node = new Leaf(num,num.val)

Intermediate code generation


Intermediate code is the interface between front end
and back end in a compiler
Ideally the details of source language are confined to
the front end and the details of target machines to the
back end (a m*n model)

Static Intermediate Code


Parser
Checker Code Generator Generator
Front end Back end

14
4/27/2019

Graphical Representation of IC: Syntax Trees


Abstract syntax tree is condensed form of parse tree
useful for representing language constructs.

Production and corresponding syntax tree

Syntax tree for previous parse tree


-Has less number of nodes

Construction of Syntax Trees


Similar to translation of expression into postfix form.
Construct subtrees for sub-expressions by creating node for
each operator and operand.
Children of an operator node are roots of nodes representing
sub-expressions constituting operands of that operator.
Each node has multiple fields with one field identifying
operator and remaining contain pointers to nodes for
operands.
Following functions are used

15
4/27/2019

Syntax tree

Syntax tree for a-4+c

Following sequence of function calls will make the syntax tree

Example

Parse tree shown in


dotted

Syntax tree for


a-4+c

16
4/27/2019

Variants of syntax trees


It is sometimes beneficial to crate a DAG instead of
tree for Expressions.
This way we can easily show the common sub-
expressions and then use that knowledge during code
generation
Example: a+a*(b-c)+(b-c)*d

+ *

*
d
a -

b c

Directed Acyclic Graph (DAG) for expressions


Identifies common subexpressins in expression.
Has node for every sub-expression of expression
Interior node represnets an operator and its children
represent its operands.
Node in DAG representing common sub-expression
has more than one “parent”; where as in syntax tree,
common sub-expression would be represented as
duplicate sub-tree
For expression
a + a * (b-c) + (b – c) * d

17
4/27/2019

SDT for construction of dag


-Similar to constructing syntax tree with minor difference
-DAG is obtained if function for constructing node shall check whether identical
node exists
Eg mknode( op, left, right) will check for node labeled op and if it exists it will
return pointer to previously constructed node.
i/p:- a + a * (b-c) + (b – c) * d

Value-number method for constructing DAG’s

To entry for i

i 10

-nodes are implemented as records stored in an array, as in Fig. above,


-In the figure, each record has a label field that determines the nature of
node.
-refer to a node by its index or position in the array
-The integer index of a node is often called a value number for historical
reasons. For example, using value numbers, we can say node 3 has Label
+, its left child is node 1 , and its right child is node 2,

18
4/27/2019

Value-number method for


constructing a node in a dag
Algorithm
Search the array for a node M with label op, left child l
and right child r
If there is such a node, return the value number M
If not create in the array a new node N with label op, left
child l, and right child r and return its value
An obvious way to determine if node m is already in
the array is to keep a11 previously created nodes on a
list and to check each node on the list to see if it has
the desired signature.
We may use a hash table

Three address code


Sequence of statement of general form A := B op C where A,B, C are
compiler defined variables or programmer defined names, op is
operator.
In a three address code there is at most one operator at the right side of
an instruction

Example:

+
t1 = b – c
+ * t2 = a * t1
t3 = a + t2
* t4 = t1 * d
d
t5 = t3 + t4
a -

b c

19
4/27/2019

A Three address statement like A:= B op C puts B in ARG1,C


in ARG2 and A in result.
Statement with unary operator are assumed not to use
ARG2
Three Address Code for statement A := -B*(C+D)

T1 := -B
T2 := C + D
T3 := T1*T2
A :=T3

Forms of three address instructions


x = y op z
x = op y
x=y
goto L
if x goto L and if False x goto L
if x relop y goto L
Procedure calls using:
param x
call p,n
y = call p,n
x = y[i] and x[i] = y
x = &y and x = *y and *x =y

20
4/27/2019

Example
do i = i+1; while (a[i] < v);

L: t1 = i + 1 100: t1 = i + 1
i = t1 101: i = t1
t2 = i * 8 102: t2 = i * 8
t3 = a[t2] 103: t3 = a[t2]
if t3 < v goto L 104: if t3 < v goto 100

Symbolic labels Position numbers

Implementation of Three Address Code


Three address code is though to be an abstract specification
which can be implemented by number of ways by different
compilers.
Quadruples
Record structure with four fields, OP, ARG1, ARG2, RESULT.
OP contains internal code for operators.
Quadruples for statement A := -B*(C+D)
OP ARG1 ARG2 RESULT
(0) UMINUS B - T1 T1 := -B
T2 := C + D
(1) + C D T2 T3 := T1*T2
(2) * T1 T2 T3 A :=T3
(3) := T3 - A

Contents of fields ARG1, ARG2 and RESULT can be


temporary names or pointers to symbol table entries for
names represented by these fields.

21
4/27/2019

Triples
To avoid entering temporary names into symbol table, we
allow a statement computing temporary value to represent
that value.
A structure of this sort with only three fields OP, ARG1 and
ARG2 where ARG1 and ARG2, the arguments of OP are
either pointers to symbol table or pointers to structure
itself.

OP ARG1 ARG2
(0) UMINUS B -
(1) + C D
(2) * (0) (1)
(3) := A (2)

Indirect Triples
Listing pointers to triples rather than listing triples
themselves

STATEMENT
(0) (14)
(1) (15) OP ARG1 ARG2
(2) (16)
(14) UMINUS B -
(3) (17)
(15) + C D
(16) * (14) (15)
(17) := A (16)

22
4/27/2019

Comparison of representations
Using Quadruple:
On producing object code, each datum, temporary will be assigned memory
Quadruples tend to clutter up symbol table with temp names.
Quadruples make location for each temporary that can be immediately
accessed via symbol table
In case of triples we have no idea unless we scan code, how many temporaries
are active simultaneously, or how many words must be allocated for
temporaries, therefore assignment of locations to temporaries is deferred to
code generation.
Code motion: in quadruples, if we move statement computing A, statement
using A require no change ( used in optimization)
In triples moving statement that defines temporary value requires to change
all pointers to that statement in ARG1 and 2 arrays.
No problem in indirect triples, move statement requires reordering of
STATEMENT list. Can also save space as compared to quadruples if same temp
is used more than once.

SDT to generate 3-address code


S->id=E { gen(id.name=E.place);}
E:->E1+T { E.place=newTemp;
gen(E.place=E1.place+T.place)}
/T { E.place=T.place}
T-> T1*F {T.place=newTemp;
gen(T.place=T1.place*F.place);}
/F { T.place=F.place}
F->id { F.place=id.name;}

Input is x=a+b*c
O/P:- t1=b*c
t2=a+t1
x=t2

23
4/27/2019

Evaluation by Top Down Parser


S->id=E { gen(id.name=E.place);} -(1)
E:->E1+T { E.place=newTemp; 2a,
gen(E.place=E1.place+T.place)} 2b
/T { E.place=T.place} 3
T-> T1*F {T.place=newTemp; 4a,
gen(T.place=T1.place+F.place);} 4b
/F { T.place=F.place} 5
F->id { F.place=id.name;} 6

SDT for type checking


E-> E1 + E2 { if ((E1.type==E2.type) && (E1.type == int)) then E.type=int else error
/ E1==E2 { { if ((E1.type==E2.type) && (E1.type == int/boolean)) then E.type=boolean
else error

/ (E1) { E.type = E1.type}


/ num { E.type =int}
/True { E.type =boolean
/False{ E.type =boolean

Process the input :- (4+20) = =8

24
4/27/2019

Parse-Stack implementation of postfix SDT’s


In a shift-reduce parser we can easily implement semantic action
using the parser stack
For each nonterminal (or state) on the stack we can associate a
record holding its attributes
Shift:-push the value of the terminal on the semantic stack
Reduce:-Execute the semantic action at the end of a production
to evaluate the attribute(s) of the non-terminal at the leftside of
the production
pop k values from the semantic stack, where k is the number of
symbols on production’s RHS
– push the production’s value on the semantic stack

T -> T1 * F
T * F
T.Val F.val
Example T
T.val

L -> E n {print(stack[top-1].val);
top=top-1; }
E -> E1 + T {stack[top-2].val=stack[top-2].val+stack.val;
top=top-2;}
E -> T
T -> T1 * F {stack[top-2].val = stack[top-2].val +
stack[top].val;
top=top-2; }
T -> F
F -> (E) {stack[top-2].val=stack[top-1].val
top=top-2;}
F -> digit

25
4/27/2019

SDT’s with actions inside productions


Turn de sk-calculator into an SDT
that prints the prefix form of an
For a production B->X {a} Y expression
If the parse is bottom-up then we
perform action “a” as soon as this
occurrence of X appears on the top of 1) L -> E n
the parser stack 2) E -> {print(‘+’);} E1 + T
If the parser is top down we perform “a” 3) E -> T
just before we expand Y 4) T -> {print(‘*’);} T1 * F
5) T -> F
Sometimes we cant do things as 6) F -> (E)
easily as explained above 7) F -> digit {print(digit.lexval);}
One example is when we are parsing
this SDT with a bottom-up parser

SDT’s with actions inside


productions (cont) L
Any SDT can be
implemented as follows E
1. Ignore the actions and
{print(‘+’);}
produce a parse tree E + T
2. Examine each interior
node N and add actions T F
as new children at the {print(4);}
{print(‘*’);}
correct position T *F digit
3. Perform a preorder {print(5);}
traversal and execute F digit
actions when their nodes {print(3);}
are visited digit

26
4/27/2019

27

You might also like