Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 43

UNIT IV

Syntax Directed Translation


Syntax Directed Definitions , Evaluation order of SDD’s , Application of SDD and
Syntax Directed Translation Schemes.

Intermediate code
Syntax tree , Polish notation and Three Address code ,static single assignment ,
Translation of Expressions ,Control flow Statements and Boolean Expressions

Run time storage


Storage Organization , Storage allocation Strategies , and Parameter passing
techniques.
SDD(Syntax Directed Definition)
Syntax Directed Definition(SDD) is a Context Free Grammar(CFG) with semantic rules
attached to the Productions.
The Semantic rules define values for attributes associated with the Non-terminal Symbol of the
Production.
SDD is a CFG which consists of attributes and rules.
• Attributes=>They are associated with grammar Symbol.
• Rules=>They are associated with Productions.

Example:

If ‘X’ is a Grammar Symbol and ‘b’ is one of its Attributes , then X.b denotes the value of ‘b’
at a particular Parse-Tree node labeled as ‘X’. Attributes can be number ,types table
reference or Strings.

Production Semantic rules


E->E+T E.val=E.val+T.val
E->T E.val=T.val
Inherited and Synthesized Attributes:
Attributes are the values attached to the nodes of a Parse
tree . Attributes is any property of a Programming Language construct.
Types of Attributes:
1. Synthesized Attribute:
If a node takes value from its childen then it is called Synthesized Attribute.
Example: Production rule A->x
A.val=x.val A.val is Synthesized Attribute because its value depending on the children
attribute value.

A->BCD here A is parent node , BCD are child nodes


A.val=B.val
A.val=C.val
A.val=D.val
2. Inherited Attribute:
If a node takes value from its parent node or siblings.
Example : Production Rule A->xy
x.val=A.val y.val=A.val
x.val=y.val y .val=x.val

A->BCD here A is parent node , BCD are child nodes


C.val=A.val
C.val=B.val
Evaluating an SDD at the Nodes of a Parse Tree:
A Parse tree that shows the value of its attributes at each
node is called an Annotated Parse Tree (APT ).The Process of computing the attribute values at the
nodes is called ANNOTATING or DECORATING the parse-tree.
Example:
Construct Syntax Directed Definition for the following Grammar by using Synthesized Attribute
and Construct Annotated Parse Tree for the Input String 2+3*4
E->E+T
E->T
T->T*F
T->F
F->num

Syntax Directed Definition:

Production Semantic rule


E->E+T E.val->E.val+T.val
E->T E.val->T.val
T->T*F T.val->T.val*F.val
T->F T.val->F.val
F->num F.val->num.lexvalue
Syntax Tree : Parse Tree:
+ E
2 * E + T
3 4 T T * F
F F num(4)

num(2) num(3)

Annotated Parse Tree:


E.val=14

E.val=2 + T.val=12

T.val=2 T.val=3 * F.val=4

F.val=2 F.val =3 num(4)

num(2) num(3)
Evaluation order of SDD’s
The useful tool that can be used for determining an evaluation order for the attribute instances in a given parse tree
is Dependency Graph. Annotated parse tree indicates attributes values and Dependency graph helps to
determine how those values can be computed.
Dependency Graph:
Dependency Graph indicates the flow of information among the attributes instances in a
Particular Parse tree : if an edge exists from one attribute instances to another , it means that the value of the first
is needed to compute the second . These edges are used to express the constraints implied by the semantic rules.
Generally,
1.The Dependency Graph has a node for each attribute associated with X ,for each node labeled by Grammar Symbol X.
2.Let ‘P’ be a Production .If a Semantic rule associated with P defines the value of Synthesized attribute A.b in terms of
the value of X.a,then the dependency graph has an edge from X.a to A.b.
3.If a semantic rule associated with P defines the value of inherited attribute B.a interms of the value of X.a,then the
Dependency Graph has an edge from X.a to B.a.
Example: Consider the following production and rule:
PRODUCTION SEMANTIC RULE
E ->E1+ T E.val= E1.val + T.val
At every node N labeled E, with children corresponding to the body of this production ,the synthesized attribute
val at N is computed using the values of val at the two children , labeled E and T. Thus, a portion of the
Dependency graph for every parse tree .we shall show the parse tree edges as dotted lines, while the edges of
the dependency graph are solid.
Types of SDD:
1.S-Attribute SDD or S-Attribute grammar or S-Attribute Definition .
2.L-Attribute SDD or L-Attribute grammar or L-Attribute Definition .

S-Attribute L-Attribute
1.A SDD that uses only Synthesized 1.A SDD that uses both Synthesized and
Attribute is called as S-Attribute SDD Inherited Attribute is called a L-Attributed
Ex: A->BCD SDD,but each Inherited Attribute is
A.val=B.val restricted to inherits from parent or left
A.val=C.val sibling only.
A.val=D.val Ex:A->xyz {y.val=A.val,y..val=x.val}
Wrong-y.val=z.val

2.Semantic Actions are always placed at 2.Semantic Actions are placed any where
right end of the production .It is also called on RHS
as Postfix SDD.

3.Attributes are Evaluated with bottom-up 3.Attributes are evaluated by traversing


parsing parse tree depth first ,left to right order
SDT(Syntax Directed Translation):During the process of parsing the evoluation of
attributes takes place by putting semantic actions in between { } at the right of
the grammar symbol is called SDT.
Production Semantic Rule
T->T*F T.val={T.val*F.val}
T->F F.val={F.val}
Example 1:
E->E*T
E->T
T->F-T
T->F
F->num

Production Semantic rule


E->E*T {E.val->E.val*T.val}
E->T {E.val->T.val}
T->F -T {T.val->F.val-T.val}
T->F {T.val->F.val}
F->num {F.val->num.lexvalue}
Input : ((4-(2-4)*2)
Parse tree:
E

E * T

T F

F - T num(2)

num(4) F - T

num(2) F

num(4)
Exampe 2: S->xxW {print(1);}
S->y {print(2);}
W->Sz {print(3);}
String xxxxyzz
Parse tree: S

x x W

S z

x x W

S z

y Output 23131
• Translation Scheme to convert infix to postfix expression
String 2+3*4
Production Semantic rule parse tree
E->E+T {printf(“+”);} --1 E
E->T { } ---2
T->T*F {printf(“*”);}---3 E + T 1
T->F {}---4
F->num {printf(num.lvalue}—5 T 2 T * F 3

F 4 F 4 num(4) 5
5 3
num(2) num(3)
Applications of SDD
One of the major applications of SDD is the Construction of Syntax trees .
In a Syntax tree,Operators and Keywords do not appear as leaves,but rather they are associated with interior node
would be the parent of those leaves in the parse tree.
The Syntax tree for the Expression 3*4+5 is given by ,

Construction Syntax Trees for Expressions:


•Construction of a Syntax Tree for an expression is similar to the translation of the expression into postfix form.
•Subtrees are Constructed for sub expressions by creating a node for each operator and operand.
•In the node for an operator ,one field identifies the operator and the remaining fields contain pointers to the nodes for
the operands.
•The operator is called the label of the node.

The Following Functions are used to create the nodes of Syntax trees for expressions with binary operators.
1)mknode(op,left,right): Creates an operator node label with op and two fields containing pointers to left and right
operands.
2)mkleaf(id,entry): Creates an identifier node with label id and a field containing entry,which is a pointer to the
Symbol-table entry for the identifier.
3)mkleaf(num,val): Creates a number node with label num and a field containing val ,the value of the number
Example : Constructs Syntax Trees for a simple expression a-4+c ,the sequence of
functions calls are

1)P1:=mkleaf(id,entry a);
2)P2:=mkleaf(num,4);
3)P3:=mknode(‘-’,P1,P2);
4)P4:=mkleaf(id,entry c);
5)P5:=mknode(‘+’,P3,P4);
Syntax Directed Translation Schemes(SDTS)
Syntax-directed translation schemes are a complementary notation to syntax directed definitions. A syntax-directed
translation scheme (SDT) is a context free grammar with program fragments embedded within production
bodies. The program fragments are called semantic actions and can appear at any position within a production
body.
Any SDT can be implemented by first building a parse tree and then performing the actions in a left-to-right depth-
first order; that is, during a preorder traversal. Typically, SDT's are implemented during parsing, without building a
parse tree. To implement SDT's, two important classes of SDD's:
1. The underlying grammar is LR-parsable, and the SDD is S-attributed.
2. The underlying grammar is LL-parsable, and the SDD is L-attributed.

1. Postfix Translation Schemes:


The simplest SDD implementation occurs when we can parse the grammar bottom-up and the SDD is S-attributed.
In that case, we can construct an SDT in which each action is placed at the end of the production and is executed
along with the reduction of the body to the head of that production. SDT's with all actions at the right ends of the
production bodies are called postfix SDT's.
Example : Let us Construct Postfix SDT for desk Calculator

Here tha action for the production 1 Prints a value.


The given grammar is LR , and SDD is S-Attributed.
2. Parser-Stack Implementation of Postfix SDT's:
A Stack is being used by the bottom up parser to hold information of the parsed
Subtrees .During LR Parsing ,Postfix SDT’s can be implemented by considering the attributes of each grammar
on the stack in a place where grammar symbol is found during reduction.

State/grammar Symbol Synthesized attributes


: :
X X.x
Y Y.y
TOP - > Z Z.z
: :
Parser stack with a field for Synthesized attributes
Here TOS contains three grammar symbols XYZ . They are to be reduced based on the production A->XYZ. Here
X.x is on attribute of X and so on .
3. Eliminating Left Recursion from SDT's:
Since no grammar with left recursion can be parsed deterministically top-
down, we examined left-recursion elimination. When the grammar is part of an SDT, we also need to worry about
how the actions are handled.
In order to Eliminate Left recursion if: A->Aα|β then,
Replace them by following Productions : -
A-> βR
R-> αR|Є,
Where β doesn’t start with A.Here ‘R’ stands for ‘remainder’.
Example: Give the Translation Scheme that Converts infix to postfix form for the following
grammar.Also generate the annotated parse tree for input string 2+6+1
E->E+T {print(‘+’)}
E->T
T->0 {print(‘0’)}
T->1 {print(‘1’)}
T->2 {print(‘2’)}
T->3 {print(‘3’)}
T->4 {print(‘4’)}
T->5 {print(‘5’)}
T->6 {print(‘6’)}
T->7 {print(‘7’)}
T->8 {print(‘8’)}
T->9 {print(‘9’)}
Solution:
Let us Convert this grammar into Non Left Recursion Form
E->TP
T->0|1|2|3|4|5|6|7|8|9
P->+Tp|Є
Production Semantic Action
E->TP
T->0 {print(‘0’)}
T->1 {print(‘1’)}
T->2 {print(‘2’)}
T->3 {print(‘3’)}
T->4 {print(‘4’)} The Annotated Parse Tree For the String 2+6+1
T->5 {print(‘5’)}
T->6 {print(‘6’)}
T->7 {print(‘7’)}
T->8 {print(‘8’)}
T->9 {print(‘9’)}
P->+TP|Є {print(‘+’)}P|Є
Intermediate code
Intermediate code:
• Intermediate Code is a modified input source program which is stored in some data structure.
• The front end translates a source program into an intermediate representation from which the back
end generates target code.
• If the compiler directly translates source code into the machine code without generating
intermediate code then a full native compiler is required for each new machine.
• Intermediate code generator receives input from its predecessor phase and semantic analyzer phase.
It takes input in the form of an annotated syntax tree.

The following are commonly used intermediate code representations(or)Forms of Intermediate Code
1. Syntax Tree
2.Postfix Notation
3. Three-Address Code
1.Syntax Tree:
• Some of the Front-End of compilers translate the input source program into an intermediate form known as
Abstract Syntax tree(AST).
• Here Leaf node represents the Operands like a,b,5 etc.
• The Interior nodes correspond to the Operators like +.- etc.
• The Natural Hierarchical Structure is represented by Syntax Trees.
• The syntax tree not only condenses the parse tree but also offers an improved visual representation of the
program’s syntactic structure,
Example: x = (a + b * c) / (a – b * c)
2.Postfix Notation:
• Also known as reverse Polish notation or suffix notation.
• In the infix notation, the operator is placed between operands, e.g., a + b.
• Postfix notation positions the operator at the right end, as in ab +. For any postfix
expressions e1 and e2 with a binary operator (+) , applying the operator yields e1e2+.
• Postfix notation eliminates the need for parentheses, as the operator’s position and arity allow
unambiguous expression decoding.
• In postfix notation, the operator consistently follows the operand.
Example 1: The postfix representation of the expression
(a + b) * c is : ab + c *
Example 2: The postfix representation of the expression
(a – b) * (c + d) + (a – b) is : ab – cd + *ab -+

3. Three-Address Code:
• A three address statement involves a maximum of three references, consisting of two for operands and one for
the result.
• A sequence of three address statements collectively forms a three address code.
• The typical form of a three address statement is expressed as x = y op z, where x, y, and z represent memory
addresses.
• Each variable (x, y, z) in a three address statement is associated with a specific memory location.
• While a standard three address statement includes three references, there are instances where a statement may
contain fewer than three references, yet it is still categorized as a three address statement.
Example : The three address code for the expression a+b*c+d
T1 = b * c
T2 = a + T1
Example 1: Write Three Address Code for the following expression
a=b+c+d
The Three Address Code for the above expression as
t1=b+c
t2=t1+d
a=t2
Example 2: Write Three Address Code for the following expression
-(a+b)+(c+d)-(a+b+c+d)
The Three Address Code for the above expression as
t1=a+b
t2=uminus t1
t3=c+d
t4=t2+t3
t5=a+b
t6=t3+t5
t7=t4-t6
Representation of Three Address Code :
1.Quadruples 2.Triples 3.Indirect Triples
1.Quadruples: In Quadruples representation each instruction is
splitted into the following 4 different fields
op arg1 arg2 result
Here ,op field used for storing the internal code of the operator
The arg1 and arg2 are used for storing the two operands.
The result field is used for storing the result of the expression.
Example: a=b* -c + b* -c
Three address code as Quadruple representation as
t1=-c Location op arg1 arg2 result
t2=b*t1 (0) - c t1
t3=-c (1) * b t1 t2
(2) - c t3
t4=b*t3
(3) * b t3 t4
t5=t2+t4 (4) + t2 t4 t5
2.Triples : In Triples representation the use of temporary variables are
avoided by refering the pointers in the symbol table.each
instruction is splitted into the following 3 different fields
op arg1 arg2
Here ,op field used for storing the internal code of the operator
The arg1 and arg2 are used for storing the two operands.
Example: a=b* -c + b* -c
Three address code as Triple representation as
t1=-c
Location op arg1 arg2
t2=b*t1 (0) - c
t3=-c (1) * b (0)
t4=b*t3 (2) - c
(3) * b (2)
t5=t2+t4
(4) + (1) (3)
3.Indirect Triples : In this representation the listing of triples is being
done and listing pointers are used instead of using statements.
Example: a=b* -c + b* -c
Three address code as Indirect Triple representation as
t1=-c
t2=b*t1
(0) 100
t3=-c (1) 101
t4=b*t3 (2) 102
t5=t2+t4 (3) 103
(4) 104
Ex1: Translate the following expression to Quadruples,Triples,Indirect Triples
a+b*c/e↑f+b*c
Three address code as 1)Quadruple:
t1=e↑f LOC op arg1 arg2 result
t2=b*c (0) ↑ e f t1
t3=t2/t1 (1) * b c t2
t4=b*a (2) / t2 t1 t3
t5=a+t3 (3) * b a t4
T6=t5+t4 (4) + a t3 t5
2)Triples: (5) + t5 t4 t6

loc op arg1 arg2 3)Indirect Triples:


(0) ↑ e f 35 (0)
(1) * b c 36 (1)
(2) / (1) (0) 37 (2)
(3) * b a 38 (3)
(4) + a (2) 39 (4)
(5) + (4) (3) 40 (5)
Ex 2: Translate the following expression to Quadruples,Triples,Indirect Triples
Example: (a+b)*(c-d)
Three address code as 1)Quadruple representation as:
t1=a+b LOC op arg1 arg2 result
t2=c-d (0) + a b t1
t3=t1*t2 (1) - c d t2
(2) * t1 t2 t3

2)Triples:
loc op arg1 arg2 3)Indirect Triples:
op arg1 arg2
(0) + a b (0) (10)
(0) + a b
(1) - c d (1) (11)
(2) * (0) (1) (1) - c d
(2) (12)
(2) * (10) (11)
Ex 3: Translate the following expression to Quadruples,Triples,Indirect
Triples
Example: R=-(a+b)*(c+d)+(a+b+c)
Three address code as 1)Quadruple representation as:
t1:=a+b
t2:=-t1
t3:=c+d LOC op arg1 arg2 result
(0) + a b t1
t4:=t2*t3
(1) - t1 t2
t5:=t1+c
(2) + c d t3
t6:=t4+t5 R:=t6 (3) * t2 t3 t4
(4) + t1 c t5
(5) + t4 t5 t6
(6) = t6 R
R=-(a+b)*(c+d)+(a+b+c)
Three address code as 2)Triples:
t1:=a+b loc op arg1 arg2
t2:=-t1 (0) + a b
t3:=c+d
(1) - (0)
t4:=t2*t3
(2) + c d
t5:=t1+c
t6:=t4+t5 R:=t6 (3) * (1) (2)
3)Indirect Triples: (4) + (0) c
(5) + (3) (4)
Triple no Statement
(6) = R (5)
(0) (10)
(1) (11)
(2) (12)
(3) (13)
(4) (14)
(5) (15)
(6) (16)
Advantages of Intermediate Code Generation:

• Easier to implement: Intermediate code generation can simplify the code generation
process by reducing the complexity of the input code, making it easier to implement.
• Facilitates code optimization: Intermediate code generation can enable the use of
various code optimization techniques, leading to improved performance and
efficiency of the generated code.
• Platform independence: Intermediate code is platform-independent, meaning that it
can be translated into machine code or bytecode for any platform.
• Code reuse: Intermediate code can be reused in the future to generate code for other
platforms or languages.
• Easier debugging: Intermediate code can be easier to debug than machine code or
bytecode, as it is closer to the original source code.
Static Single Assignment:

SSA is also an Intermediate representation useful for code


optimizations .SSA differs in 3AC as
All assignments in SSA are variable with distinct names.
Example:
if 3AC Its Assignment form is
a=b+d a1=b+d
e=a-f e1=a1-f
a=e*g a2=e1*g
e=h-e e2=h-e3
Translation of Expressions:
Let us see how to translate expressions.Consider assignment statements and
expressions as:
Example 1:
F->F1+F2
Semantic Rules:
F.addr=new temp()
F.code= F1.code||F2.code||gen(F.addr ’=‘F1.addr ’+’F2.addr)
Example 2:
F->F1*F2
Semantic Rules:
F.addr=new temp()
F.code= F1.code||F2.code||gen(F.addr ’=‘F1.addr ’*’F2.addr)
Example 3:
F->id
Semantic Rules:
F.addr=top.get(id,lexme)
Flow of Control Statements: The main idea of converting any flow of control
statements to the 3AC is to simulate the Branching of the flow of control
using the goto statement.
Simple if:
S->if(B) then S1
Semantic rule:
B.true=newlabel()-this function generates a three address code
B.false=S.next;
S1.next=S.next;
S.code=B.code||Label(B.true)||S1.code

if else:
S->if(B) then S1 else S2
Semantic rule:
B.true=newlabel()
B.false=newlabel()
S1.next=S.next;
S2.next=S.next;
S.code=B.code||Label(B.true)||S1.code||Label(B.false)||S2.code
While:
S->while E do S1
Semantic rule:
S.start=newlabel();
E.true=newlabel()
E.false=S.next;
S1.next=S.next;
S.code=label(S.begin)||E.code||label(E.true)||S1.code||label(goto
S.start)
Translation of boolean expressions: A Boolean
expression can be Translated into semantic
rules as shown:
1.B->B1||B2
B.code=B1.code||label(B1.false)||B2.code;
2. B->B1&&B2
B.code=B1.code||label(B1.true)||B2.code;
3.B->!B1
B1.true=B.false
B1.false=B.true
Run time Storage
Activation tree:
• The main aim of runtime systems is to map procedures to their
activations.
• Each execution of the procedure is called an activation .
• A procedure Definition associates an identifier with a statement or a
block of statements.
• The identifier to the procedure is called the procedure name
• An activation tree is used to determine the way the control flows
between procedures.
• During program Execution ,the control flow is sequential among
procedures.
• In a procedure ,the execution begins at the starting point of its body .
• Activation tree for the procedure call fib(4) as fib(4)

fib(3) fib(2)

fib(2) fib(1)
Storage organization: It deals with the division of memory into
different areas ,management of the activation records and layout of
the local data.
a.Sub division of run-time memory: The underlying hardware and
operating system provide a division of memory into the following
areas. Heap area
Stack area
Static area
Code area

Code area: This contains the generated code


Static area: This contains data whose absolute address can be determined
at compile time.In C the global and static variables are kept in this area.
Stack area: The local variables and parameters of all ‘C’ procedures would
reside in this area.
Heap area: This is data entered at run time.
Storage Allocation: Different types of storage allocation schemes are
followed.They are 1.Static Allocation 2.Stack Allocation 3.Heap allocation
1.Static storage allocation
• In static allocation, names are bound to storage locations.
• If memory is created at compile time then the memory will be created in
static area and only once.
• Static allocation supports the dynamic data structure that means memory is
created only at compile time and deallocated after program completion.
• The drawback with static storage allocation is that the size and position of
data objects should be known at compile time.
• Another drawback is restriction of the recursion procedure.
2.Stack Storage Allocation
• In static storage allocation, storage is organized as a stack.
• An activation record is pushed into the stack when activation begins and it is
popped when the activation end.
• It works on the basis of last-in-first-out (LIFO) and this allocation supports the
recursion process.
3.Heap Storage Allocation
• Heap allocation is the most flexible allocation scheme.
• Allocation and deallocation of memory can be done at any time and
at any place depending upon the user's requirement.
• Heap allocation is used to allocate memory to the variables
dynamically and when the variables are no more used then claim it
back.
• Heap storage allocation supports the recursion process.
Example:
fact (int n)
{
if (n<=1)
return 1;
else
return (n * fact(n-1));
}
Parameter passing techniques:
The communication medium among procedures is known as parameter passing. The values of the
variables from a calling procedure are transferred to the called procedure by some mechanism. Before
moving ahead , first go through some basic terminologies pertaining to the values in a program.

r value
The value of the Expression is called its r-value. The value contained in a single variables also becomes
an r-value if it appears on the right-hand side of the assignment operator. r-value can always be
assigned to some other variables.

l-value
The location of the memory where an expression is stored is known as the l-value of that expression . It
ialways appears at the left hand side of the assignment operator.
Eg: Week = day * 7

Based on the parameters there are various parameter passing methods , the most common methods
are
1.Pass by value
The actual parameters are evaluated mand their r values are passed to called procedure.
2.Pass by reference
The l value ,the address of parameters passed to the called routines.

You might also like