Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Experiment No.

7
Aim: Implementation of code generation phase of compiler.
Theory:
What is Code Generation?

The first part of a compiler analyzes the source code into a structure that carries the meaning of the program; this
structure is generally the abstract syntax tree that’s been checked and decorated. (Remember decorated means all
identifier references have been resolved.)

From this structure we can generate the corresponding code in some other language, the target language. This is
what a code generator does.

Some compilers generate twice: they first generate code in some “intermediate language” like SIL, LLVM IR,
HIR, MIR, CIL, etc. Then they do the “real” code generation into a target language that is directly runnable (or
really close to it), like virtual machine code, assembly language, or machine language.

Code generator is used to produce the target code for three-address statements. It uses registers to store the
operands of the three address statement.

Example:
Consider the three address statement x:= y + z. It can have the following sequence of codes:

MOV x, R0
ADD y, R0

Register and Address Descriptors:

A register descriptor contains the track of what is currently in each register. The register descriptors show that all
the registers are initially empty.

An address descriptor is used to store the location where current value of the name can be found at run time.

A code-generation algorithm:

The algorithm takes a sequence of three-address statements as input. For each three address statement of the form
a:= b op c perform the various actions. These are as follows:

1. Invoke a function getreg to find out the location L where the result of computation b op c should be stored.
2. Consult the address description for y to determine y'. If the value of y currently in memory and register
both then prefer the register y' . If the value of y is not already in L then generate the instruction MOV y'
, L to place a copy of y in L.
3. Generate the instruction OP z' , L where z' is used to show the current location of z. if z is in both then
prefer a register to a memory location. Update the address descriptor of x to indicate that x is in location
L. If x is in L then update its descriptor and remove x from all other descriptor.
4. If the current value of y or z have no next uses or not live on exit from the block or in register then alter
the register descriptor to indicate that after execution of x : = y op z those register will no longer contain
y or z.

Generating Code for Assignment Statements:


The assignment statement d:= (a-b) + (a-c) + (a-c) can be translated into the following sequence of three address
code:
t:= a-b
u:= a-c
v:= t +u
d:= v+u
Program: -
#include <stdio.h>
#include <string.h>

struct Quadruple {
char op;
char arg1[5];
char arg2[5];
char result[5];
} quad[15];

int n = 0;
char expn[20][20];

void codegen(char op[5], int t) {


printf("MOV R0, %s\n", quad[t].arg1);
printf("%s R0, %s\n", op, quad[t].arg2);
printf("MOV %s, R0\n", quad[t].result);
}

void assignment(int t) {
printf("MOV R0, %s\n", quad[t].arg1);
printf("MOV %s, R0\n", quad[t].result);
}

void explore() {
for (int i = 0; i < n; i++) {
int j, t = 0;

while (expn[i][t] != '=' && expn[i][t] != '+' && expn[i][t] != '-' && expn[i][t] != '*' && expn[i][t] != '/') {
quad[i].result[t] = expn[i][t];
t++;
}
quad[i].result[t] = '\0';

quad[i].op = expn[i][t];
t++;

for (j = t; expn[i][j] != '\0'; j++) {


if (expn[i][j] == '+' || expn[i][j] == '-' || expn[i][j] == '*' || expn[i][j] == '/') {
quad[i].op = expn[i][j];
break;
}
quad[i].arg1[j - t] = expn[i][j];
}
quad[i].arg1[j - t] = '\0';
for (t = j + 1, j = 0; expn[i][t] != '\0'; t++, j++) {
quad[i].arg2[j] = expn[i][t];
}
quad[i].arg2[j] = '\0';
}
}

int main() {
int m;

printf("Enter the number of statements\n");


scanf("%d", &m);

printf("Enter the statements\n");


for (int i = 0; i < m; i++) {
scanf("%s", expn[i]);
strcpy(quad[n++].result, expn[i]);
}

explore();

printf("\nCode generated:\n");

for (int i = 0; i < n; i++) {


if (quad[i].op == '+')
codegen("ADD", i);
else if (quad[i].op == '=')
assignment(i);
else if (quad[i].op == '-')
codegen("SUB", i);
else if (quad[i].op == '*')
codegen("MUL", i);
else if (quad[i].op == '/')
codegen("DIV", i);
}

return 0;
}
Output:

Conclusion: Hence, we have studied and implemented code generation in C.

You might also like