Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

LABORATORY MANUAL

Theory of Compiler

COMP – 433

ACADEMIC YEAR- 1438-1439H

Mohammad Shabbir Alam


Department of Computer Science
Jazan University, Jazan, KSA

Course Coordinator Track Leader


Mohammad Shabbir Alam Dr. Shadab Alam

Compiled by Mohammad Shabbir Alam COMP-433


PREFACE

This document has been prepared to serve as a laboratory manual for Comp 433
Theory of compiler for Computer Science students. The manual consists of a set of
experiments designed to allow students to build, and verify small compiler. This set of
experiments cover relevant topics prescribed in the syllabus and are designed to
reinforce the theoretical concepts taught in the classroom with practical experience in
the lab. An integral part of this course is the project. Students are required to make
group of 2 students from day one. Compiler construction would be accomplished in
three phases during the semester and specifications would be provided on the course
web page.

Compiled by Mohammad Shabbir Alam COMP-433


Lab Objectives

1. To provide an Understanding of the language translation

peculiarities by designing complete translator for mini language.

2. To provide an understanding of the all phases of compiler.

Compiled by Mohammad Shabbir Alam COMP-433


INDEX

1. Syllabus

2. Hardware/Software Requirement

3. Rational behind the Compiler Design lab

4. Practical conducted in the lab

5. References

6. New ideas besides University Syllabus

Compiled by Mohammad Shabbir Alam COMP-433


Theory of Compiler LAB

1.Practice of Lexical Analyzer implementation.

2. Write a program to check whether a string belongs to the grammar or not.

3. Write a program for recursive descent parsing and check the validity of string.

4. Write a program for computation of FIRST of Non-terminal .

5. Write a program to find Number of Whitespaces and Newline .

6. Program to check whether string is keyword or not.

7. Write a program for implementation of (Bottom up Parsing) shift Reduce Parsing.

8. Write a program to generate Three Address Code for assignment operator.

9. Write a program to implement code generator in c++ .

10. Write a Program in C++ to perform string matching.

11. Write a C++ program to check the validation of Email address.

software requirements

Software Requirements:

Language: Visula Studio C++(Visual studio.Net)

System Configuration on which lab is conducted

Processor: Dual Core (2.8Ghz)

RAM 2 GB

HDD 340GB

Monitor LCD

Keyboard Multimedia

Operating System Windows XP

Mouse Scroll

Compiled by Mohammad Shabbir Alam COMP-433


Rational behind Compiler Design LAB

Compiler is a System Software that converts High level language to low level lang.

We human beings can’t program in machine lang(low level lang.) understood by


Computers so we prog. In high level lang and compiler is the software which bridges
the gab between user and computer.

It’s a very complicated piece of software which took 18 man years to build first
compiler .To build this S/w it is divided into six phases which are

1)Lexical Analysis

2)Syntax Analysis

3)Semantic Analysis

4)Intermediate Code Generation

5)Code Optimization

6)Code Generation.

In the lab sessions students implement Lexical Analyzers and code for each phase to
understand compiler software working and its coding in detail.

Compiled by Mohammad Shabbir Alam COMP-433


Practical 1.

A Lexical Analyzer Generator in C++

Here you will get program to implement lexical analyzer in C++.

Compiler is responsible for converting high level language in machine language.


There are several phases involved in this and lexical analysis is the first phase.

Lexical analyzer reads the characters from source code and converts it into
tokens.

Different tokens or lexemes are:

Keywords

Identifiers

Operators

Constants

Take below example.

c = a + b;
After lexical analysis a symbol table is generated as given below.

Token Type

C identifier

= operator

A identifier

+ operator

B identifier

; separator

Compiled by Mohammad Shabbir Alam COMP-433


Practical 1. C++ Program for implementation of very simple lexical analyzer
which reads source code from file and then generates tokens.
#include<iostream>
#include<fstream>
#include<stdlib.h>
#include<string.h>
#include<ctype.h>

using namespace std;


/* Array of keywords*/

int isKeyword(char buffer[]){


char keywords[32][10] =
{"auto","break","main","case","char","const","continue","default",

"do","double","else","enum","extern","float","for","goto",

"if","int","long","register","return","short","signed",

"sizeof","static","struct","switch","union",
"unsigned","void","volatile","while"};
int i, flag = 0;

for(i = 0; i < 32; ++i){


if(strcmp(keywords[i], buffer) == 0){
flag = 1;
break;
}
}

return flag;
}

int main(){
/* Declaration of operator*/

char ch, buffer[15], operators[] = "+-*/%=";


ifstream fin("program.txt"); //input file
int i,j=0;

if(!fin.is_open()){
cout<<"error while opening the file\n";
exit(0);
}

while(!fin.eof()){
ch = fin.get();

for(i = 0; i < 6; ++i){


if(ch == operators[i])
cout<<ch<<" is operator\n";
}

if(isalnum(ch)){
buffer[j++] = ch;
}
else if((ch == ' ' || ch == '\n') && (j != 0)){
buffer[j] = '\0';

Compiled by Mohammad Shabbir Alam COMP-433


j = 0;

if(isKeyword(buffer) == 1)
cout<<buffer<<" is keyword\n";
else
cout<<buffer<<" is identifier\n";
}

fin.close(); //close input file

return 0;
}

OUTPUT

Compiled by Mohammad Shabbir Alam COMP-433


Practical 2 (Syntax Analysis) Implementation of Parser

PROGRAM TO CHECK WHEATHER A STRING BELONGS TO A


GRAMMAR OR NOT.

#include<stdio.h>
#include<conio.h>
#include<string.h>
#include<stdlib.h>
/* Function Declaration */
int A();
void disp();
void error();
char s[20];
int i;
void main()
{
printf("S -> cAd\n"); //input grammar
printf("A -> ab/a\n");
printf("Enter the String:\n");
scanf("%s",&s);
i=0;
if(s[i++]=='c'&&A())
{
if(s[++i]=='d'&&s[i+1]==NULL)
disp();
else
error();
}
else
error();
}
int A() // Function definition
{
if(s[i++]=='a'&&s[i]=='b')
return(1);
else if(s[--i]=='a')
return(1);
else
return(0);
}
void disp()
{
printf("\nstring is valid\n");
getch();
exit(0);
}
void error() //function definition
{
printf("\nstring is invalid\n");
getch();
}
Compiled by Mohammad Shabbir Alam COMP-433
OUTPUT

Enter A String :

Cad (String is valid )

Cda (String is invalid )

PRACTICAL- 3

Write a program for recursive descent Parsing and check the validity of string
(ccc).

S -> aBb/ccA
A -> b/c
B -> a/b

#include<stdio.h>
#include<conio.h>
#include<iostream>
using namespace std;
int i=0;
char s[10];
/* Function declaration*/
void S();
void A();
void B();
void disp();
void error();
void main()
{
/* Display grammar on console*/
cout<<"Given grammar is "<<endl;
cout<<"S -> aBb/ccA"<<endl;
cout<<"A -> b/c"<<endl;
cout<<"B -> a/b"<<endl;
cout<<"Enter the string"<<endl;
cin>>s;
S();
if(s[i]==NULL)
cout<<"string is valid"<<endl;
else
cout<<"string is invalid"<<endl;
getch();
}
void S()
{
if(s[i]=='a')
{
Compiled by Mohammad Shabbir Alam COMP-433
i++;
B();
if(s[i]=='b')
i++;
else
error();
}
else if(s[i]=='c')
{
i++;
if(s[i]=='c')
{
i++;
A();
}
else
error();
}
}
void A() // function definition
{
if(s[i]=='b'||s[i]=='c')
i++;
else
error();
}
void B() // function definition

{
if(s[i]=='a'||s[i]=='b')
i++;
else
error();
}
void error()
{
cout<<"string is invalid"<<endl;
getch();

Compiled by Mohammad Shabbir Alam COMP-433


PRACTICAL -4

PROGRAM FOR COMPUTATION OF FIRST

#include<stdio.h>
#include<conio.h>
#include<string.h>

void main()
{
char t[5],nt[10],p[5][5],first[5][5],temp; // Array declaration
int i,j,not,nont,k=0,f=0; //variable declaration

printf("\n Enter the number of nonterninals in the grammar");


scanf("%d",&nont);
printf("\n Enter the non terminals in the grammar\n");

for(i=0;i<nont;i++)
{
scanf("\n%c",&nt[i]);
}

printf("\n enter the number of terminals in the grammar");


scanf("%d",&not);
printf("\n Enter the terminals in the grammar:\n");

for(i=0;i<not||t[i]=='$';i++)
{
scanf("\n%c",&t[i]);
}

for(i=0;i<nont;i++)
{
p[i][0] = nt[i];
first[i][0] = nt[i];
}

printf("\n enter the production:\n");


for(i=0;i<nont;i++)
{
scanf("%c",&temp);
printf("\n enter the production for %c (end the production with'$' sign):",p[i][0]);
for(j=0;p[i][j]!='$';)
{
j+=1;

Compiled by Mohammad Shabbir Alam COMP-433


scanf("%c",&p[i][j]);
}
}

for(i=0;i<nont;i++)
{
printf("\nthe productionfor %c ->",p[i][0]);
for(j=1;p[i][j]!='$';j++)
{
printf("%c",p[i][j]);
}
}

for(i=0;i<nont;i++)
{
f=0;
for(j=1;p[i][j]!='$';j++)
{
for(k=0;k<not;k++)
{
if(f==1)
break;

if(p[i][j]==t[k])
{
first[i][j] = t[k];
first[i][j+1] = '$';
f=1;
break;
}
else if(p[i][j]==nt[k])
{
first[i][j]=first[k][j];
if(first[i][j]=='e')
continue;
first[i][j+1]='$';
f= 1;
break;
}
}
}
}

for(i=0;i<nont;i++)
{
printf("\n\n the firsrof %c ->",first[i][0]);
for(j=1;first[i][j]!='$';j++)
{
Compiled by Mohammad Shabbir Alam COMP-433
printf("%c\t",first[i][j]);
}
}
getch();
}

OUTPUT

Enter the no. of Non-terminals in the grammer:3

Enter the Non-terminals in the grammer:

ERT

Enter the no. of Terminals in the grammer: ( Enter e for absiline ) 5

Enter the Terminals in the grammer:

ase*+

Enter the productions :

Enter the production for E ( End the production with '$' sign ) :a+s$

Enter the production for R ( End the production with '$' sign ) :e$

Enter the production for T ( End the production with '$' sign ) :Rs$

The production for E -> a+s

The production for R -> e

The production for T -> Rs

The first of E -> a

The first of R -> e

The first of T -> e s

Compiled by Mohammad Shabbir Alam COMP-433


PRACTICAL-5

PROGRAM TO FIND THE NUMBER OF WHITESPACES AND NEWLINES


CHARACTERS

#include<stdio.h>

#include<conio.h>

#include<string.h>

void main()

char str[200],ch;

int a=0,space=0,newline=0;

clrscr();

printf("\n enter a string(press escape to quit entering):");

ch=getche();

while((ch!=27) && (a<199))

str[a]=ch;

if(str[a]==' ')

space++;

if(str[a]==13)

newline++;

printf("\n");

a++;

ch=getche();
Compiled by Mohammad Shabbir Alam COMP-433
}

printf("\n the number of lines used : %d",newline+1);

printf("\n the number of spaces used is : %d",space);

getch();

OUTPUT

enter a string(press escape to quit entering):hello!

how r u?

Do you like prog. in compiler?

the number of lines used : 4

the number of spaces used is : 7

Compiled by Mohammad Shabbir Alam COMP-433


PRACTICAL- 6

PROGRAM TO CHECK WHETHER STRING IS KEYWORD OR NOT

#include<stdio.h>
#include<conio.h>
#include<string.h>
#define found 1
#define notfound 0

void main()
{
int i,j,flag=notfound,result;
char keywords[6][6]= {"void","if","else","for","while","switch"};
char str[10];
printf("\n enter the string");
// scanf("%s",str);
gets(str);
printf("\n string is %s",str);
for(i =0;i<6;i++)
{
printf(" \n keyword is %s", keywords[i]);
result= strcmp(&keywords[i][0],str);
printf(" \t result is %d",result);
if(result==0)
{
flag = found;
printf("\nflag is %d",flag);
break;
}
}

if(flag == notfound)
printf("\n\n string is not a keyword");
else
printf("\n\n string is keyword");

getch();

Compiled by Mohammad Shabbir Alam COMP-433


PRACTICAL -7

PROGRAM FOR IMPLEMENTATION OF SHIFT REDUCE PARSING


#include<conio.h>
#include<iostream>
#include<string.h>
using namespace std;

struct grammer{
char p[20];
char prod[20];
}g[10];

void main()
{
int i,stpos,j,k,l,m,o,p,f,r;
int np,tspos,cr;

cout<<"\nEnter Number of productions:";


cin>>np;

char sc,ts[10];

cout<<"\nEnter productions:\n";
for(i=0;i<np;i++)
{
cin>>ts;
strncpy(g[i].p,ts,1);
strcpy(g[i].prod,&ts[3]);
}

char ip[10];

cout<<"\nEnter Input:";
cin>>ip;

int lip=strlen(ip);

char stack[10];

stpos=0;
i=0;

//moving input
sc=ip[i];
stack[stpos]=sc;
i++;stpos++;

cout<<"\n\nStack\tInput\tAction";
do
{
r=1;
while(r!=0)
{
cout<<"\n";
for(p=0;p<stpos;p++)
{
cout<<stack[p];

Compiled by Mohammad Shabbir Alam COMP-433


}
cout<<"\t";
for(p=i;p<lip;p++)
{
cout<<ip[p];
}

if(r==2)
{
cout<<"\tReduced";
}
else
{
cout<<"\tShifted";
}
r=0;

//try reducing
getch();
for(k=0;k<stpos;k++)
{
f=0;

for(l=0;l<10;l++)
{
ts[l]='\0';
}

tspos=0;
for(l=k;l<stpos;l++) //removing first caharcter
{
ts[tspos]=stack[l];
tspos++;
}

//now compare each possibility with production


for(m=0;m<np;m++)
{
cr = strcmp(ts,g[m].prod);

//if cr is zero then match is found


if(cr==0)
{
for(l=k;l<10;l++) //removing matched part from stack
{
stack[l]='\0';
stpos--;
}

stpos=k;

//concatinate the string


strcat(stack,g[m].p);
stpos++;
r=2;
}
}
}
}

Compiled by Mohammad Shabbir Alam COMP-433


//moving input
sc=ip[i];
stack[stpos]=sc;
i++;stpos++;

}while(strlen(stack)!=1 && stpos!=lip);

if(strlen(stack)==1)
{
cout<<"\n String Accepted";
}

getch();
}

Compiled by Mohammad Shabbir Alam COMP-433


PRACTICAL-8

PROGRAM FOR GENERATION OF THREE ADDRESS CODE


#include<stdio.h>
#include<string.h>
void pm();
void plus();
void div();
int i,ch,j,l,addr=100;
char ex[10], exp[10] ,exp1[10],exp2[10],id1[5],op[5],id2[5];
void main()
{

while(1)
{
printf("\n1.assignment\n2.arithmetic\n3.relational\n4.Exit\nEnter the
choice:");
scanf("%d",&ch);
switch(ch)
{
case 1:
printf("\nEnter the expression with assignment operator:");
scanf("%s",exp);
l=strlen(exp);
exp2[0]='\0';
i=0;

while(exp[i]!='=')
{
i++;
}
strncat(exp2,exp,i);
strrev(exp);
exp1[0]='\0';
strncat(exp1,exp,l-(i+1));
strrev(exp1);
printf("Three address code:\ntemp=%s\n%s=temp\n",exp1,exp2);
break;

case 2:
printf("\nEnter the expression with arithmetic operator:");
scanf("%s",ex);
strcpy(exp,ex);
l=strlen(exp);
exp1[0]='\0';

for(i=0;i<l;i++)
{
if(exp[i]=='+'||exp[i]=='-')
{
if(exp[i+2]=='/'||exp[i+2]=='*')
{
pm();
break;
}
else
{

Compiled by Mohammad Shabbir Alam COMP-433


plus();
break;
}
}
else if(exp[i]=='/'||exp[i]=='*')
{
div();
break;
}
}
break;

case 3:
printf("Enter the expression with relational operator");
scanf("%s%s%s",&id1,&op,&id2);
if(((strcmp(op,"<")==0)||(strcmp(op,">")==0)||(strcmp(op,"<=")==0)||(
strcmp(op,">=")==0)||(strcmp(op,"==")==0)||(strcmp(op,"!=")==0))==0)
printf("Expression is error");
else
{
printf("\n%d\tif %s%s%s goto %d",addr,id1,op,id2,addr+3);
addr++;
printf("\n%d\t T:=0",addr);
addr++;
printf("\n%d\t goto %d",addr,addr+2);
addr++;
printf("\n%d\t T:=1",addr);
}
break;
case 4:
break;
}
}
}
void pm()
{
strrev(exp);
j=l-i-1;
strncat(exp1,exp,j);
strrev(exp1);
printf("Three address
code:\ntemp=%s\ntemp1=%c%ctemp\n",exp1,exp[j+1],exp[j]);
}
void div()
{
strncat(exp1,exp,i+2);
printf("Three address
code:\ntemp=%s\ntemp1=temp%c%c\n",exp1,exp[i+2],exp[i+3]);
}
void plus()
{
strncat(exp1,exp,i+2);
printf("Three address
code:\ntemp=%s\ntemp1=temp%c%c\n",exp1,exp[i+2],exp[i+3]);
}

Compiled by Mohammad Shabbir Alam COMP-433


OUTPUT

1. assignment
2. arithmetic
3. relational
4. Exit
Enter the choice:1
Enter the expression with assignment operator:
a=b
Three address code:
temp=b
a=temp

1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:2
Enter the expression with arithmetic operator:
a+b-c
Three address code:
temp=a+b
temp1=temp-c

1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:2
Enter the expression with arithmetic operator:
a-b/c
Three address code:
temp=b/c
temp1=a-temp

1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:2
Enter the expression with arithmetic operator:
a*b-c
Three address code:
temp=a*b
temp1=temp-c

1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:2

Compiled by Mohammad Shabbir Alam COMP-433


Enter the expression with arithmetic operator:a/b*c
Three address code:
temp=a/b
temp1=temp*c
1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:3
Enter the expression with relational operator
a
<=
b

100 if a<=b goto 103


101 T:=0
102 goto 104
103 T:=1

1.assignment
2.arithmetic
3.relational
4.Exit
Enter the choice:4

Compiled by Mohammad Shabbir Alam COMP-433


PRACTICAL-9

Program for IMPLEMENTING CODE GENERATOR IN C++


Here is implementation of Code generation stage of complier in C++. You have to
provide input in AIP.TXT file and the output will be stored in ANOP.TXT file.

#include<iostream>
#include<conio.h>
#include<stdio.h>
#include<string.h>
using namespace std;
void main()
{

char op1[10],op2[10],op3[10],op4[10];
FILE *ip, *op;
ip=fopen("aip.txt","r");
if(!ip)
{
cout<<"error in opening file";
}
op=fopen("anop.txt","w");
while(!feof(ip))
{
fscanf(ip,"%s%s%s%s",&op1,&op2,&op3,&op4);
//fscanf(ip,"%c",&op1[0]);

if(strcmp(op1,"+")==0)
{
fprintf(op,"MOV AX, %s\n",op2);
fprintf(op,"MOV BX, %s\n",op3);
fprintf(op,"ADD AX, BX\n");
fprintf(op,"MOV %s, AX\n",op4);
}

if(strcmp(op1,"-")==0)
{
fprintf(op,"MOV AX, %s\n",op2);
fprintf(op,"MOV BX, %s\n",op3);
fprintf(op,"SUB AX, BX\n");
fprintf(op,"MOV %s, AX\n",op4);
}

if(strcmp(op1,"*")==0)
{
fprintf(op,"MOV AX, %s\n",op2);
fprintf(op,"MOV BX, %s\n",op3);
fprintf(op,"MUL AX, BX\n");
fprintf(op,"MOV %s, AX\n",op4);
}

if(strcmp(op1,"/")==0)
{
fprintf(op,"MOV AX, %s\n",op2);
fprintf(op,"MOV BX, %s\n",op3);
fprintf(op,"DIV AX, BX\n");
fprintf(op,"MOV %s, AX\n",op4);
}
Compiled by Mohammad Shabbir Alam COMP-433
if(strcmp(op1,"=")==0)
{
fprintf(op,"MOV %s, %s\n",op2,op3);
}

}
fclose(ip);
fclose(op);
cout<<"\n Code generation successful";
getch();
}

Input (AIP.TXT)

+ o1 o2 o3

- o3 o1 o3

* o1 o2 o3

/ o3 o1 o3

= o5 o6 %

OUTPUT (ANOP.TXT)

1 MOV AX, o1
2 MOV BX, o2
3 ADD AX, BX
4 MOV o3, AX
MOV AX, o3
5
MOV BX, o1
6 SUB AX, BX
7 MOV o3, AX
8 MOV AX, o1
9 MOV BX, o2
MUL AX, BX
10 MOV o3, AX
11 MOV AX, o3
12 MOV BX, o1
13 DIV AX, BX
14 MOV o3, AX
MOV o5, o6

Compiled by Mohammad Shabbir Alam COMP-433


PRACTICAL 10. Write a C++ program to perform string matching.

#include <iostream>
#include <string>
using namespace std;
int main()
{
std::string org, dup;
int result = -1, i = 1;
std::cout<<"Enter Original String:";
getline(std::cin, org);
std::cout<<"Enter Pattern String:";
getline(std::cin, dup);
do
{
result = org.find(dup, result + 1);
if (result != -1)
std::cout<<"\nInstance:"<<i<<"\tPosition:"<<result<<"\t";
i++;
} while (result >= 0);
return 0;
}

Output

Enter Original String: All men went to the appall mall


Enter Pattern String: all

Instance:1 Position:23
Instance:2 Position:28

Compiled by Mohammad Shabbir Alam COMP-433


Practical 11. Write a C++ program to check the validation of Email address.

#include <iostream>
#include <string>

using namespace std;


int main()
{
string input;
cout << "Enter your email address\n";
getline(cin, input);

size_t at = input.find('@');
if (at == string::npos)
{
cout << "Missing @ symbol\n";
return 1;
}

size_t dot = input.find('.', at + 1);


if (dot == string::npos)
{
cout << "Missing . symbol after @\n";
return 2;
}

cout << "Email accepted.\n";


return 0;
}

Output

Enter your Email Address: jazanu@gmail.com


Email Accepted.
Enter your Email Address: jazanu gmail.com
Missing @ Symbol.

Project :- Develop a Project for Predictive Parser

Compiled by Mohammad Shabbir Alam COMP-433


FAQ

1. What are Lex and YACC tool

2. Difference between Lex and YACC tool

3. Construct an NFA for the Regular Expression ab*

4. Are NFA and DFA Equivalent

5. What is the role of Lexical Analyser

6. What is Token

7. What is an Identifier

8. State the number of tokens in the following expression

A= b+c-d*e.

9. What is CFG.

10. State three rule of First

11. State Three rules of Follow

12. State Error in each phase of Compiler

REFERENCES

1) Principles of Compiler Design By Ullman & AHO, Narosa Publication

2)Compilers Principles, Techniques & Tools by AlfreadV.AHO, Ravi Sethi & J. D.


Ullman.

Compiled by Mohammad Shabbir Alam COMP-433

You might also like