Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 53

COLLEGE OF ENGINEERING, PERUMON

KOLLAM- 691601

CS431 COMPILER DESIGN LAB MANUAL


COLLEGE OF ENGINEERING, PERUMON

VISION

An institution of global stature to excel in technical education, research


and development for moulding engineers to lead competitive
professional environment.

MISSION
 To mould quality engineers by providing them with fundamental
knowledge, analytical skills, creativity, innovation, integrity and eth-
ics to suit the needs of the society.

 To prepare engineers globally competent in technical and leadership


skills to solve increasingly challenging technological problems for
the betterment of the community.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

VISION

To emerge as a department of global stature in the field of


computer science education, research and development.

MISSION

 To equip graduates in the field of computer science with compet-


ent technical and analytical skills, innovative research capabilities
and leadership potential.
 To instill graduates with integrity, discipline and ethics to work
with commitment for the progress of the society.
PROGRAMME EDUCATIONAL OBJECTIVES (PEOs)
UG Programme in Computer Science and Engineering

1. Graduates shall have strong foundation in fundamental


principles of Computer Science discipline that will prepare
them to pursue higher education.

2. Graduates shall be able to manage and design projects in


multidisciplinary environment by effective team work and
leadership skills.

3. Graduates shall promote life-long learning and ethical


responsibilities in profession for the betterment of the
society.
PROGRAMME OUTCOMES (POs)
UG Programme in Computer Science and Engineering

1. Engineering knowledge: Apply the knowledge of mathematics, science,


engineering fundamentals, and an engineering specialization to the solution of
complex engineering problems.
2. Problem analysis: Identify, formulate, review research literature, and analyze
complex engineering problems reaching substantiated conclusions using first
principles of mathematics, natural sciences, and engineering sciences.
3. Design/development of solutions: Design solutions for complex engineering
problems and design system components or processes that meet the specified
needs with appropriate consideration for the public health and safety, and the
cultural, societal, and environmental considerations.
4. Conduct investigations of complex problems: Use research-based knowledge
and research methods including design of experiments, analysis and
interpretation of data, and synthesis of the information to provide valid
conclusions.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources,
and modern engineering and IT tools including prediction and modeling to
complex engineering activities with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual
knowledge to assess societal, health, safety, legal and cultural issues and the
consequent responsibilities relevant to the professional engineering practice.
7. Environment and sustainability: Understand the impact of the professional
engineering solutions in societal and environmental contexts, and demonstrate
the knowledge of, and need for sustainable development.
8. Ethics: Apply ethical principles and commit to professional ethics and
responsibilities and norms of the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a
member or leader in diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities
with the engineering community and with society at large, such as, being able to
comprehend and write effective reports and design documentation, make
effective presentations, and give and receive clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding
of the engineering and management principles and apply these to one's own
work, as a member and leader in a team, to manage projects and in
multidisciplinary environments.
12. Life-long learning: Recognize the need for, and have the preparation and ability
to engage in independent and life-long learning in the broadest context of
technological change.

PROGRAMME SPECIFIC OUTCOMES (PSOs)


UG Programme in Computer Science and Engineering

1. Ability to apply the theoretical foundations and fundamentals of


computer science in modelling and developing software and
hardware solutions to real-world problems.

2. Ability to use computational, algorithmic and programming skills


along with modern software engineering tools for the design,
development and management of software systems.

3. Ability to use knowledge in computer networks, data processing,


machine intelligence and cyber security to develop secure and
intelligent software systems.

COURSE OUTCOMES AND CO-PO,PSO MAPPING


Course outcomes:
CS431.1 Illustrate the techniques of Lexical Analysis and Syntax Analysis.

CS431.2 Apply the knowledge of Lex tools to develop programs.

CS431.3 Apply the concepts of finite automata for various automata conversion and
minimisation.

CS431.4
Apply the concepts of various types of parsers for a given language.

CS431.5
Generate intermediate code.

CS431.6 Apply various code optimization techniques prior to machine level code generation.

CO-PO,PSO Mapping:
COs PO PO PO PO PO PO PO PO PO PO PO PO PSO1 PSO2 PSO3
1 2 3 4 5 6 7 8 9 10 11 12
CS431.1 3 1 3 1 3
CS431.2 3 1 3 2 3
CS431.3 3 1 3 1 3
CS431.4 3 1 3 1 3
CS431.5 3 1 3 1 3
CS431.6 3 1 3 1 3
CO 3 1 3 2 3

EXPERIMENT 1
Design and implement a lexical analyser using C language. The lexical analyser
should ignore spaces, tabs and new lines. From the given input file, the program
should identify the numbers, keywords, identifiers, special characters and also
count the same. The program should also count the total no.of lines in the input file.
Sample input file : input.txt
int main()
{
int a = 20, b = 30;
char ch;
float f;
}

Sample output:
Numbers in the program are: 20 30
int is a keyword
main is an identifier
int is a keyword
a is an identifier
b is an identifier
char is a keyword
ch is an identifier
float is a keyword
f is an identifier
Special characters are ( ) { = , = ; ; ; }
No.of numbers : 2
No.of keywords: 4
No.of identifiers: 5
No.of special characters: 10
Total no.of lines: 6

EXPERIMENT 2
Write a C program to convert NFA to DFA.
Algorithm
An NFA can have zero, one or more than one move from a given state on a given input symbol. An
NFA can also have NULL moves (moves without input symbol). On the other hand, DFA has one and
only one move from a given state on a given input symbol.
Conversion from NFA to DFA
Suppose there is an NFA N < Q, ∑, q0, δ, F > which recognizes a language L. Then the DFA D < Q’, ∑,
q0, δ’, F’ > can be constructed for language L as:
Step 1: Initially Q’ = ɸ.
Step 2: Add q0 to Q’.
Step 3: For each state in Q’, find the possible set of states for each input symbol using transition func-
tion of NFA. If this set of states is not in Q’, add it to Q’.
Step 4: Final state of DFA will be all states with contain F (final states of NFA)

Sample Code
#include<stdio.h>
#include<stdlib.h>
struct node
{
int st;
struct node *link;
};
struct node1
{

int nst[20];
};

void insert(int ,char, int);


int findalpha(char);
void findfinalstate(void);
int insertdfastate(struct node1);
int compare(struct node1,struct node1);
void printnewstate(struct node1);
static int set[20],nostate,noalpha,s,notransition,nofinal,start,finalstate[20],c,r,buffer[20];
int complete=-1;
char alphabet[20];
static int eclosure[20][20]={0};
struct node1 hash[20];
struct node * transition[20][20]={NULL};
void main()
{
int i,j,k,m,t,n,l;
struct node *temp;
struct node1 newstate={0},tmpstate={0};

printf("\nEnter No of alphabets and alphabets?\n");


scanf("%d",&noalpha);
getchar();
for(i=0;i<noalpha;i++)
{

alphabet[i]=getchar();
getchar();
}
printf("Enter the number of states?\n");
scanf("%d",&nostate);
printf("Enter the start state?\n");
scanf("%d",&start);
printf("Enter the number of final states?\n");
scanf("%d",&nofinal);
printf("Enter the final states?\n");
for(i=0;i<nofinal;i++)
scanf("%d",&finalstate[i]);
printf("Enter no of transition?\n");
scanf("%d",&notransition);
printf("NOTE:- [Transition is in the form–> qno alphabet qno]\n",notransition);
printf("NOTE:- [States number must be greater than zero]\n");
printf("\nEnter transition?\n");

for(i=0;i<notransition;i++)
{

scanf("%d %c%d",&r,&c,&s);
insert(r,c,s);

}
for(i=0;i<20;i++)
{
for(j=0;j<20;j++)
hash[i].nst[j]=0;
}
complete=-1;
i=-1;
printf("\nEquivalent DFA.....\n");
printf(".......................\n");

printf("Trnsitions of DFA\n");

newstate.nst[start]=start;
insertdfastate(newstate);
while(i!=complete)
{
i++;
newstate=hash[i];
for(k=0;k<noalpha;k++)
{
c=0;
for(j=1;j<=nostate;j++)
set[j]=0;
for(j=1;j<=nostate;j++)
{
l=newstate.nst[j];
if(l!=0)
{
temp=transition[l][k];
while(temp!=NULL)
{
if(set[temp->st]==0)
{
c++;
set[temp->st]=temp->st;
}
temp=temp->link;

}
}
}
printf("\n");
if(c!=0)
{
for(m=1;m<=nostate;m++)
tmpstate.nst[m]=set[m];

insertdfastate(tmpstate);

printnewstate(newstate);
printf("%c\t",alphabet[k]);
printnewstate(tmpstate);
printf("\n");
}
else
{
printnewstate(newstate);
printf("%c\t", alphabet[k]);
printf("NULL\n");
}

}
}
printf("\nStates of DFA:\n");
for(i=0;i<=complete;i++)
printnewstate(hash[i]);
printf("\n Alphabets:\n");
for(i=0;i<noalpha;i++)
printf("%c\t",alphabet[i]);
printf("\n Start State:\n");
printf("q%d",start);
printf("\nFinal states:\n");
findfinalstate();

}
int insertdfastate(struct node1 newstate)
{
int i;
for(i=0;i<=complete;i++)
{
if(compare(hash[i],newstate))
return 0;
}
complete++;
hash[complete]=newstate;
return 1;
}
int compare(struct node1 a,struct node1 b)
{
int i;

for(i=1;i<=nostate;i++)
{
if(a.nst[i]!=b.nst[i])
return 0;
}
return 1;
}

void insert(int r,char c,int s)


{
int j;
struct node *temp;
j=findalpha(c);
if(j==999)
{
printf("error\n");
exit(0);
}
temp=(struct node *) malloc(sizeof(struct node));
temp->st=s;
temp->link=transition[r][j];
transition[r][j]=temp;
}

int findalpha(char c)
{
int i;
for(i=0;i<noalpha;i++)
if(alphabet[i]==c)
return i;
return(999);
}
void findfinalstate()
{
int i,j,k,t;

for(i=0;i<=complete;i++)
{
for(j=1;j<=nostate;j++)
{
for(k=0;k<nofinal;k++)
{
if(hash[i].nst[j]==finalstate[k])
{
printnewstate(hash[i]);
printf("\t");
j=nostate;
break;
}
}
}
}
}
void printnewstate(struct node1 state)
{
int j;
printf("{");
for(j=1;j<=nostate;j++)
{
if(state.nst[j]!=0)
printf("q%d,",state.nst[j]);
}
printf("}\t");

}
Sample input and output – SET 1
Enter no.of alphabets and alphabets: 2 a b
Enter no.of states: 4
Enter start state: 1
Enter no.of final states: 2
Enter the final states: 3 4
Enter the no.of transitions: 8
Enter transitions:
1 a 1
1 b 1
1 a 2
2 b 2
2 a 3
3 a 4
3 b 4
4 b 3

Transitions of DFA:
{q1} a {q1,q2}
{q1} b {q1}
{q1,q2} a {q1,q2,q3}
{q1,q2} b {q1,q2}
{q1,q2,q3} a {q1,q2,q3,q4}
{q1,q2,q3} b {q1,q2,q4}
{q1,q2,q3,q4} a {q1,q2,q3,q4}
{q1,q2,q3,q4} b {q1,q2,q3,q4}
{q1,q2,q4} a {q1,q2,q3}
{q1,q2,q4} b {q1,q2,q3}
States of DFA:
{q1}, {q1,q2}, {q1,q2,q3}, {q1,q2,q4}, {q1,q2,q3,q4}
Alphabets: a b
Final states: {q1,q2,q3}, {q1,q2,q3,q4}, {q1,q2,q4}

Sample input and output – SET 2


Enter no.of alphabets and alphabets: 2 a b
Enter no.of states: 2
Enter start state: 1
Enter no.of final states: 1
Enter the final states: 1
Enter the no.of transitions: 6
Enter transitions:
1 a 1
1 a 2
1 b 2
2 a 2
2 b 1
2 b 2
Transitions of DFA:
{q1} a {q1,q2}
{q1} b {q2}
{q2} a {q2}
{q2} b {q1,q2}
{q1,q2} a {q1,q2}
{q1,q2} b {q1,q2}
States of DFA:
{q1}, {q2}, {q1,q2}
Alphabets: a b
Final states:
{q1}, {q1,q2}
EXPERIMENT 3
Write a C program to find ε – closure of all states of any given NFA with ε transition.
Algorithm

1. Start
2. Read NFA with epsilon moves
3. Repeat for all states
3.1 Push (state)
3.2 While stack not empty
3.2.1 U =pop()
3.2.2 For each epsilon move from u to v
a) If v is not visited
i. Set v as visited
ii. Push (v)
3.3 Epsilon closure (state) = visited edge
Sample Code
#include<stdio.h>
#include<stdlib.h>
struct node
{
int st;
struct node *link;
};

void findclosure(int,int);
void insert_trantbl(int ,char, int);
int findalpha(char);
void print_e_closure(int);

static int set[20],nostate,noalpha,s,notransition,c,r,buffer[20];


char alphabet[20];
static int e_closure[20][20]={0};
struct node * transition[20][20]={NULL};

void main()
{
int i,j,k,m,t,n;
struct node *temp;
printf("Enter the number of alphabets?\n");
scanf("%d",&noalpha);
getchar();
printf("NOTE:- [ use letter e as epsilon]\n");
printf("NOTE:- [e must be last character ,if it is present]\n");
printf("\nEnter alphabets?\n");
for(i=0;i<noalpha;i++)
{
alphabet[i]=getchar();
getchar();
}
printf("\nEnter the number of states?\n");
scanf("%d",&nostate);
printf("\nEnter no of transition?\n");
scanf("%d",&notransition);
printf("NOTE:- [Transition is in the form–> qno alphabet qno]\n",notransition);
printf("NOTE:- [States number must be greater than zero]\n");
printf("\nEnter transition?\n");
for(i=0;i<notransition;i++)
{
scanf("%d %c%d",&r,&c,&s);
insert_trantbl(r,c,s);
}
printf("\n");
printf("e-closure of states……\n");
printf("—————————–\n");
for(i=1;i<=nostate;i++)
{
c=0;
for(j=0;j<20;j++)
{
buffer[j]=0;
e_closure[i][j]=0;
}
findclosure(i,i);
printf("\ne-closure(q%d): ",i);
print_e_closure(i);
}
}

void findclosure(int x,int sta)


{
struct node *temp;
int i;
if(buffer[x])
return;
e_closure[sta][c++]=x;
buffer[x]=1;
if(alphabet[noalpha-1]=='e' && transition[x][noalpha-1]!=NULL)
{
temp=transition[x][noalpha-1];
while(temp!=NULL)
{
findclosure(temp->st,sta);
temp=temp->link;
}
}
}

void insert_trantbl(int r,char c,int s)


{
int j;
struct node *temp;
j=findalpha(c);
if(j==999)
{
printf("error\n");
exit(0);
}
temp=(struct node *)malloc(sizeof(struct node));
temp->st=s;
temp->link=transition[r][j];
transition[r][j]=temp;
}

int findalpha(char c)
{
int i;
for(i=0;i<noalpha;i++)
if(alphabet[i]==c)
return i;
return(999);
}
void print_e_closure(int i)
{
int j;
printf("{");
for(j=0;e_closure[i][j]!=0;j++)
printf("q%d,",e_closure[i][j]);
printf("}");
}

Sample input and output – SET 1


Enter no.of alphabets: 3
Enter alphabets: 0 1 e
Enter no.of states: 3
Enter the no.of transitions: 5
Enter transitions:
1 0 1
1 e 2
2 1 2
2 e 3
3 0 3

e-closure of states:
e-closure {q1}: {q1,q2,q3}
e-closure {q2}: {q2,q3}
e-closure {q3}: {q3}

Sample input and output – SET 2


Enter no.of alphabets: 3
Enter alphabets: 0 1 e
Enter no.of states: 3
Enter the no.of transitions: 3
Enter transitions:
1 0 2
2 e 3
3 1 3

e-closure of states:
e-closure {q1}: {q1}
e-closure {q2}: {q2,q3}
e-closure {q3}: {q3}

EXPERIMENT 4
Write a C program to convert NFA with ε transition to NFA without ε transition.

Algorithm
1. Start
2. Find epsilon closure of each state
3. For each state q in NFA
a. For each alphabet a in transition except epsilon transition
I) Find all states in transition of epsilon closure (q) and a and
II) Find the union of the states obtained in above step
III) Find epsilon closure of all the states in the union list
4. Display the transition table and final states for NFA obtained
Sample input and output
Enter no.of alphabets: 3
Enter the alphabets: a b e
Enter no.of states: 3
Enter start state: 1
Enter no.of final states: 1
Enter the final states: 3
Enter the no.of transitions: 5
Enter transitions:
1 a 1
1 e 2
2 b 2
2 e 3
3 a 3

Equivalent NFA without e:


Start state: {q1}
Alphabets: a b e
States: {q1}, {q2}, {q3}
Transitions are:
{q1} a {q1, q2, q3}
{q1} b {q2, q3}
{q2} a {q3}
{q2} b {q2,q3}
{q3} b {}
{q3} a {q3}

Final states: {q2}, {q3}

Sample Code

#include<stdio.h>
#include<stdlib.h>
struct node
{
int st;
struct node *link;
};
void findclosure(int,int);
void insert_trantbl(int ,char, int);
int findalpha(char);
void findfinalstate(void);
void unionclosure(int);
void Print_S_state(int);
void print_e_closure(int);
static int set[20],nostate,noalpha,s,notransition,nofinal,start,finalstate[20],c,r,buffer[20];
char alphabet[20];
static int e_closure[20][20]={0};
struct node * transition[20][20]={NULL};
void main()
{
int i,j,k,m,t,n,s;

struct node *temp;


printf("enter the number of alphabets?\n");
scanf("%d",&noalpha);
getchar();

printf("\nEnter alphabets?\n");
for(i=0;i<noalpha;i++)
{
alphabet[i]=getchar();
getchar();
}
printf("Enter the number of states?\n");
scanf("%d",&nostate);
printf("Enter the start state?\n");
scanf("%d",&start);
printf("Enter the number of final states?\n");
scanf("%d",&nofinal);
printf("Enter the final states?\n");
for(i=0;i<nofinal;i++)
scanf("%d",&finalstate[i]);
printf("Enter no of transition?\n");
scanf("%d",&notransition);
printf("\nEnter transition?\n");
for(i=0;i<notransition;i++)
{
scanf("%d %c%d",&r,&c,&s);
insert_trantbl(r,c,s);
}
printf("\n");
for(i=1;i<=nostate;i++)
{
c=0;
for(j=0;j<20;j++)
{
buffer[j]=0;
e_closure[i][j]=0;
}
findclosure(i,i);
}
printf("Equivalent NFA without epsilon\n");
printf("-----------------------------------\n");
printf("start state:");
Print_S_state(start);
printf("\nAlphabets:");
for(i=0;i<noalpha-1;i++)
printf("%c ",alphabet[i]);
printf("\n States :" );
for(s=1;s<=nostate;s++)
{
printf("{");
printf("q%d",s);
printf("},");
}

printf("\nTransitions are...:\n");

for(i=1;i<=nostate;i++)
{
for(j=0;j<noalpha-1;j++)
{
for(m=1;m<=nostate;m++)
set[m]=0;
for(k=0;e_closure[i][k]!=0;k++)
{
t=e_closure[i][k];
temp=transition[t][j];
while(temp!=NULL)
{
unionclosure(temp->st);
temp=temp->link;
}
}
printf("\n");
printf("{");
printf("q%d",i);
printf("}");
printf("\t");
printf("%c\t",alphabet[j] );
printf("{");
for(n=1;n<=nostate;n++)
{
if(set[n]!=0)
printf("q%d,",n);
}
printf("}");
}
}
printf("\n Final states:");
findfinalstate();
}

void findclosure(int x,int sta)


{
struct node *temp;
int i;
if(buffer[x])
return;
e_closure[sta][c++]=x;
buffer[x]=1;
if(alphabet[noalpha-1]=='e' && transition[x][noalpha-1]!=NULL)
{
temp=transition[x][noalpha-1];
while(temp!=NULL)
{
findclosure(temp->st,sta);
temp=temp->link;
}
}
}

void insert_trantbl(int r,char c,int s)


{
int j;
struct node *temp;
j=findalpha(c);
if(j==999)
{
printf("error\n");
exit(0);
}
temp=(struct node *) malloc(sizeof(struct node));
temp->st=s;
temp->link=transition[r][j];
transition[r][j]=temp;
}

int findalpha(char c)
{
int i;
for(i=0;i<noalpha;i++)
if(alphabet[i]==c)
return i;

return(999);
}

void unionclosure(int i)
{
int j=0,k;
while(e_closure[i][j]!=0)
{
k=e_closure[i][j];
set[k]=1;
j++;
}
}
void findfinalstate()
{
int i,j,k,t;
for(i=0;i<nofinal;i++)
{
for(j=2;j<=nostate;j++)
{
for(k=0;e_closure[j][k]!=0;k++)
{
if(e_closure[j][k]==finalstate[i])
{
printf("{");
printf("q%d",j);
printf("},");
}
}
}
}
}

void print_e_closure(int i)
{
int j;
printf("{");
for(j=0;e_closure[i][j]!=0;j++)
printf("q%d,",e_closure[i][j]);
printf("}\t");
}

void Print_S_state(int i)
{
printf("{");
printf("q%d",start);
printf("}\t");
}

EXPERIMENT 5
Write a C program to convert NFA with ε transition to DFA.

Algorithm
Step 1 : Take ε closure for the beginning state of NFA as beginning state of DFA.
Step 2 : Find the states that can be traversed from the present for each input symbol
(union of transition value and their closures for each states of NFA present in cur-
rent state of DFA).
Step 3 : If any new state is found, take it as current state and repeat step 2.
Step 4 : Repeat Step 2 and Step 3 until no new state present in DFA transition table.
Step 5 : Mark the states of DFA which contains final state of NFA as final states of
DFA.
Note: The functions used for the previous assignments can be used as reference.

Sample input and output

Enter no.of alphabets: 4


Enter the alphabets: a b c e
Enter no.of states: 3
Enter start state: 1
Enter no.of final states: 1
Enter the final states: 3
Enter the no.of transitions: 9
Enter transitions:
1 a 1
1 b 2
1 c 3
2 a 2
2 b 3
2 e 1
3 a 3
3 c 1
3 e 2

Equivalent DFA:
Start state: 1
Alphabets: a b c
States: {q1},{q1,q2},{q1,q2,q3}

Transitions are:
{q1} a {q1}
{q1} b {q1,q2}
{q1} c {q1,q2,q3}
{q1,q2} a {q1,q2}
{q1,q2} b {q1,q2,q3}
{q1,q2} c {q1,q2,q3}
{q1,q2,q3} a {q1,q2,q3}
{q1,q2,q3} b {q1,q2,q3}
{q1,q2,q3} c {q1,q2,q3}

Final states: {q1,q2,q3}

Sample Code
#include<stdio.h>
#include<stdlib.h>
struct node
{
int st;
struct node *link;
};
void findclosure(int,int);
void insert_trantbl(int ,char, int);
int findalpha(char);
void findfinalstate(void);
void unionclosure(int);
void print_e_closure(int);
static int set[20],nostate,noalpha,s,notransition,nofinal,start,finalstate[20],c,r,buffer[20];
char alphabet[20];
static int e_closure[20][20]={0};
struct node * transition[20][20]={NULL};
void main()
{
int i,j,k,m,t,n;
struct node *temp;
printf("enter the number of alphabets?\n");
scanf("%d",&noalpha);
getchar();
printf("\nEnter alphabets?\n");
for(i=0;i<noalpha;i++)
{
alphabet[i]=getchar();
getchar();
}
printf("Enter the number of states?\n");
scanf("%d",&nostate);
printf("Enter the start state?\n");
scanf("%d",&start);
printf("Enter the number of final states?\n");
scanf("%d",&nofinal);
printf("Enter the final states?\n");
for(i=0;i<nofinal;i++)
scanf("%d",&finalstate[i]);
printf("Enter no of transition?\n");
scanf("%d",&notransition);
printf("\nEnter transition?\n");
for(i=0;i<notransition;i++)
{
scanf("%d %c%d",&r,&c,&s);
insert_trantbl(r,c,s);
}
printf("\n");
for(i=1;i<=nostate;i++)
{
c=0;
for(j=0;j<20;j++)
{
buffer[j]=0;
e_closure[i][j]=0;
}
findclosure(i,i);
}
printf("Equivalent DFA\n");
printf("-----------------------------------\n");
printf("start state:");
printf("%d",start);
printf("\nAlphabets:");
for(i=0;i<noalpha-1;i++)
printf("%c ",alphabet[i]);
printf("\n States:" );
for(i=1;i<=nostate;i++)
print_e_closure(i);
printf("\nTransitions are...:\n");
for(i=1;i<=nostate;i++)
{
for(j=0;j<noalpha-1;j++)
{
for(m=1;m<=nostate;m++)
set[m]=0;
for(k=0;e_closure[i][k]!=0;k++)
{
t=e_closure[i][k];
temp=transition[t][j];
while(temp!=NULL)
{
unionclosure(temp->st);
temp=temp->link;
}
}
printf("\n");
print_e_closure(i);
printf("%c\t",alphabet[j]);
printf("{");
for(n=1;n<=nostate;n++)
{
if(set[n]!=0)
printf("q%d,",n);
}
printf("}");
}
}
printf("\n Final states:");
findfinalstate();
}
void findclosure(int x,int sta)
{
struct node *temp;
int i;
if(buffer[x])
return;
e_closure[sta][c++]=x;
buffer[x]=1;
if(alphabet[noalpha-1]=='e' && transition[x][noalpha-1]!=NULL)
{
temp=transition[x][noalpha-1];
while(temp!=NULL)
{
findclosure(temp->st,sta);
temp=temp->link;
}
}
}
void insert_trantbl(int r,char c,int s)
{
int j;
struct node *temp;
j=findalpha(c);
if(j==999)
{
printf("error\n");
exit(0);
}
temp=(struct node *) malloc(sizeof(struct node));
temp->st=s;
temp->link=transition[r][j];
transition[r][j]=temp;
}
int findalpha(char c)
{
int i;
for(i=0;i<noalpha;i++)
if(alphabet[i]==c)
return i;
return(999);
}
void unionclosure(int i)
{
int j=0,k;
while(e_closure[i][j]!=0)
{
k=e_closure[i][j];
set[k]=1;
j++;
}
}
void findfinalstate()
{
int i,j,k,t;
for(i=0;i<nofinal;i++)
{
for(j=1;j<=nostate;j++)
{
for(k=0;e_closure[j][k]!=0;k++)
{
if(e_closure[j][k]==finalstate[i])
{
print_e_closure(j);
}
}
}
}
}

void print_e_closure(int i)
{
int j,k;
printf("{");
for(j=0;e_closure[i][j]!=0;j++)
printf("q%d,",e_closure[i][j]);
printf("}\t");
}
EXPERIMENT 6
Write a C program to develop an operator precedence parser
Theory:
A parser that reads and understand an operator precedence grammar is called
as operator precedence parser.
In operator precedence parsing,
 Firstly, we define precedence relations between every pair of terminal sym-
bols.
 Secondly, we construct an operator precedence table.

The precedence relations are defined using the following rules:

Rule 1:
 If precedence of b is higher than precedence of a, then we define a < b
 If precedence of b is same as precedence of a, then we define a = b
 If precedence of b is lower than precedence of a, then we define a > b
Rule 2:
 An identifier is always given the higher precedence than any other symbol.
 $ symbol is always given the lowest precedence.
Rule 3:
 If two operators have the same precedence, then we go by checking their as-
sociativity.

A given input string is parsed using the following steps:

Step 1:
Insert the following-
 $ symbol at the beginning and ending of the input string.
 Precedence operator between every two symbols of the string by referring
the operator precedence table.
Step 2:
 Start scanning the string from LHS in the forward direction until > symbol is
encountered.
 Keep a pointer on that location.
Step 3:
 Start
scanning the string from RHS in the backward direction until < symbol
is encountered.
 Keep a pointer on that location.
Step 4:
 Everything that lies in the middle of < and > forms the handle.
 Replace the handle with the head of the respective production.
Step 5:
 Keep repeating the cycle from Step-02 to Step-04 until the start symbol is
reached.

#include<stdio.h>
#include<conio.h>
#include<string.h>
void main()
{
char stack[20],ip[20],opt[10][10][1],ter[10];
int i,j,k,n,top=0,row,col;
for(i=0;i<10;i++)
{
stack[i]=0;
ip[i]=0;
for(j=0;j<10;j++)
{ opt[i][j][1]=0;
}
}
printf("Enter the no.of terminals:");
scanf("%d",&n);
printf("\nEnter the terminals:");
scanf("%s",ter);
printf("\nEnter the table values:\n");
for(i=0;i<n;i++)
{
for(j=0;j<n;j++)
{
printf("Enter the value for %c %c:",ter[i],ter[j]);
scanf("%s",opt[i][j]);
}
}
printf("\nOPERATOR PRECEDENCE TABLE:\n");
for(i=0;i<n;i++)
{
printf("\t%c",ter[i]);

}
printf("\n ");
printf("\n"); for(i=0;i<n;i++)
{
printf("\n%c |",ter[i]); for(j=0;j<n;j++)
{
printf("\t%c",opt[i][j][0]);
}
}
stack[top]='$';
printf("\n\nEnter the input string(append with $):");
scanf("%s",ip);
i=0;
printf("\nSTACK\t\t\tINPUT STRING\t\t\tACTION\n");
printf("\n%s\t\t\t%s\t\t\t",stack,ip);
while(i<=strlen(ip))
{
for(k=0;k<n;k++)
{
if(stack[top]==ter[k]) row=k;
if(ip[i]==ter[k]) col=k;
}
if((stack[top]=='$')&&(ip[i]=='$'))
{
printf("String is ACCEPTED"); break;
}
else if((opt[row][col][0]=='<') ||(opt[row][col][0]=='='))
{
stack[++top]=opt[row][col][0];
stack[++top]=ip[i];
ip[i]=' ';
printf("Shift %c",ip[i]);
i++;

}
else
{
if(opt[row][col][0]=='>')
{
while(stack[top]!='<')
{
--top;
}
top=top-1; printf("Reduce");
}
else
{
printf("\nString is not accepted"); break;
}
}
printf("\n"); printf("%s\t\t\t%s\t\t\t",stack,ip);
}
getch();
}
Sample input and output: SET 1
Enter the no.of terminals:4
Enter the terminals:i+*$
Enter the table values:
Enter the value for i i:-
Enter the value for i +:>
Enter the value for i *:>
Enter the value for i $:>
Enter the value for + i:<
Enter the value for + +:>
Enter the value for + *:<
Enter the value for + $:>
Enter the value for * i:<
Enter the value for * +:>
Enter the value for * *:>
Enter the value for *
$:>
Enter the value for $ i:<
Enter the value for $
+:<
Enter the value for $
*:<
Enter the value for $ $:-
OPERATOR PRECEDENCE TABLE:
i+*$

i|->>>
+|<><>
*|<>>>
$|<<<-
Enter the input string(append with $):i+i*i$
STACK INPUT STRING ACTION
$ i+i*i$ Shift
$<i +i*i$ Reduce
$<i +i*i$ Shift
$<+ i*i$ Shift
$<+<i *i$ Reduce
$<+<i *i$ Shift
$<+<* i$ Shift
$<+<*<i $ Reduce
$<+<*<i $ Reduce
$<+<*<i $ Reduce
$<+<*<i $ String is ACCEPTED

Sample input and output: SET 2


Enter the no.of terminals:6
Enter the terminals:d*+( ) $
Enter the table values:
Enter the value for d d:-
Enter the value for d *:>
Enter the value for d +: >
Enter the value for d (: -
Enter the value for d ): >
Enter the value for d $: >

Enter the value for * d:<


Enter the value for * *:>
Enter the value for * +: >
Enter the value for * (: <
Enter the value for * ): >
Enter the value for * $:>

Enter the value for + d:<


Enter the value for + *:<
Enter the value for + +: >
Enter the value for + (: <
Enter the value for + ): >
Enter the value for + $:>

Enter the value for ( d:<


Enter the value for ( *:<
Enter the value for ( +: <
Enter the value for ( (: <
Enter the value for ( ): =
Enter the value for ( $:-

Enter the value for ) d:-


Enter the value for ) *:>
Enter the value for ) +: >
Enter the value for ) (: -
Enter the value for ) ): >
Enter the value for ) $:>

Enter the value for $ d:<


Enter the value for $ *:<
Enter the value for $ +: <
Enter the value for $ (: <
Enter the value for $ ): -
Enter the value for $ $: -
OPERATOR PRECEDENCE TABLE:
d * + ( ) $
d - > > - > >
* < > > < > >
+ < < > < > >
( < < < < = -
) - > > - > >
$ < < < < - -
EXPERIMENT 7
Write a C program to find Simulate First and Follow of any given grammar.

Theory:

FIRST(X) for a grammar symbol X is the set of terminals that begin the strings deriv-
able from X.

Rules to compute FIRST set:

1. If x is a terminal, then FIRST(x) = { ‘x’ }


2. If x-> Є, is a production rule, then add Є to FIRST(x).
3. If X->Y1 Y2 Y3….Yn is a production,
1. FIRST(X) = FIRST(Y1)
2. If FIRST(Y1) contains Є then FIRST(X) = { FIRST(Y1) – Є } U
{ FIRST(Y2) }
3. If FIRST (Yi) contains Є for all i = 1 to n, then add Є to FIRST(X).

Follow(X) to be the set of terminals that can appear immediately to the right of
Non-Terminal X in some sentential form.

Rules to compute FOLLOW set:

1. FOLLOW(S) = { $ } // where S is the starting Non-Terminal

2. If A -> pBq is a production, where p, B and q are any grammar symbols,


then everything in FIRST(q) except Є is in FOLLOW(B).

3. If A->pB is a production, then everything in FOLLOW(A) is in FOLLOW(B).

4. If A->pBq is a production and FIRST(q) contains Є, then FOLLOW(B) con-


tains { FIRST(q) – Є } U FOLLOW(A)

Sample code:
#include<stdio.h>
#include<math.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
int n,m=0,p,i=0,j=0;
char a[10][10],f[10];
void follow(char c);
void first(char c);
int main(){
int i,z;
char c,ch;
//clrscr();
printf("Enter the no of productions:\n");
scanf("%d",&n);
printf("Enter the productions:\n");
for(i=0;i<n;i++)
scanf("%s%c",a[i],&ch);
do{
m=0;
printf("Enter the elements whose first & follow is to be found:");
scanf("%c",&c);
first(c);
printf("First(%c)={",c);
for(i=0;i<m;i++)
printf("%c",f[i]);
printf("}\n");
strcpy(f," ");
//flushall();
m=0;
follow(c);
printf("Follow(%c)={",c);
for(i=0;i<m;i++)
printf("%c",f[i]);
printf("}\n");
printf("Continue(0/1)?");
scanf("%d%c",&z,&ch);
}while(z==1);
return(0);
}
void first(char c)
{
int k;
if(!isupper(c))
f[m++]=c;
for(k=0;k<n;k++)
{
if(a[k][0]==c)
{
if(a[k][2]=='$')
follow(a[k][0]);
else if(islower(a[k][2]))
f[m++]=a[k][2];
else first(a[k][2]);
}
}
}
void follow(char c)
{
if(a[0][0]==c)
f[m++]='$';
for(i=0;i<n;i++)
{
for(j=2;j<strlen(a[i]);j++)
{
if(a[i][j]==c)
{
if(a[i][j+1]!='\0')
first(a[i][j+1]);
if(a[i][j+1]=='\0' && c!=a[i][0])
follow(a[i][0]);
}
}
}
}

Sample input and output


Enter the no of productions:
5
Enter the productions:
S=AbCd
A=Cf
A=a
C=gE
E=h
Enter the elements whose first & follow is to be found:S
First(S)={ga}
Follow(S)={$}
Continue(0/1)?1
Enter the elements whose first & follow is to be found:A
First(A)={ga}
Follow(A)={b}
Continue(0/1)?1
Enter the elements whose first & follow is to be found:C
First(C)={g}
Follow(C)={df}
Continue(0/1)?1
Enter the elements whose first & follow is to be found:E
First(E)={h}
Follow(E)={df}
Continue(0/1)?0
EXPERIMENT 8
Write a C program to perform constant propagation.
Theory:
Constant propagation is the process of substituting the values of known constants
in expressions at compile time. Such constants include those defined above, as well
as intrinsic functions applied to constant values. Consider the following pseudo-
code:
int x = 14;
int y = 7 - x / 2;
return y * (28 / x + 2);
Propagating x yields:
int x = 14;
int y = 7 - 14 / 2;
return y * (28 / 14 + 2);
Continuing to propagate yields the following (which would likely be further optim-
ized by dead code elimination of both x and y.)
int x = 14;
int y = 0;
return 0;
Constant propagation is implemented in compilers using reaching definition ana-
lysis results. If all a variable's reaching definitions are the same assignment which
assigns a same constant to the variable, then the variable has a constant value and
can be replaced with the constant.
Constant propagation can also cause conditional branches to simplify to one or
more unconditional statements, when the conditional expression can be evaluated
to true or false at compile time to determine the only possible outcome.

Sample input and output


Enter the number of productions: 4
Enter 4 productions :
a=5
b=a+3
c=a
d=c+2
Optimized code :
a=5
b=5+3
c=5
d=5+2

Sample Code
#include<stdio.h>
#include<ctype.h>
int n;
char prod[10][20];
void input()
{
int i;
printf("Enter no:of productions:");
scanf("%d",&n);
printf("Enter %d productions:\n",n);
for(i=0;i<n;i++)
scanf("%s",prod[i]);
}
void replace(char c ,char num, int no)
{
int i;
for(i=no+1;i<n;i++)
{
int j=2;
if(prod[i][0]==c)
break;
while(prod[i][j]!='\0')
{
if(prod[i][j]==c)
{
prod[i][j]=num;
break;
}
j++;
}
}
}
void check()
{
int i;
for(i=0;i<n;i++)
{
if(isdigit(prod[i][2]) && prod[i][3]=='\0')
{
replace(prod[i][0],prod[i][2],i);
}
}
}
void display()
{
printf("optimized code:\n");
for(int i=0;i<n;i++)
{
printf("%s\n",prod[i]);
}
}
void main()
{
input();
check();
display();
}
EXPERIMENT 9
Write a C program to construct a recursive descent parser for an expression.
Theory:
Recursive descent is a top-down parsing technique that constructs the parse tree
from the top and the input is read from left to right. It uses procedures for every ter -
minal and non-terminal entity. This parsing technique recursively parses the input
to make a parse tree, which may or may not require back-tracking. But the grammar
associated with it (if not left factored) cannot avoid back-tracking. A form of recurs -
ive-descent parsing that does not require any back-tracking is known as predictive
parsing.

This parsing technique is regarded recursive as it uses context-free grammar which


is recursive in nature.

Back-tracking

Top- down parsers start from the root node (start symbol) and match the input
string against the production rules to replace them (if matched).

Sample code:

#include<stdio.h>
#include<ctype.h>
#include<string.h>

void Tprime();
void Eprime();
void E();
void check();
void T();
char expression[10];
int count, flag;
int main()
{
count = 0;
flag = 0;
printf("\nEnter an Algebraic Expression:\t");
scanf("%s", expression);
E();
if((strlen(expression) == count) && (flag == 0))
{
printf("\nThe Expression %s is Valid\n", expression);
}
else
{
printf("\nThe Expression %s is Invalid\n", expression);
}
}

void E()
{
T();
Eprime();
}

void T()
{
check();
Tprime();
}

void Tprime()
{
if(expression[count] == '*')
{
count++;
check();
Tprime();
}
}

void check()
{
if(isalnum(expression[count]))
{
count++;
}
else if(expression[count] == '(')
{
count++;
E();
if(expression[count] == ')')
{
count++;
}
else
{
flag = 1;
}
}
else
{
flag = 1;
}
}

void Eprime()
{
if(expression[count] == '+')
{
count++;
T();
Eprime();
}
}
Sample input and output
Enter an Algebraic Expression:
(a+b)*c
The Expression (a+b)*c is valid

EXPERIMENT 10
Write a C program to construct a Shift Reduce Parser for a given language.

Theory:

Shift-reduce parsing attempts to construct a parse tree for an input string beginning
at the leaves and working up towards the root. In other words, it is a process of “re-
ducing” (opposite of deriving a symbol using a production rule) a string w to the
start symbol of a grammar. At every (reduction) step, a particular substring match-
ing the RHS of a production rule is replaced by the symbol on the LHS of the produc-
tion.

Handles:

A “handle” of a string is a substring that matches the RHS of a production and whose
reduction to the non-terminal (on the LHS of the production) represents one step
along the reverse of a rightmost derivation toward reducing to the start symbol.

If S →* αAw →* αβw, then A → β in the position following α is a handle of αβw.

In such a case, it is suffice to say that the substring β is a handle of αβw, if the posi -
tion of β and the corresponding production are clear.

Consider the following grammar:

E → E + E | E * E | (E) | id

and a right-most derivation is as follows:

E → E + E → E+ E * E → E + E * id3 → E + id2 * id3 → id1 + id2 * id3

The id’s are subscripted for notational convenience.

Note that the reduction is in the opposite direction from id1 + id2 * id3 back to E,
where the handle at every step is underlined.

Implementation of shift reduce parser:

A convenient way to implement a shift-reduce parser is to use a stack to hold gram-


mar symbols and an input buffer to hold the string w to be parsed. The symbol $ is
used to mark the bottom of the stack and also the right-end of the input.

Notationally, the top of the stack is identified through a separator symbol |, and the
input string to be parsed appears on the right side of |. The stack content appears on
the left of |.

For example, an intermediate stage of parsing can be shown as follows:


$id1 | + id2 * id3$ …. (1)

Here “$id1” is in the stack, while the input yet to be seen is “+ id2 * id3$*

In shift-reduce parser, there are two fundamental operations: shift and reduce.

Shift operation: The next input symbol is shifted onto the top of the stack.

After shifting + into the stack, the above state captured in (1) would change into:

$id1 + | id2 * id3$

Reduce operation: Replaces a set of grammar symbols on the top of the stack with
the LHS of a production rule.

After reducing id1 using E → id, the state (1) would change into:

$E | + id2 * id3$
Sample code:
#include<stdio.h>
#include<ctype.h>
#include<stdlib.h>
#include<conio.h>
#include<string.h>
char ip_sym[15],stack[15];
int ip_ptr=0,st_ptr=0,len,i;
char temp[2],temp2[2];
char act[15];
void check();
void main()
{
//clrscr();
printf("\n\t\t SHIFT REDUCE PARSER\n");
printf("\n GRAMMER\n");
printf("\n E->E+E\n E->E/E");
printf("\n E->E*E\n E->a/b");
printf("\n enter the input symbol:\t");
gets(ip_sym);
printf("\n\t stack implementation table");
printf("\n stack\t\t input symbol\t\t action");
printf("\n______\t\t ____________\t\t ______\n");
printf("\n $\t\t%s$\t\t\t--",ip_sym);
strcpy(act,"shift ");
temp[0]=ip_sym[ip_ptr];
temp[1]='\0';
strcat(act,temp);
len=strlen(ip_sym);
for(i=0;i<=len-1;i++)

{
stack[st_ptr]=ip_sym[ip_ptr];
stack[st_ptr+1]='\0';
ip_sym[ip_ptr]=' ';
ip_ptr++;
printf("\n $%s\t\t%s$\t\t\t%s",stack,ip_sym,act);
strcpy(act,"shift ");
temp[0]=ip_sym[ip_ptr];
temp[1]='\0';
strcat(act,temp);
check();
st_ptr++;
}
st_ptr++;
check();
}
void check()
{
int flag=0;
temp2[0]=stack[st_ptr];
temp2[1]='\0';
if((!strcmp(temp2,"a"))||(!strcmp(temp2,"b")))
{
stack[st_ptr]='E';
if(!strcmp(temp2,"a"))
printf("\n $%s\t\t%s$\t\t\tE->a",stack, ip_sym);
else
printf("\n $%s\t\t%s$\t\t\tE->b",stack,ip_sym);
flag=1;
}
if((!strcmp(temp2,"+"))||(strcmp(temp2,"*"))||(!strcmp(temp2,"/")))
{
flag=1;
}
if((!strcmp(stack,"E+E"))||(!strcmp(stack,"E\E"))||(!strcmp(stack,"E*E")))
{
strcpy(stack,"E");
st_ptr=0;
if(!strcmp(stack,"E+E"))
printf("\n $%s\t\t%s$\t\t\tE->E+E",stack,ip_sym);
else
if(!strcmp(stack,"E\E"))
printf("\n $%s\t\t %s$\t\t\tE->E\E",stack,ip_sym);
else
printf("\n $%s\t\t%s$\t\t\tE->E*E",stack,ip_sym);
flag=1;
}

if(!strcmp(stack,"E")&&ip_ptr==len)
{
printf("\n $%s\t\t%s$\t\t\tACCEPT",stack,ip_sym);
getch();
exit(0);
}
if(flag==0)
{
printf("\n%s\t\t\t%s\t\t reject",stack,ip_sym);
exit(0);
}
return;
}
Sample input and output
EXPERIMENT 11
Write a C program to perform intermediate code generation.
Theory:
In the first pass of the compiler, source program is converted into intermediate
code. The second pass converts the intermediate code to target code. The intermedi-
ate code generation is done by intermediate code generation phase. It takes input
from front end which consists of lexical analysis, syntax analysis and semantic ana-
lysis and generates intermediate code and gives it to code generator.

Advantages of intermediate code:

a. Target code can be generated to any machine just by attaching new machine as
the back end. This is called retargeting.
b. It is possible to apply machine independent code optimization. This helps in
faster generation of code.

Sample code:

#include<stdio.h>
#include<string.h>
#include<ctype.h>
char tempvariables[]={'z','y','x','w','v','u','t'};
int length,top=-1,count=0,tvar=0;
char input[20],tempinput[20];
char prearray[20];
char stack[20];
char threeaddress[10][10];
char concatarray[20];
char *strrev(char *str)
{
if (!str || ! *str)
return str;
int i = strlen(str) - 1, j = 0;
char ch;
while (i > j)
{
ch = str[i];
str[i] = str[j];
str[j] = ch;
i--;
j++;
}
return str;
}
int prec(char op)
{
switch(op)
{
case '+': return 1;
break;
case '-': return 1;
break;
case '*': return 2;
break;
case '/': return 2;
break;
}
} int isoperator(char sym)
{
if(sym=='+'||sym=='-'||sym=='*'||sym=='/')
return 1;
else
return 0;
}
void push(char sym)
{
top++;
stack[top]=sym;
}
char pop()
{
top--;
return(stack[top+1]);
}
void display()
{
int i;
printf("\nStack");
for(i=top;i>-1;i--)
printf("%c\t",stack[i]);
}
void del(char sym,int pos)
{
int j,c=0,k;
for(j=0;j<pos;j++)
{
tempinput[j]=prearray[j];
}
tempinput[j]=tempvariables[tvar-1];
k=j+1;
for(j=j+3;j<length;j++)
{
tempinput[k++]=prearray[j];
}
strcpy(prearray,tempinput);
length=strlen(prearray);
}
void prefix()
{
int i,k=0;
char popval;
for(i=0;i<length;i++)
{
if(isalpha(input[i])|| isdigit(input[i]))
{
prearray[k++]=input[i];
}
else
{
if(top==-1)
push(input[i]);
else
{
while(prec(stack[top])>=prec(input[i]))
{
prearray[k++]=pop();
}
push(input[i]);
}
}
}
if(top!=-1)
{
for(i=top;i>-1;i--)
{
prearray[k++]=pop();
}
}
}
void generator(char op,char sym1,char sym2)
{
int len=0;
concatarray[len++]=tempvariables[tvar++];
concatarray[len++]='=';
concatarray[len++]=sym1;
concatarray[len++]=op;
concatarray[len++]=sym2;
strcpy(threeaddress[count++],concatarray);
}
void main()
{
int i;
printf("\nEnter the input expression: ");
scanf("%s",input);
length=strlen(input);
strcpy(input,strrev(input));
prefix();
strcpy(prearray,strrev(prearray));
for(i=0;i<length;i++)
{
if(isoperator(prearray[i]))
{
if((isalpha(prearray[i+1])||
isdigit(prearray[i+1]))&&(isalpha(prearray[i+2])||isdigit(prearray[i+2])))
{
generator(prearray[i],prearray[i+1],prearray[i+2]);
del(prearray[i],i);
i=-1;
}
}
}
printf("\nThree Adress Code for the expression:\n\n");
for(i=0;i<count;i++)
{
printf("%s\n",threeaddress[i]);
}
}

Sample input and output


Enter the input expression: a+b*c/d-e*f

Three address code for this expression


z=c/d
y=b*z
x=e*f
w=y-x
v=a+w
EXPERIMENT 12
Implementation of Lexical Analyzer using Lex Tool.
 Lexical analysis is the first phase of a compiler.
 Programs that perform lexical analysis are called lexical analyzers or lexers.
 The lexical analyzer takes the source code and breaks it into a series of
tokens, by removing any whitespace or comments in the source code.
 In programming language, keywords, constants, identifiers, strings, numbers,
operators and punctuations symbols can be considered as tokens.
 A lexeme is a sequence of characters that are included in the source program
according to the matching pattern of a token. It is nothing but an instance of
a token.
 Consider the following code given to lexical analyzer:

#include <stdio.h>
int maximum(int x, int y) {
// This will compare 2 numbers
if (x > y)
return x;
else {
return y;
}
}

 Some examples of tokens created are:


Lexeme Token
int Keyword
maximum Identifier
( Special character
int Keyword
x Identifier
, Special character
int Keyword
y Identifier
) Special character
{ Special character
if Keyword
 LEX is a tool used to generate a lexical analyzer.
 LEX translates a set of regular expression specifications (given as input in in-
put_file.l) into a C implementation (lex.yy.c).
 This C program, when compiled, yields an executable lexical analyzer.
Structure of LEX programs
 A LEX program consists of three sections: Declarations, Rules and Auxiliary
functions.
Declarations

%%

Rules

%%

Auxiliary functions

Declarations
 The declarations section consists of two parts: auxiliary declara-
tions and regular definitions.

 The auxiliary declarations are copied as such by LEX to the output lex.yy.c file.

 This C code consists of instructions to the C compiler and are not processed
by the LEX tool.

 The auxiliary declaration are written in C language and are enclosed within '
%{ ' and ' %} ' . It is generally used to declare functions, include header files,
or define global variables and constants.

 LEX allows the use of short-hands and extensions to regular expressions for
the regular definitions.

 A regular definition in LEX is of the form : D R where D is the symbol rep-


resenting the regular expression R.

Eg:
/* Auxiliary declaration starts*/
%{
#include<stdio.h>
int global_variable;
%}
/* Auxiliary declaration ends*/
/* Regular definition starts*/

number [0-9]+
op [-|+|*|/|^|=]
/* Regular definition ends*/
%{
#include<unistd.h>
%}

%option noyywrap
%%
init { printf("INIT\n"); }
begin { printf("BEGINTOK\n"); }
end { printf("END\n"); }
var { printf("VAR\n"); }
[A-Za-z][A-Za-z0-9]* { printf("ID\n"); }
\, {printf("CM\n");}
\; {printf("SC\n");}
\= {printf("EQ\n");}
[0-9]+ { printf("CONST\n");}
\+ {printf("PL\n");}
\- {printf("SUB\n");}
\* {printf("MUL\n");}
\/ {printf("DIV\n");}
\( {printf("OP\n");}
\) {printf("CP\n");}
[ \n\t] ;
. ;
%%
int main()
{
yyin=fopen("file.p","r");
yylex();
fclose(yyin);
}

Sample input:
init
var a,b,c;
a=4;
begin
b=a*2;
c=a+b;
end

Sample output:
INIT
VAR
ID
CM
ID
CM
ID
SC
ID
EQ
CONST
SC
BEGINTOK
ID
EQ
ID
MUL
CONST
SC
ID
EQ
ID
PL
ID
SC
END

EXPERIMENT 13
Implement a calculator using LEX and YACC.

%{
#include<stdio.h>
int op=0,i;
float a,b;
%}
dig[0-9]+|([0-9]*)"."([0-9]+)
add "+"
sub "-"
mul"*"
div "/"
pow "^"
ln \n
%%
{dig}{digi();}
{add}{op=1;}

{sub}{op=2;}
{mul}{op=3;}
{div}{op=4;}
{pow}{op=5;}
{ln}{printf("\n the result:%f\n\n",a);}
%%
digi()
{
if(op==0)
a=atof(yytext);
else
{
b=atof(yytext);
switch(op)
{
case 1:a=a+b;
break;
case 2:a=a-b;
break;
case 3:a=a*b;
break;
case 4:a=a/b;
break;
case 5:for(i=a;b>1;b--)
a=a*i;
break;
}
op=0;
}
}
main(int argv,char *argc[])
{
yylex();
}
yywrap()
{
return 1;
}

You might also like