Download as pdf or txt
Download as pdf or txt
You are on page 1of 140

PRACTICAL FILE

Of

COMPILER DESIGN (CS-654)


Practical File submitted in partial fulfilment of
the requirements for the award of
Bachelor of Engineering
IN
COMPUTER SCIENCE AND ENGINEERING

Submitted by:
Gursimar Singh
(Roll no: CO17325)

Under the Guidance of:

Dr. Gulshan Goyal

CHANDIGARH COLLEGE OF ENGINEERING AND TECHNOLOGY

(DEGREE WING)
Government Institute under Chandigarh (UT) Administration, Affiliated to Panjab University,
Chandigarh
Sector-26, Chandigarh. PIN-160019
Jan – May, 2020
CHANDIGARH COLLEGE OF ENGINEERING AND TECHNOLOGY (DEGREE WING)
Government Institute under Chandigarh (UT) Administration | Affiliated to Panjab University , Chandigarh
Sector-26, Chandigarh. PIN-160019 | Tel. No. 0172-2750947, 2750943
Website: www.ccet.ac.in | Email: principal@ccet.ac.in | Fax. No. :0172-2750872

Department of Computer Sc. & Engineering

ACKNOWLEDGEMENT

It is a great pleasure to present this practical file of Compiler Design. I have taken efforts in
this Lab. However, it would not have been possible without the kind support and help of our
teacher Dr. Gulshan Goyal. I would like to extend my sincere thanks to him.

I am highly indebted to Chandigarh College of Engineering & Technology (Degree Wing) for
their guidance and constant supervision as well as for providing necessary information
regarding the practical & also for their support in completing the practical file.

They taught me all the basic concepts required for the practical and guided me through each
step of building the programs whenever I was stuck.

I would like to express my special gratitude and thanks to institution (C.C.E.T.) persons for
giving me such attention and time.

I would also like to thank the University for including this Compiler Design Lab as a part of
our curriculum.
INDEX
S. PAGE
EXPERIMENT DATE REMARK
NO. NO.

1 Introduction to Compiler.

To construct a program to produce categories of


2
DFAs.

To construct a program for finding out the


3
number of comments from an input program.

4 To construct a lexical analyzer.

To construct a program to parse a string for a


5
grammar.

6 Implementation of LL parser.

7 Implementation of SLR parser.

8 Implementation of CLR parser.

9 Implementation of LALR parser.


Practical-1

1.1 Aim- Introduction to Compiler.

1.2 Translator

A program written in high-level language is called as source code. To convert the source
code into machine code, translators are needed.

A translator takes a program written in source language as input and converts it into a
program in target language as output.

It also detects and reports the error during translation.

1.2.1 Roles of translator

• Translating the high-level language program input into an equivalent machine


language program.

• Providing diagnostic messages wherever the programmer violates specification


of the high-level language program.

1.2.2 Different type of translators

The different types of translator are as follows:

1.2.2.1 : Compiler

Compiler is a translator which is used to convert programs in high-level language to low-


level language. It translates the entire program and also reports the errors in source program
encountered during the translation.

Figure 1.1: Working of Compiler, Image Courtesy- http://ecomputernotes.com


1.2.2.2 : Interpreter

Interpreter is a translator which is used to convert programs in high-level language to low-


level language. Interpreter translates line by line and reports the error once it encountered
during the translation process.

It directly executes the operations specified in the source program when the input is given
by the user.

It gives better error diagnostics than a compiler.

Figure 1.2: Working of Interpreter, Image Courtesy- http://ecomputernotes.com

Differences between compiler and interpreter

SI. Compiler Interpreter


No

1 Performs the translation of a program as Performs statement by


a whole. statement translation.

2 Execution is faster. Execution is slower.

3 Requires more memory. Memory usage is efficient.

4 Debugging is hard. Debugging is easy.

5 Programming languages like C, C++ Programming languages like Python,


uses compilers. BASIC, and Ruby uses interpreters.
1.2.2.3 : Assembler

Assembler is a translator which is used to translate the assembly language code into
machine language code.

Figure 1.3: Working of Assembler, Image Courtesy- http://ecomputernotes.com


1.3 Phases of Compiler
The structure of compiler consists of two parts:

1.3.1 Analysis part

• Analysis part breaks the source program into constituent pieces and imposes a
grammatical structure on them which further uses this structure to create an intermediate
representation of the source program.

• It is also termed as front end of compiler.

Figure 1.4: Analysis Part, Image Courtesy- http://ecomputernotes.com


1.3.2 Synthesis part

• Synthesis part takes the intermediate representation as input and transforms it to


the target program.

• It is also termed as back end of compiler.

Figure 1.5: Synthesis Part, Image Courtesy- http://ecomputernotes.com


The design of compiler can be decomposed into several phases, each of which converts
one form of source program into another.

1.3.3 The different phases of compiler are as follows:

• Lexical analysis • Intermediate code generation

• Syntax analysis • Code optimization

• Semantic analysis • Code generation

All of the mentioned phases involve the following tasks:

• Symbol table management.

• Error handling.

Figure 1.6: Phases of Compiler, Image Courtesy- http://ecomputernotes.com


1.3.3.1 Lexical Analysis

• Lexical analysis is the first phase of compiler which is also termed as scanning.

• Source program is scanned to read the stream of characters and those characters
are grouped to form a sequence called lexemes which produces token as output.
• Token: Token is a sequence of characters that represent lexical unit, which
matches with the pattern, such as keywords, operators, identifiers etc.

• Lexeme: Lexeme is instance of a token i.e., group of characters forming a token. ,

• Pattern: Pattern describes the rule that the lexemes of a token takes. It is the
structure that must be matched by strings.

• Once a token is generated the corresponding entry is made in the symbol table.

Input: stream of characters

Output: Token

Token Template: <token-name, attribute-value>

(eg.) c=a+b*5;

Lexemes and tokens

Lexemes Tokens

C Identifier

= assignment symbol

A Identifier

+ + (addition symbol)

B Identifier

* * (multiplication symbol)

5 5 (number)

Hence, <id, 1><=>< id, 2>< +><id, 3 >< * >< 5>

1.3.3.2 Syntax Analysis

• Syntax analysis is the second phase of compiler which is also called as parsing.
• Parser converts the tokens produced by lexical analyzer into a tree like
representation called parse tree.

• A parse tree describes the syntactic structure of the input.

• Syntax tree is a compressed representation of the parse tree in which the operators
appear as interior nodes and the operands of the operator are the children of the node for
that operator.

Input: Tokens

Output: Syntax tree

Figure 1.7: Syntax Tree, Image Courtesy- http://ecomputernotes.com


1.3.3.3 Semantic Analysis

• Semantic analysis is the third phase of compiler.

• It checks for the semantic consistency.

• Type information is gathered and stored in symbol table or in syntax tree.

• Performs type checking.

Figure 1.8: Syntax Annotated Tree, Image Courtesy- http://ecomputernotes.com


1.3.3.4 Intermediate Code Generation
• Intermediate code generation produces intermediate representations for the
source program which are of the following forms:

• Postfix notation

• Three address code

• Syntax tree

Most commonly used form is the three address code.

t1 = inttofloat (5)

t2 = id3* tl

t3 = id2 + t2

id1 = t3

Properties of intermediate code

• It should be easy to produce.

• It should be easy to translate into target program.

Figure illustrates the translation of source code through each phase, considering the
statement

c =a+ b * 5.

Figure 1.9: Example, Image Courtesy- http://ecomputernotes.com


1.3.4 Symbol Table Management
• Symbol table is used to store all the information about identifiers used in the program.

• It is a data structure containing a record for each identifier, with fields for the
attributes of the identifier.
• It allows finding the record for each identifier quickly and to store or retrieve data
from that record.

• Whenever an identifier is detected in any of the phases, it is stored in the symbol table.

Example

int a, b; float c; char z;

Symbol name Type Address

A Int 1000

B Int 1002

C Float 1004

Z Char 1008

1.3.5 Error Handling

• Each phase can encounter errors. After detecting an error, a phase must handle
the error so that compilation can proceed.

• In lexical analysis, errors occur in separation of tokens.

In syntax analysis, errors occur during construction of syntax tree.

• In semantic analysis, errors may occur at the following cases:

(i) When the compiler detects constructs that have right syntactic structure but no
meaning

(ii) During type conversion.

• In code optimization, errors occur when the result is affected by the optimization.
In code generation, it shows error when code is missing etc.
1.3.5.1 Error Encountered in Different Phases

Each phase can encounter errors. After detecting an error, a phase must some how deal
with the error, so that compilation can
proceed. A program may have the following kinds of errors at various
stages:
1. Lexical Errors: It includes incorrect or misspelled name of some identifier i.e.,
identifiers typed incorrectly.

2. Syntactical Errors: It includes missing semicolon or unbalanced parenthesis.


Syntactic errors are handled by syntax analyzer (parser).
3. Semantical Errors: These errors are a result of incompatible value assignment.
The semantic errors that the semantic analyzer is expected to recognize are:

• Type mismatch.
• Undeclared variable.
• Reserved identifier misuse.
• Multiple declaration of variable in a scope.
• Accessing an out of scope variable.
• Actual and formal parameter mismatch.

4. Logical errors: These errors occur due to not reachable code-infinite loop.

1.4 Frequently Asked Questions


Ques-1: What are the different types of translator?

Answer: Compiler, Interpreter, Assembler.

Ques-2: Define compiler.

Answer: Compiler is a translator which is used to convert programs in high-level language


to low-level language.

Ques-3: What are the phases of a compiler?

Answer: Lexical Analysis, Syntax Analysis, Semantic Analysis, Intermediate Code


Generation, Code Optimization, Code Generation.
Ques-4: In a compiler the module that checks every character of the source text is called.
Answer: The code generator.
Ques-5: Mention some of the cousins of a compiler.

Answer: Pre-processors, Assemblers, Loaders and link editor.


Practical-2

2.1 Aim: To construct a program to produce categories of DFAs.


2.2 Input: Category of DFA
• Excatly
• Atleast
• Atmost

2.3 Expected Output: Transition table of selected category of DFA and


acceptance or rejection of string by DFA.

Figure 2.1: Algorithm


2.4 Flowchart
Figure 2.2: Flowchart
2.5 Source Code
#include<iostream>
using namespace std;
int main()
{
char input[10];
char cond[10],string[10],z;
int
j=0,k=1,m,table[10][10],i=0,choice,states,n,n1,flag=0,n2,input1[10],con
d1[10],string1[10];
cout<<"\n\t 1: Excatly";
cout<<"\n\t 2: Atleast";
cout<<"\n\t 3: Atmost";

cout<<"\n\tEnter your choice: ";


cin>>choice;
switch(choice)
{

//Case-1
case 1:
{
cout<<"\n\tEnter the input alphabets: ";
for(i=0;i<2;i++)
{
cin>>input[i];
}
for(i=0;i<2;i++)
{
input1[i]=i;
}
cout<<"\n\tEnter the length of string: ";
cin>>n;
cout<<"\n\tEnter the string: ";
for(i=0;i<n;i++)
{
cin>>cond[i];
}
z=input[0];
for(i=0;i<n;i++)
{

if(cond[i]==z)
{
cond1[i]=0;
}
else
{
cond1[i]=1;
}
}
states=n+1;
k=cond1[0];

for(i=0;i<states;i++)
{
for(j=0;j<2;j++)
{

if(i==states-1)
{
if(j==k)
{
table[i][j]=states;
}
else
{
table[i][j]=i;
}
}

else
{
if(j==k)
{
table[i][j]=i+1;
}
else
{
table[i][j]=i;
}
}
}

}
for(i=0;i<states;i++)
{
for(j=0;j<2;j++)
{
if(table[i][j]==states)
{
table[states][0]=states;
table[states][1]=states;
break;
}
}
}

cout<<"\n\tThe Transistion Table is: \n";


cout<<" ";
for(j=0;j<2;j++)
{
cout<<"\t"<<input[j];
}
cout<<endl;

for(i=0;i<=states;i++)
{ cout<<"q"<<i<<"\t";
for(j=0;j<2;j++)
{
cout<<"q"<<table[i][j]<<"\t";
}
cout<<endl;
}
cout<<"\n\tEnter the length of required string: ";
cin>>n1;
cout<<"\n\tEnter the string: ";
for(i=0;i<n1;i++)
{
cin>>string[i];
}
for(i=0;i<n1;i++)
{
if(string[i]==z)
{
string1[i]=0;
}
else
{
string1[i]=1;
}
}

i=0,k=0,j=0;
for(k=0;k<n1;k++)
{
j=string1[k];
if(table[i][j]==i+1)
{
i=i+1;
m=i;
if(i==states)
{
cout<<"\tString not Accepted";
flag=1;
break;
}

else
{
i=i*1;
}

}
if(m==states-1)
{
cout<<"\tString is Accepted";
}
if(flag!=1)
{
cout<<"\tString not Accepted";
}
break;
}

//Case-2
case 2:
{
cout<<"\n\tEnter the input alphabets: ";
for(i=0;i<2;i++)
{
cin>>input[i];
}
for(i=0;i<2;i++)
{
input1[i]=i;
}
cout<<"\n\tEnter the length of string: ";
cin>>n;
cout<<"\n\tEnter the string: ";
for(i=0;i<n;i++)
{
cin>>cond[i];
}
z=input[0];
for(i=0;i<n;i++)
{
if(cond[i]==z)
{
cond1[i]=0;
}
else
{
cond1[i]=1;
}
}

states=n+1;
k=cond1[0];

for(i=0;i<states;i++)
{
for(j=0;j<2;j++)
{

if(i==states-1)
{
table[i][j]=i;
}

else
{
if(j==k)
{
table[i][j]=i+1;
}
else
{
table[i][j]=i;
}
}
}
}
cout<<"\n\tThe Transistion Table is: \n";
cout<<" ";
for(j=0;j<2;j++)
{
cout<<" \t"<<input[j];
}
cout<<endl;
for(i=0;i<states;i++)
{ cout<<"q"<<i<<"\t";
for(j=0;j<2;j++)
{
cout<<"q"<<table[i][j]<<"\t";
}
cout<<endl;
}
cout<<"\n\tEnter the length of required string: ";
cin>>n1;
cout<<"\n\tEnter the string: ";
for(i=0;i<n1;i++)
{
cin>>string[i];
}
z=input[0];
for(i=0;i<n1;i++)
{
if(string[i]==z)
{
string1[i]=0;
}
else
{
string1[i]=1;
}
}

k=0,i=0,j=0;
for(k=0;k<n1;k++)
{

j=string1[k];
if(table[i][j]==i+1)
{
i=i+1;
if(i==states-1)
{
cout<<"\tString is Accepted";
flag=0;
break;
}
}
else
{

i=i*1;
}
flag=1;

}
if(flag==1)
{
cout<<"\tString not Accepted";
}
}
//Case-3
case 3:
{
cout<<"\n\tEnter the input alphabets: ";
for(i=0;i<2;i++)
{
cin>>input[i];
}
for(i=0;i<2;i++)
{
input1[i]=i;
}
cout<<"\n\tEnter the length of string: ";
cin>>n;
cout<<"\n\tEnter the string: ";
for(i=0;i<n;i++)
{
cin>>cond[i];
}
z=input[0];
for(i=0;i<n;i++)
{

if(cond[i]==z)
{
cond1[i]=0;
}
else
{
cond1[i]=1;
}
}

states=n+1;
k=cond1[0];

for(i=0;i<states;i++)
{
for(j=0;j<2;j++)
{

if(i==states-1)
{

if(j==k)
{
table[i][j]=states;
}
else
{
table[i][j]=i;
}
}

else
{
if(j==k)
{
table[i][j]=i+1;
}
else
{
table[i][j]=i;
}
}
}
}
for(i=0;i<states;i++)
{
for(j=0;j<2;j++)
{
if(table[i][j]==states)
{
table[states][0]=states;
table[states][1]=states;
break;
}
}
}
cout<<"\n\tThe Transistion Table is: \n";
cout<<" ";
for(j=0;j<2;j++)
{
cout<<" \t"<<input[j];
}
cout<<endl;
for(i=0;i<=states;i++)
{ cout<<"q"<<i<<"\t";
for(j=0;j<2;j++)
{
cout<<"q"<<table[i][j]<<"\t";
}
cout<<endl;
}
cout<<"\n\tEnter the length of required string: ";
cin>>n1;
cout<<"\n\tEnter the string: ";
for(i=0;i<n1;i++)
{
cin>>string[i];
}
z=input[0];
for(i=0;i<n1;i++)

{
if(string[i]==z)
{
string1[i]=0;
}
else
{
string1[i]=1;
}
}

k=0,i=0,j=0;
for(k=0;k<n1;k++)
{

j=string1[k];
if(table[i][j]==i+1)
{
i=i+1;
if(i==states)
{
cout<<"\tString not Accepted";
flag=0;
break;
}

}
else

{
i=i*1;}
flag=1;}
if(flag==1)

{
cout<<"\tString is Accepted";

}
}

} }
2.6 Output

Figure 2.3: Output-1

Figure 2.4: Output-2


Figure 2.5: Output-3

2.7 Frequently Asked


Questions Ques-1: Give transition
function Of DFA. Answer: δ: Q x Σ → Q.
Ques-2: What are the various categories of DFA?
Answer: Excatly, Atleast, Atmost, Ending etc.
Ques-3: What is the purpose of DFA in Compiler Design?
Answer: DFA is used to recognize the tokens during the lexical analysis phase of a
compiler.
Ques-4: A Language for which no DFA exist is a Answer: Non Regular
Language.
Ques-5: What are the various formats to represent DFA?
Answer: Transition graph, Transition table.
Ques-6: Languages of a automata is.
Answer: If it is accepted by automata.

FAQs:-

Q1. Can a Dfa have multiple final states?


Ans. Yes.

Q2. Can a DFA have zero states?

Department of CSE, CCET(Degree Wing) 23


Compiler design practical (CS654) CO16333

Ans.NO, the number of states for a DFA is >= 1. A DFA with no final states rejects every input,
so it's an automaton that recognizes L=∅. So no,DFA doesn't require final states.

Q3.Can a dfa have no final state?

Ans. DFA can only have one initial state, but can have zero, one or more than one final states over
any number of input alphabet.

Q4. Define dfa?

Ans. Finite Automata (M) is defined as a set of five tuples (Q, ∑, δ, q0 , F)

Where
Q= a finite, non-empty set of states.
∑= a finite, non-empty set of inputs.
δ is the state-transition
function: δ:QX∑→Q
q0 is the initial state F is the set of final states, a subset of Q.

Q5. Is NFA more powerful than Dfa?

Ans. No, as DFAs and NFAs define the same class of languages the regular languages.

Output screenshot:
Practical-3

3.1 Aim: To construct a program for finding out the


number of comments from an input program.

3.2 Input: A C++ language program.


3.3 Expected Output: Number of comments(both single and multiline) in the
program.

3.4 Algorithm
Input

• Choosing whether to input program through file or console


• Source Code in C++

Output

• Display the number of singleline and multiline comments

Begin

Algorithm:-
step1:- Initialise the int n_o_s_c_l = 0, n_o_m_c_l = 0, t_c comments, n_o_m_c_l is number of
multiple line comments , step2:- Input the file name and open it in the read mode. step3: //for
single line comments:-
while ((ch = fgetc(fp1)) != EOF)
if (ch == '/')
then if ((ch = fgetc(fp1)) == '/')
then n_o_s_c_l++;
step4:- go back to the starting of the file.
step5:- //for multiple line comments:-while ((ch = fgetc(fp1)) != EOF)
if ch = =’/’
then
if ((ch = fgetc(fp1)) == '*')
then n_o_m_c_l++;
step6:-// for total number of comments add t_c = n_o_m_c_l + n_o_s_c_l
End
Figure 3.1: Algorithm
3.5 Flowchart

Figure 3.2: Flowchart

3.6 Source Code

#include<iostream>

#include<string.h>
#include<stdlib.h>

using namespace std;

FILE *fp;

int single=0,multi=0,i=0,k=0,choice;char c;

void single_comment()

char d,e;

while((d=fgetc(fp))!=EOF)

if(d=='\n')

single++;

return;

void multi_comment()

char d,e;

while((d=fgetc(fp))!=EOF)

if(d=='*')

e=fgetc(fp);

if(e=='/')

{
multi++;

return;

void check(char c)

char d;

if(c=='/')

if((d=fgetc(fp))=='*')

multi_comment();

else if(d=='/')

single_comment();

int main()

char program[1000];

char current;

cout<<"\n\t 1: From File";

cout<<"\n\t 2: From Console";


cout<<"\n\t Enter your choice: ";

cin>>choice;

switch(choice)

case 1:

fp = fopen ("comments.txt","r") ;

while((c=fgetc(fp))!=EOF)

check(c);

fclose(fp);

break;

case 2:

cin.ignore();

cout<<"Enter the required program "<<"\n";

cin.getline(program,1000);

while(program[k]!='\0')

if(program[k]=='/' && program[k+1]=='/')

single++;

k+2;

while(program[k]!='\0')

if((program[k]=='/'

&& program[k+1]=='n')|| program[k]=='\n')

break;

k++;

}
else if( program[k]=='/' && program[k+1]=='*')

k+2;

while(program[k]!='\0')

if(program[k]=='*' && program[k+1]=='/')

multi+

+;

k++;

break;

k++;}}

k++;}

break;

cout<<"\n"<<"No of singleline comments are: "<<single<<"\n";

cout<<"No of multiline comments are: "<<multi<<"\n";

return 0;

3.7 Output
Figure 3.3: Output-1
Figure 3.4: Output-2

3.8 Frequently Asked Questions


Ques-1: What is a comment?

Answer: A comment is a programmer-readable explanation in the source code of a


computer program.

Ques-2: What are different types of comments?

Answer: Singleline comments, Multiline comments.

Ques-3: Purpose of Comments.

Answer: Planning and reviewing, Code Description, Debugging, Resource inclusion.

Ques-4: How to define single line comments?

Answer: Single line comment can be defined using single “//”.

Ques-5: What is used to write multi line comment in C++?


Answer: /* …. */.
Practical-4
4.1 Aim: To construct a lexical analyzer.

4.2 Input: C code.

4.3 Expected Output: Display the identifiers, operators, keywords present in code.

4.4 Algorithm
Input

• Source Code in C
• Details:-

• Lexical Analysis is the first phase of a compiler. It takes the modified source code
from the language preprocessors that are written in the form of sentences. The
Lexical analyzer breaks these synatxes into a series of tokens, removing any whites-
pace or comments in the source code.
• If lexical analyzer finds a token invalid , it generates an error. The lexical Analyzer
finds a token invalid, it generates an error. It reads character streams from the source
code, check for legal tokens, and phases the data to the syntax analyzer when it
demands

Output

• Display the identifiers, operators, keywords present in input code

Begin

1. Define the tokens of the language for which the lexical analyser is to be built
2. Define the identifiers, keywords, operators for the source program that is to be
scanned by the lexical analyser
3. Traverse the code character by character, to identify the keywords , operators and
identifiers
4. Return a token id when an identifiers, keyword or operators is found

End
Figure 4.1: Algorithm
4.5 Flowchart

Figure 4.2: Flowchart


4.6 Source Code

#include<iostream>
#include<fstream>
#include<stdlib.h>
#include<string.h>
#include<ctype.h>
using namespace std;
int keyword(char a[])
{
char
keywords[32][10]={"auto","break","case","car","const","continue","defau
lt",

"do","double","else","enum","extern","float","for","goto",
"if","int","long","register","return","short","signed",

"sizeof","static","struct","switc","typedef","union",

"unsigned","void","volatile","while"};
int i,flag=0;
for(i=0;i<32;i++){
if(strcmp(keywords[i],a)==0)
{
flag=1;
break;
}
}
return flag;
}
int main(){
char c, a[15], operators[] = "+-*/%=";
ifstream fobject("code.txt");
int i,j=0;
while(!fobject.eof())
{
c=fobject.get();
for(i=0;i<6;i++)
{
if(c==operators[i])
cout<<c<<" is operator\n";
}
if(isalnum(c))
{
a[j++]=c;}
else if((c==' ' || c=='\n')&&(j!=0))
{
a[j]='\0';
j=0;
if(keyword(a)==1)
cout<<a<<" is keyword\n";
else
cout<<a<<" is indentifier\n";
}
}
fobject.close();
return 0;
}

4.7 Output

Figure 4.3: Output

4.8 Frequently Asked Questions


Ques-1: The process of forming tokens from an input stream of characters is called .
Answer: Tokenization.
Ques-2: Define Lexemes and Tokens.

Answer:

• Lexemes are the words derived from the character input stream.
• Tokens are lexemes mapped into a token-name and an attribute-value.
Ques-3: Tool for generating lexical analyzer.

Answer: Lex.

Ques-4: The output of lexical analyzer is.

Answer: Set of tokens

Ques-5: When expression sum=3+2 is tokenized then what is the token category of 3.
Answer: Integer Literal.
Practical-5

5.1 Aim: To construct a program to parse a string for a grammar.

5.2 Input: Context free grammar.

5.3 Expected Output: String generated and its derivation.

5.4 Algorithm
Input

• No. of Production Rules


• Enter the LHS and RHS of rule
• Enter the string to be parsed
• Enter the start symbol

Output

• Whether the string can be parsed or not

Begin

1: Start

2: Enter the production rules for grammar

3: Enter the string to generate

4: Print the parsing steps

5: Stop

ALGORITHM:
· Start.
· Initialise count -> 0 and Input string in expr, l->length of expr
· Set expr->expr + “$”
· Output Production rule of start symbol (E) followed by production rule of non-terminals in E
i.e T.
· If expr[count] = '+' then
output “E'->+TE'” and count->count+1
Else if expr[count] = '-'
output “E'->-TE'” and count->count+1
else output “E->null” and goto next step
· If expr[count] = '*' then
output "T'->*FT'" and count->count+1
Else if expr[count] = '/'
output "T'->/FT'" and count->count+1
else output ‘T->null’ and goto next step
· If expr[count] is an alphabet
then Output “F->id” and count-
>count+1 Else if expr[count] is
a digit then Output “F->digit”
and count->count+1
Else if expr[count] is ‘(’ Output “F->digit” and count->count+1 and
goto step 4 if expr[count] is ‘)’ Output “Rejected” and Stop
else Output “Rejected” and Stop.
· if l = count then
Output "Accepted" else
Output "Rejected"

End
Figure 5.1: Algorithm
5.5 Flowchart

Figure 5.2: Flowchart


5.6 Source Code

#include <iostream>
#include<bits/stdc++.h>
using namespace std;
//char*arrlhs;
//char**arrrhs;
char str[10];
char strred[30][10];
char rhs[10][10];
int n;
int len_of_string;
int count_reduce=-1;
int counter_for_reduction=0;
int string_counter=0;
void check_rule_for_start(char,char[],int);
void check_rule_for_LHS(char[],int,int);
void reduce(int,char[],int);
void display();
int main()
{
cout<<"Enter the number of rules: ";
cin>>n;
char arrlhs[n];
int i=0;
//char rhs[10][10];
// char strred[30][10];
while(i<n)
{
cout << "Enter the LHS and RHS of the rule"<<"\n";
cin>>arrlhs[i];
//cout<<arrlhs[i]<<endl;
//cout<<"Enter the rhs of the rule"<<endl;
cin>>rhs[i];
i++;
}
//char str[10];
cout<<"Enter the sting: ";
cin>>str;
int len=strlen(str);
len_of_string=len;
char start_symbol;
cout<<"Enter the start symbol: ";
cin>>start_symbol;
check_rule_for_start(start_symbol,arrlhs,0);
display();
return 0;
}

void check_rule_for_start(char symbol,char arrlhs[],int str_co)


{
for(int i=0;i<n;i++)
{
if(symbol==arrlhs[i])
{
count_reduce++;
reduce(i,arrlhs,str_co);
//count_reduce--;
}
}
}
void check_rule_for_LHS(char arrlhs[],int leng,int str_co)
{
//cout<<"ye";
int loc_co=count_reduce;
int loc_strc=string_counter;
for(int k=0;k<leng;k++)
{
//cout<<" hehe "<<leng<<" string co "<<string_counter;
if(string_counter==len_of_string)
{string_counter=loc_strc;
return;
/*//cout<<endl<<"parsed"<<endl;
for(int l=0;l<count_reduce;l++)
{
cout<<strred[l]<<" - ";
}
cout<<str;
//exit(1);*/
}
if(strred[loc_co][k]>=97&&strred[loc_co][k]<=122)
{
//cout<<"ho"<<string_counter;
if(strred[loc_co][k]==str[string_counter])
{
//str_co++;
string_counter++;
}
else
{
string_counter=loc_strc;
return;
}
}

else if(strred[loc_co][k]>=65&&strred[loc_co][k]<=90)
{
//cout<<"hello";
counter_for_reduction=k;
check_rule_for_start(strred[loc_co][k],arrlhs,string_counter);
}
}
if(string_counter==len_of_string) {
cout<<endl<<"Parsed"<<endl;

for(int l=0;l<count_reduce;l++)
{
cout<<strred[l]<<" - ";}
cout<<str;
exit(1);}
return;
}
void reduce(int counter,char arrlhs[],int str_co)
{
int prev=0;
for(int z=0;z<10;z++)
{
if(z<counter_for_reduction||z>counter_for_reduction+(strlen(rhs[counter
])))
{
strred[count_reduce][z]=strred[count_reduce-1][prev];
prev++;
}
else
{
//cout<<"enter"<<endl;
cout<<endl;
prev++;
//cout<<strlen(rhs[counter])<<endl;
for(int k=0;k<strlen(rhs[counter]);k++)
{
// cout<<"enter1"<<endl;
char temp=rhs[counter][k];
strred[count_reduce][k]=temp;
//cout<<temp;
z++;
for(int l=0;l<count_reduce;l++)
}
//cout<<" fdlfh "<<strred[count_reduce]<<" kwf "<<string_counter<<"
d";
}
}
//count_reduce++;
check_rule_for_LHS(arrlhs,strlen(rhs[counter]),str_co);
return;
}

void display(){
cout<<"The RHS after reductions are as follows"<<endl;
for(int i=0;i<count_reduce;i++){
//for(int j=0;j<10;j++){
//cout<<"enter"<<endl;
cout<<strred[i]; }
cout<<endl;} }

5.7 Output

Figure 5.3: Output


5.8 Frequently Asked Questions

Ques-1: The entity which generate Language is termed as:


Answer: Grammar.
Ques-2 The Grammar can be defined as: G=(V, ∑, p, S)
In the given definition, what does S represents?
Answer: Starting Variable.
Ques-3: Which of the expression is appropriate?
For production p: a->b where a∈V and b∈ _
Answer: (V+∑)*.
Ques-4: Does all regular grammar are context free?

Answer: Yes.

Ques-5: Are ambiguous grammar context free?


Answer: Yes.
Practical-6

6.1 Aim: Implementation of LL parser.

6.2 Input: Input grammar rules, first and follow of non-terminals.

6.3 Expected Output

1. LL(1) parsing table.


2. If there are more than one entry generated in one cell, then a message is displayed
such as “The given grammar is not LL1” otherwise “The given grammar is LL1”.

METHOD:-
Predictive parser is a recursive descent parser, which has the capability to predict which production
is to be used to replace the input string. The predictive parser does not suffer from backtracking.To
accomplish its tasks, the predictive parser uses a look-ahead pointer, which points to the next input
symbols. To make the parser back-tracking free, the predictive parser puts some constraints on the
grammar and accepts only a class of grammar known as LL(k) grammar. Predictive parsing uses a
stack and a parsing table to parse the input and generate a parse tree. Both the stack and the input
contains an end symbol $ to denote that the stack is empty and the input is consumed. The parser
refers to the parsing table to take any decision on the input and stack element combination.

Fig .6.1

In recursive descent parsing, the parser may have more than one production to choose from for a
single instance of input, whereas in predictive parser, each step has at most one production to
choose. There might be instances where there is no production matching the input string, making
the parsing procedure to fail.

Algorithm:-
i. Read the input string.
ii. Using predictive parsing table parse the given input string using stack .
iii. If stack [i] matches with token input string pop the token else shift it repeat the process un-
til it reaches to $.

6.4 Algorithm

Input
• Enter the no. of productions
• Enter the productions
• Enter the first values
• Enter the follow values
Ouput
• LL(1) parsing table
• Determine given grammar is LL1 or not
Begin:
1. Generate FIRST and FOLLOW’s for all the non-terminals in the provided
grammar
2. Now generate FIRST of right side of production rules
3. The predictive parser makes use of one look ahead token
4. If k look ahead tokens are needed, then we say the grammar is LL(k).
a. For k > 1, the columns are the possible sequences of k tokens, and the
tables become large

End
Figure 6.1: Algorithm
6.5 Flowchart

Figure 6.2: Flowchart


6.6 Source Code

#include<iostream>
#include<string.h>
using namespace std;
int main()
{
char
pro[10][10],first[10][10],follow[10][10],nt[10],ter[10],res[10][10][10]
,temp[10];
int npro,noter=0,nont=0,i,j,k,flag=0,count[10][10],row,col,l,m,n,index;
for(i=0;i<10;i++)
{
for(j=0;j<10;j++)
{
count[i][j]='\0';
for(k=0;k<10;k++)
{
res[i][j][k]='\0';
}
}
}
cout<<"Enter the no of productions:";
cin>>npro;
cout<<"Enter the productions:";
for(i=0;i<npro;i++)
{
cin>>pro[i];
}
for(i=0;i<npro;i++)
{
flag=0;
for(j=0;j<nont;j++)
{
if(nt[j]==pro[i][0])
{
flag=1;
}
}
if(flag==0)
{
nt[nont]=pro[i][0];
nont++;
}
}
cout<<"\nEnter the first values:\n";
for(i=0;i<nont;i++)
{
cout<<"First value("<<nt[i]<<"):";
cin>>first[i];
}
cout<<"\nEnter the follow values:\n";
for(i=0;i<nont;i++)
{
cout<<"Follow value("<<nt[i]<<"):";
cin>>follow[i];
}
for(i=0;i<nont;i++)
{
flag=0;
for(j=0;j<strlen(first[i]);j++)
{
for(k=0;k<noter;k++)
{
if(ter[k]==first[i][j])
{
flag=1;
}
}
if(flag==0)
{
if(first[i][j]!='#')
{
ter[noter]=first[i][j];
noter++;
}
}
}
}
for(i=0;i<nont;i++)
{
flag=0;
for(j=0;j<strlen(follow[i]);j++)
{
for(k=0;k<noter;k++)
{
if(ter[k]==follow[i][j])
{
flag=1;
}
}
if(flag==0)
{
ter[noter]=follow[i][j];
noter++;
}
}
}
for(i=0;i<nont;i++)
{
for(j=0;j<strlen(first[i]);j++)
{
flag=0;
if(first[i][j]=='#')
{
col=i;
for(m=0;m<strlen(follow[col]);m++)
{
for(l=0;l<noter;l++)
{
if(ter[l]==follow[col][m])
{
row=l;
}
}
temp[0]=nt[col];
temp[1]='-' ;
temp[2]='>';
temp[3]='#';
temp[4]='\0';
cout<<"temp "<<temp;
strcpy(res[col][row],temp);
count[col][row]+=1;
for(k=0;k<10;k++){
temp[k]='\0'; }
}
}
else{
for(l=0;l<noter;l++)
{
if(ter[l]==first[i][j])
{
row=l;
}
}
for(k=0;k<npro;k++){
if(nt[i]==pro[k][0])
{
col=i;
if((pro[k][3]==first[i][j])&&(pro[k][0]==nt[col]))
{
strcpy(res[col][row],pro[k]);
count[col][row]+=1;
}
else
{
if((isupper(pro[k][3]))&&(pro[k][0]==nt[col]))
{
flag=0;
for(m=0;m<nont;m++)
{
if(nt[m]==pro[k][3]){index=m;flag=1;}
}
if(flag==1){
for(m=0;m<strlen(first[index]);m++)
{if(first[i][j]==first[index][m])
{strcpy(res[col][row],pro[k]);
count[col][row]+=1;}}}}}}}}}}
cout<<"LL1 Table\n\n";
flag=0;
for(i=0;i<noter;i++){
cout<<"\t"<<ter[i];
}
for(j=0;j<nont;j++){
cout<<"\n\n"<<nt[j];
for(k=0;k<noter;k++){
cout<<"\t"<<res[j][k];
if(count[j][k]>1){flag=1;}}}
if(flag==1){cout<<"\nThe given grammar is not LL1";}
else{cout<<"\nThe given grammar is LL1";}}

6.7 Output

Figure 6.3: Output


6.8 Frequently Asked Questions

Ques-1: Define Predictive Parser.

Answer: It is a recursive decent parser having capability to predict which production is to


be used to replace the string.

Ques-2: Define LL Parser.

Answer: It is a top-down parser, that parses the input from left to right, performing
leftmost derivation of sentence.

Ques-3: Can left as well as right most derivations can be in Unambiguous grammar?
Answer: No.

Ques-4: What is the use of FIRST and FOLLOW?


Answer: To formalize the task of picking a production rule.
Ques-5: S → C C
C→cC|d
The grammar is
Answer: LL(1).
Practical-7

7.1 Aim: Implementation of SLR parser.

7.2 Input: Grammar Rules.

7.3 Expected Output: SLR Parsing Table and display that the Grammar is
SLR or Not.

7.4 Algorithm

Input
• Enter the no. of production
• Enter the production
•• Details:
• The LR parser is a non-recursive, shift-reduce, bottom-up parser. It uses a wide class
of context-free grammar which makes it the most efficient syntax analysis technique.
LR parsers are also known as LR(k) parsers, where L stands for left-to-right scanning
of the input stream; R stands for the construction of right-most derivation in reverse,
and k denotes the number of lookahead symbols to make decisions.

• There are three widely used algorithms available for constructing an LR parser:



• SLR(1) – Simple LR Parser:

•• Works on smallest class of grammar
•• Few number of states, hence very small table
• Simple and fast construction


Output

• Non-Terminal Symbols
• Terminal Symbols
• First of non-terminal
• Follow of non-terminal
• Canonical LR(0) collection for grammar
• SLR parsing table
• Grammar is SLR or not

Begin

Algorithm:

1. Open a text file using file operations.

2. Check whether first input begins with start symbol.

3. Check if there is only one production present.

4. In case there are more than one productions, push the inputs into the array.

5. Calculate and create items using the create items function.

6. Close the file.

End

Figure 7.1: Algorithm

7.5 Flowchart
Figure 7.2: Flowchart
7.6 Source Code

#include <string.h>
#include<iostream>
using namespace std;
char prod[100][100], term[100], non_term[100], action[100][100][100];
int term_len, non_term_len, first[100][100], follow[100][100],
hash1[100], no_of_prod, hash2[100], canonical[100][100][10], can_len,
go_to[100][100];

void calc_first(int k){


int j, flag=0, len, l, m, i;

if(hash1[k]) return;

hash1[k] = 1;
for(i=0;i<no_of_prod;i++){
if(prod[i][0]==non_term[k]){
flag = 0;
len = strlen(prod[i]);
for(j=3;j<len;j++){
if(!flag){
if(prod[i][j]>='A'&&prod[i][j]<='Z'){
for(l=0;l<non_term_len;l++){
if(non_term[l]==prod[i][j]) break;
}
flag = 1;
if(hash1[l]){
for(m=0;m<term_len;m++){
if(first[l][m]){
first[k][m] = 1;
if(term[m]=='^') flag=0;
}
}
}
else{
calc_first(l);
for(m=0;m<term_len;m++){
if(first[l][m]){
first[k][m] = 1;
if(term[m]=='^') flag=0;
}
}
}
}
else if(prod[i][j]!='|'){
for(l=0;l<term_len;l++){
if(term[l]==prod[i][j]) break;
}
first[k][l] = 1;
flag = 1;
}
}
else{
if(prod[i][j]=='|') flag=0;
}
}
}
}
}
void calc_follow(int k){
int i, len, j, l, flag=0, m;

if(hash2[k]) return;

hash2[k] = 1;
for(i=0;i<no_of_prod;i++){
len = strlen(prod[i]);
for(j=3;j<len;j++){
if(flag){
if(prod[i][j]>='A' && prod[i][j]<='Z'){
flag = 0;
for(l=0;l<non_term_len;l++){
if(non_term[l]==prod[i][j]) break;
}
if(l<non_term_len){
for(m=0;m<term_len;m++){
if(first[l][m]){
follow[k][m] = 1;
if(term[m]=='^') flag=1;
}
}
}
}
else{
for(l=0;l<term_len;l++){
if(term[l]==prod[i][j]) break;
}
if(l<term_len){
follow[k][l] = 1;
}

else{
if(prod[i][j]=='|'){
for(l=0;l<non_term_len;l++){
if(non_term[l]==prod[i][0]) break;
}
if(l<non_term_len){
calc_follow(l);
for(m=0;m<term_len;m++){
if(follow[l][m]) follow[k][m] = 1;
}
}
flag = 0;
}
}
flag=0;
}
}
if(prod[i][j]==non_term[k]){
flag = 1;
}
}
if(flag){
for(l=0;l<non_term_len;l++){
if(non_term[l]==prod[i][0]) break;
}
if(l<non_term_len){
calc_follow(l);
for(m=0;m<term_len;m++){
if(follow[l][m]) follow[k][m] = 1;
}
}
flag = 0;
}
}
}

void print(int in){


int i, j, len, k;
cout<<"I"<<in<<":\n";

for(i=0;i<10;i++){
if(canonical[in][no_of_prod-1][i]!=-1){
len = strlen(prod[no_of_prod-1]);
for(j=0;j<=len;j++){
if(j==canonical[in][no_of_prod-1][i]) printf(".");
cout<<prod[no_of_prod-1][j];
}
puts("");
}
else break;
}

for(i=0;i<no_of_prod-1;i++){
for(k=0;k<10;k++){
if(canonical[in][i][k]!=-1){
len = strlen(prod[i]);
for(j=0;j<=len;j++){
if(j==canonical[in][i][k]) printf(".");
cout<<prod[i][j];

}
puts("");
}
else break;
}
}
}

void closure(int arr[][10]){


long long i, j, flag=1, k, l;
char c;

for(i=0;i<10;i++){
if(arr[no_of_prod-1][i]!=-1){
c = prod[no_of_prod-1][arr[no_of_prod-1][i]];
for(j=0;j<no_of_prod;j++){
if(prod[j][0]==c){
for(k=0;k<10;k++){
if(arr[j][k]==3) break;
else if(arr[j][k]==-1){
arr[j][k] = 3;
break;
}
}
}
}
}
else break;
}
while(flag){
flag = 0;
for(i=0;i<no_of_prod;i++){
for(k=0;k<10;k++){
if(arr[i][k]!=-1){
c = prod[i][arr[i][k]];
for(j=0;j<no_of_prod;j++){
if(prod[j][0]==c){
for(l=0;l<10;l++){
if(arr[j][l]==3) break;
else if(arr[j][l]==-1){
arr[j][l] = 3;
flag = 1;
break;
}
}
}
}
}
else break;
}
}
}
}

int Goto(int in, int i, int n){


int j, ans=0, arr[100][10], flg=0, k, l, len;
char c;

for(j=0;j<100;j++)
for(k=0;k<10;k++) arr[j][k] = -1;

for(j=0;j<no_of_prod;j++){
for(k=0;k<10;k++){
if(canonical[in][j][k]!=-1){
if(n){
if(prod[j][canonical[in][j][k]]==non_term[i]){
for(l=0;l<10;l++){
if(arr[j][l]==canonical[in][j][k]+1) break;
else if(arr[j][l]==-1){
flg=1;
arr[j][l] = canonical[in][j][k]+1;
break;
}
}
}
}
else{
if(prod[j][canonical[in][j][k]]==term[i]){
for(l=0;l<10;l++){
if(arr[j][l]==canonical[in][j][k]+1) break;
else if(arr[j][l]==-1){
flg=1;
arr[j][l] = canonical[in][j][k]+1;
break;
}
}
}
}
}
else break;
}
}
if(!flg) return 0;
closure(arr);
for(j=0;j<can_len;j++){
flg = 0;
for(k=0;k<no_of_prod;k++){
for(l=0;l<10;l++){
if(canonical[j][k][l]!=arr[k][l]){
flg = 1;
break;
}
}
if(flg) break;
}
if(!flg){
ans = j;
break;
}
}

if(flg){
for(j=0;j<no_of_prod;j++){
for(l=0;l<10;l++) canonical[can_len][j][l] = arr[j][l];
}
ans = can_len;
can_len++;
}
if(n) go_to[in][i] = ans;
else sprintf(action[in][i], "s%d", ans);

if(flg){
for(i=0;i<no_of_prod;i++){
len = strlen(prod[i]);
for(j=0;j<10;j++){
if(arr[i][j]!=-1){
if(arr[i][j]==len){
if(i==no_of_prod-1)
sprintf(action[can
_len- 1][term_len-1], "acc\0");
else{
c = prod[i][0];
for(k=0;k<non_term_len;k++){
if(non_term[k]==c) break;
}
for(l=0;l<term_len;l++){
if(follow[k][l])
sprintf(action[can_len-1][l], "r%d\0", i+1);
}
}
}
}
else break;
}
}
}
return flg;
}
int main(){

int i, j, len, k, a, b, t, flag;


char c, temp[100];

can_len = 0;
term_len = non_term_len = 0;
cout<<"Enter the number of productions: ";
cin>>no_of_prod;
cout<<"Enter the productions:";
for(i=0;i<no_of_prod;i++){
cin>>prod[i];
len = strlen(prod[i]);
for(j=0;j<len;j++){
if(prod[i][j]>='A' && prod[i][j]<='Z'){
for(k=0;k<non_term_len;k++){
if(non_term[k]==prod[i][j]) break;
}
if(k==non_term_len){
non_term[non_term_len] = prod[i][j];
non_term_len++;
}
}
else if(prod[i][j]!='-' && prod[i][j]!='>'
&& prod[i][j]!='|'){
for(k=0;k<term_len;k++){
if(term[k]==prod[i][j]) break;
}
if(k==term_len){
term[term_len] = prod[i][j];
term_len++;
}
}
}
}

cout<<"\nNon terminals are:";


cout<<non_term[0];
for(i=1;i<non_term_len;i++){
cout<<","<<non_term[i];
}

cout<<"\nTerminals are:";
cout<<term[0];
for(i=1;i<term_len;i++){
cout<<","<<term[i];
}

cout<<"\nFirst\n";
for(i=0;i<non_term_len;i++){
calc_first(i);
cout<<non_term[i]<<" = { ";
for(j=0;j<term_len;j++){
if(first[i][j]) break;
}
if(j<term_len){
cout<<term[j];
j++;
for(;j<term_len;j++){
if(first[i][j]) cout<<" ,"<<term[j];
}
}
cout<<" }\n";
}

term[term_len] = '$';
term_len++;

cout<<"\nFollow\n";
follow[0][term_len-1] = 1;
for(i=0;i<non_term_len;i++){
calc_follow(i);
cout<<non_term[i]<<" = { ";
for(j=0;j<term_len;j++){
if(follow[i][j]) break;
}
if(j<term_len){
cout<<term[j];
j++;
for(;j<term_len;j++){
if(follow[i][j]) cout<<" ,"<<term[j];
}
}
cout<<" }\n";
}

for(i=0;i<100;i++){
for(j=0;j<100;j++){
sprintf(action[i][j], "\0");
go_to[i][j] = -1;
}
}

sprintf(prod[no_of_prod], "X->%c\0", prod[0][0]);


no_of_prod++;

for(i=0;i<100;i++){
for(j=0;j<no_of_prod;j++){
for(k=0;k<10;k++) canonical[i][j][k] = -1;
}
}

puts("\nCanonical LR(0) collection for grammar");

canonical[0][no_of_prod-1][0] = 3;
closure(canonical[0]);
can_len++;
puts("");
print(0);

flag = 1;
while(flag){
flag = 0;
for(i=0;i<can_len;i++){
for(j=0;j<non_term_len;j++){
if(Goto(i, j, 1)){
cout<<"\nGOTO(I"<<i<<", "<<non_term[j]<<") ";
print(can_len-1);
flag = 1; }}
for(j=0;j<term_len;j++)

{if(Goto(i, j, 0)){
cout<<"\nGO , "<<term[j]<<") "; print(can_len-1);
TO(I"<<i<<" flag = 1;}}}}

cout<<"\nSLR Parsing Table\n\n";


cout<<"State\t|\t\tAction";
for(i=0;i<term_len-2;i++)
printf("\t");
cout<<"|\tGoTo\n";
cout<<"------------------------------------------------------------
--------------------\n\t|";
for(i=0;i<term_len;i++){
cout<<term[i]<<"\t";}
cout<<"|";
for(i=0;i<non_term_len;i++){
cout<<non_term[i]<<"\t";}
puts("\n-----------------------------------------------------------
---------------------");
for(i=0;i<can_len;i++){
printf("%d\t|", i);
for(j=0;j<term_len;j++){
cout<<action[i][j]<<"\t";
}
cout<<"|";
for(j=0;j<non_term_len;j++)
{if(go_to[i][j]!=-1) cout<<go_to[i][j]; cout<<"\t";
}
puts("");}
return 0; }

7.7 Output

Figure 7.3: Output


7.8 Frequently Asked Questions

FAQS:
Q1.The construction of the canonical collection of the set of LR(1)items are similar to the
construction of canonical collection of the sets of LR(0) items with an exception ,what is that
exception? Ans.Closure and goto operation works a little bit different.
Q2. Assume that SLR parser for a grammar G has n1 states and the LALR parser for G has n2
states that derieve the relation between n1 and n2.
Ans. SLR parser has less range of context free languages but still n1 is equal to n2 .
Q. 3An LR-parser can detect a syntactic error as soon as?
Ans. It is possible to do so as a left-to- right scan of the input thus error is found when the input
string is scanned.
Q4.What is augmented Grammar?
Ans. We add an extra production to the original set of the production.
Q5.What are the steps required in calculating itmes?
Ans.Write augmented Grammar , find closure for item and calculate Goto.

Practical-8

8.1 Aim: Implementation of CLR parser.

8.2 Input: Input a string to be parsed.

Details:-

The LALR(1) parser is less powerful than the LR(1) parser, and more powerful than the SLR(1)
parser, though they all use the same production rules. The simplification that the LALR parser
introduces consists in merging rules that have identical kernel item sets, because during the
LR(0) state-construction process the lookaheads are not known. This reduces the power of the
parser because not knowing the lookahead symbols can confuse the parser as to which grammar
rule to pick next, resulting in reduce/reduce conflicts. All conflicts that arise in applying a
LALR(1) parser to an unambiguous LR(1) grammar are reduce/reduce conflicts. The SLR(1)
parser performs further merging, which introduces additional conflicts.
The standard example of an LR(1) grammar that cannot be parsed with the LALR(1) parser,
exhibiting such a reduce/reduce conflict, is:
S→aEc
→aFd
→bFc
→bEd
E

e
F

e

In the LALR table construction, two states will be merged into one state and later the
lookaheads will be found to be ambiguous. The one state with lookaheads is:
E → e. {c,d}
F → e. {c,d}
8.3
8.4 Expected Output: If the entered string satisfies the provided grammar rules,
then a message is displayed “accept the input” otherwise “error in input” or “do not accept
the input”.

8.5 Algorithm

Input

• Enter the no. of production


• Enter the production

Output

• Non-Terminal Symbols
• Terminal Symbols
• First of non-terminal
• Follow of non-terminal
• Canonical LR(1) collection for grammar
• CLR parsing table
• Message of acceptance displayed

Begin
1. Given a language Grammar rules
2. Build The Augmented Grammar
3. Construct the collection of LR(1) items sets
{I0, I1,…….., In} ) for Given grammar
4. This contains one look ahead symbol
5. Initial Item set contains a look ahead symbol as ‘$’
6. A-> α B β,a

Then we calculate FIRST{ β,a} as the new look ahead symbol.

7. Calculate the whole Set Of items like this

8. We Build CLR parsing Table

a) If the item set contains a rule like


{S’->S} then enter Accept under $ column of this rule in the parsing
table

b) If the item Sets contains rules like


{S->w.} then enter its rule no in the parsing table using the Look
Ahead Symbols Of The Given Item Sets

9. Display the table

End

Figure 8.1: Algorithm


8.6 Flowchart

Figure
8.2:

Flowchart
8.7 Source Code

#include <string.h>
#include<iostream>
using namespace std;
char prod[100][100], term[100], non_term[100], action[100][100][100];
int term_len, non_term_len, first[100][100], follow[100][100],
hash1[100], no_of_prod, hash2[100], canonical[100][20][10][10], can_len,
go_to[100][100], clr;
void calc_first(int k){
int j, flag=0, len, l, m, i;

if(hash1[k]) return;

hash1[k] = 1;
for(i=0;i<no_of_prod;i++){
if(prod[i][0]==non_term[k]){
flag = 0;
len = strlen(prod[i]);
for(j=3;j<len;j++){
if(!flag){
if(prod[i][j]>='A'&&prod[i][j]<='Z'){
for(l=0;l<non_term_len;l++){
if(non_term[l]==prod[i][j]) break;
}
flag = 1;
if(hash1[l]){
for(m=0;m<term_len;m++){
if(first[l][m]){
first[k][m] = 1;
if(term[m]=='^') flag=0;
}
}
}
else{
calc_first(l);
for(m=0;m<term_len;m++){
if(first[l][m]){
first[k][m] = 1;
if(term[m]=='^') flag=0;
}
}
}
}
else if(prod[i][j]!='|'){
for(l=0;l<term_len;l++){
if(term[l]==prod[i][j]) break;
}
first[k][l] = 1;
flag = 1;
}
}
else{
if(prod[i][j]=='|') flag=0;
}
}
}
}
}

void calc_follow(int k){


int i, len, j, l, flag=0, m;

if(hash2[k]) return;

hash2[k] = 1;
for(i=0;i<no_of_prod;i++){
len = strlen(prod[i]);
for(j=3;j<len;j++){
if(flag){
if(prod[i][j]>='A' && prod[i][j]<='Z'){
flag = 0;
for(l=0;l<non_term_len;l++){
if(non_term[l]==prod[i][j]) break;
}
if(l<non_term_len){
for(m=0;m<term_len;m++){
if(first[l][m]){
follow[k][m] = 1;
if(term[m]=='^') flag=1;
}
}
}
}
else{
for(l=0;l<term_len;l++){
if(term[l]==prod[i][j]) break;
}
if(l<term_len){
follow[k][l] = 1;
}
else{
if(prod[i][j]=='|'){
for(l=0;l<non_term_len;l++){
if(non_term[l]==prod[i][0]) break;
}
if(l<non_term_len){
calc_follow(l);
for(m=0;m<term_len;m++){

if(follow[l][m]) follow[k][m] = 1;
}
}
flag = 0;
}
}
flag=0;
}
}
if(prod[i][j]==non_term[k]){
flag = 1;
}
}
if(flag){
for(l=0;l<non_term_len;l++){
if(non_term[l]==prod[i][0]) break;
}
if(l<non_term_len){
calc_follow(l);
for(m=0;m<term_len;m++){
if(follow[l][m]) follow[k][m] = 1;
}
}
flag = 0;
}
}
}

void print(int in){


int i, j, len, k, l, prnt;

cout<<"I"<<in<<":\n";
for(i=0;i<10;i++){
prnt = 0;
for(k=0;k<term_len;k++){
if(canonical[in][no_of_prod-1][i][k]){
if(!prnt){
len = strlen(prod[no_of_prod-1]);
for(j=0;j<=len;j++){
if(j==i) printf(".");
cout<<prod[no_of_prod-1][j];
}
cout<<","<<term[k];
prnt = 1;
}
else cout<<"|"<<term[k];
}
}
if(prnt) puts("");
}

for(l=0;l<no_of_prod-1;l++){
for(i=0;i<10;i++){
prnt = 0;
for(k=0;k<term_len;k++){
if(canonical[in][l][i][k]){
if(!prnt){
len = strlen(prod[l]);
for(j=0;j<=len;j++){
if(j==i) printf(".");

cout<<prod[l][j];
}
cout<<","<<term[k];
prnt = 1;
}
else cout<<"|"<<term[k];
}
}
if(prnt) puts("");
}
}
}

void Frst(char * str, int * arr){


int j, flag=0, len, l, m, i, k;

len = strlen(str);

for(j=0;j<len;j++){
if(!flag){
if(str[j]>='A'&& str[j]<='Z'){
for(l=0;l<non_term_len;l++){
if(non_term[l]==str[j]) break;
}
flag = 1;
for(m=0;m<term_len;m++){
if(first[l][m]){
arr[m] = 1;
if(term[m]=='^') flag=0;
}

}
}
else if(str[j]!='|'){
for(l=0;l<term_len;l++){
if(term[l]==str[j]) break;
}
arr[l] = 1;
flag = 1;
}
}
else break;
}
}

void closure(int arr[][10][10]){


int i, j, flag=1, k, l, m, arr1[10], n;
char c, str[100];

while(flag){
flag = 0;
for(i=0;i<no_of_prod;i++){
//printf("** %s\n", prod[i]);
for(k=0;k<10;k++){
for(n=0;n<10;n++){
if(arr[i][k][n]){
c = prod[i][k];
//printf("*** %c %c\n", c, term[n]);
for(j=0;j<no_of_prod;j++){
if(prod[j][0]==c){
//printf("**** %c\n", c);

for(m=0;m<10;m++) arr1[m] = 0;

for(m=k+1;prod[i][m]!='\0';m++){
str[m-k-1] = prod[i][m];
}
str[m-k-1] = term[n];
str[m-k] = '\0';
//printf("***** %s\n", str);

Frst(str, arr1);

for(m=0;m<term_len;m++){
if(arr1[m] && !arr[j][3][m] ){
flag = 1;
arr[j][3][m] = 1;
//printf("******
prod[j], term[m]);
%s

%c\n",
}
}
}
}
}
}

}
}
}
}

int Goto(int in, int sym, int nt){

// nt = 0, terminal
// nt = 1, non terminal
int j, ans=0, arr[100][10][10]={0}, flg=0, k, l, len, i, m;
char c;
for(i=0;i<no_of_prod;i++){
for(j=0;j<10;j++){
for(k=0;k<term_len;k++){
if(canonical[in][i][j][k]){
if(nt){
if(prod[i][j]==non_term[sym]){
arr[i][j+1][k] = 1;
flg = 1;
}
}
else if(prod[i][j]==term[sym]){
arr[i][j+1][k] = 1;
flg = 1;
}
}
}
}
}

if(!flg) return 0;

closure(arr);

for(j=0;j<can_len;j++){
flg = 0;

for(k=0;k<no_of_prod;k++){
for(l=0;l<10;l++){
for(i=0;i<term_len;i++){
if(canonical[j][k][l][i]!=arr[k][l][i]){
flg = 1;
break;
}
}
if(flg) break;
}
if(flg) break;
}
if(!flg){
ans = j;
break;
}
}

if(flg){
for(j=0;j<no_of_prod;j++){
for(l=0;l<10;l++){
for(i=0;i<term_len;i++)
canonical[c
an_len][j][l][i] =
arr[j][l][i];
}
}
ans = can_len;
can_len++;
}

if(nt) go_to[in][sym] = ans;

else{
if(action[in][sym][0]=='\0')
sprintf(action[in][sym
], "s%d\0",
ans);
else{
s nf(action[in][sym], "s%d", &k); if(k!=ans){
s clr = 0;
c len = strlen(action[in][sym]);
a sprintf(action[in][sym]+len, ",s%d", ans);
}
}
}

if(flg){
for(i=0;i<no_of_prod;i++){
len = strlen(prod[i]);
for(j=0;j<10;j++){
for(l=0;l<term_len;l++){
if(arr[i][j][l] && j==len){
if(i==no_of_prod-1) sprintf(action[can_len-
1][term_len-1], "acc\0");
else{
if(action[can_len-1][l][0]=='\0')
sprintf(action[can_len-1][l], "r%d\0", i+1);
else{
sscanf(action[can_len-1][l],

"r%d",
&k);
if(k!=i+1){
clr = 0;
len = strlen(action[can_len-1][l]);
sprintf(action[can_len-1][l]+len,
"r%d\0", i+1);

}
}
}
}
}
}
}
}
return flg;
}

int main(){
int i, j, len, k, a, b, t, flag;
char c, temp[100];

clr = 1;
can_len = 0;
term_len = non_term_len = 0;
cout<<"Enter the number of productions: ";
cin>>no_of_prod;
cout<<"Enter production rules: ";
for(i=0;i<no_of_prod;i++){
cin>>prod[i];
len = strlen(prod[i]);
for(j=0;j<len;j++){
if(prod[i][j]>='A' && prod[i][j]<='Z'){
for(k=0;k<non_term_len;k++){
if(non_term[k]==prod[i][j]) break;
}
if(k==non_term_len){

non_term[non_term_len] = prod[i][j];
non_term_len++;
}
}
else if(prod[i][j]!='-' && prod[i][j]!='>'
&& prod[i][j]!='|'){
for(k=0;k<term_len;k++){
if(term[k]==prod[i][j]) break;
}
if(k==term_len){
term[term_len] = prod[i][j];
term_len++;
}
}
}
}
cout<<"Non terminals are: ";
cout<<non_term[0];
for(i=1;i<non_term_len;i++){
cout<<","<<non_term[i];
}
cout<<"\nTerminals are: ";
cout<<term[0];
for(i=1;i<term_len;i++){
cout<<","<<term[i];
}
cout<<"\nFirst\n";
for(i=0;i<non_term_len;i++){
calc_first(i);
cout<<non_term[i]<<" = { ";

for(j=0;j<term_len;j++){
if(first[i][j]) break;
}
if(j<term_len){
cout<<term[j];
j++;
for(;j<term_len;j++){
if(first[i][j]) cout<<" ,"<<term[j];
}
}
cout<<" }\n";
}
term[term_len] = '$';
term_len++;
cout<<"\nFollow\n";
follow[0][term_len-1] = 1;
for(i=0;i<non_term_len;i++){
calc_follow(i);
cout<<non_term[i]<<" = { ";
for(j=0;j<term_len;j++){
if(follow[i][j]) break;
}
if(j<term_len){
cout<<term[j];
j++;
for(;j<term_len;j++){
if(follow[i][j]) cout<<" ,"<<term[j];
}
}
cout<<" }\n";

}
for(i=0;i<100;i++){
for(j=0;j<100;j++){
sprintf(action[i][j], "\0");
go_to[i][j] = -1;
}
}
sprintf(prod[no_of_prod], "X->%c\0", prod[0][0]);
no_of_prod++;
puts("\nCanonical LR(1) collection of sets for grammar");
canonical[0][no_of_prod-1][3][term_len-1] = 1;
closure(canonical[0]);
can_len++;
puts("");
print(0);
flag = 1;
while(flag){
flag = 0;
for(i=0;i<can_len;i++){
for(j=0;j<non_term_len;j++){
if(Goto(i, j, 1)){
cout<<"\nGOTO(I"<<i<<", "<<non_term[j]<<") ";
print(can_len-1);
flag = 1;
}
}
for(j=0;j<term_len-1;j++){
if(Goto(i, j, 0)){
cout<<"\nGOTO(I"<<i<<", "<<term[j]<<") ";
print(can_len-1);

flag = 1;
}
}
}
}
if(clr){
puts("\nThe Grammar is CLR(1)");
cout<<"\nCLR Parsing Table\n\n";
cout<<"State\t|\t\tAction";
for(i=0;i<term_len-2;i++) printf("\t");
cout<<"|\tGoTo\n";
cout<<"
________________________________________________________________
\n\t|";
for(i=0;i<term_len;i++){
cout<<term[i]<<"\t" ;}
printf("|");
for(i=0;i<non_term_len;i++){
cout<<non_term[i]<<"\t";}
puts("\n
________________________________________________________________
");
for(i=0;i<can_len;i++){
cout<<i<<"\t|";
for(j=0;j<term_len;j++){
cout<<action[i][j]<<"\t";}
printf("|");
for(j=0;j<non_term_len;j++){
if(go_to[i][j]!=-1) cout<<go_to[i][j];
cout<<"\t";}
puts("");}}
else{
puts("\nThe Grammar is not CLR(1)");}
return 0;}
8.8 Output

Figure 8.3: Output


8.9 Frequently Asked Questions

Ques-1: Full Form of CLR.

Answer: Canonical Lookahead.

Ques-2: What does R in CLR parser stands for?

Answer: It stands for Right Most Derivation in reverse order.

Ques-3: What does Si in CLR table represents?

Answer: It represents shift action with production rule no. i.

Ques-4: What does ri in CLR table represents?

Answer: It represents reduce action with production rule no. i

Ques-5: What will be the ouput of FIRST of {S -> ab} ?

Answer: {a}.
Practical-9

9.1 Aim: Implementation of LALR parser.

9.2 Input: Input a string to be parsed.

9.3 Expected Output: If the entered string satisfies the provided grammar rules,
then a message is displayed “accept the input” otherwise “error in input” or “do not accept
the input”.

9.4 Algorithm
Input

• Enter the no. of production


• Enter the production

Output

• Non-Terminal Symbols
• First of non-terminal
• Follow of non-terminal
• Canonical LR(1) collection for grammar
• Grammar is LALR(1) or not
• Items having same core
• LALR Parsing table
• Message of acceptance displayed

Begin

1. Given a language Grammar rules


2. Build The Augmented Grammar
3. Construct the collection of LR(1) items sets {I0, I1, ............................... , In} ) for
given grammar
4. Now if the rules in the different item sets is same but different Look Ahead
Symbols, then Create a single Item Set By combining Both item Sets and Union
the Look Ahead Symbols
5. Like This Create set of all LALR(1) items
6. Build the parsing table
7. If there are no parsing conflicts, then the given grammar is said to be an LALR(1)
grammar

End

9.5 Flowchart

Figure 9.1: Flowchart


9.6 Source Code

#include<iostream>
#include <string.h>
using namespace std;
char prod[100][100], term[100], non_term[100], action[100][100][100];
int term_len, non_term_len, first[100][100], follow[100][100],
hash1[100], no_of_prod, hash2[100], canonical[100][20][10][10], can_len,
go_to[100][100], eq_items[100], lalr;
void calc_first(int k){
int j, flag=0, len, l, m, i;
if(hash1[k]) return;
hash1[k] = 1;
for(i=0;i<no_of_prod;i++){
if(prod[i][0]==non_term[k]){
flag = 0;
len = strlen(prod[i]);
for(j=3;j<len;j++){
if(!flag){
if(prod[i][j]>='A'&&prod[i][j]<='Z'){
for(l=0;l<non_term_len;l++){
if(non_term[l]==prod[i][j]) break;
}
flag = 1;
if(hash1[l]){
for(m=0;m<term_len;m++){
if(first[l][m]){
first[k][m] = 1;
if(term[m]=='^') flag=0;
}
}
}
else{
calc_first(l);
for(m=0;m<term_len;m++){
if(first[l][m]){
first[k][m] = 1;
if(term[m]=='^') flag=0;
}
}
}
}
else if(prod[i][j]!='|'){
for(l=0;l<term_len;l++){
if(term[l]==prod[i][j]) break;
}
first[k][l] = 1;
flag = 1;
}
}
else{
if(prod[i][j]=='|') flag=0;
}
}
}
}
}

void calc_follow(int k){


int i, len, j, l, flag=0, m;
if(hash2[k]) return;
hash2[k] = 1;
for(i=0;i<no_of_prod;i++){
len = strlen(prod[i]);
for(j=3;j<len;j++){
if(flag){
if(prod[i][j]>='A' && prod[i][j]<='Z'){
flag = 0;
for(l=0;l<non_term_len;l++){
if(non_term[l]==prod[i][j]) break;
}
if(l<non_term_len){
for(m=0;m<term_len;m++){
if(first[l][m]){
follow[k][m] = 1;
if(term[m]=='^') flag=1;
}
}
}
}
else{
for(l=0;l<term_len;l++){
if(term[l]==prod[i][j]) break;
}
if(l<term_len){
follow[k][l] = 1;
}
else{
if(prod[i][j]=='|'){
for(l=0;l<non_term_len;l++){
if(non_term[l]==prod[i][0]) break;
}
if(l<non_term_len){
calc_follow(l);
for(m=0;m<term_len;m++){
if(follow[l][m]) follow[k][m] = 1;
}
}
flag = 0;
}
}
flag=0;
}
}
if(prod[i][j]==non_term[k]){
flag = 1;
}
}
if(flag){
for(l=0;l<non_term_len;l++){
if(non_term[l]==prod[i][0]) break;
}
if(l<non_term_len){
calc_follow(l);
for(m=0;m<term_len;m++){
if(follow[l][m]) follow[k][m] = 1;
}
}
flag = 0;
}
}

}
void print(int in){
int i, j, len, k, l, prnt;
cout<<"I"<<in<<":\n";
for(i=0;i<10;i++){
prnt = 0;
for(k=0;k<term_len;k++){
if(canonical[in][no_of_prod-1][i][k]){
if(!prnt){
len = strlen(prod[no_of_prod-1]);
for(j=0;j<=len;j++){
if(j==i) printf(".");
cout<<prod[no_of_prod-1][j];
}
cout<<","<<term[k];
prnt = 1;
}
else cout<<"|"<<term[k];
}
}
if(prnt) puts("");
}
for(l=0;l<no_of_prod-1;l++){
for(i=0;i<10;i++){
prnt = 0;
for(k=0;k<term_len;k++){
if(canonical[in][l][i][k]){
if(!prnt){
len = strlen(prod[l]);
for(j=0;j<=len;j++){

if(j==i) cout<<".";
cout<<prod[l][j];
}
cout<<","<<term[k];
prnt = 1;
}
else cout<<"|"<<term[k];
}
}
if(prnt) puts("");
}
}
}
void Frst(char * str, int * arr){
int j, flag=0, len, l, m, i, k;
len = strlen(str);
for(j=0;j<len;j++){
if(!flag){
if(str[j]>='A'&& str[j]<='Z'){
for(l=0;l<non_term_len;l++){
if(non_term[l]==str[j]) break;
}
flag = 1;
for(m=0;m<term_len;m++){
if(first[l][m]){
arr[m] = 1;
if(term[m]=='^') flag=0;
}
}
}

else if(str[j]!='|'){
for(l=0;l<term_len;l++){
if(term[l]==str[j]) break;
}
arr[l] = 1;
flag = 1;
}
}
else break;
}
}

void closure(int arr[][10][10]){


int i, j, flag=1, k, l, m, arr1[10], n;
char c, str[100];
while(flag){
flag = 0;
for(i=0;i<no_of_prod;i++){
//printf("** %s\n", prod[i]);
for(k=0;k<10;k++){
for(n=0;n<10;n++){
if(arr[i][k][n]){
c = prod[i][k];
//printf("*** %c %c\n", c, term[n]);
for(j=0;j<no_of_prod;j++){
if(prod[j][0]==c){
//printf("**** %c\n", c);
for(m=0;m<10;m++) arr1[m] = 0;
for(m=k+1;prod[i][m]!='\0';m++){
str[m-k-1] = prod[i][m];

}
str[m-k-1] = term[n];
str[m-k] = '\0';
//printf("***** %s\n", str);
Frst(str, arr1);

for(m=0;m<term_len;m++){
if(arr1[m] && !arr[j][3][m] ){
flag = 1;
arr[j][3][m] = 1;
//printf("******
prod[j], term[m]);
%s

%c\n",
}
}
}
}
}
}

}
}
}
}

int Goto(int in, int sym, int nt){


// nt = 0, terminal
// nt = 1, non terminal
int j, ans=0, arr[100][10][10]={0}, flg=0, k, l, len, i, m;
char c;

for(i=0;i<no_of_prod;i++){
for(j=0;j<10;j++){
for(k=0;k<term_len;k++){
if(canonical[in][i][j][k]){
if(nt){
if(prod[i][j]==non_term[sym]){
arr[i][j+1][k] = 1;
flg = 1;
}
}
else if(prod[i][j]==term[sym]){
arr[i][j+1][k] = 1;
flg = 1;
}
}
}
}
}

if(!flg) return 0;

closure(arr);

for(j=0;j<can_len;j++){
flg = 0;
for(k=0;k<no_of_prod;k++){
for(l=0;l<10;l++){
for(i=0;i<term_len;i++){
if(canonical[j][k][l][i]!=arr[k][l][i]){
flg = 1;

break;
}
}
if(flg) break;
}
if(flg) break;
}
if(!flg){
ans = j;
break;
}
}

if(flg){
for(j=0;j<no_of_prod;j++){
for(l=0;l<10;l++){
for(i=0;i<term_len;i++)
canonical[c
an_len][j][l][i] =
arr[j][l][i];
}
}
ans = can_len;
can_len++;
}

if(nt) go_to[in][sym] = ans;


else{
if(action[in][sym][0]=='\0')
sprintf(action[in][sym
], "s%d\0",
ans);
else{
sscanf(action[in][sym], "s%d", &k);
if(k!=ans){
lalr = 0;
len = strlen(action[in][sym]);
sprintf(action[in][sym]+len, ",s%d", ans);
}
}
}

if(flg){
for(i=0;i<no_of_prod;i++){
len = strlen(prod[i]);
for(j=0;j<10;j++){
for(l=0;l<term_len;l++){
if(arr[i][j][l] && j==len){
if(i==no_of_prod-1) sprintf(action[can_len-
1][term_len-1], "acc\0");
else{
if(action[can_len-1][l][0]=='\0')
sprintf(action[can_len-1][l], "r%d\0", i+1);
else{
sscanf(action[can_len-1][l],

"r%d",
&k);
if(k!=i+1){
lalr = 0;
len = strlen(action[can_len-1][l]);
sprintf(action[can_len-1][l]+len,
"r%d\0", i+1);

}
}
}
}
}

}
}
}
return flg;
}

void print_(int in){


int i;
in = eq_items[in];
for(i=0;i<can_len;i++){
if(eq_items[i]==in) printf("%d", i);
}
}

int main(){
int i, j, len, k, a, b, t, flag, l, m, flg1, flg2;
char c, temp[100];

lalr = 1;
can_len = 0;
term_len = non_term_len = 0;
cout<<"Enter the number of productions: ";
cin>>no_of_prod;
cout<<"Enter production rules: ";
for(i=0;i<no_of_prod;i++){
cin>>prod[i];
len = strlen(prod[i]);
for(j=0;j<len;j++){
if(prod[i][j]>='A' && prod[i][j]<='Z'){
for(k=0;k<non_term_len;k++){
if(non_term[k]==prod[i][j]) break;
}
if(k==non_term_len){
non_term[non_term_len] = prod[i][j];
non_term_len++;
}
}
else if(prod[i][j]!='-' && prod[i][j]!='>'
&& prod[i][j]!='|'){
for(k=0;k<term_len;k++){
if(term[k]==prod[i][j]) break;
}
if(k==term_len){
term[term_len] = prod[i][j];
term_len++;
}
}
}
}

cout<<"\nNon terminals are: ";


cout<<non_term[0];
for(i=1;i<non_term_len;i++){
cout<<","<<non_term[i];
}

cout<<"Terminals are:";
cout<<term[0];
for(i=1;i<term_len;i++){
cout<<","<<term[i];

}
cout<<"\nFirst\n";
for(i=0;i<non_term_len;i++){
calc_first(i);
cout<<non_term[i]<<" = { ";
for(j=0;j<term_len;j++){
if(first[i][j]) break;
}
if(j<term_len){
cout<<term[j];
j++;
for(;j<term_len;j++){
if(first[i][j]) cout<<" ,"<<term[j];
}
}
cout<<" }\n";
}

term[term_len] = '$';
term_len++;

cout<<"\nFollow\n";
follow[0][term_len-1] = 1;
for(i=0;i<non_term_len;i++){
calc_follow(i);
cout<<non_term[i]<<" = { ";
for(j=0;j<term_len;j++){
if(follow[i][j]) break;
}
if(j<term_len){
cout<<term[j];
j++;
for(;j<term_len;j++){
if(follow[i][j]) cout<<" ,"<<term[j];
}
}
cout<<" }\n";
}

for(i=0;i<100;i++){
for(j=0;j<100;j++){
sprintf(action[i][j], "\0");
go_to[i][j] = -1;
}
}

sprintf(prod[no_of_prod], "X-
>%c\0", prod[0][0]); // Added
new initial symbol...
no_of_prod++;

puts("\nCanonical LR(1) collection of sets for grammar");

canonical[0][no_of_prod-1][3][term_len-1] = 1;
closure(canonical[0]);
can_len++;
puts("");
print(0);

flag = 1;
while(flag)
{
flag = 0;
for(i=0;i<can_len;i++)
{
for(j=0;j<non_term_len;j++)
{
if(Goto(i, j, 1)){
cout<<"\nGOTO(I"<<i<<", "<<non_term[j]<<") ";
print(can_len-1);
flag = 1;
}
}
for(j=0;j<term_len-1;j++)
{
if(Goto(i, j, 0)){
cout<<"\nGOTO(I"<<i<<", "<<term[j]<<") ";
print(can_len-1);
flag = 1;
}
}
}
}

if(lalr)
{
puts("\nThe Grammar is LALR(1)");
for(i=0;i<can_len;i++)
{
if(!eq_items[i])
{
eq_items[i] = i;
for(j=i+1;j<can_len;j++)

{
flag = 1;
for(k=0;k<no_of_prod;k++)
{
for(l=0;l<10;l++)
{
flg1 = flg2 = 0;
for(m=0;m<term_len;m++)
{
if(canonical[i][k][l][m]) flg1=1;
if(canonical[j][k][l][m]) flg2=1;
}
if(flg1!=flg2)
{
flag = 0;
break;
}
}
if(!flag) break;
}
if(flag) eq_items[j] = i;
}
}
}

cout<<"\nItems having same core are:\n";


for(i=0;i<can_len;i++)
{
if(eq_items[i]==i)
{
flag = 0;
for(j=i+1;j<can_len;j++)
{
if(eq_items[j]==i)
{
if(!flag)
{
cout<<i<<" , "<<j;
flag = 1;
}
else cout<<","<<j;
}
}
if(flag) puts("");
}
}

cout<<"\nLALR Parsing Table\n\n";


cout<<"State\t|\t\tAction";
for(i=0;i<term_len-2;i++) cout<<"\t";
cout<<"|\tGoTo\n";
cout<<"
________________________________________________________________
\n\t|";
for(i=0;i<term_len;i++){
cout<<term[i]<<"\t";
}
cout<<"|";
for(i=0;i<non_term_len;i++)
{
cout<<non_term[i]<<"\t";
}
puts("\n
________________________________________________________________
");
for(i=0;i<can_len;i++)
{
if(eq_items[i]==i)
{
print_(i);
cout<<"\t|";
for(j=0;j<term_len;j++)
{
if(action[i][j][0]=='s')
{
sscanf(action[i][j], "s%d", &k);
cout<<"s";
print_(k);
cout<<"\t";
}
else
if(action[i][j][0]=='r'||action[i][j][0]=='\0')
{
flag = 0;
k = eq_items[i];
for(l=0;l<can_len;l++)
{
if(eq_items[l]==k && action[l][j][0]=='r')
{
if(!flag)
{
cout<<action[l][j];
flag = 1;
}
else cout<<action[l][j]+1;
}
}
cout<<"\t";
}
else cout<<action[i][j]<<"\t";
}
printf("|");
for(j=0;j<non_term_len;j++)
{
if(go_to[i][j]!=-1) print_(go_to[i][j]);
cout<<"\t";
}
puts("");
}
}
}
Else
{

puts("\nThe Grammar is not LALR(1)");


}

return 0;
}
9.7 Output

Figure 9.3: Output


9.8 Frequently Asked Questions

Ques-1: What is an LALR parser ?

Answer: LALR parser or look ahead LR parser is a simplified version of a canonical LR


parser, to parse a set according to a set of production rules specified by a formal grammar
for a computer language.

Ques-2: What are the two functions used in LALR ?

Answer: The two functions used in LALR are: CLOSURE() and GOTO()

Ques-3: What does Si in CLR table represents?

Answer: It represents shift action with production rule no. i.

Ques-4: What does ri in CLR table represents?

Answer: It represents reduce action with production rule no. i.

Ques-5: A LALR parser is intermediate in power between SLR and canonical LR parser.
State true or false.

Answer: True.
Practical -10

10.1 Aim: Implementation of Shift-Reduce parser.


10.2 Introduction:
Shift Reduce parser attempts for the construction of parse in a similar manner as done in bottom
up parsing i.e. the parse tree is constructed from leaves(bottom) to the root(up). A more general
form of shift reduce parser is LR parser. This parser requires some data structures i.e.
• A input buffer for storing the input string.
• A stack for storing and accessing the production rules.
Basic Operations –
Shift: This involves moving of symbols from input buffer onto the stac2..k
Reduce: If the handle appears on top of the stack then, its reduction by using appropriate
production rule is done i.e. RHS of production rule is popped out of stack and LHS of
production rule is pushed onto the stack.
Accept: If only start symbol is present in the stack and the input buffer is empty then, the
parsing action is called accept. When accept action is obtained, it is means successful
parsing is done.
Error: This is the situation in which the parser can neither perform shift action nor reduce
action and not even accept action.
INPUT:
CFG grammar (production rule) and a string is provided as the input to the Shift
Reduce Parser.
OUTPUT:
The output of the Parser is a table representing stack , input and action performed on input
with the answer that the string is accepted or not.

10.3 Algorithm:
• START

• Enter the Non terminals (Variables) separated by space


• Enter terminal variables separated by space

• Enter the production rules separated by space and store in a list

• Implement stack and initially stack is empty


• Initially, perform shift operation by shifting first element from input to stack

• If elements in stack has production rule


Perform reduce operation

• Else
Repeat step 7
• Check, whole input string processed

• If yes, goto step 12

• Else, goto step 7

• If stack is empty
Print "String Parsed successfully"
• Else
Print"String is Not Parsed"

• STOP

10.4 Flowchart:
10.5 Source Code:

#include <iostream>
#include <string>
#include <vector>
using namespace std;

class production
{
public:
string lhs;
string rhs;
production(string lhs, string rhs)
{
this->lhs = lhs;
this->rhs = rhs;
}
};

class grammar
{
public:
vector<production> rules;
};

grammar g;

void getGrammar()
{
cout << "Enter the number of productions: ";
int n;
cin >> n;
for (int i = 0; i < n; i++)
{
cout << "Enter the lhs of the rule: ";
string lhs;
cin >> lhs;
cout << "Enter the rhs of the rule: ";
string rhs;
cin >> rhs;
g.rules.push_back(*(new production(lhs, rhs)));
}
}

void parse(string input)


{
string stack = "$";
input += "$";
bool errorFlag = false;
cout << "Stack" << "\t" << "Input Buffer" << "\tAction" << endl;
while (true)
{
bool flag = true;
for (unsigned int i = 0; i != g.rules.size(); ++i)
{
for (unsigned int j = 0; j < stack.length(); ++j)
{
if (stack.substr(stack.length() - (j + 1), stack.length()) == g.rules[i].rhs)
{
cout << stack << "\t" << input << "\t"
<< "Reduced by " << g.rules[i].lhs << "->" << g.rules[i].rhs << endl;
stack.erase(stack.begin() + stack.length() - (j + 1), stack.end()--);
stack += g.rules[i].lhs;
flag = false;
break;
}
}
}
if (flag && errorFlag)
{
cout << stack << "\t" << input << "\tError" << endl;
break;
}
if (flag)
{
if (input != "$")
{
cout << stack << "\t" << input << "\tShift" << endl;
stack += input[0];
input.erase(input.begin(), input.begin() + 1);
}
else if(stack == "$" + g.rules[0].lhs)
{
cout << stack << "\t" << input << "\tAccept" << endl;
break;
}
else
{
errorFlag = true;
}
}
}
}

int main()
{
getGrammar();
cout << "Enter a string to parse: ";
string input;
cin >> input;
parse(input);
return 0;
}

10.6 OUTPUT

You might also like