Compiler Design Book Final

1
COMPILER DESIGN PRACTICAL PROGRAMMING CONCEPTS
VI EDITOR
The default editor that comes with the UNIX operating system is called vi (visual editor). The UNIX
vi editor is a full screen editor and has two modes of operation:
1. Command mode commands which cause action to be taken on the file.
2. Insert mode in which entered text is inserted into the file.
In the command mode, every character typed is a command that does something to the text file being
edited; a character typed in the command mode may even cause the vi editor to enter the insert mode.
In the insert mode, every character typed is added to the text in the file; pressing the <Esc> (Escape)
key turns off the Insert mode.
vi Editor Commands
vi filename : It is used to create a new file or edit an existing one.

Esc+Shift+:+w+q : It is used to exit vi and save changes
Esc+Shift:+q : It is used to exit vi without saving.
Esc+Alt+dd : It is used to delete a complete line
Esc+Alt+x : It is used to delete a single character.
Esc : It is used to enter vi command mode.
i : It is used to enter vi insert mode.
Esc+Ctrl+f : It is used to display next page.
2
LEX PRACTICE
Metacharacter Matches
. any character except newline
\n newline
* zero or more copies of the preceding expression
+ one or more copies of the preceding expression
? zero or one copy of the preceding expression
^ beginning of line
$ end of line
a|b a or b
(ab)+ one or more copies of ab (grouping)
"a+b" literal "a+b" (C escapes still work)
[] character class
Table 1: Pattern Matching Primitives
Expression Matches
abc abc
abc* ab abc abcc abccc ...
abc+ abc, abcc, abccc, abcccc, ...
a(bc)+ abc, abcbc, abcbcbc, ...
a(bc)? a, abc
[abc] one of: a, b, c
[a-z] any letter, a through z
[a\-z] one of: a, -, z
[-az] one of: - a z
[A-Za-z0-9]+ one or more alphanumeric characters
[ \t\n]+ whitespace
[^ab] anything except: a, b
[a^b] a, ^, b
[a|b] a, |, b
3
a|b a, b
Table 2: Pattern Matching Examples
Name Function
int yylex(void) call to invoke lexer, returns token
char *yytext pointer to matched string
yyleng length of matched string
yylval value associated with token
int yywrap(void) wrapup, return 1 if done, 0 if not done
FILE *yyout output file
FILE *yyin input file
INITIAL initial start condition
BEGIN
switch start condition
condition
ECHO write matched string
Table 3: Lex Predefined Variables

4
PROGRAM FOR RECOGNIZING OPERATORS IN A GIVEN INPUT FILE
ALGORITHM
STEP1: Start
STEP2: Declare the header files stdio.h and string.h
STEP3: Start the main section
STEP4: Create character variables c, d to get the characters from Input file.
STEP5: Create two pointer variables fs, fp of ‘FILE’ type to Access the file.
STEP6: Open the files Operators.txt and ip.c using fs and fp respectively in ‘read’ mode.
STEP7: Start the while loop with condition !feof(fs).
STEP8: Get the character from Operators.txt and assign it to char ‘c’.
STEP9: Start another while loop with condition!feof(fp).
STEP10: Similarly get the character from ip.c and assign it to char‘d’.
STEP11: If c = = ‘ ‘ || c = = ‘ \ n’ || c = = EOF then break the program
STEP12: If c = = d, then print the character from the input file as output.
STEP13: Rewind fp.
STEP14: Close the opened files and while loops.
STEP15: Stop.
Text File:
+.=-/*%
Input File:
#include<stdio.h>
main()
{
int a=10,b=2,c;
c=a+b;
printf(“%d”,c);
}
5
Program Code:
#include<stdio.h>
#include<string.h>
main()
{
Char c,d;
FILE *fs,*fp;
fs=fopen(“operators.txt”,”r”);
fp=fopen(“ip.c”,”r”);
while(!feof(fs))
{
c=fgetc(fs);
while(!feof(fp))
{
d=fgetc(fp);
if(c==’ ‘||c==’\n’||c==EOF)
break;
if(c==d)
{
printf(“\n %c IS AN OPERATOR”,c);
}
}
rewind(fp);
}
fclose(fp);
fclose(fs);
}
Compilation:
cc operators.c
Output:
./a.out
= IS AN OPERATOR
= IS AN OPERATOR
+ IS AN OPERATOR
= IS AN OPERATOR
6
PROGRAM FOR RECOGNIZING SPECIAL SYMBOLS IN A GIVEN INPUT FILE
ALGORITHM:
STEP1: Start.
STEP2: Declare the header files stdio.h and string.h.
STEP3: Start the main section.
STEP4: Create character variables c, d to get the characters from input file.
STEP5: Create two pointer variables fs, fp of ‘FILE’ type to access the file.
STEP6: Open the files Specials.txt and ip.c using fs and fp respectively in ‘read’ mode.
STEP8: Get the character from Specails.txt and assign it to char ‘c’.
STEP9: Start another while loop with condition!feof(fp).
STEP10: Similarly get the character from ip.c and assign it to char‘d’.
STEP11: If c = = ‘ ‘ || c = = ‘ \ n’ || c = = EOF then break the program
STEP12: If c = = d, then print the character from the input file as output.
STEP13: Rewind fp.
STEP14: Close the opened files and while loops.
STEP15: Stop.
Program Code:
#include<stdio.h>
#include<string.h>
main()
{
char c,d;
FILE *fs,*fp;
fs=fopen(“specials.txt”,”r”);
while(!feof(fs))
{
c=fgetc(fs);
while(!feof(fp))
{
d=fgetc(fp);
if(c==’ ‘||c==’\n’||c==EOF)
break;
if(c==d)
7
{
printf(“\n %c IS SPECIAL SYMBOL”,c);
}
}
rewind(fp);
}
fclose(fp);
fclose(fs);
}
Compilation:
cc specials.c
Output:
./a.out
# IS A SPECIAL SYMBOL
( IS A SPECIAL SYMBOL
) IS A SPECIAL SYMBOL
“IS A SPECIAL SYMBOL
“IS A SPECIAL SYMBOL
, IS A SPECIAL SYMBOL
8
PROGRAM FOR RECOGNIZING CONSTANTS IN A GIVEN INPUT FILE
ALGORITHM
STEP1: Start
STEP4 : Create character variables c , d to get the characters from input file and an array a to
store the constants.
STEP5: Create integer variables i , f ,ass
STEP6: Create a pointer variable fp of ‘FILE’ type to access the file.
STEP7: Open the file ip.c using fp in ‘read’ mode.
STEP8: Start the while loop with condition !feof(fp).
STEP9: Get the character from ip.c and assign it to char ‘c’.
STEP10: Get the ascii value of ‘c’ and store it in ass.
STEP11: If ass((G.T.E 48) && (L.T.E57)) then store the value of c in array ‘a’ and make the values
of f and i equal to 1.
STEP12: Start the while loop with condition f==1.
STEP13: Get the character from ip.c and store in d and get the ascii value of d and store in ass.
STEP14: If ass((E.T 46) || (G.T.E 48) && (L.T.E 57)) then store the value of d in array ‘a’ and
increment I, make f=1.
STEP15: Else make f=0 , store ‘\0’ in array ‘a’ as ‘i’ th element and then come out of the loop.
STEP16: print the elements in the array and close the opened files and while loops.
STEP17: stop.
9
Program Code:
#include<stdio.h>
#include<string.h>
main()
{
char c,d,a[20];
int i,f,ass;
FILE *fp;
while(!feof(fp))
{
c=fgetc(fp);
ass=toascii©;
if((ass>=48) && (ass<=57))
{
a[0]=c;
f=1;
i=1;
while(f==1)
{
d=fgetc(fp);
ass=toascii(d);
if((ass==46) || (ass>=48) && (ass<=57))
{
a[i]=d;
i=i+1;
f=1;
}
else
{
f=0;
a[i]=’\0’;
break;
}
}
Printf (“\n %s IS A CONSTANT”,a);
}
}
10
Compilation:
cc constants.c
Output:
./a.out
10 IS A CONSTANT
2 IS A CONSTANT
11
PROGRAM FOR RECOGNIZING KEYWORDS AND IDENTIFIERS IN A GIVEN INPUT

FILE
ALGORITHM:
STEP1: Start.
STEP2: Declare the header files stdio.h and string.h.
STEP3: Start the main section.
STEP4: Create two pointer variables fs , fp of ‘FILE’ type to access the file.
STEP5: Create two arrays a , b of ‘char’ type to store the characters and two variables c , d of
‘char’ type to get the characters from the input file
STEP6: Create integer variables ass, f , z, i.
STEP7: Open the files Keywords.txt and ip.c using fs and fp respectively in ‘read’ mode.
STEP8: Start a while loop with condition (!feof(fp)).
STEP9: Get the character from the file ip.c and store it in variable ‘c’ .Convert it into its ascii
value.
STEP10: If the ascii value of ‘c’ is ((G.T.E 65) && ( L.T.E 90) || (G.T.E 97) && (L.T.E
122)) then store the character in array ‘a’.
STEP11: The values of f , i are equal to 1 and start another while loop with condition (f = = 1).
STEP12: Get the character from the file ip.c and store it in variable ‘d’ .Convert it into its
ascii value.
STEP13: If the ascii value of ‘d’ is (((G.T.E 65 ) && ( L.T.E 90) || ((G.T.E 97 ) && (L.T.E
122)) || (E.T 95) || (E.T 46) || ((G.T.E 48 ) && ( L.T.E 57))) then Store the character in array
‘a’ and increment ‘i’ value and f = 1.
STEP14: Else f = 0 , a [ i] is the last symbol and write the function fseek(fp,-1,1).
STEP15: Start another while loop with condition (!feof(fs)) and store the string from
Keywords.txt into array ‘b’.
STEP16: Compare a , b and store the value in variable z.
STEP17: If z = 0 it is a Keyword else it is a Identifier.
STEP18: Stop.
Text File:
int void char

12
Program Code
#include<stdio.h>
#include<string.h>
main()
{
FILE *fs,*fp;
char a[20],b[20],c,d;
int ass,f,z,i;
fs=fopen(“Keywords.c”,”r”);
while(!feof(fp))
{
c=fgetc(fp);
ass=toascii(c );
if(((ass>=65)&&(ass<=90))||((ass>=97)&&(ass<=122)))
{
a[0]=c;
f=1;
i=1;
while(f==1)
{
d=fgetc(fp);
ass=toascii(d);
if(((ass>=65)&&(ass<=90))||((ass>=97)&&(ass<=122))||
(ass==95)||(ass==46)||((ass>=48)&&( ass<=57)))
{
a[i]=d;
i=i+1;
f=1;
}
else
{
f=0;
a[i]=’\0’;
fseek(fp,-1,1);
break;
13
}
}
while(!feof(fs))
{
fscanf(fs,”%s”,b);
z=strcmp(a,b);
if(z==0)
{
Printf(“%s IS A KEYWORD”,a);
}
}
if(z!=0)
{
printf(“%s IS AN IDENTIFIER”,a);
rewind(fs);
}
}
}
}
Compilation:
cc Keywords.c
Output:
./a.out
stdio.h IS A IDENTIFIER
main IS AN IDENTIFIER
int IS AN KEYWORD
a IS AN IDENTIFIER
b IS AN IDENTIFIER
c IS AN IDENTIFIER
a IS AN IDENTIFIER
b IS AN IDENTIFIER
c IS AN IDENTIFIER
printf IS AN IDENTIFIER
c IS AN IDENTIFIER
14
PROGRAM FOR RECOGNIZING HEADER FILES IN A GIVEN INPUT FILE
ALGORITHM
STEP1: Start
STEP4 : Create character variables c , d to get the characters from input file and two arrays a
and b.
STEP5: Create integer variables i , f ,ass ,z.
STEP6: Create two pointer variables fp and fs of ‘FILE’ type toaccess the file.
STEP7: Open the file ip.c and headerfiles.c using fp and fs respectively in ‘read’ mode.
STEP8: Start the while loop with condition !feof(fp).
STEP9: Get the character from ip.c and assign it to char ‘c’.
STEP10: Get the ascii value of ‘c’ and store it in ass.
STEP11: If ass ((G.T.E 65) && ( L.T.E 90) || (G.T.E 97) && ( L.T.E 122)) then store the value of
c in array ‘a’ and make f and i as 1.
STEP12: Start the while loop with condition f==1.
STEP13: Get the character from ip.c and and store in d and get the ascii value of ‘d’ and store in ass.
STEP14: If ass (((G.T.E 65 ) && ( L.T.E 90) || ((G.T.E 97 ) && ( L.T.E 122)) || (E.T 46) ) then
store the value of d in array ‘a’ and increment i, make f=1.
STEP15: Else make f=0 , store ‘\0’ in array ‘a’ as ‘i’ th element , use fseek fuction to seek a
specified place (offset=-1) within thefile ip.c and modify it and then come out of the loop.
STEP17: Scan the file headerfiles.c using fscanf() and store the result in b.
STEP18: compare character arrays ‘a’ and ‘b’ and store the result in z.
STEP19: If z==0 then print the elements in the array ‘a’ and come out of the while loop.
STEP20: Else Rewind(fs).
STEP21: Close all the opened files and while loops.
STEP22: stop.
Text File:
stdio.h string.h
15
Program Code:
#include<stdio.h>
#include<string.h>
main()
{
char c,d,a[20],b[20];
int i,f,ass,z;
FILE *fp,*fs;
fs=fopen(“headerfiles.txt”,”r”);
while(!feof(fp))
{
c=fgetc(fp);
ass=toascii(c );
if((ass>=65) && (ass<=90) || (ass>=97) && (ass<=122) )
{
a[0]=c;
f=1;
i=1;
while(f==1)
{
d=fgetc(fp);
ass=toascii(d);
if((ass>=65) && (ass<=90) ||(ass>=97) && (ass<=122) || (ass==46))
{
a[i]=d;
i=i+1;
f=1;
}
else
{
f=0;
a[i]=’\0’;
fseek(fp,-1,1);
break;
}
}
16
while(!feof(fs))
{
fscanf(fs,”%s”,b);
z=strcmp(a,b);
if(z==0)
{
printf (“\n %s IS A HEADER FILE”,a);
break;
}
}
if(z!=0)
{
rewind(fs);
}
}
}
fclose(fp);
fclose(fs);
}
Compilation:
cc headerfiles.c
Output:
./a.out
Stdio.h IS A HEADER FILE
17
PROGRAM FOR LEXICAL ANALYSIS WITH SYMBOL TABLE
ALGORITHM:
STEP1: Start
STEP2: Declare the header files stdio.h, string.h and ctype.h
STEP3: Declare a structure named lextable where identifier,arithmetic, relop, val are the character
variables declared inside the structure.
STEP4: Close the structure and start the main section.
STEP5: Declare integer variables i, n and character array ch[50]
STEP6: Declare the array It[80] of type struct
STEP7: Enter your expression and store it in array ch.
STEP8: Find the length of expression using strlen(ch) and assign it to n
STEP9: Start the for loop with condition for ( i=0; i<n; i++)
STEP10: Start the if condition with isalpha(ch[i])
STEP11: The value of It[i].identifier is the value of ch[i] and the remaining values are equal to blank
spaces.
STEP12: Start another if condition, if the character is a digit, then store ch[i] in It[i].val and
remaining are equal to blank spaces
STEP13: Start another if condition, where if the ch[i] is equal to any of arithmetic operation + || -|| * ||
/ || %. Store it in It[i].arithmetic and remaining are blank spaces.
STEP14: Similarly start another if condition where, if the value of ch[i] is equal to any of relational
operation = || < || > || ? store it in It[i].relop and remaining are blank spaces.
STEP15: Print the contents of the lexeme table
STEP16: Stop
18
Program Code:
#include<stdio.h>
#include<string.h>
#include<ctype.h>
struct lextable
{
char identifier;
char arithmetic;
char relop;
char value;
};
int main()
{
int i, n;
char ch[50];
struct lextable It[80];
printf(“\n LEXEME”);
printf(“\n ENTER YOUR EXPRESSION”);
scanf(“%s”,ch);
n=strlen(ch);
printf(“\n THE EXPRESSION: \t”);
printf(“%s”,ch);
for(i = 0; i < n; i++)
{
If( isalpha( ch[i]))
{
It[i].arithmetic = ‘ ‘;
It[i].identifier = ch[i];
It[i].relop = ‘ ‘;
It[i].value = ‘ ‘;
}
else if(isdigit(ch[i]))
{
It[i].identifier = ‘ ‘;
It[i].value = ch[i];
19
}
else if(ch[i] = = ‘+’ || ch[i] = = ‘-‘ || ch[i] = = ‘*’ || ch[i] = = ‘/’ || ch[i] = = ‘%’)
{
It[i].arithmetic = ch[i];
}
else if(ch[i] = = ‘=’ || ch[i] = = ‘<‘ || ch[i] = = ‘>’ || ch[i] = = ‘?’)
{
It[i].relop = ch[i];
}
}
Printf(“\n CONTENTS OF LEXEME TABLE ARE:”);
Printf(“\n IDENTIFIER \t ARITHMETIC \t RELOP \t VALUE \n”);
for(i=0; i<n; i++)
{
Printf(“%c \t\t %c \t\t %c \t\t %c \n”, It[i].identifier, It[i].arithmetic, It[i].relop, It[i].value);
}
}
Compilation:
cc lexanalysis.c
Output:
./a.out
LEXEME
ENTER YOUR EXPRESSION
a+b=5
THE EXPRESSION: a+b=5
CONTENTS OF LEXEMETABLE ARE
IDENTIFIER ARITHMETIC RELOP VALUE
a +
b = 5
20
PROGRAM TO DESIGN LEXICAL ANALYSIS
#include<string.h>
#include<ctype.h>
#include<stdio.h>
void keyword(char str[10])
{
if(strcmp("for",str)==0||strcmp("while",str)==0||strcmp("do",str)==0||strcmp("int",str)==0||strcmp("flo
at",str)==0||strcmp("char",str)==0||strcmp("double",str)==0||strcmp("static",str)==0||strcmp("switch",s
tr)==0||strcmp("case",str)==0)
printf("\n%s is a keyword",str);
else
printf("\n%s is an identifier",str);
}
main()
{
FILE *f1,*f2,*f3;
char c,str[10],st1[10];
int num[100],lineno=0,tokenvalue=0,i=0,j=0,k=0;
printf("\nEnter the c program");/*gets(st1);*/
f1=fopen("input","w");
while((c=getchar())!=EOF)
putc(c,f1);
fclose(f1);
f1=fopen("input","r");
f2=fopen("identifier","w");
f3=fopen("specialchar","w");
while((c=getc(f1))!=EOF)
{
if(isdigit(c))
{
tokenvalue=c-'0';
c=getc(f1);
while(isdigit(c))
{
tokenvalue*=10+c-'0';
c=getc(f1);
}
21
num[i++]=tokenvalue;
ungetc(c,f1);
}
else if(isalpha(c))
{
putc(c,f2);
c=getc(f1);
while(isdigit(c)||isalpha(c)||c=='_'||c=='$')
{
putc(c,f2);
c=getc(f1);
}
putc(' ',f2);
ungetc(c,f1);
}
else if(c==' '||c=='\t')
printf(" ");
else if(c=='\n')
lineno++;
else
putc(c,f3);
}
fclose(f2);
fclose(f3);
fclose(f1);
printf("\nThe no's in the program are");
for(j=0;j<i;j++)
printf("%d",num[j]);
printf("\n");
f2=fopen("identifier","r");
k=0;
printf("The keywords and identifiersare:");
{
if(c!=' ')
str[k++]=c;
else
22
{
str[k]='\0';
keyword(str);
k=0;
}
}
fclose(f2);
f3=fopen("specialchar","r");
printf("\nSpecial characters are");
printf("%c",c);
printf("\n");
fclose(f3);
printf("Total no. of lines are:%d",lineno);
}
Output:
Enter the C program
a+b*c
Ctrl-D
The no’s in the program are:
The keywords and identifiers are:

a is an identifier and terminal
b is an identifier and terminal
c is an identifier and terminal
Special characters are:

+*
Total no. of lines are: 1
23
LEX PROGRAMS
LEX SOURCE SPECIFICATION THAT PRINT * WHENEVER IT RECIEVES A TOKEN
Lex Code:
%%
“Token” printf(“*”);
%%
Compilation Steps:
lex Filename.l
cc lex.yy.c –ll
Output:
./a.out
This is Token
This is *
24
LEX SOURCE SPECIFICATION THAT PRINT INTEGER WHENEVER IT RECEIVES A

TOKEN FROM 0 TO 9
Lex Code:
%%
[0-9] printf(“INTEGER”);
%%
Compilation:
lex integer.l
cc lex.yy.c –ll
Output:
./a.out
12
INTEGER INTEGER
25
LEX SOURCE SPECIFICATION THAT PRINTS INTEGER WHENEVER IT RECEIVES A

TOKEN OF ANY LENGTH
Lex Code:
%%
[0-9]+ printf(“INTEGER”);
%%
Compilation:
lex integerlength.l
cc lex.yy.c –ll
Output:
./a.out
123
INTEGER
26
LEX SOURCE SPECIFICATION THAT IDENTIFIES POSITIVE INTEGER
Lex Code:
%%
“+”?[0 – 9]+ printf(“POSITIVE INTEGER”);
%%
Compilation:
lex positiveinteger.l
cc lex.yy.c –ll
Output:
./a.out
+12
POSITIVE INTEGER
-12
-POSITIVE INTEGER
27
LEX SOURCE SPECIFICATION THAT IDENTIFIES NEGATIVE INTEGER
Lex Code:
%%
“-“[0 - 9]+ printf(“NEGATIVE INTEGER”);
%%
Compilation:
lex negativeinteger.l
cc lex.yy.c –ll
Output:
./a.out
-1
NEGATIVE INTEGER
1
1
28
LEX PROGRAM TO FIND WHETHER THE GIVEN INPUT IS PALINDROME NUMBER

OR NOT
%{
int i , s=0, t, m;
%}
%%
[0 – 9]+ { i = atoi(yytext);
t = 0;
m = atoi(yytext);
while( i > 0 )
{
t = i % 10;
s = s * 10 + t;
i = i / 10;
}
if( m = = s)
printf(“ GIVEN NUMBER IS PALINDROME NUMBER ”);
else
printf(“ GIVEN NUMBET IS NOT A PALINDROME NUMBER ”);
}
%%
Compilation:
lex palindrome.l
cc lex.yy.c –ll
Output:
./a.out
121 GIVEN NUMBER IS PALINDROME NUMBER
120 GIVEN NUMBER IS NOT A PALINDROME NUMBER
29
LEX PROGRAM TO FIND WHETHER THE GIVEN INPUT IS ARMSTRONG OR NOT
%{
int i , s=0, t, m;
%}
%%
[0 – 9]+ { i = atoi( yytext );
t = 0;
m = atoi( yytext );
while( i > 0 )
{
t = i % 10;
s = t * t * t + s;
i = i / 10;
}
if( m = = s)
printf(“ARMSTRONG NUMBER”);
else
printf(“ NOT AN ARMSTRONG NUMBER”);
}
%%
Compilation:
lex armstrong.l
cc lex.yy.c –ll
Output:
./a.out
153
ARMSTRONG NUMBER
30
LEX PROGRAM TO RECOGNIZES STRINGS OF NUMBERS (Integers) IN THE INPUT

AND SIMPLY PRINTS THEM OUT
%{
#include <stdio.h>
%}
%option noyywrap
%%
[0-9]+ {printf("Saw an integer: %s\n", yytext); }
.|\n { }
%%
int main(void)
{
yylex();
return 0;
}
Compilation:
lex filename.l
cc lex.yy.c –ll
Input:
abc123z.!&*2gj6
Output:
./a.out
the program will print:
Saw an integer: 123
Saw an integer: 2
Saw an integer: 6
31
LEX PROGRAM FOR REMOVING MULTIPLE BLANKS IN GIVEN INPUT TEXT
%%
“ ” + printf( “ “ );
%%
Compilation:
lex blank.l
cc lex.yy.c –ll
Output:
./a.out
This is KITS
This is KITS
32
LEX PROGRAM THAT PRINTS SUCCESSOR OF THE GIVEN INPUT CHARACTER
%%
[ A – Z a – z 0 – 9 ] printf(“ %c”, yytext[0] + 1 );
%%
Compilation:
lex successor.l
cc lex.yy.c –ll
Output:
./a.out
kits
ljut
abc
bcd
33
LEX PROGRAM THAT PRINTS PREDECESSOR OF THE GIVEN INPUT CHARACTER
%%
[ a – z A – Z 0 – 9 ] printf( “%c” , yytext[0] - 1);
%%
Compilation:
lex predecessor.l
cc lex.yy.c –ll
Output:
kits
jhsr
bcd
abc
34
LEX PROGRAM TO REMOVE COMMENT LINES IN THE GIVEN INPUT TEXT
%%
“/*”[0 – 9 a – z A – Z] * “*/” printf(“ “ );
%%
Compilation:
lex commentline.l
cc lex.yy.c –ll
Output:
./a.out
We are of kits /*college*/
We are of kits
35
LEX PROGRAM FOR CONVERTING REAL NUMBER TO INTEGER NUMBER
First Method:
%{
int i ;
%}
%%
“+”?[0 – 9]+\.[0 – 9]+ {i = 0;
While( yytext[i] ! = ‘.’)
{
printf( “%c”,yytext[i] );
i++;
}
}
%%
Compilation:
lex realtoint.l
cc lex.yy.c –ll
Output:
./a.out
23.5
23
+23.5
23
-23.5
-23
36
Second Method:
%%
“.”[0 – 9]+ printf( “ “ );
%%
Compilation:
lex realtoint1.l
cc lex.yy.c –ll
Output:
./a.out
23.5
23
37
LEX PROGRAM TO COUNT NUMBER OF VOWELS AND CONSONANTS
%{
int v=0,c=0;
%}
%%
[aeiouAEIOU] v++;
[a-zA-Z] c++;
%%
main()
{
printf("ENTER INTPUT : \n");
yylex();
printf("VOWELS=%d\nCONSONANTS=%d\n",v,c);
}
38
LEX PROGRAM TO COUNT THE TYPE OF NUMBERS
%{
int pi=0,ni=0,pf=0,nf=0;
%}
%%
\+?[0-9]+ pi++;
\+?[0-9]*\.[0-9]+ pf++;
\-[0-9]+ ni++;
\-[0-9]*\.[0-9]+ nf++;
%%
main()
{
printf("ENTER INPUT : ");
yylex();
printf("\nPOSITIVE INTEGER : %d",pi);
printf("\nNEGATIVE INTEGER : %d",ni);
printf("\nPOSITIVE FRACTION : %d",pf);
printf("\nNEGATIVE FRACTION : %d\n",nf);
}
39
LEX PROGRAM TO COUNT NUMBER OF Printf and Scanf Statements
%{
#include "stdio.h"
int pf=0,sf=0;
%}
%%
printf {
pf++;
fprintf(yyout,"%s","writef");
}
scanf {
sf++;
fprintf(yyout,"%s","readf");
}
%%
main()
{
yyin=fopen("file1.l","r+");
yyout=fopen("file2.l","w+");
yylex();
printf("NUMBER OF PRINTF IS %d\n",pf);
printf("NUMBER OF SCANF IS %d\n",sf);
}
40
LEX PROGRAM TO FIND THE SIMPLE AND COMPOUND STATEMENTS
%{
}%
%%
"and"|
"or"|
"but"|
"because"|
"nevertheless" {printf("COMPOUNT SENTANCE"); exit(0); }
.;
\n return 0;
%%
main()
{
prntf("\nENTER THE SENTANCE : ");
yylex();
printf("SIMPLE SENTANCE");
}
41
LEX PROGRAM TO COUNT NUMBER OF IDENTIFIERS
%{
#include<stdio.h>
int id=0,flag=0;
%}
%%
"int"|"char"|"float"|"double" { flag=1; printf("%s",yytext); }
";" { flag=0;printf("%s",yytext); }
[a-zA-Z][a-zA-z0-9]* { if(flag!=0) id++; printf("%s",yytext); }
[a-zA-Z0-9]*"="[0-9]+ { id++; printf("%s",yytext); }
[0] return(0);
%%
main()
{
printf("\n *** output\n");
yyin=fopen("f1.l","r");
yylex();
printf("\nNUMBER OF IDENTIFIERS = %d\n",id);
fclose(yyin);
}
int yywrap()
{
return(1);
}
42
LEX PROGRAM TO COUNT NUMBER OF WORDS, CHARATERS, BLANKS AND LINES
%{
int c=0,w=0,l=0,s=0;
%}
%%
[\n] l++;
[' '\n\t] s++;
[^' '\t\n]+ w++; c+=yyleng;
%%
int main(int argc, char *argv[])

{
if(argc==2)
{
yyin=fopen(argv[1],"r");
yylex();
printf("\nNUMBER OF SPACES = %d",s);
printf("\nCHARACTER=%d",c);
printf("\nLINES=%d",l);
printf("\nWORD=%d\n",w);
}
else
printf("ERROR");
}
43
LEX PROGRAM TO COUNT NUMBER OF COMMENT LINES
%{
#include<stdio.h>
int cc=0;
%}
%%
"/*"[a-zA-Z0-9' '\t\n]*"*/" cc++;
"//"[a-zA-Z0-9' '\t]* cc++;
%%
main()
{
yyin=fopen("f1.l","r");
yyout=fopen("f2.l","w");
yylex();
fclose(yyin);
fclose(yyout);
printf("\nTHE NUMBER OF COMMENT LINES = %d\n",cc);
}
44
LEX PROGRAM TO CHECK THE VALIDITY OF ARITHEMATIC STATEMENT

%{
#include<stdio.h>
int opr=0,opd=0;
int n;
%}
%%
[\+\-\*\/] { printf("OPERATORS ARE %s\n",yytext);
opr++;
}
[a-zA-Z]+ { printf("OPERANDS ARE %s\n",yytext);
opd++;
}
[0-9]+ { printf("OPERANDS ARE %s\n",yytext);
opd++;
}
[a-zA-Z]+\+\-\*\/[a-zA-Z]+ { n=0; }
[0-9]+\+\-\*\/[0-9]+ { n=0; }
%%
main()
{
printf("\nENTER THE EXPRESSION : \n");
yylex();
printf("\nNUMBER OF OPERATORS ARE %d",opr);
printf("\nNUMBER OF OPERANDS ARE %d",opd);
if((n==0)&&(opd==opr+1))
printf("\nVALID EXPRESSION\n");
else
printf("\nINVALID EXPRESSION\n");
}
45
LEX PROGRAM TO COUNT NUMBER OF CONSTANTS
%{
#include<stdio.h>
int cons=0;
%}
%%
[0-9]+ { printf("\n%s",yytext); cons++; }
.;
%%
main(int argc,char *argv[])

{
if(argc==2)
{
yyin=fopen(argv[1],"r");
yylex();
printf("\nNUMBER OF CONSTANTS : %d\n",cons);
}
else
printf("\nERROR");
}
46
LEX PROGRAM GENERATES A C PROGRAM WHICH TAKES STANDARD INPUT AS

OUTPUT OF UNIX DATE AND GIVES EITHER OF THE FOLLOWING MESSAGES
Good Morning
Good Afternoon
Good Evening
%{
%}
%%
Morning [ ](00|01|02|03|04|05|06|07|08|09|10|11)[:]
Afternoon [ ](12|13|14|15|16|17)[:]
Evening [ ](18|19|20|21|22|23)[:]
%%
{Morning} printf("Good Morning ");
{Afternoon} printf("Good Afternoon ");
{Evening} printf("Good Evening ");
. ;
If we assume that executable file name of the generated C program is “greet” then we can run the
following command from see the output.
date | greet
BEGIN followed by the name of a start condition places the scanner in the corresponding start
Condition
47
WRITE A PROGRAM TO IMPLEMENT LEXICAL ANALYZER USING LEX TOOL
LEX
/* program name is lexp.l */

%{
/* program to recognize a c program */
int COMMENT=0;
%}
identifier [a-zA-Z][a-zA-Z0-9]*
%%
#.* { printf("\n%s is a PREPROCESSOR DIRECTIVE",yytext);}
int |
float |
char |
double |
while |
for |
do |
if |
break |
continue |
void |
switch |
case |
long |
struct |
const |
typedef |
return |
else |
auto |
default |
enum |
extern |
register |
short |
48
sizeof |
signed |
static |
unsigned |
union |
volatile |
goto {printf("\n\t%s is a KEYWORD",yytext);}
"/*" {COMMENT = 1;}
/*{printf("\n\n\t%s is a COMMENT\n",yytext);}*/
"*/" {COMMENT = 0;}
/* printf("\n\n\t%s is a COMMENT\n",yytext);}*/
{identifier}$ {if(!COMMENT)printf("\n\nFUNCTION\n\t%s",yytext);}
\{ {if(!COMMENT) printf("\n BLOCK BEGINS");}
\} {if(!COMMENT) printf("\n BLOCK ENDS");}
{identifier}(\[[0-9]*\])? {if(!COMMENT) printf("\n %s IDENTIFIER",yytext);}
\".*\" {if(!COMMENT) printf("\n\t%s is a STRING",yytext);}
[0-9]+ {if(!COMMENT) printf("\n\t%s is a NUMBER",yytext);}
$(\;)? {if(!COMMENT) printf("\n\t");ECHO;printf("\n");}
\( ECHO;
= {if(!COMMENT)printf("\n\t%s is an ASSIGNMENT OPERATOR",yytext);}
\<= |
\>= |
\< |
== |
\> {if(!COMMENT) printf("\n\t%s is a RELATIONAL OPERATOR",yytext);}
%%
int main(int argc,char **argv)
{
if (argc > 1)
{
FILE *file;
file = fopen(argv[1],"r");
if(!file)
{
printf("could not open %s \n",argv[1]);
exit(0);
}
yyin = file;
49
}
yylex();
printf("\n\n");
return 0;
}
int yywrap()
{
return 0;
}
Input:
$vi var.c
#include<stdio.h>
main()
{
int a,b;
}
Output:
$lex lex.l
$cc lex.yy.c
$./a.out var.c
#include<stdio.h> is a PREPROCESSOR DIRECTIVE
FUNCTION
main ( )
BLOCK BEGINS
int is a KEYWORD
a IDENTIFIER
b IDENTIFIER
BLOCK ENDS
50
YACC
The computer program YACC (Yet another Compiler Compiler) is a parser generator developed
by Stephan C. Johnson at AT&T for the UNIX operating system. It generates a parser (the part of
a compiler that tries to make syntactic sense of the source code) based on an analytic grammar written
in a notation similar to BNF. YACC generates the code for the parser in the C programming
language.YACC used to be available as the default parser generator on most UNIX operating systems.
The parser generated by yacc requires a Lexical analyzer generator, such as LEX or FLEX is widely
available. YACC uses LALR parser. Bison is the GNU version of YACC.
Assume our goal is to write a BASIC compiler. First, we need to specify all pattern matching rules for
lex (bas.l) and grammar rules for yacc (bas.y). Commands to create our compiler, bas.exe, are listed
below:
yacc –d bas.y # create y.tab.h, y.tab.c
lex bas.l # create lex.yy.c
cc lex.yy.c y.tab.c –obas.exe # compile/link
YACC reads the grammar descriptions in bas.y and generates a parser, function yyparse, in file
y.tab.c. Included in file bas.y are token declarations. The –d option causes yacc to generate
definitions for tokens and place them in file y.tab.h. Lex reads the pattern descriptions in bas.l,
includes file y.tab.h, and generates a lexical analyzer, function yylex, in file lex.yy.c. Finally, the
lexer and parser are compiled and linked together to form the executable, bas.exe. From main, we
call yyparse to run the compiler. Function yyparse automatically calls yylex to obtain each token.
51
File Content
filename.lex Specifies the lex command specification file that defines the lexical analysis rules.
Specifies the yacc command grammar file that defines the parsing rules, and calls the
filename.yacc
yylex subroutine created by the lex command to provide input.
The programs section contains the following subroutines. Because these subroutines are included in
this file, you do not need to use the YACC library when processing this file.
main The required main program that calls the yyparse subroutine to start the program.
yyerror(s) This error-handling subroutine only prints a syntax error message.
yywrap The wrap-up subroutine that returns a value of 1 when the end of input occurs.
52
YACC PROGRAMS
YACC PROGRAM FOR ARITHEMATIC EXPRESSION
Lex
%{
#include"y.tab.h"
int extern yylval;
%}
%%
[0-9]+ {yylval=atoi(yytext);
return num;}
. return yytext[0];
\n return 0;
%%
Yacc
%{
#include<stdio.h>
int valid=0,temp;
%}
%token num
%left '+' '-'
%left '*' '/'
%%
exp1: exp {temp=$1;}
exp: exp '+' exp {$$=$1+$3;}
|exp '-' exp {$$=$1-$3;}
|exp'*'exp{$$=$1*$3;}
|exp'/'exp{if($3==0) {valid=1;$$=0;} else{$$=$1/$3;}}
|'('exp')'{$$=$2;}
|num {$1=$$;};
%%
int yyerror()
{
53
printf("\n Invalid Expression");

valid=2;
return 0;
}
int main()
{
printf("\n ENTER THE EXPRESSION TO BE EVALUATED::");
yyparse();
if(valid==1)
{
printf("\n DIVISION BY 0!");
}
if(valid==0)
{
printf("\nVALID EXPRESSION\n");
printf("\nTHE VALUE EVALUATED IS %d\n::",temp);
}
}
54
YACC PROGRAM TO CONVERT DECIMAL NO. TO A OCTAL NO.
Lex Code:
%{
#include “y.tab.h”
extern int yylval;
%}
%%
[0-9]+ { yylval = atoi(yytext);
return num;}
\n return 0;
. return yytext[0];
%%
Yacc Code:
%{
int x, t , r, n;
%}
% token num;
%%
Stat : num{ x = $1;
t = 1;
n = 0;
while( x! = 0)
{
r = x % 8;
n = n + r * t;
t = t * 10;
x = x / 8;
} printf(“%d”,n);
}
%%
55
Compilation:
lex dectooct.l
yacc –d dectooct.y
cc lex.yy.c y.tab.c –ll -ly
Output:
./a.out
10
12
56
YACC PROGRAM TO CONVERT OCTAL NO. TO A DECIMAL NO.
Lex Code:
%{
#include “y.tab.h”
extern int yylval;
%}
%%
[0-9]+ { yylval = atoi(yytext);
return num;}
\n return 0;
. return yytext[0];
%%
Yacc Code:
%{
int x, t, r, n;
%}
% token num;
%%
Stat : num{ x = $1;
t = 1;
n = 0;
while(x! = 0)
{
r = x % 10;
n = n + r * t;
t = t * 8;
x = x / 10;
} printf(“%d”,n);
}
%%
57
Compilation:
lex octtodec.l
yacc –d octtodec.y
cc lex.yy.c y.tab.c –ll -ly
Output:
./a.out
12
10
58
YACC PROGRAM FOR IF-THEN Statements IN A Compiler Design
(if.l)
ALPHA [A-Za-z]
DIGIT [0-9]
%%
[ \t\n]
if return IF;
then return THEN;
{DIGIT}+ return NUM;
{ALPHA}({ALPHA}|{DIGIT})* return ID;
"<=" return LE;
">=" return GE;
"==" return EQ;
"!=" return NE;
"||" return OR;
"&&" return AND;
. return yytext[0];
%%
(if.y)
%{
#include <stdio.h>
#include <stdlib.h>
%}
%token ID NUM IF THEN LE GE EQ NE OR AND
%right '='
%left AND OR
%left '<' '>' LE GE EQ NE
%left '+''-'
%left '*''/'
%right UMINUS
%left '!'
%%
S : ST {printf("Input accepted.\n");exit(0);};
ST : IF '(' E2 ')' THEN ST1';'
;
59
ST1 : ST
|E
;
E : ID'='E
| E'+'E
| E'-'E
| E'*'E
| E'/'E
| E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| ID
| NUM
;
E2 : E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| ID
| NUM
;
%%
#include "lex.yy.c"
main()
{
printf("Enter the statement: ");
yyparse();
}
60
Output:
$ lex if.l
$ yacc if.y
$ gcc y.tab.c -ll -ly
$ ./a.out
Enter the statement: if(i>) then i=1;
syntax error
$ ./a.out
Enter the statement: if(i>8) then i=1;
Input accepted.
$
61
YACC PROGRAM FOR FOR LOOP Statements IN A Compiler Design
Lex file: for.l
alpha [A-Za-z]
digit [0-9]
%%
[\t \n]
for return FOR;
{digit}+ return NUM;
{alpha}({alpha}|{digit})* return ID;
"<=" return LE;
">=" return GE;
"==" return EQ;
"!=" return NE;
"||" return OR;
"&&" return AND;
. return yytext[0];
%%
Yacc file: for.y
%{
#include <stdio.h>
#include <stdlib.h>
%}
%token ID NUM FOR LE GE EQ NE OR AND
%right "="
%left OR AND
%left '>' '<' LE GE EQ NE
%left '+' '-'
%left '*' '/'
%right UMINUS
%left '!'
%%
S : ST {printf("Input accepted\n"); exit(0);}
62
ST : FOR '(' E ';' E2 ';' E ')' DEF

;
DEF : '{' BODY '}'
| E';'
| ST
|
;
BODY : BODY BODY
| E ';'
| ST
|
;
E : ID '=' E
| E '+' E
| E '-' E
| E '*' E
| E '/' E
| E '<' E
| E '>' E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| E '+' '+'
| E '-' '-'
| ID
| NUM
;
E2 : E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
63
| E OR E
| E AND E
;
%%
#include "lex.yy.c"
main() {
printf("Enter the expression:\n");
yyparse();
}
Output:
$ lex for.l
$ yacc for.y
conflicts: 25 shift/reduce, 4 reduce/reduce
$ gcc y.tab.c -ll -ly
$ ./a.out
Enter the expression:
for(i=0;i<n;i++)
i=i+1;
Input accepted
$
64
YACC & LEX SPECIFICATION PROGRAM’S ARE FOR TESTING BALANCED

PARENTHESES.
(bp.y)
%{
#include <ctype.h>
#include <stdio.h>
#include "y.tab.h"
extern int yydebug;
%}
%token OPEN CLOSE
%%
lines : s '\n' {printf("OK\n"); }
;
s:
| OPEN s CLOSE s
;
%%
void yyerror(char * s)
{
fprintf (stderr, "%s\n", s);
}
int yywrap(){return 1; }
int main(void) {
yydebug=1;
return yyparse();}
Lex Specification File (bp.l)

%{
#include"y.tab.h"
%}
%%
[ \t] { }
"(" return OPEN;
")" return CLOSE;
\n|. { return yytext[0];
}
65
Compilation:
lex bp.lex
yacc –dv bp.y
gcc –o bp y.tab.c lex.yy.c -ly –lfl
Input:
(()) or (()(()()))
Output:
OK
66
COMPILER DESIGN BASIC CONCEPTS AND DEFINITIONS
1. What is a Parser?
A Parser for a Grammar is a program which takes in the Language string as its input and produces
either a corresponding Parse tree or an Error.
2. What is the Syntax of a Language?

The Rules which tells whether a string is a valid Program or not are called the Syntax.
3. What is the Semantics of a Language?

The Rules which gives meaning to programs are called the Semantics of a Language.
4. What are tokens?

When a string representing a program is broken into sequence of substrings, such that each
substring represents a constant, identifier, operator, keyword etc of the language, these substrings
are called the tokens of the Language.
5. What is the Lexical Analysis?

The Function of a lexical Analyzer is to read the input stream representing the Source program,
one character ata a time and to translate it into valid tokens.
6. How can we represent a token in a language?

The Tokens in a Language are represented by a set of Regular Expressions. A regular expression
specifies a set of strings to be matched. It contains text characters and operator characters. The
Advantage of using regular expression is that a recognizer can be automatically generated.
7. Are Lexical Analysis and Parsing two different Passes?

These two can form two different passes of a Parser. The Lexical analysis can store all the
recognized tokens in an intermediate file and give it to the Parser as an input. However it is more
convenient to have the lexical Analyzer as a coroutine or a subroutine which the Parser calls
whenever it requires a token.
8. How are the tokens recognized?

The tokens which are represented by a Regular Expressions are recognized in an input string by
means of a state transition Diagram and Finite Automata.
67
9. How do we write the Regular Expressions?

The following are the most general notations used for expressing a R.E.
Symbol Description
| OR (alternation)
() Group of Subexpression
* 0 or more Occurrences
? 0 or 1 Occurrence
+ 1 or more Occurrences
{n,m} n-m Occurrences

Suppose we want to express the 'C' identifiers as a regular Expression: -
identifier=letter(letter|digit)*
Where letter denotes the character set a-z & A-Z
In LEX we can write it as [a-zA-Z][a-zA-Z0-9]*
10. What are the Advantages of using Context-Free grammars?

It is precise and easy to understand.
It is easier to determine syntatic ambiguities and conflicts in the grammar.
11. What are the Parse Trees?

Parse trees are the Graphical representation of the grammar which filters out the choice for
replacement order of the Production rules, example for a production P→ABC, the parse tree
would be
P
/ \\
/ \ \
A B C
12. What are Terminals and non-Terminals in a grammar?
Terminals: All the basic symbols or tokens of which the language is composed of are called
Terminals. In a Parse Tree the Leafs represents the Terminal Symbol.
Non-Terminals: These are syntactic variables in the grammar which represents a set of strings
the grammar is composed of. In a Parse tree all the inner nodes represents the Non-Terminal
symbols.
68
13. What are Ambiguious Grammars?

A Grammar that produces more than one Parse Tree for the same sentences or the Production
rules in a grammar is said to be ambiguous. Example consider a simple mathematical expression
E→E*E, the production can have two Parse tree according to assocciativity of the operator '*'
E E
/ \ / \
* E E *
/ \ /\
E E E E
14. What is bottom up Parsing?

The Parsing method in which the Parse tree is constructed from the input language string begining
from the leaves and going up to the root node. Bottom-Up parsing is also called shift-reduce
parsing due to its implementation. The YACC supports shift-reduce pasing.
15. What is the need of Operator precedence?

The shift reduce Parsing has a basic limitation. Grammars which can represent a left-sentential
parse tree as well as right-sentential parse tree cannot be handled by shift reduce parsing. Such a
grammar ought to have two non-terminals in the production rule. So the Terminal sandwitched
between these two non-terminals must have some associativity and precedence. This will help the
parser to understand which non-terminal would be expanded first.
16. What is LALR Parser ?

LALR is Look-ahead LR parser. It differs from LR in the fact that it will look ahead one symbol
in the input string before going for a reduce action. Look- ahead helps in knowing if the complete
rule has been matched or not.
Consider a grammar G with production
P→AB|ABC
When the Parser shifts the Symbol B it can reduce to P . But if the next Symbol was C then it has
not matched the complete rule. A LALR parser will shift one extra token and then take a decision
to reduce or shift.
17. What is a pass?

The pass of a compiler is defined as the number of physical scans over a source program i.e., how
many times the source code will be scanned.
69
18. Define token, lexeme, and pattern?

 Token is a sequence of character having a collective meaning.
 Pattern is a rule describing the set of strings in the input for which the same token is
produced as a output or it is a rule describing the set of lexemes that can represent a particular
token in the source program.
 Lexeme is a sequence of characters in the source program that is matched by the pattern for a
token.
19. What are the various types of Intermediate code?

Postfix notations, Prefix or polish notations and Trees, Three Address code.
20. What is Bootstrapping?

A compiler compilers itself is called as "Bootstrapping".
21. What is Back patching?

The filling of the labels in the Three Address statement is called "Back patching".
22. What are the functions available in the Back patching?

The functions available in the Back patching are as follows:
makelist(i),
mergeof(p1,p2),
backpatch(p,i).
23. What is a Basic Block?
Basic Block is a sequence of consecutive statements in which flow of control enters at the
beginning & leaves at the end.
24. What is the use of Symbol Table?

A Compiler uses a Table to keep track of Scope and binding information about names.
25. What is the Output of the Code Generator?

Object Program.
26. What is the Graphical Representation for Intermediate Representation?

Syntax tree and Directed Acyclic Graph (DAG).
27. What is the parsing technique used in YACC parser?

Shift Reduce Technique.
70
28. What is the Application of the DAG's?

a. Automatically detect the common sub expression.
b. Determine which identifiers have their values used in the block for which a leaf is created at
some time.
c. We can determine which statements compute values, which could be used outside the block.
29. What is the minimum and maximum size of a Input Buffering?

The minimum and maximum size is 1024 and 4096.
30. What is mean by parsing?

A parser for grammar G is a program that takes a string w as input and produces a parser tree for
w if w is a sentence of G, else produces an error message indicating that w is not a sentence of G.
31. Give different types of parsing techniques?

They are 2 types of Parsing Techniques
a. Top-down Parsing
b. Bottom-Up Parsing.
32. Give one example for top down parsing?

Recursive decent parsing, Non-Recursive decent parsing LL(1)grammar, Precedtive parsing.
33. Define handle?

Handle is a substring of a string that matches the right side of a production, which when reduced
to a non terminal on the left side of the production represents one step of the reverse right most
derivation.
34. What does u mean by handle pruning?

Handle pruning is nothing but reducing the handle by the non-terminal, which is towards the left
of the production.
35. When do you say that a grammar is ambiguous?

A Context free grammar is said to be ambiguous if a string in a language of the grammar can be
represented by two or more different parse trees.
36. What is a Cross Compiler?

Compiler which runs on one machine and produces a target code for another machine.
71
37. What is Sentinel?

The special character that identifies the End of a String.
38. How many types of errors? Examples for them?

a. Lexical, such as misspelling an identifier, keyword or operator.
b. Syntactic, such as an arithmetic expression with unbalanced parentheses.
c. Semantic, such as an operator applied to an incompatible operand Logical, such as an
infinitely recursive call.
39. What are the problems with top-down parsing?
a. Left recursion,
b. Back tracking,
c. Order of Alternatives,
d. Report of Failures.
40. What is dependence Graph?

The graph depicting the interdependences of the attributes of different nodes in a parse tree
41. What CC and LL stands for?

 CC→c compiler
 LL→Link library/loader and linkers.
42. What is a Flow graph?

The graph that shows the basic blocks and their successor relationship is called “Flow Graph”.
43. How many sections are there in LEX and YACC? Name them?
LEX and YACC there are 3 sections are:
a) Declarations section
b) Translation Rules
c) Auxiliary Procedures.
44. What is the compilation of the LEX program?

 Give the file name as filename.l
 Compiler the lex program as lex filename.l
 Compile the lex program in c as cc lex.yy.c –ll
 To see the output a.out
72
45. What is the compilation of only the YACC program?

 Give the file name as filename.y
 Compiler the lex program as yacc –d filename.y
 Give the file name as filename.l
 Compiler the lex program as lex filename.l
 Compile the lex program in c as cc lex.yy.c –ll
 Compile both lex and yacc programs as cc lex.yy.c y.tab.c –ll -ly
 To see the output a.out
46. How many phases are there in a compiler?

They are 2 phases are:
a) Analysis Phases
i) Lexical/Linear/scanning phases
ii) Hierarchical/Syntax/Parsing phases
iii) Semantic phases
b) Synthesis Phases
i) Intermediate code
ii) Code Optimization
iii) Code Generation
47. What are the various types of bottom- up parsers?

The various types of Bottom Up parser are:
1. Shift-Reduce
2. Operator Precedence
3. LR where L is left to right scanning of the input, R is construction the right most
derivation in reverse.
i) SLR(Simple LR)
ii) CLR(Canonical LR)
iii) LALR(Look a head LR)
73
INDEX Code generation phase,

A
Action specification in LEX,
Action tables
Action | GOTO tables, Code optimization phase,
Activation records, Compilation, process described,
Addressing modes, machine model and, Compilers defined,
Algebraic properties, register requirements Concatenation operation, regular sets and,
reduced with, Context-free grammars (CFGs)
Alphabet, defined for lexical analysis, Cross-compilers,
Ambiguous grammars and bottom-up parsing,
AND operator and translation, D
Arithmetic expressions, translation of, DAGs (Directed acyclic graphs).
Array references, Data storage.
Arrays, to represent action tables, Data structures for representing parsing tables,
Attributes defined, Dead states of DFAs,
Attributes dummy synthesized attributes, Decrement operators, implementation of,
Attributes inherited attributes, Dependency graphs,
Attributes synthesized attributes, Derivation in context-free grammar,
Augmented grammars, Detection, of DFA unreachable and dead
states,
B Deterministic finite automata (DFA)
Back end compilers, DO-WHILE statements and translation,
Back-patching, Dummy synthesized attributes,
Backtracking parsers,
Block statements and stack allocation, E
Boolean expressions, translation of, ∈-closure (q), finding,
Bootstrap compilers, defined, ∈-moves acceptance of strings by NFAs with,
Bottom-up parsing Equivalence of NFAs with and without,
Braces {} in syntax-directed translation NFAs with, ∈-productions
schemes, Equivalence of automata,
Error handling detection and report of errors,
C Errors Handling
Call and return sequences, stack allocation
and, F
Canonical collection of sets Finite automata
Cartesian products, set operation, FOR statements and translation,
CASE statements, Front-end compilers,
74
G Loop optimizations,
Global common sub-expressions, eliminating, Induction variables, reduction of,
GOTO tables, Loop detection,
Loop jamming,
H Loop unrolling,
Handle pruning, LR parsers and parsing,
Hash tables for organization of symbol tables, LR(1) parsers and parsing
I M
IF-THEN-ELSE statements and translation, Machine model described,
IF-THEN statements and translation, Memory,
Increment operators, implementation of, Memory addresses, machine model and,
Indirect triple representation,
Induction variables of loops N
Inherited attributes, Names access to nonlocal names,
Input files, LEX, Non-deterministic finite automata (NFA)
Intermediate code generation phase, Non-terminals in context-free grammar,
Intersection, set operation, NOT operator and translation,
J-K O
Jumps and Boolean translation, Operators for regular expressions,
Optimizations of DFAs,
L OR operator and translation,
LALR parsing,
Language, defined for lexical analysis, P
Language tokens, lexical analysis and, Panic mode recovery,
L-attributed definitions, Parsers and parsing
Left linear grammar, Predictive top-down parsers,
LEX compiler-writing tool, Parse trees in CFG,
Lexemes, Pattern specification in LEX,
Lexical analysis Peephole optimization,
Lexical analyzers, design of, Postfix notation,
Lexical phase, Power set, set operation,
Linear lists for organization of symbol tables, Predictive parsing
Local common sub expressions, eliminating, Prefixes, defined,
Logical expressions Procedure calls,
Loop invariant computations, Productions (P) in context-free grammar,
Loop jamming,
75
Q Syntax analysis phase,

Quadruple representation, Syntax-directed definitions
Syntax directed translations and translation
R schemes,
Recursion, eliminating left recursion, Syntax trees,
Recursive descent parsers, Synthesized attributes,
Reduce-reduce conflicts,
Reducible flow graphs T
Reduction of grammar, Terminals (T) in context-free grammar,
Registers Three-address code,
Regular expression notation Three-address statements, representation of,
Role in lexical analysis, Tokens, lexical analysis and,
Regular expressions Top-down parsing
Regular grammar, Translations and translation schemes
Right linear grammar, Trees,
Triple representation,
S
Scope rules and scope information, U
Search trees for organization of symbol tables, Union set operation,
Sentential form handles, Unit productions defined,
Set difference, set operation, Unreachable states of DFAs,
Set operations, defined,
Sets defined, V
Shift-reduce conflicts, Variables (V) in context-free grammar,
SLR parsing,
Source files, LEX, W-X
Stack allocation WHILE statements and translation,
Start symbol (S) in context-free grammar,
Storage management Y-Z
Stack allocation, YACC
Static allocation,
Storage allocation,
Strings, defined,
SWITCH statements, translation of,
Symbol tables
Scope information,
Search trees for organization,
Syntactic phase error recovery,

Compiler Design Book Final

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Compiler Design Book Final

Uploaded by

Copyright:

Available Formats

1

COMPILER DESIGN PRACTICAL PROGRAMMING CONCEPTS

vi filename : It is used to create a new file or edit an existing one.

. any character except newline

* zero or more copies of the preceding expression

+ one or more copies of the preceding expression

? zero or one copy of the preceding expression

(ab)+ one or more copies of ab (grouping)

"a+b" literal "a+b" (C escapes still work)

Table 1: Pattern Matching Primitives

abc* ab abc abcc abccc ...

abc+ abc, abcc, abccc, abcccc, ...

a(bc)+ abc, abcbc, abcbcbc, ...

[abc] one of: a, b, c

[a-z] any letter, a through z

[a\-z] one of: a, -, z

[-az] one of: - a z

[A-Za-z0-9]+ one or more alphanumeric characters

[^ab] anything except: a, b

Table 2: Pattern Matching Examples

int yylex(void) call to invoke lexer, returns token

char *yytext pointer to matched string

yyleng length of matched string

yylval value associated with token

int yywrap(void) wrapup, return 1 if done, 0 if not done

FILE *yyout output file

FILE *yyin input file

INITIAL initial start condition

ECHO write matched string

Table 3: Lex Predefined Variables

PROGRAM FOR RECOGNIZING OPERATORS IN A GIVEN INPUT FILE

PROGRAM FOR RECOGNIZING SPECIAL SYMBOLS IN A GIVEN INPUT FILE

PROGRAM FOR RECOGNIZING CONSTANTS IN A GIVEN INPUT FILE

PROGRAM FOR RECOGNIZING KEYWORDS AND IDENTIFIERS IN A GIVEN INPUT

int void char

PROGRAM FOR RECOGNIZING HEADER FILES IN A GIVEN INPUT FILE

PROGRAM FOR LEXICAL ANALYSIS WITH SYMBOL TABLE

PROGRAM TO DESIGN LEXICAL ANALYSIS

The no’s in the program are:

The keywords and identifiers are:

Special characters are:

LEX SOURCE SPECIFICATION THAT PRINT * WHENEVER IT RECIEVES A TOKEN

LEX SOURCE SPECIFICATION THAT PRINT INTEGER WHENEVER IT RECEIVES A

LEX SOURCE SPECIFICATION THAT PRINTS INTEGER WHENEVER IT RECEIVES A

LEX SOURCE SPECIFICATION THAT IDENTIFIES POSITIVE INTEGER

LEX SOURCE SPECIFICATION THAT IDENTIFIES NEGATIVE INTEGER

LEX PROGRAM TO FIND WHETHER THE GIVEN INPUT IS PALINDROME NUMBER

LEX PROGRAM TO FIND WHETHER THE GIVEN INPUT IS ARMSTRONG OR NOT

LEX PROGRAM TO RECOGNIZES STRINGS OF NUMBERS (Integers) IN THE INPUT

LEX PROGRAM FOR REMOVING MULTIPLE BLANKS IN GIVEN INPUT TEXT

LEX PROGRAM THAT PRINTS SUCCESSOR OF THE GIVEN INPUT CHARACTER

LEX PROGRAM THAT PRINTS PREDECESSOR OF THE GIVEN INPUT CHARACTER

LEX PROGRAM TO REMOVE COMMENT LINES IN THE GIVEN INPUT TEXT

LEX PROGRAM FOR CONVERTING REAL NUMBER TO INTEGER NUMBER

LEX PROGRAM TO COUNT NUMBER OF VOWELS AND CONSONANTS

LEX PROGRAM TO COUNT THE TYPE OF NUMBERS

LEX PROGRAM TO COUNT NUMBER OF Printf and Scanf Statements

LEX PROGRAM TO FIND THE SIMPLE AND COMPOUND STATEMENTS

LEX PROGRAM TO COUNT NUMBER OF IDENTIFIERS

LEX PROGRAM TO COUNT NUMBER OF WORDS, CHARATERS, BLANKS AND LINES

int main(int argc, char *argv[])

LEX PROGRAM TO COUNT NUMBER OF COMMENT LINES

LEX PROGRAM TO CHECK THE VALIDITY OF ARITHEMATIC STATEMENT