Professional Documents
Culture Documents
Compiler Design Book Final
Compiler Design Book Final
VI EDITOR
The default editor that comes with the UNIX operating system is called vi (visual editor). The UNIX
vi editor is a full screen editor and has two modes of operation:
1. Command mode commands which cause action to be taken on the file.
2. Insert mode in which entered text is inserted into the file.
In the command mode, every character typed is a command that does something to the text file being
edited; a character typed in the command mode may even cause the vi editor to enter the insert mode.
In the insert mode, every character typed is added to the text in the file; pressing the <Esc> (Escape)
key turns off the Insert mode.
vi Editor Commands
LEX PRACTICE
Metacharacter Matches
\n newline
^ beginning of line
$ end of line
a|b a or b
[] character class
Expression Matches
abc abc
a(bc)? a, abc
[ \t\n]+ whitespace
[a^b] a, ^, b
[a|b] a, |, b
3
a|b a, b
Name Function
BEGIN
switch start condition
condition
ALGORITHM
STEP1: Start
STEP2: Declare the header files stdio.h and string.h
STEP3: Start the main section
STEP4: Create character variables c, d to get the characters from Input file.
STEP5: Create two pointer variables fs, fp of ‘FILE’ type to Access the file.
STEP6: Open the files Operators.txt and ip.c using fs and fp respectively in ‘read’ mode.
STEP7: Start the while loop with condition !feof(fs).
STEP8: Get the character from Operators.txt and assign it to char ‘c’.
STEP9: Start another while loop with condition!feof(fp).
STEP10: Similarly get the character from ip.c and assign it to char‘d’.
STEP11: If c = = ‘ ‘ || c = = ‘ \ n’ || c = = EOF then break the program
STEP12: If c = = d, then print the character from the input file as output.
STEP13: Rewind fp.
STEP14: Close the opened files and while loops.
STEP15: Stop.
Text File:
+.=-/*%
Input File:
#include<stdio.h>
main()
{
int a=10,b=2,c;
c=a+b;
printf(“%d”,c);
}
5
Program Code:
#include<stdio.h>
#include<string.h>
main()
{
Char c,d;
FILE *fs,*fp;
fs=fopen(“operators.txt”,”r”);
fp=fopen(“ip.c”,”r”);
while(!feof(fs))
{
c=fgetc(fs);
while(!feof(fp))
{
d=fgetc(fp);
if(c==’ ‘||c==’\n’||c==EOF)
break;
if(c==d)
{
printf(“\n %c IS AN OPERATOR”,c);
}
}
rewind(fp);
}
fclose(fp);
fclose(fs);
}
Compilation:
cc operators.c
Output:
./a.out
= IS AN OPERATOR
= IS AN OPERATOR
+ IS AN OPERATOR
= IS AN OPERATOR
6
ALGORITHM:
STEP1: Start.
STEP2: Declare the header files stdio.h and string.h.
STEP3: Start the main section.
STEP4: Create character variables c, d to get the characters from input file.
STEP5: Create two pointer variables fs, fp of ‘FILE’ type to access the file.
STEP6: Open the files Specials.txt and ip.c using fs and fp respectively in ‘read’ mode.
STEP7: Start the while loop with condition !feof(fs).
STEP8: Get the character from Specails.txt and assign it to char ‘c’.
STEP9: Start another while loop with condition!feof(fp).
STEP10: Similarly get the character from ip.c and assign it to char‘d’.
STEP11: If c = = ‘ ‘ || c = = ‘ \ n’ || c = = EOF then break the program
STEP12: If c = = d, then print the character from the input file as output.
STEP13: Rewind fp.
STEP14: Close the opened files and while loops.
STEP15: Stop.
Program Code:
#include<stdio.h>
#include<string.h>
main()
{
char c,d;
FILE *fs,*fp;
fs=fopen(“specials.txt”,”r”);
fp=fopen(“ip.c”,”r”);
while(!feof(fs))
{
c=fgetc(fs);
while(!feof(fp))
{
d=fgetc(fp);
if(c==’ ‘||c==’\n’||c==EOF)
break;
if(c==d)
7
{
printf(“\n %c IS SPECIAL SYMBOL”,c);
}
}
rewind(fp);
}
fclose(fp);
fclose(fs);
}
Compilation:
cc specials.c
Output:
./a.out
# IS A SPECIAL SYMBOL
( IS A SPECIAL SYMBOL
) IS A SPECIAL SYMBOL
“IS A SPECIAL SYMBOL
“IS A SPECIAL SYMBOL
, IS A SPECIAL SYMBOL
8
ALGORITHM
STEP1: Start
STEP2: Declare the header files stdio.h and string.h
STEP3: Start the main section
STEP4 : Create character variables c , d to get the characters from input file and an array a to
store the constants.
STEP5: Create integer variables i , f ,ass
STEP6: Create a pointer variable fp of ‘FILE’ type to access the file.
STEP7: Open the file ip.c using fp in ‘read’ mode.
STEP8: Start the while loop with condition !feof(fp).
STEP9: Get the character from ip.c and assign it to char ‘c’.
STEP10: Get the ascii value of ‘c’ and store it in ass.
STEP11: If ass((G.T.E 48) && (L.T.E57)) then store the value of c in array ‘a’ and make the values
of f and i equal to 1.
STEP12: Start the while loop with condition f==1.
STEP13: Get the character from ip.c and store in d and get the ascii value of d and store in ass.
STEP14: If ass((E.T 46) || (G.T.E 48) && (L.T.E 57)) then store the value of d in array ‘a’ and
increment I, make f=1.
STEP15: Else make f=0 , store ‘\0’ in array ‘a’ as ‘i’ th element and then come out of the loop.
STEP16: print the elements in the array and close the opened files and while loops.
STEP17: stop.
9
Program Code:
#include<stdio.h>
#include<string.h>
main()
{
char c,d,a[20];
int i,f,ass;
FILE *fp;
fp=fopen(“ip.c”,”r”);
while(!feof(fp))
{
c=fgetc(fp);
ass=toascii©;
if((ass>=48) && (ass<=57))
{
a[0]=c;
f=1;
i=1;
while(f==1)
{
d=fgetc(fp);
ass=toascii(d);
if((ass==46) || (ass>=48) && (ass<=57))
{
a[i]=d;
i=i+1;
f=1;
}
else
{
f=0;
a[i]=’\0’;
break;
}
}
Printf (“\n %s IS A CONSTANT”,a);
}
}
10
Compilation:
cc constants.c
Output:
./a.out
10 IS A CONSTANT
2 IS A CONSTANT
11
ALGORITHM:
STEP1: Start.
STEP2: Declare the header files stdio.h and string.h.
STEP3: Start the main section.
STEP4: Create two pointer variables fs , fp of ‘FILE’ type to access the file.
STEP5: Create two arrays a , b of ‘char’ type to store the characters and two variables c , d of
‘char’ type to get the characters from the input file
STEP6: Create integer variables ass, f , z, i.
STEP7: Open the files Keywords.txt and ip.c using fs and fp respectively in ‘read’ mode.
STEP8: Start a while loop with condition (!feof(fp)).
STEP9: Get the character from the file ip.c and store it in variable ‘c’ .Convert it into its ascii
value.
STEP10: If the ascii value of ‘c’ is ((G.T.E 65) && ( L.T.E 90) || (G.T.E 97) && (L.T.E
122)) then store the character in array ‘a’.
STEP11: The values of f , i are equal to 1 and start another while loop with condition (f = = 1).
STEP12: Get the character from the file ip.c and store it in variable ‘d’ .Convert it into its
ascii value.
STEP13: If the ascii value of ‘d’ is (((G.T.E 65 ) && ( L.T.E 90) || ((G.T.E 97 ) && (L.T.E
122)) || (E.T 95) || (E.T 46) || ((G.T.E 48 ) && ( L.T.E 57))) then Store the character in array
‘a’ and increment ‘i’ value and f = 1.
STEP14: Else f = 0 , a [ i] is the last symbol and write the function fseek(fp,-1,1).
STEP15: Start another while loop with condition (!feof(fs)) and store the string from
Keywords.txt into array ‘b’.
STEP16: Compare a , b and store the value in variable z.
STEP17: If z = 0 it is a Keyword else it is a Identifier.
STEP18: Stop.
Text File:
Program Code
#include<stdio.h>
#include<string.h>
main()
{
FILE *fs,*fp;
char a[20],b[20],c,d;
int ass,f,z,i;
fp=fopen(“ip.c”,”r”);
fs=fopen(“Keywords.c”,”r”);
while(!feof(fp))
{
c=fgetc(fp);
ass=toascii(c );
if(((ass>=65)&&(ass<=90))||((ass>=97)&&(ass<=122)))
{
a[0]=c;
f=1;
i=1;
while(f==1)
{
d=fgetc(fp);
ass=toascii(d);
if(((ass>=65)&&(ass<=90))||((ass>=97)&&(ass<=122))||
(ass==95)||(ass==46)||((ass>=48)&&( ass<=57)))
{
a[i]=d;
i=i+1;
f=1;
}
else
{
f=0;
a[i]=’\0’;
fseek(fp,-1,1);
break;
13
}
}
while(!feof(fs))
{
fscanf(fs,”%s”,b);
z=strcmp(a,b);
if(z==0)
{
Printf(“%s IS A KEYWORD”,a);
}
}
if(z!=0)
{
printf(“%s IS AN IDENTIFIER”,a);
rewind(fs);
}
}
}
}
Compilation:
cc Keywords.c
Output:
./a.out
stdio.h IS A IDENTIFIER
main IS AN IDENTIFIER
int IS AN KEYWORD
a IS AN IDENTIFIER
b IS AN IDENTIFIER
c IS AN IDENTIFIER
a IS AN IDENTIFIER
b IS AN IDENTIFIER
c IS AN IDENTIFIER
printf IS AN IDENTIFIER
c IS AN IDENTIFIER
14
ALGORITHM
STEP1: Start
STEP2: Declare the header files stdio.h and string.h
STEP3: Start the main section
STEP4 : Create character variables c , d to get the characters from input file and two arrays a
and b.
STEP5: Create integer variables i , f ,ass ,z.
STEP6: Create two pointer variables fp and fs of ‘FILE’ type toaccess the file.
STEP7: Open the file ip.c and headerfiles.c using fp and fs respectively in ‘read’ mode.
STEP8: Start the while loop with condition !feof(fp).
STEP9: Get the character from ip.c and assign it to char ‘c’.
STEP10: Get the ascii value of ‘c’ and store it in ass.
STEP11: If ass ((G.T.E 65) && ( L.T.E 90) || (G.T.E 97) && ( L.T.E 122)) then store the value of
c in array ‘a’ and make f and i as 1.
STEP12: Start the while loop with condition f==1.
STEP13: Get the character from ip.c and and store in d and get the ascii value of ‘d’ and store in ass.
STEP14: If ass (((G.T.E 65 ) && ( L.T.E 90) || ((G.T.E 97 ) && ( L.T.E 122)) || (E.T 46) ) then
store the value of d in array ‘a’ and increment i, make f=1.
STEP15: Else make f=0 , store ‘\0’ in array ‘a’ as ‘i’ th element , use fseek fuction to seek a
specified place (offset=-1) within thefile ip.c and modify it and then come out of the loop.
STEP16: Start the while loop with condition !feof(fs).
STEP17: Scan the file headerfiles.c using fscanf() and store the result in b.
STEP18: compare character arrays ‘a’ and ‘b’ and store the result in z.
STEP19: If z==0 then print the elements in the array ‘a’ and come out of the while loop.
STEP20: Else Rewind(fs).
STEP21: Close all the opened files and while loops.
STEP22: stop.
Text File:
stdio.h string.h
15
Program Code:
#include<stdio.h>
#include<string.h>
main()
{
char c,d,a[20],b[20];
int i,f,ass,z;
FILE *fp,*fs;
fp=fopen(“ip.c”,”r”);
fs=fopen(“headerfiles.txt”,”r”);
while(!feof(fp))
{
c=fgetc(fp);
ass=toascii(c );
if((ass>=65) && (ass<=90) || (ass>=97) && (ass<=122) )
{
a[0]=c;
f=1;
i=1;
while(f==1)
{
d=fgetc(fp);
ass=toascii(d);
if((ass>=65) && (ass<=90) ||(ass>=97) && (ass<=122) || (ass==46))
{
a[i]=d;
i=i+1;
f=1;
}
else
{
f=0;
a[i]=’\0’;
fseek(fp,-1,1);
break;
}
}
16
while(!feof(fs))
{
fscanf(fs,”%s”,b);
z=strcmp(a,b);
if(z==0)
{
printf (“\n %s IS A HEADER FILE”,a);
break;
}
}
if(z!=0)
{
rewind(fs);
}
}
}
fclose(fp);
fclose(fs);
}
Compilation:
cc headerfiles.c
Output:
./a.out
Stdio.h IS A HEADER FILE
17
ALGORITHM:
STEP1: Start
STEP2: Declare the header files stdio.h, string.h and ctype.h
STEP3: Declare a structure named lextable where identifier,arithmetic, relop, val are the character
variables declared inside the structure.
STEP4: Close the structure and start the main section.
STEP5: Declare integer variables i, n and character array ch[50]
STEP6: Declare the array It[80] of type struct
STEP7: Enter your expression and store it in array ch.
STEP8: Find the length of expression using strlen(ch) and assign it to n
STEP9: Start the for loop with condition for ( i=0; i<n; i++)
STEP10: Start the if condition with isalpha(ch[i])
STEP11: The value of It[i].identifier is the value of ch[i] and the remaining values are equal to blank
spaces.
STEP12: Start another if condition, if the character is a digit, then store ch[i] in It[i].val and
remaining are equal to blank spaces
STEP13: Start another if condition, where if the ch[i] is equal to any of arithmetic operation + || -|| * ||
/ || %. Store it in It[i].arithmetic and remaining are blank spaces.
STEP14: Similarly start another if condition where, if the value of ch[i] is equal to any of relational
operation = || < || > || ? store it in It[i].relop and remaining are blank spaces.
STEP15: Print the contents of the lexeme table
STEP16: Stop
18
Program Code:
#include<stdio.h>
#include<string.h>
#include<ctype.h>
struct lextable
{
char identifier;
char arithmetic;
char relop;
char value;
};
int main()
{
int i, n;
char ch[50];
struct lextable It[80];
printf(“\n LEXEME”);
printf(“\n ENTER YOUR EXPRESSION”);
scanf(“%s”,ch);
n=strlen(ch);
printf(“\n THE EXPRESSION: \t”);
printf(“%s”,ch);
for(i = 0; i < n; i++)
{
If( isalpha( ch[i]))
{
It[i].arithmetic = ‘ ‘;
It[i].identifier = ch[i];
It[i].relop = ‘ ‘;
It[i].value = ‘ ‘;
}
else if(isdigit(ch[i]))
{
It[i].arithmetic = ‘ ‘;
It[i].identifier = ‘ ‘;
It[i].relop = ‘ ‘;
It[i].value = ch[i];
19
}
else if(ch[i] = = ‘+’ || ch[i] = = ‘-‘ || ch[i] = = ‘*’ || ch[i] = = ‘/’ || ch[i] = = ‘%’)
{
It[i].arithmetic = ch[i];
It[i].identifier = ‘ ‘;
It[i].relop = ‘ ‘;
It[i].value = ‘ ‘;
}
else if(ch[i] = = ‘=’ || ch[i] = = ‘<‘ || ch[i] = = ‘>’ || ch[i] = = ‘?’)
{
It[i].arithmetic = ‘ ‘;
It[i].identifier = ‘ ‘;
It[i].relop = ch[i];
It[i].value = ‘ ‘;
}
}
Printf(“\n CONTENTS OF LEXEME TABLE ARE:”);
Printf(“\n IDENTIFIER \t ARITHMETIC \t RELOP \t VALUE \n”);
for(i=0; i<n; i++)
{
Printf(“%c \t\t %c \t\t %c \t\t %c \n”, It[i].identifier, It[i].arithmetic, It[i].relop, It[i].value);
}
}
Compilation:
cc lexanalysis.c
Output:
./a.out
LEXEME
ENTER YOUR EXPRESSION
a+b=5
THE EXPRESSION: a+b=5
CONTENTS OF LEXEMETABLE ARE
IDENTIFIER ARITHMETIC RELOP VALUE
a +
b = 5
20
#include<string.h>
#include<ctype.h>
#include<stdio.h>
void keyword(char str[10])
{
if(strcmp("for",str)==0||strcmp("while",str)==0||strcmp("do",str)==0||strcmp("int",str)==0||strcmp("flo
at",str)==0||strcmp("char",str)==0||strcmp("double",str)==0||strcmp("static",str)==0||strcmp("switch",s
tr)==0||strcmp("case",str)==0)
printf("\n%s is a keyword",str);
else
printf("\n%s is an identifier",str);
}
main()
{
FILE *f1,*f2,*f3;
char c,str[10],st1[10];
int num[100],lineno=0,tokenvalue=0,i=0,j=0,k=0;
printf("\nEnter the c program");/*gets(st1);*/
f1=fopen("input","w");
while((c=getchar())!=EOF)
putc(c,f1);
fclose(f1);
f1=fopen("input","r");
f2=fopen("identifier","w");
f3=fopen("specialchar","w");
while((c=getc(f1))!=EOF)
{
if(isdigit(c))
{
tokenvalue=c-'0';
c=getc(f1);
while(isdigit(c))
{
tokenvalue*=10+c-'0';
c=getc(f1);
}
21
num[i++]=tokenvalue;
ungetc(c,f1);
}
else if(isalpha(c))
{
putc(c,f2);
c=getc(f1);
while(isdigit(c)||isalpha(c)||c=='_'||c=='$')
{
putc(c,f2);
c=getc(f1);
}
putc(' ',f2);
ungetc(c,f1);
}
else if(c==' '||c=='\t')
printf(" ");
else if(c=='\n')
lineno++;
else
putc(c,f3);
}
fclose(f2);
fclose(f3);
fclose(f1);
printf("\nThe no's in the program are");
for(j=0;j<i;j++)
printf("%d",num[j]);
printf("\n");
f2=fopen("identifier","r");
k=0;
printf("The keywords and identifiersare:");
while((c=getc(f2))!=EOF)
{
if(c!=' ')
str[k++]=c;
else
22
{
str[k]='\0';
keyword(str);
k=0;
}
}
fclose(f2);
f3=fopen("specialchar","r");
printf("\nSpecial characters are");
while((c=getc(f3))!=EOF)
printf("%c",c);
printf("\n");
fclose(f3);
printf("Total no. of lines are:%d",lineno);
}
Output:
Enter the C program
a+b*c
Ctrl-D
LEX PROGRAMS
Lex Code:
%%
“Token” printf(“*”);
%%
Compilation Steps:
lex Filename.l
cc lex.yy.c –ll
Output:
./a.out
This is Token
This is *
24
Lex Code:
%%
[0-9] printf(“INTEGER”);
%%
Compilation:
lex integer.l
cc lex.yy.c –ll
Output:
./a.out
12
INTEGER INTEGER
25
Lex Code:
%%
[0-9]+ printf(“INTEGER”);
%%
Compilation:
lex integerlength.l
cc lex.yy.c –ll
Output:
./a.out
123
INTEGER
26
Lex Code:
%%
“+”?[0 – 9]+ printf(“POSITIVE INTEGER”);
%%
Compilation:
lex positiveinteger.l
cc lex.yy.c –ll
Output:
./a.out
+12
POSITIVE INTEGER
-12
-POSITIVE INTEGER
27
Lex Code:
%%
“-“[0 - 9]+ printf(“NEGATIVE INTEGER”);
%%
Compilation:
lex negativeinteger.l
cc lex.yy.c –ll
Output:
./a.out
-1
NEGATIVE INTEGER
1
1
28
%{
int i , s=0, t, m;
%}
%%
[0 – 9]+ { i = atoi(yytext);
t = 0;
m = atoi(yytext);
while( i > 0 )
{
t = i % 10;
s = s * 10 + t;
i = i / 10;
}
if( m = = s)
printf(“ GIVEN NUMBER IS PALINDROME NUMBER ”);
else
printf(“ GIVEN NUMBET IS NOT A PALINDROME NUMBER ”);
}
%%
Compilation:
lex palindrome.l
cc lex.yy.c –ll
Output:
./a.out
121 GIVEN NUMBER IS PALINDROME NUMBER
120 GIVEN NUMBER IS NOT A PALINDROME NUMBER
29
%{
int i , s=0, t, m;
%}
%%
[0 – 9]+ { i = atoi( yytext );
t = 0;
m = atoi( yytext );
while( i > 0 )
{
t = i % 10;
s = t * t * t + s;
i = i / 10;
}
if( m = = s)
printf(“ARMSTRONG NUMBER”);
else
printf(“ NOT AN ARMSTRONG NUMBER”);
}
%%
Compilation:
lex armstrong.l
cc lex.yy.c –ll
Output:
./a.out
153
ARMSTRONG NUMBER
30
%{
#include <stdio.h>
%}
%option noyywrap
%%
[0-9]+ {printf("Saw an integer: %s\n", yytext); }
.|\n { }
%%
int main(void)
{
yylex();
return 0;
}
Compilation:
lex filename.l
cc lex.yy.c –ll
Input:
abc123z.!&*2gj6
Output:
./a.out
the program will print:
Saw an integer: 123
Saw an integer: 2
Saw an integer: 6
31
%%
“ ” + printf( “ “ );
%%
Compilation:
lex blank.l
cc lex.yy.c –ll
Output:
./a.out
This is KITS
This is KITS
32
%%
[ A – Z a – z 0 – 9 ] printf(“ %c”, yytext[0] + 1 );
%%
Compilation:
lex successor.l
cc lex.yy.c –ll
Output:
./a.out
kits
ljut
abc
bcd
33
%%
[ a – z A – Z 0 – 9 ] printf( “%c” , yytext[0] - 1);
%%
Compilation:
lex predecessor.l
cc lex.yy.c –ll
Output:
kits
jhsr
bcd
abc
34
%%
“/*”[0 – 9 a – z A – Z] * “*/” printf(“ “ );
%%
Compilation:
lex commentline.l
cc lex.yy.c –ll
Output:
./a.out
We are of kits /*college*/
We are of kits
35
First Method:
%{
int i ;
%}
%%
“+”?[0 – 9]+\.[0 – 9]+ {i = 0;
While( yytext[i] ! = ‘.’)
{
printf( “%c”,yytext[i] );
i++;
}
}
%%
Compilation:
lex realtoint.l
cc lex.yy.c –ll
Output:
./a.out
23.5
23
+23.5
23
-23.5
-23
36
Second Method:
%%
“.”[0 – 9]+ printf( “ “ );
%%
Compilation:
lex realtoint1.l
cc lex.yy.c –ll
Output:
./a.out
23.5
23
37
%{
int v=0,c=0;
%}
%%
[aeiouAEIOU] v++;
[a-zA-Z] c++;
%%
main()
{
printf("ENTER INTPUT : \n");
yylex();
printf("VOWELS=%d\nCONSONANTS=%d\n",v,c);
}
38
%{
int pi=0,ni=0,pf=0,nf=0;
%}
%%
\+?[0-9]+ pi++;
\+?[0-9]*\.[0-9]+ pf++;
\-[0-9]+ ni++;
\-[0-9]*\.[0-9]+ nf++;
%%
main()
{
printf("ENTER INPUT : ");
yylex();
printf("\nPOSITIVE INTEGER : %d",pi);
printf("\nNEGATIVE INTEGER : %d",ni);
printf("\nPOSITIVE FRACTION : %d",pf);
printf("\nNEGATIVE FRACTION : %d\n",nf);
}
39
%{
#include "stdio.h"
int pf=0,sf=0;
%}
%%
printf {
pf++;
fprintf(yyout,"%s","writef");
}
scanf {
sf++;
fprintf(yyout,"%s","readf");
}
%%
main()
{
yyin=fopen("file1.l","r+");
yyout=fopen("file2.l","w+");
yylex();
printf("NUMBER OF PRINTF IS %d\n",pf);
printf("NUMBER OF SCANF IS %d\n",sf);
}
40
%{
}%
%%
"and"|
"or"|
"but"|
"because"|
"nevertheless" {printf("COMPOUNT SENTANCE"); exit(0); }
.;
\n return 0;
%%
main()
{
prntf("\nENTER THE SENTANCE : ");
yylex();
printf("SIMPLE SENTANCE");
}
41
%{
#include<stdio.h>
int id=0,flag=0;
%}
%%
"int"|"char"|"float"|"double" { flag=1; printf("%s",yytext); }
";" { flag=0;printf("%s",yytext); }
[a-zA-Z][a-zA-z0-9]* { if(flag!=0) id++; printf("%s",yytext); }
[a-zA-Z0-9]*"="[0-9]+ { id++; printf("%s",yytext); }
[0] return(0);
%%
main()
{
printf("\n *** output\n");
yyin=fopen("f1.l","r");
yylex();
printf("\nNUMBER OF IDENTIFIERS = %d\n",id);
fclose(yyin);
}
int yywrap()
{
return(1);
}
42
%{
int c=0,w=0,l=0,s=0;
%}
%%
[\n] l++;
[' '\n\t] s++;
[^' '\t\n]+ w++; c+=yyleng;
%%
%{
#include<stdio.h>
int cc=0;
%}
%%
"/*"[a-zA-Z0-9' '\t\n]*"*/" cc++;
"//"[a-zA-Z0-9' '\t]* cc++;
%%
main()
{
yyin=fopen("f1.l","r");
yyout=fopen("f2.l","w");
yylex();
fclose(yyin);
fclose(yyout);
printf("\nTHE NUMBER OF COMMENT LINES = %d\n",cc);
}
44
%%
[\+\-\*\/] { printf("OPERATORS ARE %s\n",yytext);
opr++;
}
[a-zA-Z]+ { printf("OPERANDS ARE %s\n",yytext);
opd++;
}
[0-9]+ { printf("OPERANDS ARE %s\n",yytext);
opd++;
}
[a-zA-Z]+\+\-\*\/[a-zA-Z]+ { n=0; }
[0-9]+\+\-\*\/[0-9]+ { n=0; }
%%
main()
{
printf("\nENTER THE EXPRESSION : \n");
yylex();
printf("\nNUMBER OF OPERATORS ARE %d",opr);
printf("\nNUMBER OF OPERANDS ARE %d",opd);
if((n==0)&&(opd==opr+1))
printf("\nVALID EXPRESSION\n");
else
printf("\nINVALID EXPRESSION\n");
}
45
%{
#include<stdio.h>
int cons=0;
%}
%%
[0-9]+ { printf("\n%s",yytext); cons++; }
.;
%%
%{
%}
%%
Morning [ ](00|01|02|03|04|05|06|07|08|09|10|11)[:]
Afternoon [ ](12|13|14|15|16|17)[:]
Evening [ ](18|19|20|21|22|23)[:]
%%
{Morning} printf("Good Morning ");
{Afternoon} printf("Good Afternoon ");
{Evening} printf("Good Evening ");
. ;
If we assume that executable file name of the generated C program is “greet” then we can run the
following command from see the output.
date | greet
BEGIN followed by the name of a start condition places the scanner in the corresponding start
Condition
47
LEX
%%
#.* { printf("\n%s is a PREPROCESSOR DIRECTIVE",yytext);}
int |
float |
char |
double |
while |
for |
do |
if |
break |
continue |
void |
switch |
case |
long |
struct |
const |
typedef |
return |
else |
auto |
default |
enum |
extern |
register |
short |
48
sizeof |
signed |
static |
unsigned |
union |
volatile |
goto {printf("\n\t%s is a KEYWORD",yytext);}
"/*" {COMMENT = 1;}
/*{printf("\n\n\t%s is a COMMENT\n",yytext);}*/
"*/" {COMMENT = 0;}
/* printf("\n\n\t%s is a COMMENT\n",yytext);}*/
{identifier}\( {if(!COMMENT)printf("\n\nFUNCTION\n\t%s",yytext);}
\{ {if(!COMMENT) printf("\n BLOCK BEGINS");}
\} {if(!COMMENT) printf("\n BLOCK ENDS");}
{identifier}(\[[0-9]*\])? {if(!COMMENT) printf("\n %s IDENTIFIER",yytext);}
\".*\" {if(!COMMENT) printf("\n\t%s is a STRING",yytext);}
[0-9]+ {if(!COMMENT) printf("\n\t%s is a NUMBER",yytext);}
\)(\;)? {if(!COMMENT) printf("\n\t");ECHO;printf("\n");}
\( ECHO;
= {if(!COMMENT)printf("\n\t%s is an ASSIGNMENT OPERATOR",yytext);}
\<= |
\>= |
\< |
== |
\> {if(!COMMENT) printf("\n\t%s is a RELATIONAL OPERATOR",yytext);}
%%
int main(int argc,char **argv)
{
if (argc > 1)
{
FILE *file;
file = fopen(argv[1],"r");
if(!file)
{
printf("could not open %s \n",argv[1]);
exit(0);
}
yyin = file;
49
}
yylex();
printf("\n\n");
return 0;
}
int yywrap()
{
return 0;
}
Input:
$vi var.c
#include<stdio.h>
main()
{
int a,b;
}
Output:
$lex lex.l
$cc lex.yy.c
$./a.out var.c
FUNCTION
main ( )
BLOCK BEGINS
int is a KEYWORD
a IDENTIFIER
b IDENTIFIER
BLOCK ENDS
50
YACC
The computer program YACC (Yet another Compiler Compiler) is a parser generator developed
by Stephan C. Johnson at AT&T for the UNIX operating system. It generates a parser (the part of
a compiler that tries to make syntactic sense of the source code) based on an analytic grammar written
in a notation similar to BNF. YACC generates the code for the parser in the C programming
language.YACC used to be available as the default parser generator on most UNIX operating systems.
The parser generated by yacc requires a Lexical analyzer generator, such as LEX or FLEX is widely
available. YACC uses LALR parser. Bison is the GNU version of YACC.
Assume our goal is to write a BASIC compiler. First, we need to specify all pattern matching rules for
lex (bas.l) and grammar rules for yacc (bas.y). Commands to create our compiler, bas.exe, are listed
below:
yacc –d bas.y # create y.tab.h, y.tab.c
lex bas.l # create lex.yy.c
cc lex.yy.c y.tab.c –obas.exe # compile/link
YACC reads the grammar descriptions in bas.y and generates a parser, function yyparse, in file
y.tab.c. Included in file bas.y are token declarations. The –d option causes yacc to generate
definitions for tokens and place them in file y.tab.h. Lex reads the pattern descriptions in bas.l,
includes file y.tab.h, and generates a lexical analyzer, function yylex, in file lex.yy.c. Finally, the
lexer and parser are compiled and linked together to form the executable, bas.exe. From main, we
call yyparse to run the compiler. Function yyparse automatically calls yylex to obtain each token.
51
File Content
filename.lex Specifies the lex command specification file that defines the lexical analysis rules.
Specifies the yacc command grammar file that defines the parsing rules, and calls the
filename.yacc
yylex subroutine created by the lex command to provide input.
The programs section contains the following subroutines. Because these subroutines are included in
this file, you do not need to use the YACC library when processing this file.
main The required main program that calls the yyparse subroutine to start the program.
yywrap The wrap-up subroutine that returns a value of 1 when the end of input occurs.
52
YACC PROGRAMS
Lex
%{
#include"y.tab.h"
int extern yylval;
%}
%%
[0-9]+ {yylval=atoi(yytext);
return num;}
. return yytext[0];
\n return 0;
%%
Yacc
%{
#include<stdio.h>
int valid=0,temp;
%}
%token num
%left '+' '-'
%left '*' '/'
%%
exp1: exp {temp=$1;}
exp: exp '+' exp {$$=$1+$3;}
|exp '-' exp {$$=$1-$3;}
|exp'*'exp{$$=$1*$3;}
|exp'/'exp{if($3==0) {valid=1;$$=0;} else{$$=$1/$3;}}
|'('exp')'{$$=$2;}
|num {$1=$$;};
%%
int yyerror()
{
53
Lex Code:
%{
#include “y.tab.h”
extern int yylval;
%}
%%
[0-9]+ { yylval = atoi(yytext);
return num;}
\n return 0;
. return yytext[0];
%%
Yacc Code:
%{
int x, t , r, n;
%}
% token num;
%%
Stat : num{ x = $1;
t = 1;
n = 0;
while( x! = 0)
{
r = x % 8;
n = n + r * t;
t = t * 10;
x = x / 8;
} printf(“%d”,n);
}
%%
55
Compilation:
lex dectooct.l
yacc –d dectooct.y
cc lex.yy.c y.tab.c –ll -ly
Output:
./a.out
10
12
56
Lex Code:
%{
#include “y.tab.h”
extern int yylval;
%}
%%
[0-9]+ { yylval = atoi(yytext);
return num;}
\n return 0;
. return yytext[0];
%%
Yacc Code:
%{
int x, t, r, n;
%}
% token num;
%%
Stat : num{ x = $1;
t = 1;
n = 0;
while(x! = 0)
{
r = x % 10;
n = n + r * t;
t = t * 8;
x = x / 10;
} printf(“%d”,n);
}
%%
57
Compilation:
lex octtodec.l
yacc –d octtodec.y
cc lex.yy.c y.tab.c –ll -ly
Output:
./a.out
12
10
58
(if.l)
ALPHA [A-Za-z]
DIGIT [0-9]
%%
[ \t\n]
if return IF;
then return THEN;
{DIGIT}+ return NUM;
{ALPHA}({ALPHA}|{DIGIT})* return ID;
"<=" return LE;
">=" return GE;
"==" return EQ;
"!=" return NE;
"||" return OR;
"&&" return AND;
. return yytext[0];
%%
(if.y)
%{
#include <stdio.h>
#include <stdlib.h>
%}
%token ID NUM IF THEN LE GE EQ NE OR AND
%right '='
%left AND OR
%left '<' '>' LE GE EQ NE
%left '+''-'
%left '*''/'
%right UMINUS
%left '!'
%%
S : ST {printf("Input accepted.\n");exit(0);};
ST : IF '(' E2 ')' THEN ST1';'
;
59
ST1 : ST
|E
;
E : ID'='E
| E'+'E
| E'-'E
| E'*'E
| E'/'E
| E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| ID
| NUM
;
E2 : E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| ID
| NUM
;
%%
#include "lex.yy.c"
main()
{
printf("Enter the statement: ");
yyparse();
}
60
Output:
$ lex if.l
$ yacc if.y
$ gcc y.tab.c -ll -ly
$ ./a.out
Enter the statement: if(i>) then i=1;
syntax error
$ ./a.out
Enter the statement: if(i>8) then i=1;
Input accepted.
$
61
alpha [A-Za-z]
digit [0-9]
%%
[\t \n]
for return FOR;
{digit}+ return NUM;
{alpha}({alpha}|{digit})* return ID;
"<=" return LE;
">=" return GE;
"==" return EQ;
"!=" return NE;
"||" return OR;
"&&" return AND;
. return yytext[0];
%%
%{
#include <stdio.h>
#include <stdlib.h>
%}
%token ID NUM FOR LE GE EQ NE OR AND
%right "="
%left OR AND
%left '>' '<' LE GE EQ NE
%left '+' '-'
%left '*' '/'
%right UMINUS
%left '!'
%%
S : ST {printf("Input accepted\n"); exit(0);}
62
E : ID '=' E
| E '+' E
| E '-' E
| E '*' E
| E '/' E
| E '<' E
| E '>' E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| E '+' '+'
| E '-' '-'
| ID
| NUM
;
E2 : E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
63
| E OR E
| E AND E
;
%%
#include "lex.yy.c"
main() {
printf("Enter the expression:\n");
yyparse();
}
Output:
$ lex for.l
$ yacc for.y
conflicts: 25 shift/reduce, 4 reduce/reduce
$ gcc y.tab.c -ll -ly
$ ./a.out
Enter the expression:
for(i=0;i<n;i++)
i=i+1;
Input accepted
$
64
(bp.y)
%{
#include <ctype.h>
#include <stdio.h>
#include "y.tab.h"
extern int yydebug;
%}
%token OPEN CLOSE
%%
lines : s '\n' {printf("OK\n"); }
;
s:
| OPEN s CLOSE s
;
%%
void yyerror(char * s)
{
fprintf (stderr, "%s\n", s);
}
int yywrap(){return 1; }
int main(void) {
yydebug=1;
return yyparse();}
Compilation:
lex bp.lex
yacc –dv bp.y
gcc –o bp y.tab.c lex.yy.c -ly –lfl
Input:
(()) or (()(()()))
Output:
OK
66
1. What is a Parser?
A Parser for a Grammar is a program which takes in the Language string as its input and produces
either a corresponding Parse tree or an Error.
Symbol Description
| OR (alternation)
() Group of Subexpression
* 0 or more Occurrences
? 0 or 1 Occurrence
+ 1 or more Occurrences
Non-Terminals: These are syntactic variables in the grammar which represents a set of strings
the grammar is composed of. In a Parse tree all the inner nodes represents the Non-Terminal
symbols.
68
E E
/ \ / \
* E E *
/ \ /\
E E E E
43. How many sections are there in LEX and YACC? Name them?
LEX and YACC there are 3 sections are:
a) Declarations section
b) Translation Rules
c) Auxiliary Procedures.
G Loop optimizations,
Global common sub-expressions, eliminating, Induction variables, reduction of,
GOTO tables, Loop detection,
Loop jamming,
H Loop unrolling,
Handle pruning, LR parsers and parsing,
Hash tables for organization of symbol tables, LR(1) parsers and parsing
I M
IF-THEN-ELSE statements and translation, Machine model described,
IF-THEN statements and translation, Memory,
Increment operators, implementation of, Memory addresses, machine model and,
Indirect triple representation,
Induction variables of loops N
Inherited attributes, Names access to nonlocal names,
Input files, LEX, Non-deterministic finite automata (NFA)
Intermediate code generation phase, Non-terminals in context-free grammar,
Intersection, set operation, NOT operator and translation,
J-K O
Jumps and Boolean translation, Operators for regular expressions,
Optimizations of DFAs,
L OR operator and translation,
LALR parsing,
Language, defined for lexical analysis, P
Language tokens, lexical analysis and, Panic mode recovery,
L-attributed definitions, Parsers and parsing
Left linear grammar, Predictive top-down parsers,
LEX compiler-writing tool, Parse trees in CFG,
Lexemes, Pattern specification in LEX,
Lexical analysis Peephole optimization,
Lexical analyzers, design of, Postfix notation,
Lexical phase, Power set, set operation,
Linear lists for organization of symbol tables, Predictive parsing
Local common sub expressions, eliminating, Prefixes, defined,
Logical expressions Procedure calls,
Loop invariant computations, Productions (P) in context-free grammar,
Loop jamming,
75