Professional Documents
Culture Documents
Lab Manual SPCC
Lab Manual SPCC
Lab Manual SPCC
Lab Manual
System Programming & Compiler
Construction
Program
T.E
Semester
VI
Subject Name
Practical In-Charge
Name & Signature
Department of Computer Engineering
I/C HOD
Prof. Sandeep Raskar
: Computer Engineering.
Class
: TE
Subject
Experiment List
Serial No.
1.
Name of experiment
2.
3.
4.
5.
6.
7.
8.
9.
10.
Experiment No. 01
Study of 360/370 processor with instructions
Department of Computer Engineering
Theory:
Conclusion:
Experiment No. 2
Assemblers: 2 pass Assembler.
Title: Design and implementation of 2 pass assemblers for X86 machine
Theory:
Assembly Language Syntax:
Each line of a program is one of the following:
an instruction
an assembler directive (or pseudo-op)
a comment
Whitespace (between symbols) and case are ignored. Comments (beginning with ;)
are also ignored.
An instruction has the following format:
LABEL OPCODE OPERANDS ; COMMENTS
Example:
ADDR1,R1,R3
ADDR1,R1,#3
LDR6,NUMBER
BRzLOOP
Schemes of translation:
Symbol Table(ST)
Literal table(LT)
IC:
Address
Opcode fields:
i
Statement class
Opcode
Operand
code
Operand-2 fields
i. Operand class:
S:symbol
L:literals
ii. code
2-Pass Assembler:
Pass-I
I
I.
II.
Pass-II
I
I.
Process pseudo-opcodes.
Algorithm:
Pass-I
Step1: Initialize all variables:
Loc_cntr=0;
Littab_ptr=0;
Step2: Read line();
Break-words();
Step3: While(opcode!=end)
Read line();
Break-words();
case1: if opcode=start or opcode=small model
Department of Computer Engineering
code=search_code_MOT(opcode);
i.
Length=search_size_MOT(opcode);
ii.
Check_operand_literal();
If yes:
Insert_littab(litaral,Loc_cntr)
Littab_ptr++;
If no:
Find operand enry in SYMTAB
Generate IC(IS,code)(S,entry-no);
Loc_cntr=Loc_cntr+size;
Case6:
If opcode=end
i
code=search_code_MOT(opcode);
i.
Length=search_size_MOT(opcode);
//statement
Loc_cntr=Loc_cntr+size;
Step4: stop.
Program:
Program for 2 pass Assemblers
#include<stdio.h>
#include<string.h>
#include<conio.h>
void chk_label();
void chk_opcode();
void READ_LINE();
struct optab
{
char
code[10],objcode[10];
}myoptab[3]={
{"LDA","00"},
{"JMP","01"},
{"STA","02"}
};
struct symtab{
char symbol[10];
int addr;
}mysymtab[10];
int startaddr,locctr,symcount=0,length;
char line[20],label[8],opcode[8],operand[8],programname[10];
/*ASSEMBLER PASS 1 */
void PASS1()
READ_LINE();
if(!strcmp(opcode,"START"))
{
startaddr=atoi(operand);
locctr=startaddr;
strcpy(programname,label);
fprintf(inter,"%s",line);
fgets(line,20,input);
}
else
{
programname[0]='\0';
startaddr=0;
locctr=0;
}
printf("\n %d\t %s\t%s\t %s",locctr,label,opcode,operand);
while(strcmp(line,"END")!=0)
{
READ_LINE();
printf("\n %d\t %s \t%s\t %s",locctr,label,opcode,operand);
printf("\n %d\t\t%s",locctr,line);
fprintf(inter,"%s",line);
fclose(inter);
fclose(input);
}
/*Assesmbler pass 2 */
void PASS2()
{
FILE *inter,*output;
char record[30],part[8],value[5];/*Part array was defined as part[6]
previously*/
int currtxtlen=0,foundopcode,foundoperand,chk,operandaddr,recaddr=0;
inter=fopen("inter.txt","r");
output=fopen("output.txt","w");
fgets(line,20,inter);
READ_LINE();
if(!strcmp(opcode,"START")) fgets(line,20,inter);
printf("\n\nCorresponding Object code is..\n");
printf("\nH^ %s ^ %d ^ %d ",programname,startaddr,length);
fprintf(output,"\nH^ %s ^ %d ^ %d ",programname,startaddr,length);
recaddr=startaddr; record[0]='\0';
while(strcmp(line,"END")!=0)
{
if(!strcmp(mysymtab[chk].symbol,operand))
{
itoa(mysymtab[chk].addr,value,10);
strcat(part,value);
foundoperand=1;
}
if(!foundoperand)strcat(part,"err");
}
}
}
if(!foundopcode)
{
if(strcmp(opcode,"BYTE")==0 || strcmp(opcode,"WORD")||
strcmp(opcode,"RESB"))
{
strcat(part,operand);
}
}
void READ_LINE()
{
char buff[8],word1[8],word2[8],word3[8];
void chk_opcode()
{
int k=0,found=0;
for(k=0;k<3;k++)
if(!strcmp(opcode,myoptab[k].code))
{
locctr+=3;
found=1;
break;
}
if(!found)
{
int main()
{
PASS1();
length=locctr-startaddr;
PASS2();
getch();
Experiment No.3
Macro Processor
Source Code
(with macro)
obj
Macro ProcessorExpanded Compiler
Code
or Assembler
Examples:
Macro
Incr
Macro name
Mov ax,10
Sequence to be abbreviated
Add ax,01
Mend
End of definition
Parameterized Macro
macro
&lable incr &arg1, &arg2
Department of Computer Engineering
Conditional Macro
macro
Vary &count, &arg1,&arg2,&arg3
Add ax, &arg1
AIF (&count EQ 1).FINI
Test if &count=1
Add ax,&arg2
AIF (&count EQ 2).FINI
Test if &count=1
Add ax,&arg3
.FINI mend
Label start with (.) and do not appear in the output.
AGO is unconditional branch
AIF is conditional branch
If Call to this macro is:
Vary 3,var1,var2,var3
Expansion will be:
Department of Computer Engineering
If call is:
Vary 2, var1,var2
Expansion will be:
Add ax,var1
Ad ax,var2
Nested Macro
1
Macro
Adds &arg1,&arg2
Add1 &arg1
Add1 &arg2
1. Macro Instruction defining macros:
macro
Department of Computer Engineering
Recursive Macro
macro
Incr &arg1
Mov ax, &arg1
Add ax,1
AIF (&arg1 EQ 10) .L1
Incr arg1
.L1 mend
Algorithm
Pass 1:
1 Both counters for MDT and MNT i.e. MDTC and MNTC are initialized
first.
1. Macro process in first pass reads each and every line of source code.
2. If it is macro pseudo opcode, the entire macro definition that follows is
saved in MDT
3. The macro name is entered in MNT.
4. When MEND pseudo opcode is encountered that means all macro definition
have been processed, so transfer control to pass2.
Pass 2:
1. Read source code supplied by pass1to see opcode mnemonic of each line is
present in MNT or not.
2. When call is found, the MDTP is set to corresponding macro definition
stored in MDT.
3. The positional vs actual parameter list is prepared. Each line from MDT is
read and positional parameter is replaced by actual parameter.
4. Scanning of input file continues.
5. When MEND pseudo output is countered, MDT terminates expansion of
6. macro.
7. When END pseudo output is encountered, expansion code is transferred to
assembler for further processing.
Program:
#include<stdio.h>
#include<conio.h>
#include<string.h>
#include<stdlib.h>
void main()
{
FILE *f1,*f2,*f3;char mne[20],opnd[20],la[20];clrscr();
f1=fopen("minp2.txt","r");
Input file:
minp2.txt
EX1
MACRO &A,&B
LDA
&A
STA
&B
MEND
SAMPLE START
1000
EX1
N1,N2
N1
RESW
N2
RESW
END
Output files:
dtab2.txt
EX1
&A,&B
LDA
&A
STA
&B
MEND
ntab2.txt
EX1
EXPERIMENT NO. 4
Theory:
Structure of a lex file
The structure of a lex file is intentionally similar to that of a yacc file; files are divided up
into three sections, separated by lines that contain only two percent signs, as follows:
Definition section
%%
Rules section
%%
C code section
The definition section is the place to define macros and to import header files
written in C. It is also possible to write any C code here, which will be copied
verbatim into the generated source file.
The rules section is the most important section; it associates patterns with C
statements. Patterns are simply regular expressions. When the lexer sees some text
in the input matching a given pattern, it executes the associated C code. This is the
basis of how lex operates.
The C code section contains C statements and functions that are copied verbatim
to the generated source file. These statements presumably contain code called by
the rules in the rules section. In large programs it is more convenient to place this
code in a separate file and link it in at compile time.
Program:
%{
#include <stdio.h>
int charcount=0,wordcnt=0,spacecnt=0,linecnt=0;
%}
Department of Computer Engineering
%%
/*** The rule section to identify words,space, and lines***/
/* matches word and counts word and characters */
{word}{wordcnt++;charcnt=charcnt+yyleng;} {BLANK} { spacecnt++;}
/*identify end of line and count line */
{EOL}{linecnt++;}
/* ignore all other and count character */
{charcnt++;}
%%
/* called by lex when input is exhausted.Return 1 if you are done or 0 if more processing
is required*/
int yywrap()
{
return 1;
}
/* The C code section*/
int main(int argc,char *argv[])
{
/* if only one argument not passed to the program*/
if(argc!=2)
{
/* output usage format of the program and exit*/
print (Usage:<./a.out><sourcefile>\n);
exit(0);
Department of Computer Engineering
Conclusion: Thus we have implemented simple parser using yacc and lex.
EXPERIMENT NO. 5
Program:
//IMPLEMENT A NATURL LANGUAGE PARSER USING LEX & YACC
//LEX FILE
%{
/*Recognising parts of speech*/
#include "y.tab.h"
%}
VERB
"is"|"am"|"are"|"were"|"was"|"be"|"being"|"been"|"do"|"does"|"did"|"will"|"has"|"have"|"ha
d"
ADVERB "very"|"simply"|"gently"|"quietly"|"calmly"|"angrily"
PRONOUN "she"|"we"|"they"|"I"|"you"|"he"|"it"
ARTICLE "a"|"an"|"the"
%%
{VERB}
{printf("\nverb");
return(VERB);
}
{ADVERB} {printf("\nadverb");
return(ADVERB);
}
{PRONOUN} {printf("\npronoun");
Department of Computer Engineering
{printf("\nnoun");
return(NOUN);
}
%%
//YACC FILE
%{
/*To recognise english sentences*/
#include<stdio.h>
%}
%token NOUN PRONOUN VERB ADVERB ARTICLE
%%
sentence: subject VERB object {printf("\nSentence is valid");}
;
subject: ARTICLE NOUN
| NOUN
| ARTICLE PRONOUN
| PRONOUN
;
object: ARTICLE NOUN
| NOUN
;
//OUTPUT
[root@localhost assign3]# lex natural.l
[root@localhost assign3]# yacc -d natural.y
[root@localhost assign3]# gcc lex.yy.c y.tab.c -lfl
[root@localhost assign3]# ./a.out
Ram is a boy
noun
verb
article
noun
Conclusion: Thus we have implemented simple parser using yacc and lex.
Experiment No.6
Theory:
Automatic techniques for constructing parsers start with computing
some basic functions for symbols in the grammar. These functions are
useful in understanding both recursive descent and bottom-up LR
parsers.
First(a)
First(a) is the set of terminals that begin strings derived from a, which can include epsilon.
1.
2.
3.
4.
Program:
C code to Find First and Follow in a given Grammar
/* C program to find First and Follow in a given Grammar. */
#include<stdio.h>
#include<string.h>
int i,j,l,m,n=0,o,p,nv,z=0,x=0;
char str[10],temp,temp2[10],temp3[20],*ptr;
struct prod
{
char lhs[10],rhs[10][10],ft[10],fol[10];
int n;
}pro[10];
void findter()
{
int k,t;
for(k=0;k<n;k++)
{
if(temp==pro[k].lhs[0])
{
for(t=0;t<pro[k].n;t++)
{
if( pro[k].rhs[t][0]<65 || pro[k].rhs[t][0]>90 )
pro[i].ft[strlen(pro[i].ft)]=pro[k].rhs[t][0];
else if( pro[k].rhs[t][0]>=65 && pro[k].rhs[t][0]<=90 )
{
temp=pro[k].rhs[t][0];
if(temp=='S')
pro[i].ft[strlen(pro[i].ft)]='#';
findter();
}
}
break;
}
}
Department of Computer Engineering
Conclusion: Thus we have implemented Find First and Follow in a given Grammar
EXPERIMENT NO. 8
EXPERIMENT NO. 9
Theory:
PEEPHOLE OPTIMIZATION
A statement-by-statement code-generations strategy often produce target code that
contains redundant instructions and suboptimal constructs .The quality of such target code
can be improved by applying optimizing transformations to the target program. A simple
but effective technique for improving the target code is peephole optimization, a method
for trying to improving the performance of the target program by examining a short
sequence of target instructions (called the peephole) and replacing these instructions by a
shorter or faster sequence, whenever possible. The peephole is a small, moving window on
the target program. The code in the peephole need not contiguous, although some
implementations do require this. We shall give the following examples of program
transformations that are characteristic of peephole optimizations:
Redundant-instructions elimination
Flow-of-control optimizations
Algebraic simplifications
Use of machine idioms
UNREACHABLE CODE:
Another opportunity for peephole optimizations is the removal of unreachable
instructions. An unlabeled instruction immediately following an unconditional jump may
be removed. This operation can be repeated to eliminate a sequence of instructions. For
example, for debugging purposes, a large program may have within it certain segments
that are executed only if a variable debug is 1.In C, the source code might look like:
#define debug 0
.
If ( debug ) {
Print debugging information
}
If debug =1 goto L2
Goto L2
L1: print debugging information
L2:(a)
One obvious peephole optimization is to eliminate jumps over jumps .Thus no matter what
the value of debug, (a) can be replaced by:
If debug 1 goto
Print debugging information
Department of Computer Engineering
As the argument of the first statement of (c) evaluates to a constant true, it can be replaced
by goto L2. Then all the statement that print debugging aids are manifestly unreachable
and can be eliminated one at a time.
Program
class CO
{
public static void main(String[] args)
{
int i,j,b,k=0;
String op[][]={{"void","main()"," "," "," "," "," "," "},
{"{"," "," "," "," "," "," "," "},
{"pi","=","3.14",",","r","=","5",";"},
{"A","=","pi","*","r","*","r"," "},
{"}"," "," "," "," "," "," "," "}};
String rs[][]=new String[10][10];
String v;char a;
System.out.println("program code is:");
for(i=0;i<5;i++)
{
for(i=0;i<5;i++)
{
for(j=0;j<8;j++)
{
v=op[i][j];
if(v.compareTo("=")==0)
{
a=op[i][j+1].charAt(0);
if(a>=48 && a<=57)
{
rs[k][0]=op[i][j-1];
rs[k++][1]=op[i][j+1];
}
}
}
}
System.out.println("Identifier\tValue");
for(i=0;i<k;i++)
{
for(j=0;j<2;j++)
OUTPUT
Value
pi
3.14
EXPERIMENT NO. 10
Theory:
ld combines a number of object and archive files, relocates their data and ties up symbol
references. Usually the last step in compiling a program is to run ld.
ld accepts Linker Command Language files written in a superset of AT&T's Link Editor
Command Language syntax, to provide explicit and total control over the linking process.
This man page does not describe the command language; see the ld entry in "info", or
the manual ld: the GNU linker, for full details on the command language and on other
aspects of the GNU linker.
This version of ld uses the general purpose BFD libraries to operate on object files. This
allows ld to read, combine, and write object files in many different formats---for example,
COFF or "a.out". Different formats may be linked together to produce any available
kind of object file.
Aside from its flexibility, the GNU linker is more helpful than other linkers in providing
diagnostic information. Many linkers abandon execution immediately upon encountering
an error; whenever possible, ld continues executing, allowing you to identify other errors
(or, in some cases, to get an output file in spite of the error).
Department of Computer Engineering
OPTIONS
The linker supports a plethora of command-line options, but in actual practice few of them
are used in any particular context. For instance, a frequent use of ld is to link standard
Unix object files on a standard, supported Unix system. On such a system, to link a file
"hello.o":
ld -o <output> /lib/crt0.o hello.o -lc
This tells ld to produce a file called output as the result of linking the file
"/lib/crt0.o" with "hello.o" and the library "libc.a", which will come from
the standard search directories. (See the discussion of the -l option below.)
Some of the command-line options to ld may be specified at any point in the command
line. However, options which refer to files, such as -l or -T, cause the file to be read at the
point at which the option appears in the command line, relative to the object files and other
file options. Repeating non-file options with a different argument will either have no
further effect, or override prior occurrences (those further to the left on the command line)
of that option. Options which may be meaningfully specified more than once are noted in
the descriptions below.
Non-option arguments are object files or archives which are to be linked together. They
may follow, precede, or be mixed in with command-line options, except that an object file
argument may not be placed between an option and its argument.
Usually the linker is invoked with at least one object file, but you can specify other forms
of binary input files using -l, -R, and the script command language. If no binary input files
at all are specified, the linker does not produce any output, and issues the message No
input files.
If the linker can not recognize the format of an object file, it will assume that it is a linker
script. A script specified in this way augments the main linker script used for the link
(either the default linker script or the one specified by using -T). This feature permits the
linker to link against a file which appears to be an object or an archive, but actually merely
defines some symbol values, or uses "INPUT" or "GROUP" to load other objects. Note
that specifying a script in this way merely augments the main linker script; use the -T
option to replace the default linker script entirely.
For options whose names are a single letter, option arguments must either follow the
option letter without intervening whitespace, or be given as separate arguments
immediately following the option that requires them.
For options whose names are multiple letters, either one dash or two can precede the
option name; for example, -trace-symbol and --trace-symbol are equivalent. Note - there
is one exception to this rule. Multiple letter options that start with a lower case 'o' can only
be preceeded by two dashes. This is to reduce confusion with the -o option. So for
example -omagic sets the output file name to magic whereas --omagic sets the NMAGIC
flag on the output.
Arguments to multiple-letter options must either be separated from the option name by an
equals sign, or be given as separate arguments immediately following the option that
requires them. For example, --trace-symbol foo and --trace-symbol=foo are equivalent.
Department of Computer Engineering
This is important, because otherwise the compiler driver program may silently drop the
linker options, resulting in a bad link.
Here is a table of the generic command line switches accepted by the GNU linker:
-akeyword
This option is supported for HP/UX compatibility. The keyword argument must be
one of the strings archive, shared, or default. -aarchive is functionally equivalent
to -Bstatic, and the other two keywords are functionally equivalent to -Bdynamic.
This option may be used any number of times.
-Aarchitecture
--architecture=architecture
In the current release of ld, this option is useful only for the Intel 960 family of
architectures. In that ld configuration, the architecture argument identifies the
particular architecture in the 960 family, enabling some safeguards and modifying
the archive-library search path.
Future releases of ld may support similar functionality for other architecture
families.
-b input-format
--format=input-format
ld may be configured to support more than one kind of object file. If your ld is
configured this way, you can use the -b option to specify the binary format for input
object files that follow this option on the command line. Even when ld is configured
to support alternative object formats, you don't usually need to specify this, as ld
should be configured to expect as a default input format the most usual format on
each machine. input-format is a text string, the name of a particular format supported
by the BFD libraries. (You can list the available binary formats with objdump -i.)
You may want to use this option if you are linking files with an unusual binary
format. You can also use -b to switch formats explicitly (when linking object files of
different formats), by including -b input-format before each group of object files in a
particular format.