Professional Documents
Culture Documents
Chapter 1-1
Chapter 1-1
COMPILER DESIGN
1
1.1. Overview of Compilation
A compiler translator the code written in one language to some other language
without changing the meaning of the program. These process is called compilation.
Compiler design covers basic translation mechanism and error detection and
recovery.
It includes lexical, syntax, and sematic analysis as front end, and code
generation and optimization as back-end (ML).
Compiler is a program that can read a program written in HLL like C, C++, and
Java (called source language) and translates it into a LLL(which is target
language/machine program).
3
1.1. Overview of Compilation….(Cont.)
Types of Compiler
Cross-compiler
This is a compiler that runs on platform (A) and produce executable machine code for
another platform (B).
Source-to-source Compiler
This is a compiler that takes the source code of one programming language and translates
it into the source code of another programming language.
Bootstrap Compilers
These compilers are written in a programming language that they have to compile.
Decompiler
Basically, it is not a compiler. It is just the reverse of the compiler. It converts the machine
code into high-level language.
4
1.2. Language Processing System
5
1.2. Language Processing System….(Cont.)
6
1.2. Language Processing System….(Cont.)
Before we go to the details of compilers, we should understand a few other tools
that work closely with compilers.
A. Preprocessor:
A preprocessor, generally considered as of compiler, is a tool that produces
input for compilers.
It deals with macro-processing, augmentation, file inclusion, language extension.
etc.
B. Compiler
compiler is a software which converts a program write in HLL to LLL.
An important role of the compiler is error showing to the programmer.
7
1.2. Language Processing System….(Cont.)
Interpreter:
Like a compiler, an interpreter high-level language into low-level machine
language.
The difference is in the way they read the source code or input.
A compiler reads the whole source code at once, creates tokens, checks
semantics, generates intermediate code, executes the whole program
and may involve many passes.
In contrast, an interpreter reads the program line by line, converts it to
an intermediate code, executes it, then takes the next statement in
sequence.
If an error occurs, an interpreter stops execution and reports it.
Whereas a compiler reads the whole program even if it encounters
several errors. 8
1.2. Language Processing System….(Cont.)
C. Assembler
It is a program that translates assembly language programs (LLL) into machine level
language (MLL)
The output of an assembler is called an object file.
This object file contains a combination of machine instructions and a data that required
to place these instructions in memory.
D. Linker
Linker is a computer program that links and merges various object files together in order to make
an executable file.
The major task of a linker is to search and locate referenced module/routines in a program and
to determine the memory location that the codes will be loaded.
E. Loader
Loader is a part of OS and is responsible for loading executable files into memory and execute
them.
It calculates the size of a program and creates memory space for it. 9
1.3. Analysis and Synthesis in a Compilation
In Compilation process of compiler, there are two major parts/phases:
1. Analysis part/phases
2. Synthesis part/phases
Analysis Part: source code is divide into components, and an intermediate representation is
created from source code.
Lexical Analyzer, Syntax Analyzer and Semantic Analyzer are the parts of this phase.
Synthesis Part: it generates desired target code from the above intermediate representation.
Intermediate Code Generator, Code Generator, and Code Optimizer are the parts of
this phase.
10
1.3. Analysis and Synthesis in a Compilation..(Cont.)
Of the two parts, synthesis requires the most specialized techniques.
Software tools:
Static checkers: reads source code, analyzes source code, then find out the
bugs.
11
1.4. Phases of a compiler
Phase: is logically interrelated operations that takes source program in one
representation and produces operations in another representation.
Each phases take input from its previous stage, has its own representation of
source program, and feeds its output to the next phase of the compiler.
Lexical analyzer
Front-End(Analysis Phase)
Syntax analyzer
Semantic analyzer
13
1.4. Phases of a compiler…(Cont.)
1.4.1. Lexical Analysis
It scan/read the source code as a stream of characters and groups these characters into
meaningful sequences called lexemes.
Then lexical analyzer represents these lexemes in the form of tokens as:
<token-name, attribute-value> where
Token-name is an abstract name that will be used during syntax analysis.
Attribute-value is a value that points to an entry in the symbol table.
Example: consider the following statement/source code.
new val: = old val: + 12
Tokens
new val tokens mapped into identifiers (id1)
= token mapped into assignment operator
old val token mapped into identifier (id2)
+ token mapped into addition operator
12 token represents number
Id1 = id2 + 12; 14
1.4. Phases of a compiler…(Cont.)
1.4.2. Syntax Analysis
Syntax analysis also called Parsing
It is the second phase of compiler, which takes the token produced by lexical analysis as
input and generates a parse tree (or syntax tree).
tokens
assignment
:=
id1 expressions
id2 number
old val 12
15
1.4. Phases of a compiler…(Cont.)
For example, assignment of values is between compatible data types, and adding
string to an integer.
Semantic analyzer also keeps track of identifiers, their types and expressions; as
well as whether identifiers are declared before use or not etc.
id1 = temp3;
17
1.4. Phases of a compiler…(Cont.)
1.4.4. Code Optimization
Example:
new val: = old val: + fact * 1;
temp1 = id3 * 1;
18
1.4. Phases of a compiler…(Cont.)
1.4.5. Code Generation
The code generator takes the optimized representation of the intermediate code
and maps it into the target machine language.
The code generator translates the intermediate code into a sequence of re-
locatable machine code.
Sequence of instruction of machine code performs the task as the intermediate code
would do.
Example: id1:= id2 + id3*1;
MOV R1, id3
MUL R1, #1
MOV R2, id2 Assembly Language Level(LLL)
ADD R1, R2
MOV id1, R1
19
1.4. Phases of a compiler…(Cont.)
Symbol Table
It is a data structure maintained throughout all the phases of a compiler.
All the identifier names along with their types are stored here.
The symbol table makes it is easier for the compiler to quickly search the
identifier record and retrieve it.
Error Handler
It is invoked when a flow error is detected in the source program.
The output of LA is a stream of tokens, which is passed to the next phase
19
1.4. Phases of a compiler…(Cont.)
20
1.5. Compiler Construction Tools
Some commonly used compiler construction tools are available to implement
one or more compiler phases include:
Scanner Generator:
Produce lexical analyzes from a regular-expression description of the language.
Parser Generators:
Automatically produce syntax analyzers from a grammatical description of a
programming language’
Syntax-directed translation engine:
Produce collections of routines for walking a parse tree and generating
intermediate code.
Code Generator Generators:
Produce a code generator form a collection of rules for translating each
operation of the intermediate language into the machine language for a target
machine. 21
1.5. Compiler Construction Tools…(Cont.)
Data-flow analysis engines:
Facilitate the gathering of information about how values are transmitted
from one part of the program to each other part. This is key part of code
optimization.
Compiler-Construction Toolkits:
Provide an integrated set of routines for constructing various phases of a
compiler.
22
U!
YO
NK
H A
T
25