Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 37

Introduction to Compiler

Reference:
Compilers : Principles, Techniques and Tools
Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman
What is a Compiler?
A compiler is a program that can read a program in one
language - the source language - and translate it into an
equivalent program in another language - the target
language

Error Messages
What is a Compiler?
Compilers are classified into
1. Single pass
2. Multipass
3. Just-in-time
4. Debugging
5. Optimizing
A language-processing system
What is a Compiler?
Compilation has 2 parts
1. Analysis
2. Synthesis

• Analysis part : Breaks up the source program into


constituent pieces create an intermediate representation
of the source program
What is a Compiler?
Analysis part : This intermediate representation is
a hierarchical structure called syntax tree

Syntax tree for


What is a Compiler?
Synthesis part :

• The synthesis part constructs the desired target


program from the intermediate representation and

the information in the symbol table.

• The analysis part is often called the front end of the


compiler; the synthesis part is the back end.
What is a Compiler?
Analysis has 3 phases:
1. Lexical Analysis (Linear Analysis)
2. Syntax Analysis(Hierarchical Analysis)
3. Semantic Analysis
Synthesis has
1. Intermediate code generation
2. Code optimization
3. Code generation phase
Phases of a Compiler

Analysis Phase

Synthesis Phase
Phases of a Compiler
1. Lexical Analysis (scanning)

Reads the stream of characters making up the source


program and groups them into tokens

Token are :
Phases of a Compiler
1. Lexical Analysis (scanning)

The character sequence forming a token is called the


lexeme.

For example ; position


token = identifier
lexeme = position

Some token are associated with values


Phases of a Compiler
2. Syntax Analysis (parsing)
Groups the tokens into grammatical phrases

Grammatical phrases are represented by a parse tree


Phases of a Compiler
2. Syntax Analysis (parsing)

A syntax tree is a compressed representation of a


parse tree.

Syntax tree for the previous parse tree

Each interior node represents


an operation and the children of
the node represent the
arguments of the operation.
Phases of a Compiler
3. Semantic Analysis

Uses the syntax tree to check the source program for


semantic errors

An important part of semantic analysis is type checking,

int a[20];
A[2.4] = 12 is a semantic error
Phases of a Compiler
3. Semantic Analysis

It can also perform type conversions called coercions

Suppose that all


identifiers are of real
type
Phases of a Compiler
4 . Intermediate Code Generation

• After syntax and semantic analysis of the source


program, many compilers generate an explicit low-level
or machine-like intermediate representation

• This intermediate representation should have two


important properties:

1. it should be easy to produce


2. it should be easy to translate into the target machine
Phases of a Compiler
4. Intermediate Code Generation

Intermediate Code

An example of
3 –address code
Phases of a Compiler
5. Code Optimization

The machine-independent code-optimization phase


attempts to improve the intermediate code so that
better target code will result.

Optimized code
Phases of a Compiler
6. Code Generation
The code generator takes as input an intermediate
representation of the source program and maps it into
the target language.
Target code may be relocatable machine code or
assembly code
Phases of a Compiler
Symbol-Table Management
A symbol table is a data structure containing a record
for each variable name, with fields for the attributes
of the name

Attributes may provide information about


• storage allocated for a variable
• Data type
• scope of the variable
Phases of a Compiler
Symbol-Table Management

In the case of procedure name atributes may provide


information about
• the number and types of its arguments
• the method of passing each argument (for
example, by value or by reference),
• and the type returned
Phases of a Compiler
Symbol-Table Management

• When an identifiers in the source program is detected


by lexical analyzer, the identifier is entered in to the
symbol table.

• However its attributes cannot be determined during


lexical analysis. They are entered during the later
phases.
Phases of a Compiler
Error detection

• Each phase can encounter errors . If an error is


detected in a phase, it must somehow deal with that
error so that further errors to be detected.
• Errors can occur during:
• Lexical analysis
• Syntax analysis
• Semantic analysis
Phases of a Compiler
Phases of a Compiler
Cousins of the compiler

A compiler is assisted by

1. Preprocessors
2. Assemblers
3. linkers/loaders
Cousins of the compilers
A language-processing system
Cousins of the compilers
1. Preprocessors
Processors produce input to compilers. Its tasks are

1. Macro processing
• e..g In C #define PI 3.14

2. File Inclusions
• E.g. In C #include <global.h>

3. Rational Preprocessors
4. Language extensions
Cousins of the compilers
Preprocessors
Rational Preprocessors
- Augment older languages with more modern
flow-of-control and data-structuring facilities.
Language extensions
- These processors attempt to add capabilities to
the language by what amounts to built-in macros
The Grouping of Phases into Passes
Phases of a compiler are collected into front end and back end

Front end

Back end
The Grouping of Phases into Passes

In an implementation, activities from several


phases may be grouped together into a pass that
reads an input file and writes an output file

Example

One Pass
The Grouping of Phases into Passes

optional

Back end pass


Compiler-Construction Tools

• Specialized tools have been created to help


implement various phases of a compiler. Some
commonly used compiler-construction tools
include :

1. Scanner generators
2. Parser generators
3. Syntax directed translation engines
4. Automatic code generators
5. Data flow engines
Compiler-Construction Tools
1.Scanner generators
• produce lexical analyzers from a regular-expression
description of the tokens of a language

• The basic organization of the resulting lexical


analyzer is in effect a finite automaton.

E.g. : LEX
Compiler-Construction Tools
2. Parser generators
• Automatically produce syntax analyzers from a
grammatical description of a programming language

• With these tools , this phase is now one of the


easiest to implement

E.g. : YACC.
Compiler-Construction Tools
3. Syntax directed translation engines
• produce collections of routines for walking a parse
tree and generating intermediate code.

4. Automatic code generators


• produce a code generator from a collection of rules
for translating each operation of the intermediate
language into the machine language for a target
machine.
Compiler-Construction Tools
5. Data flow engines

• These facilitate the gathering of information about


how values are transmitted from one part of a
program to each other part.

• Data-flow analysis is a key part of code


optimization

You might also like