Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Chapter 1

CSE309N
Chapter 1
Chapter 1
Introduction to Compiling
Introduction to Compiling
Chapter 1
CSE309N
Introduction to Compilers
Introduction to Compilers

As a Discipline, Involves Multiple CS&E Areas

Programming Languages and Algorithms

Theory of Computing Soft!are Engineering

Computer Architecture "perating Systems

Has Deceivingly Simplistic Intent:


Compiler
Source
program
Target
Program
Error messages
Diverse & Varied
Chapter 1
CSE309N
Classifications of Compilers
Classifications of Compilers

Compilers ie!ed "rom Many #erspectives

Ho!ever, All utili$e same %asic tas&s to accomplish


their actions
Single Pass
Multiple Pass
Load & Go
Construction
Debugging
Optimizing
unctional
Chapter 1
CSE309N
The #odel
The #odel

'he '() *undamental #arts:

(e (ill Discuss +oth in 'his Class, and


*)C,S on analysis-
Analysis: Decompose Source into an
intermediate representation
Synthesis: 'arget program generation
"rom representation
Chapter 1
CSE309N
Important .otes

'oday: 'here are many So"t!are 'ools "or helping


!ith the Analysis #art- 'his (asn/t the Case in
Early Days- 0some1 analysis is also important in:

Structure $ Synta% directed editors& 'orce


(syntactically) correct code to *e entered

Ta+es input as a se,uence of commands to *uild a


source program-

Performs&
2 Te%t.creation
2 Te%t modifications
2 Analy/es the source program
Chapter 1
CSE309N
Important .otes 0Continue1

Pretty Printers& Standardi$ed version "or program structure


0i-e-, %lan& space, indenting, etc-1

Analyzes the source program and prints it in such a way that


the structure of the program becomes clearly visible.

Examples

Comments may appear in a special "ont

Statements may appear !ith an amount o" indentations


proportional to the depth o" their nesting in a hierarchical
organi$ation o" the stmts-

Static Chec+ers& A 34uic&5 compilation to detect


rudimentary errors

Examples

Detects parts o" the program that can never %e e6ecuted

A varia%le used %e"ore it is de"ined

Interpreters& 3real5 time e6ecution o" code a 3line7at7a7


time5
Chapter 1
CSE309N
Important .otes 0Continue1

Compilation Is Not 8imited to #rogramming 8anguage


Applications

Te!t ormatters

LATE0 T1"'' Are Languages 2hose Commands


'ormat Te%t 3 paragraphs4 figures4 mathematical
structures etc5

Silicon Compilers

Te%tual $ 6raphical& Ta+e Input and 6enerate Circuit


7esign

Database "uer# Processors

7ata*ase 8uery Languages Are Also a Programming


Language

Input is compiled Into a Set of "perations for


Accessing the 7ata*ase
Chapter 1
CSE309N
'he Many
'he Many
#hases
#hases
o" a Compiler
o" a Compiler
Source Program
Le!ical $nal#zer
%
S#nta! $nal#zer
&
Semantic $nal#zer
'
(ntermediate
Code Generator
)
Code Optimizer
*
Code Generator
+
Target Program
S#mbol,table
Manager
Error -andler
Chapter 1
CSE309N
Language.Processing System
Language.Processing System
S.eleton Source Program
Pre,Processor
%
Compiler
&
$ssembler
'
/elocatable
Mac0ine Code
)
Loader
Lin.1Editor
*
E!ecutable
Librar#2
relocatable
ob3ect 4iles
Source program
Target Assembly
program
Chapter 1
CSE309N

'hree #hases:

Linear $ Le%ical Analysis&

L.to.1 Scan to Identify To+ens


to+en& se,uence of chars ha9ing a collecti9e meaning

:ierarchical Analysis&

6rouping of To+ens Into #eaningful Collection

Semantic Analysis&

Chec+ing to ensure Correctness of Components


'he Analysis 'as& *or Compilation
Chapter 1
CSE309N
#hase 1- 8e6ical Analysis
Easiest Analysis . Identify to+ens !hich are the
*asic *uilding *loc+s
'or
E%ample&
$ll are to.ens
;lan+s4 Line *rea+s4 etc- are
scanned out
Position 56 initial 7 rate 8 +9 :
_______ __ _____ _ ___ _ __ _
Chapter 1
CSE309N
Phase <-
Phase <-
:ierarchical Analysis
:ierarchical Analysis
Parsing
Parsing
or
or
Synta% Analysis
Synta% Analysis
'or pre9ious e%ample4
!e !ould ha9e
Parse Tree&
identifier
identifier
expression
identifier
expression
number
expression
expression
expression
assignment
statement
position
56
7
8
+9
initial
rate
;odes o4 tree are constructed using a grammar 4or t0e language
Chapter 1
CSE309N
2hat is a 6rammar=
2hat is a 6rammar=

9rammar is a Set o" :ules (hich 9overn the


Interdependencies & Structure Among the 'o&ens
statement is an assignment statement4 or
!hile statement4 or
if statement4 or ---
assignment statement
e%pression
is an
is an identifier &> e%pression ?
3e%pression54 or
e%pression @ e%pression4 or
e%pression A e%pression4 or
num*er4 or
identifier4 or ---
Chapter 1
CSE309N
(hy Have (e Divided Analysis
(hy Have (e Divided Analysis
in 'his Manner;
in 'his Manner;

8e6ical Analysis 7 Scans Input, Its 8inear Actions


Are .ot :ecursive

(denti4# Onl# (ndividual <=ords> t0at are t0e t0e To.ens


o4 t0e Language

:ecursion Is :e4uired to Identi"y Structure o" an


E6pression, As Indicated in #arse 'ree

Veri4# t0at t0e <=ords> are Correctl# $ssembled into


<sentences>

(hat is 'hird #hase;

Determine ?0et0er t0e Sentences 0ave One and Onl#


One @nambiguous (nterpretation

A and do somet0ing about itB

eCgC <Do0n Too. Picture o4 Mar# Out on t0e Patio>


Chapter 1
CSE309N
Phase 3- Semantic Analysis
Phase 3- Semantic Analysis

*ind More Complicated Semantic Errors and


Support Code 9eneration

#arse 'ree Is Augmented (ith Semantic Actions


position
initial
rate
56
7
8
+9
Compressed Tree
position
initial
rate
56
7
8
inttoreal
+9
Conversion $ction
Chapter 1
CSE309N
Phase 3- Semantic Analysis
Phase 3- Semantic Analysis

Most Important Activity in 'his #hase:

'ype Chec&ing 7 8egality o" )perands

Many Di""erent Situations:


:eal :< int = char >
A?int@ :< A?real@ = int >
!hile char AB int do
C- Etc-
Chapter 1
CSE309N
Analysis in 'e6t *ormatting
Simple Commands & LATE0
B*eginCproofD
BendCproofD
Bnoindent
BsectionCIntroduction
D
EAFiE
EAFCiFGDE
Em*edded
in a
stream of
te%t4 i-e-4
a 'ILE
B and E ser9e as signals to LATE0
*egin
single
noindent
section
Language
Commands
2hat are to+ens=
2hat is hierarchical structure=
2hat +ind of semantic analysis is re,uired=
Chapter 1
CSE309N
Supporting #hasesD
Activities "or Analysis

Sym%ol 'a%le Creation D Maintenance

Contains (n4o Estorage2 t#pe2 scope2 argsF on Eac0


<Meaning4ul> To.en2 T#picall# (denti4iers

Data Structure Created 1 (nitialized During Le!ical


$nal#sis

@tilized 1 @pdated During Later $nal#sis & S#nt0esis

Error Handling

Detection o4 Di44erent Errors ?0ic0 Correspond to $ll


P0ases

?0at Ginds o4 Errors $re ound During t0e $nal#sis


P0aseH

?0at -appens ?0en an Error (s oundH


Chapter 1
CSE309N
'he Many
'he Many
#hases
#hases
o" a Compiler
o" a Compiler
Source Program
Le!ical
$nal#zer
%
S#nta! $nal#zer
&
Semantic $nal#zer
'
(ntermediate
Code Generator
)
Code Optimizer
*
Code Generator
+
Target Program
S#mbol,table
Manager
Error -andler
Chapter 1
CSE309N
'he Synthesis 'as& *or Compilation

Intermediate Code 9eneration

$bstract Mac0ine Version o4 Code , (ndependent o4


$rc0itecture

Easy to Produce and

Easy to translate into target program

Code )ptimi$ation

ind More E44icient ?a#s to E!ecute Code

/eplace Code ?it0 More Optimal Statements

*inal Code 9eneration

Generate /elocatable Mac0ine Dependent Code


Chapter 1
CSE309N
1e9ie!ing the Entire Process
1e9ie!ing the Entire Process
E
r
r
o
r
s
position 56 initial 7 rate 8 +9
le!ical anal#zer
s#nta! anal#zer
semantic anal#zer
intermediate code generator
id% 56 id& 7 id' 8 +9
56
id%
id&
id'
7
8
+9
56
id%
id&l
id'
7
8
inttoreal
+9
S#mbol
Table
position CCCC
initial AC
rateAC
Chapter 1
CSE309N
1e9ie!ing the Entire Process
1e9ie!ing the Entire Process
E
r
r
o
r
s
intermediate code generator
code optimizer
4inal code generator
temp% 56 inttorealE+9F
temp& 56 id' 8 temp%
temp' 56 id& 7 temp&
id% 56 temp'
temp% 56 id' 8 +9C9
id% 56 id& 7 temp%
MOV id'2 /&
M@L I+9C92 /&
MOV id&2 /%
$DD /&2 /%
MOV /%2 id%
position CCCC
initial AC
rateAC
S#mbol Table
3 address code
Chapter 1
CSE309N
Assem*lers
Assem*lers

Assem%ly code: names are used "or instructions,


and names are used "or memory addresses-

'!o7pass Assem%ly:

irst Pass5 all identi4iers are assigned to memor#


addresses E9,o44setF
eCgC substitute 9 4or a2 and ) 4or b

Second Pass5 produce relocatable mac0ine code5


MOV a2 /%
$DD I&2 /%
MOV /%2 b
999% 9% 99 99999999 8
99%% 9% %9 999999%9
99%9 9% 99 99999%99 8
relocation
bit
Load
Store
add
Chapter 1
CSE309N
Loaders and Lin+.Editors
Loaders and Lin+.Editors

8oader: ta&ing relocata%le machine code, altering


the addresses and placing the altered instructions
into memory-

8in&7editor: ta&ing many 0relocata%le1 machine code


programs 0!ith cross7re"erences1 and produce a
single "ile-

;eed to .eep trac. o4 correspondence bet=een variable


names and corresponding addresses in eac0 piece o4 codeC
Chapter 1
CSE309N
Compiler Cousins&
Compiler Cousins&
#reprocessors
#reprocessors

#rovide Input to Compilers
%C Macro Processing
Ide4ine in C5 does te!t substitution be4ore
compiling
Ide4ine J '
Ide4ine K $8L7C
Ide4ine M getc0arEF
Chapter 1
CSE309N
E- *ile Inclusion
Hinclude in C . *ring in another file *efore compiling
de4sC0
111111
111111
111111
mainCc
Iinclude <de4sC0>
A,,,A,,,A,,,
A,,,A,,,A,,,
A,,,A,,,A,,,
111111
111111
111111
A,,,A,,,A,,,
A,,,A,,,A,,,
A,,,A,,,A,,,
Chapter 1
CSE309N
F- :ational #reprocessors

Augment 3)ld5 8anguages (ith Modern Constructs

Add Macros "or I" 7 'hen, (hile, Etc-

GDe"ine Can Ma&e C Code More #ascal7li&e


Ide4ine begin N
Ide4ine end O
Chapter 1
CSE309N
H- 8anguage E6tensions "or a
Data%ase System
EQUEL - Database query language embedded in C
## etrie!e "D#$Department%Dnum& '(ere
## Department%Dname $ )esearc(*
is +reprocessed into,
ingres-system".etr/%%esearc(*01----1----&2
a procedure call in a programming language%
Chapter 1
CSE309N
'he 9rouping o" #hases
'ront End & Analysis @ Intermediate Code 6eneration
;ac+ End & Code 6eneration @ "ptimi/ation
9s-
Num*er of Passes&
A pass& re,uires r$! intermediate files
'e!er passes& more efficiency-
:o!e9er& fe!er passes re,uire more
sophisticated memory management and compiler
phase interaction-
Tradeoffs II--
Chapter 1
CSE309N
Compiler Construction 'ools
Parser 6enerators&
Produce Synta% Analy/ers
Scanner 6enerators&
Produce Le%ical Analy/ers
Synta%.directed Translation Engines&
6enerate Intermediate Code
Automatic Code 6enerators&
6enerate Actual Code
7ata.'lo! Engines&
Support "ptimi/ation
Chapter 1
CSE309N
The End

You might also like