Compilers Lecture 8

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

Compilers Design

Semantic Analysis

1
Introduction
Lexical analysis
Detects inputs with illegal tokens
Parsing
Detects inputs with ill-formed parse trees
Semantic analysis
Last phase in the analysis stage
Catches all remaining errors
Introduction(cont.)
Parsing cannot catch some errors
Some language constructs not context-free
Important checks :
1. All identifiers are declared
2. Types
3. Inheritance relationships
4. Classes defined only once
5. Methods in a class defined only once
6. Reserved identifiers are not misused
 The requirements depend on the language
Scope
Matching identifier declarations with uses
Important static analysis step in most languages
The scope of an identifier is the portion of a program
in which that identifier is accessible
The same identifier may refer to different things in
different parts of the program
Different scopes for same name don’t overlap
An identifier may have restricted scope (private in c+
+)
Scope(cont.)
Most languages have static scope
Scope depends only on the program text, not run-time
behaviour i.e C and Java language.
A few languages are dynamically scoped
Lisp and Javascript
Lisp has changed to mostly static scoping
Scope depends on execution of the program
Scope(cont.)
Method names have complex rules
A method need not be defined in the class in which it
is used, but in some parent class
Methods may also be redefined (overridden)
Scope(cont.)
 It may happen that the same name is declared in several nested
scopes. In this case, it is normal that the declaration closest to a use
of the name will be the one that defines that particular use. For
example consider the following C statement block:
{
int x = 1;
int y = 2;
{
double x = 3.14159265358979;
y += (int)x;
}
y += x;
}
Symbol table
Much of semantic analysis can be expressed as a
recursive descent of an AST:
Before: Process an AST node n
Recurse: Process the children of n
After: Finish processing the AST node n
When performing semantic analysis on a portion of
the AST, we need to know which identifiers are
defined.
Symbol table(cont.)
In the previous example:
Before processing y, add definition of x to current
definitions, overriding any other definition of x
Recurse
After processing y, remove definition of x and restore old
definition of x
Symbol table(cont.)
A symbol table is a data structure that tracks the
current bindings of identifiers
What info?
Textual name
Data type
Declaring procedure
Lexical level of declaration
If array, number and size of dimensions
If procedure, number and type of parameters
Types
A set of values and a set of operations on those values
ex: int operations are +-*/
ex: string operations concat
Classes are one instantiation of the modern notion of
type.
Consider the assembly language fragment
add $r1, $r2, $r3
What are the types of $r1, $r2, $r3?
Types(cont.)
Certain operations are legal for values of each type
 It doesn’t make sense to add a function pointer and an
integer in C
 It does make sense to add two integers
 But both have the same assembly language implementation!
A language’s type system specifies which operations are
valid for which types
The goal of type checking is to ensure that operations are
used with the correct types
 Enforces intended interpretation of values, because nothing
else will!
Types(cont.)
Three kinds of languages:
Statically typed: All or almost all checking of types is
done as part of compilation (C and Java)
Dynamically typed: Almost all checking of types is done
as part of program execution (Lisp, Scheme, and
python)
Untyped: No type checking (machine code)
Static vs. dynamic typing
Static typing
Static checking catches many programming errors at
compile time
Avoids overhead of runtime type checks
Dynamic typing
Static type systems are restrictive
Rapid prototyping difficult within a static type system
Static vs. dynamic typing(cont.)
A lot of code is written in statically typed languages
with an “escape” mechanism
Unsafe casts in C, Java
People retrofit static typing to dynamically typed
languages
For optimization, debugging
It’s debatable whether either compromise represents
the best or worst of both worlds
Type Checking
Type Checking is the process of verifying fully typed
programs
Type Inference is the process of filling in missing type
information
The two are different, but the terms are often used
interchangeably.
Type Checking(cont.)
We have seen two examples of formal notation
specifying parts of a compiler
Regular expressions
Context-free grammars
The appropriate formalism for type checking is logical
rules of inference
Type Checking(cont.)
Inference rules have the form
If Hypothesis is true, then Conclusion is true
Type checking computes via reasoning
If E1 and E2 have certain types, then E3 has a certain
type
Rules of inference are a compact notation for “If-Then”
statements. For example:
If e1 has type Int and e2 has type Int, then e1 + e2 has
type Int
Questions?

19

You might also like