Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 13

Compiler Design 13.

Symbol Tables
malolavaradhan

Symbol Tables

The job of the symbol table is to store all the names of the program and information about each name In block structured languages, roughly speaking, the symbol table collects information from declarations and uses that information whenever a name is used later in the program

this information could be part of the syntax tree, but is put into a table for efficient access to names
If there are different occurrences of the same name, the symbol table assists in name resolution

Either the parser or the lexical analyzer can do

Symbol Table Entries: Simple Variables, Basic Information

Variables (identifiers)

Character string (lexeme), may have limits on number of characters Data type Storage class (if not already implied by the data type) Name and lexical level of block in which it is declared Other access information, if necessary, such as modifiability constraints Base address and memory offset, after allocation

Symbol Table Entries: Beyond Simple Variables

Arrays

Also needs number of dimensions


Upper and lower bounds of each dimension List of fields Information about each field Number and types of parameters Type of return value

Records and structures


Functions and Procedures


Symbol Table Representation

The two main operations are


insert (name) makes an entry for this name lookup (name) finds the relevant occurrence of the name by searching the table

Lookups occur a lot more often than insert Hash tables are commonly used

Because of good average time complexity for lookup (O(1)). var1 class1 var3
fn1 var2 fn2

Scope Analysis

The scope of a name is tied to the idea of a block in the programming language

Standard blocks (statement sequences, sometimes if statement)


Procedures and functions Program (global program level) Universe (predefined functions, etc.)

Names must be unique within the block in which they are declared (no two objects with the same name in one block)

There are some languages with exceptions for different types (a function and a variable may have same name)

Declaration Before Use?

We are dealing primarily with languages in which there are declarations of names required

Names of variables, constants, arrays, etc. must be declared before use


Names of functions and procedures vary

C requires functions and procedures to also be declared before use, or at least given a prototype Java does not require this for methods (can call first, define later in *.java file)

Scope of a name (in a statically scoped language):

The scope of a constant, variable, array, etc. is from the end of its definition to the end of the block in which it is declared

More Symbol Table Functions

In addition to lookup and insert, the symbol table will also need

initializeScope (level) , when a block is entered to create a new hash table entry in the symbol table list

finializeScope (level), on block exit put the current hash table into a background list

Essentially makes a tree structure (scope A may contain scopes B1, B2, B3 ...), where one child may be distinguised as the active block

The symbol tables shown so far are all for the program being compiled, also needed is a way to look up names in the universe

Alternate Representation

The lists of hash tables can be inefficient for lookup since the system has to search up the list of lexical levels

More names tend to be declared at level 0, thus making the most common occurrence be the most expensive

An optimization of the symbol table as lists of hash tables is to keep one giant hash table

Within that table each name will have a list of occurrences identified by lexical level

This representation keeps the (essentially) constant time lookup

Static Scope

The scoping system described so far assumes that the scope rules are for static scoping

The static problem layout of enclosing blocks determines the scoping of a name The scoping of a name depends on the call structure of the program at run-time

There are also languages with dynamic scoping

The name resolution will be to the closest block on the call stack of a block with a declaration of that name the most recently called function or block

Object-Oriented Scoping

Languages like Java must keep symbol tables for


The code being compiled Any external classes that are known and referenced inside the code The inheritance hierarchy above the class containing the code

One method of implementation is to attach a symbol table to each class with two nesting hierarchies

One for lexical scoping inside individual methods One to follow the inheritance hierarchy of the class

Testing and Error Recovery

If a name is used, but the lookup fails to find any definition

Give an error but enter the name with a dummy type information so that further uses do not also trigger errors Give an ambiguity error, choose which type to use in later analysis, usually the first Include all types of correct declarations

If a name is defined twice

Testing cases

Incorrect cases may include

References

Nancy McCracken's original slides

Linz University Compiler course materials (MicroJava).


Keith Cooper and Linda Torczon, Engineering a Compiler, Elsevier, 2004. Kenneth C. Louden, Compiler Construction: Principles and Practices, PWS Publishing, 1997. Per Brinch Hansen, On Pascal Compilers, Prentice-Hall, 1985. Out of print.

You might also like