Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

CSE2002 Theory of Computation and Compiler Design

LTP JC
3 0 0 4 4
Objectives
 Provides required theoretical foundation for a computational model and compiler design
 Discuss Turing machines as a abstract computational model
 Compiler algorithms focus more on low level system aspects
Expected Outcome
On successful completion of the course, the student should be able to:
1. Design computational models for formal languages
2. Design scanners and parsers using top-down as well as bottom-up paradigms
3. Design symbol tables and use them for type checking and other semantic checks
4. Implement a language translator
5. Use tools such as lex, YACC to automate parts of implementation process.

Module Topics L Hrs SLO


1 Introduction To Languages and Grammars
Overview of a computational model - Languages and grammars –
alphabets – Strings - Operations on languages
3 1
Introduction to Compilers - Analysis of the Source Program -
Phases of a Compiler

2 Regular Expressions and Finite Automata


Finite automata – DFA – NFA – Equivalence of NFA and DFA
(With Proof) - Regular expressions – Conversion between RE and
FA (With Proof) 9 9,6
Lexical Analysis - Recognition of Tokens - Designing a Lexical
Analyzer using finite automata

3 Myhill-Nerode Theorem - Minimization of FA – Decision properties


of regular languages – Pumping lemma for Regular languages (With 4 5, 9
Proof)
4 CFG, PDAs and Turing Machines
CFG – Chomsky Normal Forms - NPDA – DPDA - Membership
algorithm for CFG
Syntax Analysis - Top-Down Parsing - Bottom-Up Parsing - 12 1, 6
Operator-Precedence Parsing - LR Parsers

5 Turing Machines – Recursive and recursively enumerable languages


– Linear bounded automata - Chomsky's hierarchy – Halting 5 6, 9
problem
6 Intermediate Code Generation - Intermediate Languages –
Declarations - Assignment Statements - Boolean Expressions - Case
Statements – Backpatching - Procedure Calls. 4 11

Proceedings of the 39th Academic Council [17.12.2015] 453


7 Code Optimization - Basic Blocks and Flow Graphs – The DAG
Representation of Basic Blocks - The Principal Sources of
Optimization - Optimization of Basic Blocks - Loops in Flow 4 18
Graphs - Peephole Optimization - Introduction to Global Data-
Flow Analysis
8 Code Generation – Issues in the Design of a Code Generator -
The Target Machine - Run-Time Storage Management - Next-Use
Information - Register Allocation and Assignment - A Simple Code 3 9
Generator - Generating Code from DAG

9 Recent Trends – Just-in-time compilation with adaptive


optimization for dynamic languages - Parallelizing Compilers 1 9

Project # Generally a team project [3 to 4 members] 60 9, 18


# Concepts studied in CSE1001/CSE1002/CSE1003 should have been used [Non
# Down to earth application and innovative idea should have been attempted Contact
# Report in Digital format with all drawings using software package to be hrs]
submitted. [Ex. 1. Design of a traffic light system using sequential circuits OR 2.
Design of digital clock]
# Assessment on a continuous basis with a minimum of 3 reviews.

The following is a sample project that shall be given to students that shall be
implemented using any programming language:

Define a small language that is similar to Standford's COOL (Class room Object
Oriented language). Each project will ultimately result in a working compiler
phase which can interface with other phases. Student will have an option of
doing the projects in any programming languages they may also integrate some
of the tools already available.

 Develop a lexical analyzer - Tools such as lex, flex for C++; jlex for Java
may be used
Input - Set of tokens
Output - recognizing tokens in the specified language as valid and invalid

 Design and develop a parser (Variations may be given) – Tools such as


YACC, bison for C++ and CUP for Java may be used, packages for
manipulating trees may also used to achieve the task
Input – Text with Symbols
Output - Abstract Syntax Tree

 Implement to check static semantics of a language - refer to the


typing rules, identifier scoping rules, and other restrictions of the specified
language

 Code generator - Input AST constructed and static analysis performed


Output - MIPS assembly code

Text/Reference book exercises may also be given as project.

Proceedings of the 39th Academic Council [17.12.2015] 454


Text Books
1. Introduction to Automata Theory, Languages, and Computation (3rd Edition), John E Hopcroft,
Rajeev Motwani, Jeffery D. Ullman, Pearson education, 2013.
2. Principles of Compiler Design, Alferd V. Aho and Jeffery D. Ullman, Addison Wesley,2006.

Reference Books

1. Introduction to Languages and the Theory of Computation, John Martin, McGraw-Hill Higher
Education,2010
2. Modern Compiler Implementation in Java, 2nd ed., Andrew W. Appel Cambrdige University Press,
2012.

Theory of Computation and Compiler Design

Knowledge Areas that contain topics and learning outcomes covered in the course

Knowledge Area Total Hours of Coverage

CS: AL(Algorithms and Complexity) / CE: CAO 17

CS: PL(Programming Languages) / CE: CAO 19

CS: DS(Discrete Structures) / CE: DSC 9

Body of Knowledge coverage


[List the Knowledge Units covered in whole or in part in the course. If in part, please indicate which topics
and/or learning outcomes are covered. For those not covered, you might want to indicate whether they are covered
in another course or not covered in your curriculum at all. This section will likely be the most time-consuming to
complete, but is the most valuable for educators planning to adopt the CS2013 guidelines.]

KA Knowledge Unit Topics Covered Hours

CS: AL / Basic Automata, Introduction to languages and grammars – 8


CE: ALG Computability Chomsky's hierarchy
and Finite automata – DFA – NFA – Equivalence of
Complexity NFA and DFA - Regular expressions –
Conversion between RE and FA – Minimization
of FA

CS: AL / Advanced CFG – Normal Forms – CNF and GNF - PDA – 9


CE: ALG Automata Theory DPDA – NPDA - Turing Machines – Recursive
and and recursively enumerable languages
Computability

Proceedings of the 39th Academic Council [17.12.2015] 455


CS: PL / Language Introduction to Compilers - Analysis of the Source 4
CE: PRF Translation and Program - Phases of a Compiler - Lexical Analysis
Execution - The Role of the Lexical Analyzer - Specification
of Tokens - Recognition of Tokens - Finite
Automata - From a Regular Expression to an NFA
- Design of a Lexical Analyzer

CS: PL / Syntax Analysis  Top-Down Parsing - Bottom-Up Parsing - 6


CE: PRF Operator-Precedence Parsing - LR Parsers - Using
Ambiguous Grammars

CS: PL / Code Generation Code Generation – Issues in the Design of a 3


CE: PRF Code Generator - The Target Machine - Run-
Time Storage Management - Next-Use
Information - A Simple Code Generator

CS: PL / Advanced Register Allocation and Assignment - Generating 2


CE: PRF Programming Code from DAGs - Dynamic Programming Code
Constructs

CS: PL / Language Intermediate Languages – Declarations - 4


CE: PRF Pragmatics Assignment Statements - Boolean Expressions -
Case Statements – Backpatching - Procedure Calls.
CS: DS / Proof Techniques Decision properties of FAs- Pumping for Regular 6
CE: DSC and languages – All Theorems and their proofs
CS: DS / Graphs and Trees Code Optimization - Basic Blocks and Flow 3
CE: DSC Graphs – The DAG Representation of Basic
Blocks - The Principal Sources of Optimization -
Optimization of Basic Blocks - Loops in Flow
Graphs - Peephole Optimization - Introduction to
Global Data-Flow Analysis
Total Hours 45

Where does the course fit in the curriculum?


[In what year do students commonly take the course? Is it compulsory? Does it have pre-requisites, required
following courses? How many students take it?]

This course is a
 Core subject
 Suitable from 4th semester onwards.
 Knowledge of any one programming language is essential.

Proceedings of the 39th Academic Council [17.12.2015] 456


What is covered in the course?
[A short description, and/or a concise list of topics - possibly from your course syllabus.(This is likely to be your
longest answer)]
The course gives an idea of different kinds of computational problems that are to be solved. All
the abstract computational models such as finite automata, pushdown automata and Turing
machines are taught to the students. Students are expected to design abstract models for the
given problems and also understand the limitations of such models. This course also gives
complete knowledge about how a high level language program is converted into the machine
format that can be understood by the machine. The subject gives the overall idea of the phases
involved in the conversion process and students are made to understand and apply the abstract
machine models for doing a particular task in a compilation process. The phases of compiler
such as lexical analysis, syntax analysis, code generation and code optimization are dealt in detail.
Overview of other phases of compilation is to be given in the course. Students are expected to
apply the acquired knowledge for designing a language translator.

Part 1: Abstract Models of Computation


This part of the course introduces languages and grammars and develops one of the three
abstract computational models such as finite automata, pushdown automata and Turing
machines to generate/accept the languages.

Part II: Lexical and Syntax Analysis


This part of the course deals with the algorithms and computational models that takes the high
level language program as input and check for correct syntax.

Part III: Code Generation and Optimization


The algorithms involved in generation of the code and optimization is explained to students in
this part of the course.

What is the format of the course?


[Is it face to face, online or blended? How many contact hours? Does it have lectures, lab sessions, discussion
classes?]

This Course is designed with 150 minutes of in-classroom sessions per week, 30 minutes of
video/reading instructional material per week, as well as 200 minutes of non-contact time
spent on implementing course related project. Generally this course should have the
combination of lectures, in-class discussion, guest-lectures, mandatory off-class reading material,
quizzes.

How are students assessed?


[What type, and number, of assignments are students are expected to do? (papers, problem sets, programming
projects, etc.). How long do you expect students to spend on completing assessed work?]

 Students are assessed on a combination group activities, classroom discussion,


assignments, projects, and continuous, final assessment tests.
 A minimum of six assignments shall be given to students in addition to the project. The
assignments may be given in the earlier stage of the course before the students start the
project.

Proceedings of the 39th Academic Council [17.12.2015] 457


 Students can earn additional weightage based on certificate of completion of a related
MOOC course.

Session wise plan

Class Lab Topic Covered levels of Reference Remarks


Hour Hour mastery Book

3 Introduction To Familiarity T1, T2 Several


Languages, Grammars applications of
and Compilers automata theory
Overview of a such as Natural
computational model - language
Languages and grammars – processing,
alphabets – Strings - bionformatics
Operations on languages may be quoted
and compiler
Analysis of the Source design shall be
Program - Phases of a introduced as a
Compiler applcation of
automata theory
that is to be dealt
in detail
1 Regular Expressions and Familiarity T1, R1 Assignment1 with
Finite Automata exercise problems
Finite automata – DFA – in text/reference
NFA book is to be
given
2 Design of DFA and NFA - Usage T1, R1
Equivalence of NFA and
DFA (With Proof)

3 Regular expressions – Usage T1, R1 Assignment2 with


Conversion between RE exercise problems
and FA (With Proof) in text/reference
book is to be
given
3 Lexical Analysis - Familiarity T2, R2
Recognition of Tokens -
Designing a Lexical
Analyzer using finite
automata

2 Myhill-Nerode Theorem - Familiarity T1, R1 Assignment3 with


Minimization of FA exercise problems

Proceedings of the 39th Academic Council [17.12.2015] 458


2 Decision properties of Usage T1, R1 in text/reference
regular languages – book is to be
Pumping lemma for given
Regular languages (With
Proof)
3 CFG, PDAs and Turing Familiarity T1, R1 Assignment4 with
Machines exercise problems
CFG – Chomsky Normal in text/reference
Forms book is to be
3 NPDA – DPDA - Usage T1, R1 given
Membership algorithm for
CFG

4 Syntax Analysis - Top- Familiarity T2, R2 Assignment5 with


Down Parsing - Bottom- exercise problems
Up Parsing - Operator- in text/reference
Precedence Parsing book is to be
given
2 LR Parsers Familiarity T2, R2

3 Turing Machines – Usage T1, R1 Assignment6 with


Recursive and recursively exercise problems
enumerable languages in text/reference
2 Linear bounded automata Usage T1, R1 book is to be
- Chomsky's hierarchy – given
Halting problem
4 Intermediate Code Familiarity T2, R2
Generation - Intermediate These topics can
Languages – Declarations - be dealt in flipped
Assignment Statements - classroom type.
Boolean Expressions - Video lecturers
Case Statements – may be prepared
Backpatching - Procedure new or may be
Calls. taken from web
4 Code Optimization - Familiarity T2, R2 and further
Basic Blocks and Flow discussed in the
Graphs – The DAG class
Representation of Basic
Blocks - The Principal
Sources of Optimization -
Optimization of Basic
Blocks - Loops in Flow
Graphs - Peephole
Optimization -
Introduction to Global
Data-Flow Analysis

Proceedings of the 39th Academic Council [17.12.2015] 459


3 Code Generation – Issues Familiarity T2, R2
in the Design of a Code
Generator - The Target
Machine - Run-Time
Storage Management -
Next-Use Information -
Register Allocation and
Assignment - A Simple
Code Generator -
Generating Code from
DAG
1 Recent Trends – Just-in- Familiarity T2, R2
time compilation with
adaptive optimization for
dynamic languages -
Parallelizing Compilers
45 Hours
(3 Credit
hours

15 Weeks
schedule)

Approved by the Academic Council on: 17.12.2015

Proceedings of the 39th Academic Council [17.12.2015] 460

You might also like