What Is A Compiler

What is a Compiler?
A compiler is a computer program which helps you transform source code

written in a high-level language into low-level machine language. It translates
the code written in one programming language to some other language
without changing the meaning of the code. The compiler also makes the end
code efficient which is optimized for execution time and memory space.
The compiling process includes basic translation mechanisms and error

detection. Compiler process goes through lexical, syntax, and semantic
analysis at the front end, and code generation and optimization at a back-end.
In this Compiler Design tutorial, you will learn
 What is a Compiler?
 Features of Compilers
 Types of Compiler
 Tasks of Compiler
 History of Compiler
 Steps for Language processing systems
 Compiler Construction Tools
 Why use a Compiler?
 Application of Compilers
Features of Compilers
 Correctness
 Speed of compilation
 Preserve the correct the meaning of the code
 The speed of the target code
 Recognize legal and illegal program constructs
Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)
 Good error reporting/handling
 Code debugging help
Types of Compiler
Following are the different types of Compiler:
 Single Pass Compilers

 Two Pass Compilers
 Multipass Compilers
Single Pass Compiler

Top 10 Behavioral Interview Questions and Answers
S
ingle Pass Compiler
In single pass Compiler source code directly transforms into machine code. For
example, Pascal language.
Two Pass Compiler
Two Pass Compiler

Two pass Compiler is divided into two sections, viz.
1. Front end: It maps legal code into Intermediate Representation (IR).

2. Back end: It maps IR onto the target machine
The Two pass compiler method also simplifies the retargeting process. It also allows
multiple front ends.
Multipass Compilers
Multipass Compilers
The multipass compiler processes the source code or syntax tree of a program several
times. It divided a large program into multiple small programs and process them. It
develops multiple intermediate codes. All of these multipass take the output of the
previous phase as an input. So it requires less memory. It is also known as ‘Wide
Compiler’.
Tasks of Compiler
Main tasks performed by the Compiler are:
 Breaks up the up the source program into pieces and impose grammatical
structure on them
 Allows you to construct the desired target program from the intermediate
representation and also create the symbol table
 Compiles source code and detects errors in it
 Manage storage of all variables and codes.
 Support for separate compilation
 Read, analyze the entire program, and translate to semantically equivalent
 Translating the source code into object code depending upon the type of
machine
History of Compiler
Important Landmark of Compiler’s history are as follows:
 The “compiler” word was first used in the early 1950s by Grace Murray
Hopper

 The first compiler was build by John Backum and his group between 1954 and
1957 at IBM
 COBOL was the first programming language which was compiled on multiple
platforms in 1960
 The study of the scanning and parsing issues were pursued in the 1960s and
1970s to provide a complete solution
Steps for Language processing systems

Before knowing about the concept of compilers, you first need to understand a few
other tools which work with compilers.
Steps for Language processing systems
 Preprocessor: The preprocessor is considered as a part of the Compiler. It is a

tool which produces input for Compiler. It deals with macro processing,
augmentation, language extension, etc.

 Interpreter: An interpreter is like Compiler which translates high-level
language into low-level machine language. The main difference between both
is that interpreter reads and transforms code line by line. Compiler reads the
entire code at once and creates the machine code.
 Assembler: It translates assembly language code into machine understandable

language. The output result of assembler is known as an object file which is a
combination of machine instruction as well as the data required to store these
instructions in memory.
 Linker: The linker helps you to link and merge various object files to create an
executable file. All these files might have been compiled with separate
assemblers. The main task of a linker is to search for called modules in a
program and to find out the memory location where all modules are stored.
 Loader: The loader is a part of the OS, which performs the tasks of loading
executable files into memory and run them. It also calculates the size of a
program which creates additional memory space.
 Cross-compiler: A Cross compiler in compiler design is a platform which

helps you to generate executable code.
 Source-to-source Compiler: Source to source compiler is a term used when

the source code of one programming language is translated into the source of
another language.
Compiler Construction Tools

Compiler construction tools were introduced as computer-related technologies spread
all over the world. They are also known as a compiler- compilers, compiler-
generators or translator.
These tools use specific language or algorithm for specifying and implementing the
component of the compiler. Following are the example of compiler construction tools.
 Scanner generators: This tool takes regular expressions as input. For example
LEX for Unix Operating System.
 Syntax-directed translation engines: These software tools offer an
intermediate code by using the parse tree. It has a goal of associating one or
more translations with each node of the parse tree.

 Parser generators: A parser generator takes a grammar as input and
automatically generates source code which can parse streams of characters with
the help of a grammar.
 Automatic code generators: Takes intermediate code and converts them into
Machine Language
 Data-flow engines: This tool is helpful for code optimization. Here,
information is supplied by user and intermediate code is compared to analyze
any relation. It is also known as data-flow analysis. It helps you to find out how
values are transmitted from one part of the program to another part.
Why use a Compiler?

 Compiler verifies entire program, so there are no syntax or semantic errors
 The executable file is optimized by the compiler, so it is executes faster
 Allows you to create internal structure in memory
 There is no need to execute the program on the same machine it was built
 Translate entire program in other language
 Generate files on disk
 Link the files into an executable format
 Check for syntax errors and data types
 Helps you to enhance your understanding of language semantics
 Helps to handle language performance issues
 Opportunity for a non-trivial programming project
 The techniques used for constructing a compiler can be useful for other
purposes as well
Application of Compilers
 Compiler design helps full implementation Of High-Level Programming
Languages
 Support optimization for Computer Architecture Parallelism
 Design of New Memory Hierarchies of Machines
 Widely used for Translating Programs
 Used with other Software Productivity Tools

Summary
 A compiler is a computer program which helps you transform source code
written in a high-level language into low-level machine language
 Correctness, speed of compilation, preserve the correct the meaning of the code
are some important features of compiler design
 Compilers are divided into three parts 1) Single Pass Compilers 2)Two Pass
Compilers, and 3) Multipass Compilers
 The “compiler” was word first used in the early 1950s by Grace Murray
Hopper
 Steps for Language processing system are: Preprocessor, Interpreter,
Assembler, Linker/Loader
 Important compiler construction tools are 1) Scanner generators, 2)Syntax-3)
directed translation engines, 4) Parser generators, 5) Automatic code generators
 The main task of the compiler is to verify the entire program, so there are no
syntax or semantic errors
Q:Compiler construction?
The process of constructing a compiler is called compiler
construction.
Why we construct compiler?
Without compiler our machine is not able to understand
source code written in high level language.

What is Lexical Analysis?
Lexical Analysis is the very first phase in the compiler designing. A Lexer takes the
modified source code which is written in the form of sentences . In other words, it
helps you to convert a sequence of characters into a sequence of tokens. The lexical
analyzer breaks this syntax into a series of tokens. It removes any extra space or
comment written in the source code.
Programs that perform Lexical Analysis in compiler design are called lexical
analyzers or lexers. A lexer contains tokenizer or scanner. If the lexical analyzer
detects that the token is invalid, it generates an error. The role of Lexical Analyzer in
compiler design is to read character streams from the source code, check for legal
tokens, and pass the data to the syntax analyzer when it demands.
Example
How Pleasant Is The Weather?
See this Lexical Analysis example; Here, we can easily recognize that there are five
words How Pleasant, The, Weather, Is. This is very natural for us as we can recognize
the separators, blanks, and the punctuation symbol.
HowPl easantIs Th ewe ather?
Now, check this example, we can also read this. However, it will take some time
because separators are put in the Odd Places. It is not something which comes to you
immediately.
In this tutorial, you will learn
 Basic Terminologies:
 Lexical Analyzer Architecture: How tokens are recognized
 Roles of the Lexical analyzer
 Lexical Errors
 Error Recovery in Lexical Analyzer
 Lexical Analyzer vs. Parser
 Why separate Lexical and Parser?
 Advantages of Lexical analysis
 Disadvantage of Lexical analysis
Basic Terminologies
What’s a lexeme?
A lexeme is a sequence of characters that are included in the source program
according to the matching pattern of a token. It is nothing but an instance of a token.
What’s a token?
Tokens in compiler design are the sequence of characters which represents a unit of
information in the source program.
What is Pattern?
A pattern is a description which is used by the token. In the case of a keyword which
uses as a token, the pattern is a sequence of characters.
Lexical Analyzer Architecture: How tokens are

recognized
The main task of lexical analysis is to read input characters in the code and produce
tokens.
Lexical analyzer scans the entire source code of the program. It identifies each token
one by one. Scanners are usually implemented to produce tokens only when requested
by a parser. Here is how recognition of tokens in compiler design works-

Lexical Analyzer Architecture
1. “Get next token” is a command which is sent from the parser to the lexical
analyzer.
2. On receiving this command, the lexical analyzer scans the input until it finds
the next token.
3. It returns the token to Parser.
Lexical Analyzer skips whitespaces and comments while creating these tokens. If any
error is present, then Lexical analyzer will correlate that error with the source file and
line number.
Roles of the Lexical analyzer

Lexical analyzer performs below given tasks:
 Helps to identify token into the symbol table

 Removes white spaces and comments from the source program
 Correlates error messages with the source program
 Helps you to expands the macros if it is found in the source program
 Read input characters from the source program
Example of Lexical Analysis, Tokens, Non-Tokens

Consider the following code that is fed to Lexical Analyzer
#include <stdio.h>
int maximum(int x, int y) {
// This will compare 2 numbers
if (x > y)
return x;
else {
return y;

}
}
Examples of Tokens created
Lexeme Token
int Keyword
maximum Identifier
( Operator
int Keyword
x Identifier
, Operator
int Keyword
Y Identifier
) Operator
{ Operator
If Keyword
Examples of Nontokens
Type Examples
Comment // This will compare 2 numbers
Pre-processor directive #include <stdio.h>
Pre-processor directive #define NUMS 8,9
Macro NUMS
Whitespace /n /b /t
Lexical Errors
A character sequence which is not possible to scan into any valid token is a lexical
error. Important facts about the lexical error:
 Lexical errors are not very common, but it should be managed by a scanner
 Misspelling of identifiers, operators, keyword are considered as lexical errors
 Generally, a lexical error is caused by the appearance of some illegal character,
mostly at the beginning of a token.
Error Recovery in Lexical Analyzer

Here, are a few most common error recovery techniques:

 Removes one character from the remaining input
 In the panic mode, the successive characters are always ignored until we reach
a well-formed token
 By inserting the missing character into the remaining input
 Replace a character with another character
 Transpose two serial characters
Lexical Analyzer vs. Parser

Lexical Analyser Parser
Scan Input program Perform syntax analysis
Identify Tokens Create an abstract representation of the code
Insert tokens into Symbol Table Update symbol table entries
It generates lexical errors It generates a parse tree of the source code
Why separate Lexical and Parser?

 The simplicity of design: It eases the process of lexical analysis and the syntax
analysis by eliminating unwanted tokens
 To improve compiler efficiency: Helps you to improve compiler efficiency
 Specialization: specialized techniques can be applied to improves the lexical
analysis process
 Portability: only the scanner requires to communicate with the outside world
 Higher portability: input-device-specific peculiarities restricted to the lexer
Advantages of Lexical analysis

 Lexical analyzer method is used by programs like compilers which can use the
parsed data from a programmer’s code to create a compiled binary executable
code
 It is used by web browsers to format and display a web page with the help of
parsed data from JavsScript, HTML, CSS
 A separate lexical analyzer helps you to construct a specialized and potentially
more efficient processor for the task
Disadvantage of Lexical analysis

 You need to spend significant time reading the source program and partitioning
it in the form of tokens
 Some regular expressions are quite difficult to understand compared to PEG or
EBNF rules
 More effort is needed to develop and debug the lexer and its token descriptions
 Additional runtime overhead is required to generate the lexer tables and
construct the tokens
Summary
 Lexical analysis is the very first phase in the compiler designing
 Lexemes and Tokens are the sequence of characters that are included in the
source program according to the matching pattern of a token
 Lexical analyzer is implemented to scan the entire source code of the program
 Lexical analyzer helps to identify token into the symbol table
 A character sequence which is not possible to scan into any valid token is a
lexical error
 Removes one character from the remaining input is useful Error recovery
method
 Lexical Analyser scan the input program while parser perform syntax analysis
 It eases the process of lexical analysis and the syntax analysis by eliminating
unwanted tokens
 Lexical analyzer is used by web browsers to format and display a web page
with the help of parsed data from JavsScript, HTML, CSS
 The biggest drawback of using Lexical analyzer is that it needs additional
runtime overhead is required to generate the lexer tables and construct the
tokens

What is Syntax Analysis?
Syntax Analysis is a second phase of the compiler design process in which the given
input string is checked for the confirmation of rules and structure of the formal
grammar. It analyses the syntactical structure and checks if the given input is in the
correct syntax of the programming language or not.
Syntax Analysis in Compiler Design process comes after the Lexical analysis phase. It
is also known as the Parse Tree or Syntax Tree. The Parse Tree is developed with the
help of pre-defined grammar of the language. The syntax analyser also checks
whether a given program fulfills the rules implied by a context-free grammar. If it
satisfies, the parser then creates the parse tree of that source program. Otherwise, it
will display error messages.

Syntax Analyser Process
In this tutorial, you will learn
 Why do you need Syntax Analyser?

 Important Syntax Analyser Terminology
 Why do we need Parsing?
 Parsing Techniques
 Error – Recovery Methods
 Grammar:
 Notational Conventions
 Context Free Grammar
 Grammar Derivation
 Syntax vs. Lexical Analyser
 Disadvantages of using Syntax Analysers
Why do you need Syntax Analyser?

 Check if the code is valid grammatically
 The syntactical analyser helps you to apply rules to the code
 Helps you to make sure that each opening brace has a corresponding closing
balance
 Each declaration has a type and that the type must be exists
Important Syntax Analyser Terminology

Important terminologies used in syntax analysis process:
 Sentence: A sentence is a group of character over some alphabet.
 Lexeme: A lexeme is the lowest level syntactic unit of a language (e.g., total,
start).
 Token: A token is just a category of lexemes.
 Keywords and reserved words – It is an identifier which is used as a fixed

part of the syntax of a statement. It is a reserved word which you can’t use as a
variable name or identifier.
 Noise words – Noise words are optional which are inserted in a statement to
enhance the readability of the sentence.
 Comments – It is a very important part of the documentation. It mostly display

by, /* */, or//Blank (spaces)
 Delimiters – It is a syntactic element which marks the start or end of some

syntactic unit. Like a statement or expression, “begin”…”end”, or {}.
 Character set – ASCII, Unicode
 Identifiers – It is a restrictions on the length which helps you to reduce the

readability of the sentence.
 Operator symbols – + and – performs two basic arithmetic operations.

 Syntactic elements of the Language
Why do we need Parsing?

A parse also checks that the input string is well-formed, and if not, reject it.

Following are important tasks perform by the parser in compiler design:
 Helps you to detect all types of Syntax errors

 Find the position at which error has occurred
 Clear & accurate description of the error.
 Recovery from an error to continue and find further errors in the code.
 Should not affect compilation of “correct” programs.
 The parse must reject invalid texts by reporting syntax errors
Parsing Techniques
Parsing techniques are divided into two different groups:
 Top-Down Parsing,
 Bottom-Up Parsing
Top-Down Parsing:
In the top-down parsing construction of the parse tree starts at the root and then
proceeds towards the leaves.
Two types of Top-down parsing are:
1. Predictive Parsing:
Predictive parse can predict which production should be used to replace the specific
input string. The predictive parser uses look-ahead point, which points towards next
input symbols. Backtracking is not an issue with this parsing technique. It is known as
LL(1) Parser
2. Recursive Descent Parsing:
This parsing technique recursively parses the input to make a prase tree. It consists of
several small functions, one for each nonterminal in the grammar.
Bottom-Up Parsing:
In the bottom up parsing in compiler design, the construction of the parse tree starts
with the leave, and then it processes towards its root. It is also called as shift-reduce
parsing. This type of parsing in compiler design is created with the help of using
some software tools.
Error – Recovery Methods

Common Errors that occur in Parsing in System Software
 Lexical: Name of an incorrectly typed identifier

 Syntactical: unbalanced parenthesis or a missing semicolon
 Semantical: incompatible value assignment
 Logical: Infinite loop and not reachable code

A parser should able to detect and report any error found in the program. So,
whenever an error occurred the parser. It should be able to handle it and carry on
parsing the remaining input. A program can have following types of errors at various
compilation process stages. There are five common error-recovery methods which can
be implemented in the parser
Statement mode recovery

 In the case when the parser encounters an error, it helps you to take corrective
steps. This allows rest of inputs and states to parse ahead.
 For example, adding a missing semicolon is comes in statement mode recover
method. However, parse designer need to be careful while making these
changes as one wrong correction may lead to an infinite loop.
Panic-Mode recovery
 In the case when the parser encounters an error, this mode ignores the rest of
the statement and not process input from erroneous input to delimiter, like a
semi-colon. This is a simple error recovery method.
 In this type of recovery method, the parser rejects input symbols one by one
until a single designated group of synchronizing tokens is found. The
synchronizing tokens generally using delimiters like or.
Phrase-Level Recovery:
 Compiler corrects the program by inserting or deleting tokens. This allows it to
proceed to parse from where it was. It performs correction on the remaining
input. It can replace a prefix of the remaining input with some string this helps
the parser to continue the process.
Error Productions
 Error production recovery expands the grammar for the language which
generates the erroneous constructs. The parser then performs error diagnostic
about that construct.
Global Correction:
 The compiler should make less number of changes as possible while processing
an incorrect input string. Given incorrect input string a and grammar c,
algorithms will search for a parse tree for a related string b. Like some
insertions, deletions, and modification made of tokens needed to transform an
into b is as little as possible.
Grammar:
A grammar is a set of structural rules which describe a language. Grammars assign
structure to any sentence. This term also refers to the study of these rules, and this file
includes morphology, phonology, and syntax. It is capable of describing many, of the
syntax of programming languages.
Rules of Form Grammar
 The non-terminal symbol should appear to the left of the at least one production
 The goal symbol should never be displayed to the right of the::= of any
production
 A rule is recursive if LHS appears in its RHS
Notational Conventions
Notational conventions symbol may be indicated by enclosing the element in square
brackets. It is an arbitrary sequence of instances of the element which can be indicated
by enclosing the element in braces followed by an asterisk symbol, { … }*.
It is a choice of the alternative which may use the symbol within the single rule. It
may be enclosed by parenthesis ([,] ) when needed.
Two types of Notational conventions area Terminal and Non-terminals
1.Terminals:
 Lower-case letters in the alphabet such as a, b, c,

 Operator symbols such as +,-, *, etc.
 Punctuation symbols such as parentheses, hash, comma
 0, 1, …, 9 digits
 Boldface strings like id or if, anything which represents a single terminal
symbol
2.Nonterminals:
 Upper-case letters such as A, B, C

 Lower-case italic names: the expression or some

Context Free Grammar
A CFG is a left-recursive grammar that has at least one production of the type. The
rules in a context-free grammar are mainly recursive. A syntax analyser checks that
specific program satisfies all the rules of Context-free grammar or not. If it does meet,
these rules syntax analysers may create a parse tree for that programme.
expression -> expression -+ term
expression -> expression – term
expression-> term
term -> term * factor
term -> expression/ factor
term -> factor factor
factor -> ( expression )
factor -> id
Grammar Derivation
Grammar derivation is a sequence of grammar rule which transforms the start symbol
into the string. A derivation proves that the string belongs to the grammar’s language.
Left-most Derivation
When the sentential form of input is scanned and replaced in left to right sequence, it
is known as left-most derivation. The sentential form which is derived by the left-most
derivation is called the left-sentential form.
Right-most Derivation
Rightmost derivation scan and replace the input with production rules, from right to
left, sequence. It’s known as right-most derivation. The sentential form which is
derived from the rightmost derivation is known as right-sentential form.
Syntax vs. Lexical Analyser

Syntax Analyser Lexical Analyser
The syntax analyser mainly deals with recursive constructs The lexical analyser eases the
of the language. task of the syntax analyser.
The syntax analyser works on tokens in a source program to
The lexical analyser recognizes
recognize meaningful structures in the programming
the token in a source program.
language.
It receives inputs, in the form of tokens, from lexical It is responsible for the validity

of a token supplied by
analysers.
the syntax analyser
Disadvantages of using Syntax Analysers

 It will never determine if a token is valid or not
 Not helps you to determine if an operation performed on a token type is valid
or not
 You can’t decide that token is declared & initialized before it is being used
Summary
 Syntax analysis is a second phase of the compiler design process that comes
after lexical analysis
 The syntactical analyser helps you to apply rules to the code
 Sentence, Lexeme, Token, Keywords and reserved words, Noise words,
Comments, Delimiters, Character set, Identifiers are some important terms used
in the Syntax Analysis in Compiler construction
 Parse checks that the input string is well-formed, and if not, reject it
 Parsing techniques are divided into two different groups: Top-Down Parsing,
Bottom-Up Parsing
 Lexical, Syntactical, Semantical, and logical are some common errors occurs
during parsing method
 A grammar is a set of structural rules which describe a language
 Notational conventions symbol may be indicated by enclosing the element in
square brackets
 A CFG is a left-recursive grammar that has at least one production of the type
 Grammar derivation is a sequence of grammar rule which transforms the start
symbol into the string
 The syntax analyser mainly deals with recursive constructs of the language
while the lexical analyser eases the task of the syntax analyser in DBMS
 The drawback of Syntax analyser method is that it will never determine if a
token is valid or not

What is Compiler?
A compiler is a computer program that transforms code written in a high-level
programming language into the machine code. It is a program which translates the
human-readable code to a language a computer processor understands (binary 1 and 0
bits). The computer processes the machine code to perform the corresponding tasks.
A compiler should comply with the syntax rule of that programming language in
which it is written. However, the compiler is only a program and cannot fix errors
found in that program. So, if you make a mistake, you need to make changes in the
syntax of your program. Otherwise, it will not compile.
What is Interpreter?
An interpreter is a computer program, which coverts each high-level program
statement into the machine code. This includes source code, pre-compiled code, and
scripts. Both compiler and interpreters do the same job which is converting higher
level programming language to machine code. However, a compiler will convert the
code into machine code (create an exe) before program run. Interpreters convert code
into machine code when the program is run.
KEY DIFFERENCE
 Compiler transforms code written in a high-level programming language into
the machine code, at once, before program runs, whereas an Interpreter coverts
each high-level program statement, one by one, into the machine code, during
program run.
 Compiled code runs faster while interpreted code runs slower.
 Compiler displays all errors after compilation, on the other hand, the Interpreter
displays errors of each line one by one.
 Compiler is based on translation linking-loading model, whereas Interpreter is
based on Interpretation Method.
 Compiler takes an entire program whereas the Interpreter takes a single line of
code.
Difference Between Compiler and Interpreter

Basis of
Compiler Interpreter
difference
 Create the program.
 Compile will parse or
analyses all of the
language statements for
its correctness. If  Create the Program
incorrect, throws an error  No linking of files or machine code
Programming  If no error, the compiler generation
Steps will convert source code  Source statements executed line by
to machine code. line DURING Execution
 It links different code
files into a runnable
program(know as exe)
 Run the Program
The program code is already

translated into machine code. Interpreters are easier to use, especially for
Advantage
Thus, it code execution time is beginners.
less.
You can’t change the program
Interpreted programs can run on computers
Disadvantage without going back to the
that have the corresponding interpreter.
source code.
Store machine language as
Machine code Not saving machine code at all.
machine code on the disk
Running time Compiled code run faster Interpreted code run slower

Basis of
difference
It is based on language
Model translationlinking-loading It is based on Interpretation Method.
model.
Generates output program (in
Do not generate output program. So they
Program the form of exe) which can be
evaluate the source program at every time
generation run independently from the
during execution.
original program.
Program execution is separate
Program Execution is a part
from the compilation. It
Execution ofInterpretation process, so it is performed
performed only after the entire
line by line.
output program is compiled.
Target program
Memory executeindependently and do The interpreter exists in the memory
requirement not require the compiler in the during interpretation.
memory.
For web environments, where load times
Bounded to the specific target
are important. Due to all the exhaustive
machine and cannot be ported.
analysis is done, compiles take relatively
Best suited for C and C++ are a most popular a
larger time to compile even small code that
programming language which
may not be run multiple times. In such
uses compilation model.
cases, interpreters are better.
The compiler sees the entire
Interpreters see code line by line, and thus
Code code upfront. Hence, they
optimizations are not as robust as
Optimization perform lots of optimizations
compilers
that make code run faster
Difficult to implement as
Dynamic Interpreted languages support Dynamic
compilers cannot predict what
Typing Typing
happens at turn time.
It is best suited for the It is best suited for the program and
Usage
Production Environment developmentenvironment.
Compiler displays all errors and
warning at the compilation The interpreter reads a single statement
Error
time. Therefore, you can’t run and shows the error if any. You must
execution
the program without fixing correct the error to interpret next line.
errors
Input It takes an entire program It takes a single line of code.
Output Compliers generates Interpreter never generate any intermediate

Basis of
difference
intermediate machnie code. machnie code.
Display all errors after,
Errors compilation, all at the same Displays all errors of each line one by one.
time.
Pertaining
C,C++,C#, Scala, Java all use
Programming PHP, Perl, Ruby uses an interpreter.
complier.
languages
Role of Compiler
 Compliers reads the source code, outputs executable code
 Translates software written in a higher-level language into instructions that
computer can understand. It converts the text that a programmer writes into a
format the CPU can understand.
 The process of compilation is relatively complicated. It spends a lot of time
analyzing and processing the program.
 The executable result is some form of machine-specific binary code.
Role of Interpreter
 The interpreter converts the source code line-by-line during RUN Time.
 Interpret completely translates a program written in a high-level language into
machine level language.
 Interpreter allows evaluation and modification of the program while it is
executing.
 Relatively less time spent for analyzing and processing the program
 Program execution is relatively slow compared to compiler
HIGH-LEVEL LANGUAGES
High-level languages, like C, C++, JAVA, etc., are very near to English. It makes
programming process easy. However, it must be translated into machine language
before execution. This translation process is either conducted by either a compiler or
an interpreter. Also known as source code.
MACHINE CODE

Machine languages are very close to the hardware. Every computer has its machine
language. A machine language programs are made up of series of binary pattern. (Eg.
110110) It represents the simple operations which should be performed by the
computer. Machine language programs are executable so that they can be run directly.
OBJECT CODE
On compilation of source code, the machine code generated for different processors
like Intel, AMD, and ARM is different. To make code portable, the source code is
first converted to Object Code. It is an intermediary code (similar to machine code)
that no processor will understand. At run time, the object code is converted to the
machine code of the underlying platform.
Java is both Compiled and Interpreted.

To exploit relative advantages of compilers are interpreters some programming
languages like Java are both compiled and interpreted. The Java code itself is
compiled into Object Code. At run time, the JVM interprets the Object code into the
machine code of the target computer.

What Is A Compiler

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

What Is A Compiler

Uploaded by

Copyright:

Available Formats

What is a Compiler?

A compiler is a computer program which helps you transform source code

The compiling process includes basic translation mechanisms and error

In this Compiler Design tutorial, you will learn

 Single Pass Compilers

Single Pass Compiler

Two Pass Compiler

Two Pass Compiler

1. Front end: It maps legal code into Intermediate Representation (IR).

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

Steps for Language processing systems

Steps for Language processing systems

 Preprocessor: The preprocessor is considered as a part of the Compiler. It is a

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

 Assembler: It translates assembly language code into machine understandable

 Cross-compiler: A Cross compiler in compiler design is a platform which

 Source-to-source Compiler: Source to source compiler is a term used when

Compiler Construction Tools

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

Why use a Compiler?

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

In this tutorial, you will learn

Lexical Analyzer Architecture: How tokens are

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

Roles of the Lexical analyzer

 Helps to identify token into the symbol table

Example of Lexical Analysis, Tokens, Non-Tokens

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

Error Recovery in Lexical Analyzer

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

Lexical Analyzer vs. Parser

Why separate Lexical and Parser?

Advantages of Lexical analysis

Disadvantage of Lexical analysis

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

 Why do you need Syntax Analyser?

Why do you need Syntax Analyser?

Important Syntax Analyser Terminology

 Sentence: A sentence is a group of character over some alphabet.

 Token: A token is just a category of lexemes.

 Keywords and reserved words – It is an identifier which is used as a fixed

 Comments – It is a very important part of the documentation. It mostly display

 Delimiters – It is a syntactic element which marks the start or end of some

 Character set – ASCII, Unicode

 Identifiers – It is a restrictions on the length which helps you to reduce the

 Operator symbols – + and – performs two basic arithmetic operations.

Why do we need Parsing?

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

 Helps you to detect all types of Syntax errors

Two types of Top-down parsing are:

2. Recursive Descent Parsing:

Error – Recovery Methods

 Lexical: Name of an incorrectly typed identifier

Sarfraz Ali Roll No # 229 BSCS 5th Semester (M)

Statement mode recovery

Rules of Form Grammar

Two types of Notational conventions area Terminal and Non-terminals

 Lower-case letters in the alphabet such as a, b, c,