Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 7

LEXICAL ANALYSIS AND

PARSING
LEXICAL ANALYSIS

• Lexical analysis is the first phase of a compilation process also known as a scanner. It converts
the high level input program into a sequence of tokens. The output is a sequence of tokens that is
sent to the parser for syntax analysis.
• A token describes a pattern of characters having same meaning in the source program such as
identifiers (variable name, function name), operators (+, ++), keywords (for, while, if), numbers,
delimiters and so on.
• Examples of non-tokens are comments, preprocessor directive, macros, blanks, tabs, newline,
etc.
• Lexical analysis is carried out by a lexical analyser. The lexical analyser breaks these
syntaxes into a series of tokens by removing any whitespace or comments in the source code.
• It identifies the error with the help of the automation machine and the grammar of the given
language on which it is based like C, C++, and gives row and column number of the error.
• It reads character streams from the source code, checks for legal tokens, and passes the data to
the syntax analyzer when it demands.
LEXICAL ERRORS

• A character sequence which is not possible to scan into any valid token. If the analyser finds a
token invalid, it generates an error.
• They are not very common but should be managed by a scanner.
• Misspelling of tokens are also considered as lexical errors.
PARSING

• Parsing is the second phase, i.e. after lexical analysis. It checks the syntactical structure of the
given input, i.e. whether the given input is in the correct syntax or not.
• A by-product of this process is typically a tree (parse tree) that represents the structure of the
program.
• The parse tree is constructed by using the pre-defined grammar of the language and the input
string. If the given input string can be produced with the help of the syntax tree (in the derivation
process), the input string is found to be in the correct syntax if not, error is reported by syntax
analyzer.
TYPES OF PARSER

• Top-Down parsing: This occurs when the parse starts constructing the parse tree from the start
symbol and then tries to transform the start symbol to the input.
• Bottom-Up parsing: This starts with the input symbols and tries to construct the parse tree up to
the start symbol.
DIFFERENCES BETWEEN A LEXICAL
ANALYSER AND PARSER
Lexical analyser parser

It is the first phase of the compilation process It is the second phase of the compilation process

Scan input program Perform syntax analysis

Identify tokens Create an abstract representation of the code

Insert tokens into symbol table Update symbol table entries

It generates lexical errors It generates a parse tree of the source code

You might also like