The Role of The Lexical Analyzer: Token Source Program

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

The Role of The Lexical Analyzer

token
Source program Lexical Analyser get next token parser

Symbol table

Token a reserve word

Patterns the set of strings defined by a rule is called a pattern


Lexemes is a sequence of characters in the source program matched by the pattern for atoken

Input Buffering

Sentinels

Regular Expression:

Tokens are built from symbols of a finite vocabulary. We use regular expressions to define structures of tokens.

Regular Expressions
The sets of strings defined by regular expressions are termed regular sets Definition of regular expressions
is a regular expression denoting the empty set
A string s is a regular expression denoting a set containing only s if A and B are regular expressions, so are

A | B (alternation)
AB A* (concatenation) (Kleene closure)

Regular Expressions (Contd)


Some examples
Let D = (0 | 1 | 2 | 3 | 4 | ... | 9 ) L = (A | B | ... | Z) decimal = D+ D+
ident = L (L | D)* (_ (L | D)+)*

Some more examples


Identifiers:

Real Numbers:

Recognition of Tokens

A transition diagram

This machine accepts abccabc, but it rejects abcab. This machine accepts (abc+)+.

Transition Diagrams:
Depicts the action that take place when a lexical analyzer is called by

the parser

* Input retraction must take place

Implementing a Transition Diagram

Lex Specification
A lex program consists of three parts: declarations %% translation rules %% auxiliary procedures

Finite Automata
Generalized Transition Diagram Used to recognize regular sets Can be deterministic or [only one output from a state] Non Deterministic [more than one output from a state]

Nondeterministic Finite Automata


Recognizes a regular expression: (a|b)*abb

Transition table for the finite automation of previous figure

You might also like