Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 13

Compiler Construction

LAB-4
A scanner needs to compute as many attributes as are necessary to allow further processing.
Since the scanner will have to compute possibly several attributes for each token. It is often
helpful to collect all the attributes into a single structured data type which we could call a token
record. Such a record could be declared in C as

typedef struct {
TokenType tokenval ;
char * stringval;
Int numval;
} TokenRecord ;

A common arrangement is for the scanner to return the token value only and place the other
attributes in global variables where they can be accessed by other parts of compiler.

Although the task of scanner is to convert the entire source program into a sequence of tokens,
the scanner will rarely do this all at once. Instead the scanner will operate under the control of
parser , returning the single next token from the input on demand. So the scanner will be declared
as a function such as

TokenType getToken(void);

Regular expressions: Regular expressions represent patterns of string of characters. Patterns


recognized by scanner are defined by Regular expressions.

Reserved words and identifiers: Reserved words are the simplest to write a regular expression:
they are represented by their fixed sequence of characters. If we wanted to collect the reserved
words into one definition, we could write something like

Reserved = if | while | do | ….

Identifiers: identifiers are strings of characters which are not fixed. Typically an identifier must
begin with a letter and contain only letters and digits. We can express this in terms of regular
definitions as

letter = [a-zA-Z] digit = [0-9]

identifier = letter(letter|

digit)*

Numbers: Numbers can be just sequence of digits (natural numbers), or decimal numbers, or
numbers with an exponent. We can write regular definitions for these numbers as follows:

nat = [0-9]+

Page 1 of 13
Compiler Construction

signedNat = (+|-)? nat number = signedNat

(“.” nat) ? (E signedNat) ?

Finite automate: Finite automata or finite-state machines are a mathematical way of describing
particular kind of algorithms. In particular, finite automata can be used to describe the process of
recognizing patterns in input strings, and so can be used to construct scanners.

Finite automata can be described using transition diagrams the following example illustrates this.

The transition diagram makes it easy to visualize the scanner algorithm and code can be written
easily by hand if the scanner it to process simple language. consider the following operators.
< , <= , <> , > , >= , =

We can represent them as a transition diagram. Then we can write code using the transition
diagram.

Transition Table: In the above code example, the Finite automata has been hardwired right into
the code. It is also possible to express the DFA as a data structure and then write “generic” code
that will take its actions from the data structure.

A simple data structure that is adequate for this purpose is a transition table. A two dimensional
array, indexed by state and input character that expresses the values of the transition function T.

Consider the DFA for identifier:

Page 2 of 13
Compiler Construction

The DFA can be represented by the following transition table.


Input char Letter Digit other

State
1 2 Error Error
2 2 2 3
3 Yes Accept
Then the generic code can be expressed as:

State = 1

ch = next input character ;

while not Accept [ state ] and not error state do

new state = T[state ,ch] if Advance[state,ch]

then ch = next input char state = new state

end while if Accept[state] then accept ;

Lexical Analyzer in C++.


Now you have studied the basic theory to code for a Finite Automata. Using the
suggested style of coding write code to recognize the following key words and
language constructs.

Keywords.:

If, do, for, while, begin , end , switch , else , break.

All Special Characters in C++:

Page 3 of 13
Compiler Construction

; , [, ] , ( , ) , { , } ,

Operators:
The table is given below. The precedence at the top is maximum, at the bottom minimum.

Operators associativity
* / % multiply, divide, mod Left to right
+ - add, subtract Left to right
<< >> shift left, shift right Left to right
< <= > >= less than, less or equal, greater, greater or equal Left to right
== != equal, not equal Left to right
&& logical and Left to right
|| logical or Left to right
= += -= *= /= %= >>= <<= assignments Right to left

Task_1 First of all Draw Transition diagrams.

Task-2 Draw transition table.

Task-3 write code

Task -4 Make a symbol table for identifiers with the following functions: entry
*Search(string). entry * make_entry(string).

#include<iostream>
#include<conio.h>
#include <string>
#include <fstream>
using namespace std;
fstream AMINA;
enum tokentype { A_PLUS_PLUS, A_EQUALSTO, A_plus, A_MINUS_MINUS, A_MINUS_EQUALS,
A_MINUS, A_MULTIPLY_EQUALS, A_MULTIPLY, A_DIVIDE_EQUALS, A_DIVIDE,
A_MOD_EQUALS, A_MOD, A_EQUALSTO_EQUALSTO, A_EQUAL, A_NOT_EQUALS, A_NOT_EQUALS_TO,
A_AND_AND, A_AND, A_OR_OR, A_OR, A_right_opening_bracket,
A_left_opening_bracket, A_right_round_bracket, A_left_round_bracket, A_comma_op,
A_semicolon, A_GREATER_OR_EQUALTO, A_input, A_GREATER_THAN,
A_LESS_OR_EQUALTO, A_output, A_LESS_THAN, A_num, A_if, A_elseif, A_int, A_while, A_do,
A_for, A_return, A_float, A_double, A_string, A_variable };

struct AMINA_STRUCTURE
{
tokentype t;

Page 4 of 13
Compiler Construction

string identifier;
string name = "";
int val = 0;

AMINA_STRUCTURE lexical()
{
AMINA_STRUCTURE A1;
char character;
//Check for operators
if (character == '+')
{
AMINA.get(character);
if (character == '+')
{
A1.t = A_PLUS_PLUS;
A1.name = "PLUS_PLUS";
A1.val;
return A1;
}
else if (character == '=')
{
A1.t = A_EQUALSTO;
A1.name = "A_EQUALSTO";
A1.val;
return A1;

}
else
{
A1.t = A_plus;
A1.name = "A_plus";
A1.val;
return A1;

}
}
else if (character == '-')
{
AMINA.get(character);
if (character == '-')
{
A1.t = A_MINUS_MINUS;
A1.name = "A_MINUS_MINUS";
A1.val;
return A1;

}
else if (character == '=')
{
A1.t = A_MINUS_EQUALS;
A1.name = "A_MINUS_EQUALS";
A1.val;
return A1;

}
else

Page 5 of 13
Compiler Construction

{
A1.t = A_MINUS;
A1.name = "A_MINUS";
A1.val;
return A1;

}
}
else if (character == '*')
{
AMINA.get(character);
if (character == '=')
{
A1.t = A_MULTIPLY_EQUALS;
A1.name = "A_MULTIPLY_EQUALS";
A1.val;
return A1;

}
else
{
A1.t = A_MULTIPLY;
A1.name = "A_MULTIPLY";
A1.val;
return A1;

}
}
else if (character == '/')
{
AMINA.get(character);
if (character == '=')
{
A1.t = A_DIVIDE_EQUALS;
A1.name = "A_DIVIDE_EQUALS";
A1.val;
return A1;

}
else {
A1.t = A_DIVIDE;
A1.name = "A_DIVIDE";
A1.val;
return A1;

}
}
else if (character == '%')
{
AMINA.get(character);
if (character == '=')
{
A1.t = A_MOD_EQUALS;
A1.name = "A_MOD_EQUALS";
A1.val;
return A1;

Page 6 of 13
Compiler Construction

}
else {
A1.t = A_MOD;
A1.name = "A_MOD";
A1.val;
return A1;

}
}
else if (character == '=')
{
AMINA.get(character);
if (character == '=')
{
A1.t = A_EQUALSTO_EQUALSTO;
A1.name = "A_EQUALSTO_EQUALSTO";
A1.val;
return A1;

}
else {
A1.t = A_EQUAL;
A1.name = "A_EQUAL";
A1.val;
return A1;

}
}
else if (character == '!')
{
AMINA.get(character);
if (character == '=')
{
A1.t = A_NOT_EQUALS;
A1.name = "A_NOT_EQUALS";
A1.val;
return A1;

}
else {
A1.t = A_NOT_EQUALS_TO;
A1.name = " A_NOT_EQUALS_TO";
A1.val;
return A1;

}
}
else if (character == '&')
{
AMINA.get(character);
if (character == '&')
{
A1.t = A_AND_AND;
A1.name = "A_AND_AND";
A1.val;
return A1;

Page 7 of 13
Compiler Construction

}
else {
A1.t = A_AND;
A1.name = "A_AND";
A1.val;
return A1;

}
}
else if (character == '|')
{
AMINA.get(character);
if (character == '|')
{
A1.t = A_OR_OR;
A1.name = "A_OR_OR";
A1.val;
return A1;

}
else
{
A1.t = A_OR;
A1.name = "A_OR";
A1.val;
return A1;

}
}
else if (character == '{')
{

A1.t = A_right_opening_bracket;
A1.name = "A_right_opening_bracket";
A1.val;
return A1;

}
else if (character == '}')
{
A1.t = A_left_opening_bracket;
A1.name = "A_left_opening_bracket";
A1.val;
return A1;

}
else if (character == '(')
{
A1.t = A_right_round_bracket;
A1.name = "A_right_round_bracket";
A1.val;
return A1;

Page 8 of 13
Compiler Construction

else if (character == ')')


{
A1.t = A_left_round_bracket;
A1.name = "A_left_round_bracket";
A1.val;
return A1;

}
else if (character == ',')
{
A1.t = A_comma_op;
A1.name = "A_comma_op";
A1.val;
return A1;
}
else if (character == ';')
{
A1.t = A_semicolon;
A1.name = "A_semicolon";
A1.val;
return A1;

}
else if (character == '>')
{
AMINA.get(character);
if (character == '=')
{
A1.t = A_GREATER_OR_EQUALTO;
A1.name = "A_GREATER_OR_EQUALTO";
A1.val;
return A1;

}
else if (character == '>')
{
A1.t = A_input;
A1.name = "A_input";
A1.val;
return A1;

}
else
{
A1.t = A_GREATER_THAN;
A1.name = "A_GREATER_THAN ";
A1.val;
return A1;

}
}
else if (character == '<')
{
AMINA.get(character);
if (character == '=')
{
A1.t = A_LESS_OR_EQUALTO;

Page 9 of 13
Compiler Construction

A1.name = "A_LESS_OR_EQUALTO";
A1.val;
return A1;

}
else if (character == '<')
{
A1.t = A_output;
A1.name = "A_output";
A1.val;
return A1;

}
else
{
A1.t = A_LESS_THAN;
A1.name = "A_LESS_THAN";
A1.val;
return A1;

}
}
else if (isdigit(character))
{
string s;
do
{
s += character;
AMINA.get(character);
} while (isdigit(character));
AMINA.putback(character);
A1.t = A_num;
A1.name = "IS A NUMBER";
A1.val;
return A1;
}
else if (isalpha(character))
{
string s;
do
{
s += character;
AMINA.get(character);
} while (isalpha(character) || isdigit(character));
AMINA.putback(character);
if (s == "if")
{
A1.t = A_if;
A1.name = "..IF..";
A1.val;
return A1;

}
else if (s == "else if")
{
A1.t = A_elseif;
A1.name = "..ELSE_IF..";

Page 10 of 13
Compiler Construction

A1.val;
return A1;

}
else if (s == "int")
{
A1.t = A_int;
A1.name = "..INT..";
A1.val;
return A1;

}
else if (s == "while")
{
A1.t = A_while;
A1.name = "..WHILE..";
A1.val;
return A1;
}
else if (s == "do")
{
A1.t = A_do;
A1.name = "..DO..";
A1.val;
return A1;

}
else if (s == "return")
{
A1.t = A_return;
A1.name = "..RETURN..";
A1.val;
return A1;

}
else if (s == "for")
{
A1.t = A_for;
A1.name = "..FOR..";
A1.val;
return A1;

}
else if (s == "float")
{
A1.t = A_float;
A1.name = "..FLOAT..";
A1.val;
return A1;

}
else if (s == "double")
{
A1.t = A_double;
A1.name = "..DOUBLE..";
A1.val;
return A1;

Page 11 of 13
Compiler Construction

}
else if (s == "string")
{
A1.t = A_string;
A1.name = "..STRING..";
A1.val;
return A1;

}
else {
A1.t = A_variable;
A1.name = "..variable..";
A1.val;
return A1;

}
}
}
};

int main()
{
AMINA.open("C:\\Users\\amina\\OneDrive\\Desktop\\AMINA.txt");
if (!AMINA)
{
cout << "file does not exist";
}
else
{
char character;

while (!AMINA.eof())
{
AMINA.get(character);
if (AMINA.eof())
{
break;
}
else if (character == ' ' || character == '\n' || character == '\
t')
{
continue;
}
else if (character == '/')
{
AMINA.get(character);
if (character == '/')
{
while (character != '\n')
{
AMINA.get(character);
continue;
}
}
}
system("pause");

Page 12 of 13
Compiler Construction

return 0;
}
}
}

Page 13 of 13

You might also like