MP (Mini Pascal) Specification: 2.1. Program Declaration

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

MP (Mini Pascal) Specification

Version 1.2 1. Introduction


MP (Mini Pascal) is a language which consists of a subset of Pascal plus some C++ language features. The Pascal features of this language are (details will be discussed later): a few primitive types, one-dimensional arrays, control structures, expressions, compound statements (i.e., blocks), functions and procedures. MP borrows from C++ the comment style, break and continue statements...

2. Program structure
MP does not support separate compilation so all declarations (variable and function) must be resided in one single file. An MP program should begin with the program declaration, followed by a declaration part. After that is the main part of the program (the body), which is a block statement described later in section 2.3. The program is terminated by a period (.).

2.1.

Program declaration

The program declaration starts with the program keyword, followed by the name of that program and end up by a semi-colon (;). For example:
program myMPProgram;

In MP, the program name is provided just for programmers convenience. It does not have any effect in that program (i.e., you could use it to name variables or procedures below).

2.2.

Declaration part

The declaration part contains a variable (and constant) declaration part and a procedure (and function) declaration part. Note that the procedure declaration part must follow the variable declaration part.

2.2.1.

Variable declaration part

The variable declaration part consists of several (or no) constant and variable declarations. Each constant declaration must begin with the const keyword, followed by an identifier, a =, a literal (could be integer, real, string or boolean one) and a semi-colon. Each variable declaration has the form: var identifier-list : type ; The identifier-list is a non-empty comma-separated list of identifiers. type could be a primitive type or an array type or string typewhich will be discussed later in the Type section. For example:
const myIntConst = 5; var my1stVar: integer;

var myArrayVar: array[1..5] of real; var my2ndVar, my3rdVar: boolean; var my2ndArray, my3rdArray: array[2..6] of real; const myRealConst = 5.3e4; var myString: string;

2.2.2.

Procedure declaration part

The procedure declaration part contains several (or no) function and procedure declarations. The function declaration begins with a function keyword, then the function name, an opening parenthesis ((), a semi-colon separated parameter list, a closing parenthesis ()), a colon (:), a type (return type of that function), a semi-colon, a declaration part and the body of the function. A function declaration is terminated by a semi-colon (;). In the parameter list of a function declaration, there is zero or more parameters. A parameter consists of an identifier, a colon and a parameter type. If two or more consecutive parameters have the same type, they could be reduced to a shorter form: a comma delimited list of these parameter names, followed by a colon (:) and the shared parameter type. For example function area(a:real;b:real;c:real):real; could be reduced to
function area(a,b,c:real):real;

A parameter and return type may be any type of MP. The declaration part of a function is similar to the declaration part of the program. The body of a function is also simply a block statement. A procedure declaration is similar to a function one except that the procedure declaration uses keyword procedure instead of function and does not contain the return type. We will use the term procedure to refer both procedure and function unless we specify. For example:
procedure foo(); var child_share:integer; function child1():real; begin return child_share + 1; end; procedure child2(); begin child_share := 2; end; begin child2(); child_share := child1(); writeln(child_share); end;

2.3.

Block statement

A block statement begins by the keyword begin and ends up with the keyword end. Between the two keywords, there may be a list of statements preceded by a local variable

(and constant) declaration part (described in section 2.2.1). The list of statements may be empty. Section 7 below describes all kinds of statements in MP. For example:
begin //start of declaration part var r,s: real; const myPI=3.14; //list of statements r:=2.0; s:=r*r*myPI; end

3. Lexical Structure
3.1. Character Set
An MP program consists of a sequence of characters from the ASCII character set. Blank ( ), tab (\t), formfeed (i.e., the ASCII FF) (\f), carriage return (i.e., the ASCII CR \r) and newline (i.e., the ASCII LF \n) are whitespace characters. The \n is used as newline character in MP. This definition of lines can be used to determine the line numbers produced by an MP compiler.

3.2.

Program Comments

MP borrows the comment syntax from C++. That is, there are two types of comments: block comment and line comment. A block comment starts with /* and ignores all characters until it reaches */. A line comment ignores all characters from // to the newline character at the end of the current line. For example:
/* This is a block comment */ a := 5; // This is C-style line comment

As designed in C++ , the following rules are enforced: Comments do not nest. /* and */ have no special meaning in comments that begin with //. // has no special meaning in comments that begin with /*.

3.3.

Tokens Set

In an MP program, there are five kinds of tokens: identifiers, keywords, operators, separators and literals.

3.3.1.

Identifiers

Identifiers begin with a letter (A-Z or a-z) or underscore, and may contain letters, underscores (_), and digits (0-9). MP is case-sensitive, therefore these are distinct identifiers: WriteLn, writeln, and WRITELN. Variable name and procedure name are examples of identifiers. 3

3.3.2.

Keywords

The following character sequences are reserved as keywords and cannot be used as identifiers: and array begin repeat const div do break false downto else end for function if integer continue mod not of or procedure program real return string then to while until var boolean true

3.3.3.
Operator + * / div mod := =

Operators
Meaning Operator < > <= >= <> and not or Meaning Less than operator Greater than operator Less than or equal operator Greater than or equal operator Not equal operator Logical AND Logical NOT Logical OR

The following is a list of valid operators along with their meaning:

Addition or unary plus operator Subtraction or minus operator Multiplication operator Real division Integer Division Modulus Assignment operator Equal operator

3.3.4.
Separator ( ) [ ] ,

Separators
Meaning Parentheses Square brackets Comma

The following ten characters are the separators:

: ; . ..

Colon Semi-colon Period Range separator

3.3.5.

Literals

A literal is a source representation of a value of an integer, real, boolean or string type.

3.3.5.1. Integer literals


Integer literals are values that are always expressed in decimal (base 10). A decimal number is a string of digits (0-9) and is at least one digit long. The following are valid integer numbers:
0 100 255 2500

Integer literals are of type integer.

3.3.5.2. Real literals


Real literals are numbers that contain an integer portion, a fractional portion, and an exponent. The integer portion is similar to an integer literal. The fractional portion starts with a dot (.), followed by a non-empty sequence of digit. The exponent part starts with the letter E or e, followed by an optional sign (i.e. - or + or none) and a non-empty sequence of digit. In a real literal, the integer portion, the fractional portion and the exponent part are all optional. However, there are some rules that must be followed: If the integer portion does not exist, the fractional portion must be available. The exponent part is optional. If there is an integer portion in that real literal, at least one of the two (fractional portion and exponent part) are required. The following are valid real literals:
1.03e-2 1.03e+2 1.03e2 1e+2 1e2 1.0 .1e-2 Real literals are of type real. 1.03 .1e+2 1.0e-2 1.0e+2 .1e2 .1 1.0e2 1e-2

3.3.5.3. Boolean literal


A boolean literal is either true or false, formed from ASCII letters. Boolean literals are of type boolean.

3.3.5.4. String literals


String literals consist zero or more characters enclosed by single quotes ('). Use escape sequences (listed below) to represent special characters within a string. Remember that the quotes are not part of the string. It is a compile-time error for a new line, tab, single quote ('), double quote() character to appear after the opening (') and before the closing matching ('). 5

Example 1: 'This is a wrong string causes the following response: ErrorToken(Unclosed string: 'This is a wrong string) Example 2: 'This is a wrong string because it includes a in string' causes the following response: ErrorToken(Illegal tab in string: 'This is a wrong string because it includes a ) identifier in 'string' ErrorToken(Unclosed string: ') Example 3: 'This is a wrong string because it includes a causes the following response: ErrorToken(Illegal in string: 'This is a wrong string because it includes a ) All the supported escaped sequences are as follows:
\b \f \r \n \t \' \" \\ backspace formfeed carriage return newline horizontal tab single quote double quote backslash

A single backslash also causes error message Illegal \ in string: followed by the error string from the first single quote to the single backslash. For example,
Example4: 'This is a wrong string \ '

causes the following response: ErrorToken(Illegal \ in string: 'This is a wrong string \) ErrorToken(Unclosed string: ') The following are valid examples of string constants:
'This is a string containing tab \t' 'Where\'s the program?'

4. Types and Values


Types limit the values that a variable can hold (e.g., an identifier x whose type is integer cannot hold value true), the values that an expression can produce, and the operations supported on those values (e.g., we can not apply operation + in two boolean values). Types of all variables and expressions in MP must be known at compile time.

4.1.

Boolean type

The keyword boolean denotes a boolean type. Each value of type boolean can be either true or false. if, while, repeat and other control statements work with boolean expressions. The operands of the following operators can be in boolean type:
= <> not and or

4.2.

Integer type

The keyword integer is used to represent an integer type. Only these operators can act on integer values: 6

div mod +

<

<=

>

>=

<>

4.3.
+

Real type
* / < <= > >=

The keyword real represents a real type. The operands of the following operators can be in real type:

4.4.

Array type

For simplicity reason, MP supports only one-dimensional arrays. The element type of an array can only be boolean, integer or real. The range of an array must be specified in its array declaration. The elements of the array are referenced from the lower-bound to the upper-bound of that range. The lower and upper-bound of a range is an integer literal preceded optionally by a unary minus or plus operator. E.g., i:array[1..5] of integer; i has five elements: a[1],a[2],a[3],a[4],a[5]. An index can be any integer expression (i.e., any expression of type integer).

4.5.

String type

The keyword string is used to represent a string type. A string object can be an operand for the following operators: + < <= > >= = <> Except the first operator returning the string type, the others return a boolean type.

5. Variables
In an MP program, all variables must be declared before use. There are three kinds of variables: global, block-scoped and procedure-scoped. A variable name cannot be used for another variable or a procedure in the same scope. However, it can be reused in other scopes. When a variable is redeclared by another variable in a nested scope, it is hidden in the nested scope.

5.1.

Global variables

As discussed above, global variables are variables declared in the global variable declaration part (i.e., outside all functions (or procedures) and block statement). Global variables are visible from the place where they are declared to the end of the program.

5.2.

Block-scoped variables

Block-scoped variables are declared inside a block statement. They serve as local variables for temporary use. They are visible inside the block where they are declared and all nested blocks. The following fragment of code is legal:
begin var r:integer;//block variable var s:real; //block variable s:=r*r*3.14;

begin // s and r are visible in nested blocks s:= s * r + 1; if (s > 3) then begin // r may be redeclared so it masks r declared outside var r:char; s := 1; // s still be visible end end end

Storage of a block-scope variable declared in a block is allocated when the flow of control enters the block and destroyed as soon as the flow of control leaves the block. Unlike a global variable, a block-scope variable may be associated with more than one storage during the execution of the program.

5.3.

Procedure-scoped variables

A procedure-scoped variable is the one declared inside a procedure but outside any block inside that procedure. Procedure-scoped variables include parameters and variables declared in the variable declaration part of that procedure. They are visible from the place they are declared to the end of the defining procedure. For example:
program DeclDemo; a var a: array[1..5] of integer; // global variable fill x size procedure fill(x:array[1..5] of integer); var size:integer;//procedure local variable init x procedure init(x:array[1..5] of integer); begin var i:integer; //block variable for i:=1 to size do x[i]:=-1; end; begin size:=5; init(x); end; begin fill(a);
end.

6. Expressions
Expressions are constructs which are made up of operators and operands. Expressions work with existing data and return new data. In MP, there exist two types of operations, unary and binary. Unary operations work with one operand and binary operations work with two operands. Regardless of the operator, operands may be literals, constants, variables, data returned by another operator, or data returned by a function call. Operators can be grouped according to the types they operate on. The following describes expressions in more detail.

6.1.

Arithmetic Operators

Standard arithmetic operators are listed below. Operator Operation Applicable types + Unary sign identity Integer, real + * div / mod Unary sign negation Addition Subtraction Multiplication Integer division Real division Integer, real Integer, real, string Integer, real Integer, real Integer Integer, real

Integer remainder Integer The operands of these operators could be of integer or real type. However, the two div and mod operators require all their operands must be in integer type or a type mismatch error will occur. If the operands are all in integer or real type then the operation results are in the same type of the operands. If the operands are in mixed types between integer and real types, the result will be of real type. There is one exception in the case of operator /: the result is always in real no matter types of its operands. A special case is that operator + can manipulate in 2 string operands. The operator will return a string object that is the concatenation of the first and the second string operands.

6.2.

Boolean Operators

Boolean operators include logical and, not and or. The operation of each is summarized below: Operator Logical Operation And Conjunction Not Or Negation Disjunction

6.3.

Relational Operators

Relational operators perform arithmetic and literal comparisons. All relational operations result in a boolean type. Relational operators include: Operator = < > <= >= Meaning Equal Less than Greater than Less than or equal Greater than or equal Applicable types integer, boolean, string integer, real, string integer, real, string integer, real, string integer, real, string 9

Not equal integer, boolean, string In general, all operands of a relational operator must be in the same type. A special case is that an integer may compare to a real.

<>

6.4.

Index operators

An index operator is used to reference or extract selected elements of an array. It must take the following form:
variable [ expression ]

The expression between [ and ] must be of int type. The type of the variable must be an array type. The index operator returns the element of the array variable whose index is expression. The operator has the highest precedence. For example,
a[3+foo(2)] := a[b[2]] +3;

The above assignment is valid if the return type of foo is int and the element type of b is int.

6.5.

Operator Precedence and Associativity

For expressions with three or more operands (i.e. 2 - 244 / 4), rules of precedence apply. The order of precedence for operators is listed from highest to lowest: Operator Type Index operators Unary Operators Multiplying Operators Adding Operators Relational Operators Operator [ ] not, +, *, /, div, mod, and +, -, or =, <>, <, >, <=, >=

Operations are performed from left to right while operations of higher precedence are performed first. For instance, the following expression:
7 + 4 * 2

is not the same as:


(7 + 4) * 2

Since multiplication has a higher precedence than addition, multiplication is performed first followed by addition. Use parentheses to separate operations that you want to be performed first. Except that relational operators are non-associative, all other binary operators are leftassociative.

6.6.

Evaluation orders

MP requires the left-hand operand of a binary operator must be evaluated first before any part of the right-hand operand is evaluated. Similarly, in a function or procedure call, the actual parameters must be evaluated from left to right. Every operand of an operator must be evaluated before any part of the operation itself is performed. The two exceptions are the logical operators and and or, which are still evaluated from left to right, but it is guaranteed that evaluation will stop as soon as the truth or falsehood is

known. This is known as the short-circuit evaluation. We will discuss this later in detail (code generation step).

6.7.

A function call

A function call starts with an identifier (which is a function name), then an opening parenthesis ((), a nullable comma-separated list of arguments, which are expressions, and a closing parenthesis ()). For example,
foo(2+3,x); goo();

7. Statements
A statement indicates the action a program performs. There are many kinds of statements, as describe as follows:

7.1.

Assignment statement
lvalue := expression;

An assignment statement assigns a value to an object. An assignment takes the following form: where lvalue can be a variable or a member of an array and the value returned by the expression is stored in the lvalue. The type of the value returned by the expression must be compatible with the type of lvalue. The following code fragment contains examples of assignment:
begin aPI := PI; value := foo(5); l[3] := value * 2; end.

7.2. 7.3.

Compound (block) statement For statement

Compound statement is a block statement that was described in section 2.3.

The for statement allows repetitive execution of one or more statements. For statement executes a loop for a predetermined number of iterations. For statements take the following form:

for variable := expression1 to|downto expression2 do statement First, expression1 will be evaluated and assigned to variable. Then MP calculate expression2 . In case of to clause being used, if the value of expression1 (i.e., the current value of variable) is less or equal to the value of expression2, statement will be executed. After that, variable will be incremented by 1. The process continues until the variable hits the value of expression2. If variable is greater than expression2, the statement will be skipped (i.e., the statement next to this for loop will be executed).

If downto clause is used, the iterative process is the same except that the statement will be executed if variable are greater or equal to expression2 and the variable will be decremented by 1 after each iteration. Note that variable, expression1, expression2 must be of integer type and variable must be a scalar one. The following are examples of for loops:
for i := 1 to 100 do begin writeln(i); Intarray[i] := i + 1; end for x := 5 downto 2 do WriteLn(x);

7.4.

If statement

The if statement conditionally executes one of two statements based on the value of an expression. The form of an if statement is:

if expression then statement [else statement] where expression evaluates to a boolean value. If expression results in true then the statement following the reserved word then is executed. If expression evaluates to false and an else clause is specified then the statement following else is executed. If no else clause exists and expression is false then the if

statement is passed over. The following is an example of an if statement.


if flag then writeln('Expression is true') else writeln ('Expression is false');

7.5.

Repeat statement

The repeat statement, much like the for statement, executes one or more statements in a loop. Unlike a for statement where the loop condition is tested prior to each iteration, a repeat statement condition is tested after each iteration. Therefore, a repeat loop is executed at least once. A repeat statement has the following form:
repeat <one or more statements> until expression;

where the repeat loop executes repeatedly until the expression evaluates to the boolean value of true. The following is an example of the repeat statement:
repeat /*Do something*/ if (i>5) then quitCondition := true; else i := i+1; until quitCondition;

7.6.

While statement

The while statement executes one or more statements in a loop. while statements take the following form:
while expression do statement where expression evaluates to a boolean value. If the value is true, the while loop executes repeatedly statement until expression becomes false.

7.7.

Break statement

The break statement has the following format: break; The break statement has the same semantics as in C. It must reside in a loop (i.e., in one of the followings: a for loop, a repeat loop, or a while loop). Otherwise, a runtime error will be generated.

7.8.

Continue statement

The continue statement has the following format: continue; The continue statement has the same semantics as in C. It must reside in a loop (i.e., in one of the followings: a for loop, a repeat loop, or a while loop). Otherwise, a runtime error will be generated.

7.9.

Return statement

The return statement has the following format: return expression; The return statement must reside in a function and is used to return value of the expression to the caller of the function.

7.10. Call statement


The call statement starts with an identifier (which is a procedure name), then an opening parenthesis ((), a nullable comma-separated list of arguments, which are expressions, and a closing parenthesis ()). A call statement is terminated by a semicolon. For example,
foo(2+3,x); goo();

8. Procedures and Functions


Procedures are a sequence of instructions that are separate from the main code block. Functions are the same as procedures except that functions return a value while procedures do not. Procedures are blocks of code that are called from one or more places throughout your program. Procedures make source code more readable and reduce the size of the executable because repetitive blocks of code are replaced with a call to a procedure. Both procedures and functions accept parameters. Parameters allow the calling routine to communicate with a procedure. In MP, parameters are passed by value. Thus, only the values of the actual parameters are passed to corresponding formal parameters and the procedure has no access to the actual variable. One can modify the values of formal parameters but cannot alter those of actual

parameters. When an actual parameter is an array, the arrays address (i.e. value of the actual parameter) will be passed to the procedure. Therefore, any modification to that the elements of array parameter would really affect the actual array elements. For example:
program DeclDemo; var i: integer; var a: array[1..5] of integer; procedure initialize(x:array [1..5] of integer); begin var i:integer; for i:=1 to 5 do x[i]:=-1; end; begin initialize(a);
end.

MP procedures and functions may be called recursively.

9. Built-in Functions
MP provides 12 built-in functions, which are specified in the table below. Built-in function name
readInt() writeInt(anArg) writeIntLn(anArg) readReal() writeReal(anArg)

Parameter type no Integer Integer no Real no Boolean String String no

writeRealLn(anArg) Real readBool() writeBool(anArg)

writeBoolLn(anArg) Boolean writeStr(anArg) writeStrLn(anArg) writeLn()

Semantic Read an integer number from keyboard Write an integer number to the screen Write an integer number to the screen and a new line Read an real number from keyboard Write a real value to the screen Write a real value to the screen and a new line Read a boolean value from keyboard Write a boolean value to the screen Write a boolean value to the screen and a new line Write a string to the screen Write a string to the screen and a new line Write a new line to the screen

You might also like