Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

A modern compilation model

Some
language
... Some Other
language
... Jack
language
Proj. 9: building an app.
Proj. 12: building the OS

Laboratorio di Programmazione
Project 2:
Some Jack
Some Other
compiler compiler Translator from
a. a. 2007/2008
compiler Projects
the Jack
10-11
Language to VM
VM language Language

VM
implementation VM imp. VM imp.

JACK Language
VM over the Hack Projects
over CISC over RISC
emulator platform 7-8
platforms platforms

(chap: 9-10-11-12)
“The Elements of Computing Systems”
by Noam Nisan & Shimon Schocken, MIT Press, 2005
CISC RISC written in Hack
machine
language
machine
language
... a high-level
language
machine
language

Projects
... ...
Ing. Nadia Ranaldo
1-6

Dipartimento di Ingegneria CISC


machine
RISC
machine
other digital platforms, each equipped
with its VM implementation
Any
computer
Hack
computer
Università degli Studi del Sannio 2

Jack: a typical OO language –


The OO approach to programming sample applications

• Object = entity associated with properties (fields)


and operations (methods)
• Objects are instances of classes
E.g. bank account, employee, transaction, window,
gameSession, …
• OO programming: identifying, designing and implementing
classes
• Each class is typically:
– A template for generating and manipulating objects

and/or
– A collection of related subroutines.

3 4
Example 0: hello world Example 1: procedural programming
class
class Main
Main {{
/**
/** Hello
Hello World
World program.
program. */*/ • Jack program =
class
class Main
Main {{
/*
/* Sums
Sums up
function
up 1+2+3+...+n
1+2+3+...+n */ */ collection of one or
function intint sum(int
sum(int n) n) {{
function
function void
/*
void main()
main() {{ var int i, sum;
var int i, sum; more classes
/* Prints
Prints some
some text
text using
using the
the standard
standard library.
library. */
*/
• Jack class = collection
let
let sum
sum == 0;0;
do
do Output.printString(”Hello
Output.printString(”Hello World”);
World”); let
let ii == 1;
1;
do
do Output.println();
Output.println(); //
// New
New line
line while
while (~(i>n)) {{
(~(i>n)) of one or more
return;
subroutines
return; let
let sum
sum == sum
sum ++ i;
i;
}} let
let ii == ii ++ 1;
1;
}} }} • Jack subroutine:
return
return sum;
– Function
sum;
}}
– Method
function
function void
void main()
main() {{
var
var int
int n,n, x;
x; – Constructor
• Java-like syntax let
let nn == Keyboard.readInt(”Enter
let
Keyboard.readInt(”Enter n:
let xx == Main.sum(n);
Main.sum(n);
n: ”);
”); (the example on the
do
do Output.printString("The
Output.printString("The result
result is:
is: ");
"); left has functions only,
• Comments
do
do Output.printInt(sum);
do
Output.printInt(sum); as it is “object-less”)
do Output.println();
Output.println();
return;
return; • There must be one
Main class, and one of
}}
• Standard library. }} //
// Main
Main
its methods must be
main
5 6

Example 2: OO programming Example 2: typical OO programming


class
class BankAccount
BankAccount {{ class
class BankAccount
BankAccount {{
(cont.)
static
static int
int nAccounts;
nAccounts; static
static int
int nAccounts;
nAccounts;

//
// account
account properties
properties //
// account
account properties
properties
field
field int
int id;
id; field
field int
int id;
id;
field
field String owner;
String owner; field
field String owner;
String owner;
field
field int
int balance;
balance; field
field int
int balance;
balance;

/*
/* Constructs
Constructs aa new
new bank
bank account.
account. */
*/ //
// Constructor
Constructor ...
... (omitted)
(omitted)
constructor
constructor BankAccount
BankAccount new(String
new(String aOwner)
aOwner) {{
let id = nAccounts;
let id = nAccounts; /*
/* Deposits
Deposits money
money in
in this
this account.
account. */
*/
let
let nAccounts
nAccounts == nAccounts
nAccounts ++ 1;
1; method
method void
void deposit(int
deposit(int amount)
amount) {{
let
let owner
owner == aOwner;
aOwner; let balance = balance + amount;
let balance = balance + amount;
let balance =
let balance = 0;0; return;
return;
return
return this;
this; ... }} ...
}} ... ...
var
var int
int sum;
sum; var
var int
int sum;
sum;
// ... More BankAccount methods.
// ... More BankAccount methods. var /*
/* Withdraws
Withdraws money
money from
from this
this account.
account. */
*/
var BankAccount
BankAccount b,
b, c;
c; method
var
var BankAccount
BankAccount b,
b, c;
c;
method void
void withdraw(int
withdraw(int amount){
amount){
}} //
// BankAccount
BankAccount let if (balance > amount)
if (balance > amount) { {
let b=BankAccount.new(”Joe”);
b=BankAccount.new(”Joe”); let
let
let b=BankAccount.new(”Joe”);
b=BankAccount.new(”Joe”);
...
... let balance
balance == balance
balance -- amount;
amount; do
do b.deposit(5000);
b.deposit(5000);
}}
return;
return; let
}} let c=BankAccount.new(”jane”);
c=BankAccount.new(”jane”);
let
let sum
sum == 1000;
1000;
do b.withdraw(sum);
do b.withdraw(sum);
//
// ...
... More
More BankAccount
BankAccount methods.
methods. ...
}} // ...
// BankAccount
BankAccount

7 8
Example 2: typical OO programming Example 3: abstract data types
class
class BankAccount
BankAccount {{
(cont.)
static
static int
int nAccounts;
nAccounts; • Motivation: Jack has only 3 primitive data type: int, char, boolean
//
// account
account properties
properties
field
field int
int id;
id; Fraction
field API
field String
String owner;
owner;
field
field int
int balance;
balance;

//
// Constructor
Constructor ...
... (omitted)
(omitted)

/*
/* Prints
Prints information
information about
about this
this account.
account. */
*/
method
method void
void printInfo()
printInfo() {{
do Output.printInt(ID);
do Output.printInt(ID);
do
do Output.printString(owner);
Output.printString(owner);
do
do Output.printInt(balance);
Output.printInt(balance);
return; ...
...
return; var
}} var int
int sum;
sum;
var
var BankAccount
BankAccount b,
b, c;
c;
/* Using the Fraction API
/* Destroys
Destroys this
this account.
account. */
*/
method
method void
void dispose()
dispose() {{
//
// Construct
Construct andand manipulate
manipulate (example) „ API = public contract
do Memory.deAlloc(this); //
// bb and
and cc ...
...
do Memory.deAlloc(this);
return;
return; „ Interface /
}} do
do b.printInfo();
b.printInfo(); implementation
do
do b.dispose();
b.dispose();
// ...
...
// ...
... More
More BankAccount
BankAccount methods.
methods.
}} //
// BankAccount
BankAccount

9 10

Example 3: abstract data types Example 3: abstract data types


(implementation) (implementation cont.)

11 12
Example 4: linked list Jack language specification
/**
/** Provides
Provides aa linked
linked list
list abstraction.
abstraction. */
*/

• Syntax
class
class List
List {{
field
field int
int data;
data;
field
field List
List next;
next;

/*
/* Creates
Creates aa new
new List
List object.
object. */
• Data types
*/
constructor
constructor List
List new(int
new(int car,
car, List
List cdr)
cdr) {{
let data = car;
let data = car;
let
let next
next == cdr;
cdr;
return
return this;
this;

• Variable kinds
}}

/*
/* Disposes
Disposes this
this List
List by
by recursively
recursively disposing
disposing its
its tail.
tail. */
*/
method
method void
void dispose()
dispose() {{
if (~(next = null))
if (~(next = null)) { {

• Expressions
do
do next.dispose();
next.dispose();
}}
class
class Foo
Foo {{
do Memory.deAlloc(this);
do Memory.deAlloc(this); ...
...
return;
return; //
// Creates
Creates aa list
list holding
holding the
the numbers
numbers (2,3,5).
(2,3,5).

• Statements
}}
function
function void
void create235()
create235() {{
...
... var List
var List v;v;
}} //
// class
class List.
List. let
let vv == List.new(5,null);
List.new(5,null);
let
let vv == List.new(2,List.new(3,v));
List.new(2,List.new(3,v));
...
• Subroutine calling
...
}}

13 14

Jack syntax Jack syntax (cont.)

15 16
Jack data types Jack data types: memory allocation
„ Primitive:
• Int 16-bit 2’s complement (15, -2, 3, ...)
• Boolean 0 and –1, standing for true and false
• Char unicode character (‘a’, ‘x’, ‘+’, ‘%’, ...)

„ Abstract data types (supplied by the OS or by the user):


• String
• Fraction
• List
• ... • Object types are represented by a class name and
implemented as a reference, i.e. a memory address
„ Application-specific objects:
• BankAccount • Memory allocation:
• Bat / Ball – Primitive variables are allocated memory space when they are
• ... declared
– Object variables are allocated memory space when they are
17
created via a constructor. 18

Jack variable kinds and scope Jack expressions

„ No operator priority!

19 20
Jack Statements Jack subroutine calls
let
letvariable
variable==expression;
expression;
or
• general syntax: subroutineName(arg1, arg2, …)
or
let
let variable [expression] == expression;
variable [expression] expression;

if
if (expression)
(expression){{ • Each argument is a valid Jack expression
statements
• Parameter passing is by value
statements
}}
else
else{{
statements
statements
}}

while
while(expression)
(expression){{
Example: suppose we have function int sqrt(int n)

This function can be invoked as follows:


statements
statements
}}

– sqrt(17)
do – sqrt(x)
do function-or-method-call;
function-or-method-call;
– sqrt(a*c-17)
return
– sqrt(a*sqrt(c-17)+3)
return expression;
expression;
or
or
return;
Etc.
return;

21 22

Jack subroutine calls (cont.) Jack program structure

• Each class in a
separate file
(compilation unit)

• Jack program =
collection of
classes, containing
a Main.main()

23 24
Jack standard library = language extensions = OS
Jack revisited
class
class Math
Math {{ /**
/** Computes
Computes the
the average
average of
of aa sequence
sequence of
of integers.
integers. */
*/
function
function voidvoid init()
init() class
Class String {{ class Main
Main {{
Class
function String
function int abs(int
int abs(int x) x) function void main() {
constructor String function void main() {
constructor
function int String new(int
multiply(int new(int
x, int maxLength)
maxLength)
y) var
function
Class int
Arraymultiply(int
{{dispose()x, int y) var Array
Array a;
a;
Class
method Array
void var
method
function intvoid dispose()
divide(int
function int divide(int x, int y) x, int y) var int
int length;
length;
method int length() var
function function
method
function intint
function
int Array
min(intArray
min(int x,new(int
length() int
int y)
new(int y)size)
size) var int
int i,
i, sum;
sum;
methodclass
class
char Output
Output {{x,
charAt(int j)
method
function
function intchar
int max(int
max(int charAt(int
x, int
x, int y)j)
y)
methodfunction
void void
dispose()moveCursor(int i,
i, int
int j) let
methodmethod
method
function int
function
void
void void
sqrt(int
void moveCursor(int
dispose()
setCharAt(int
setCharAt(int
x) j,
j, char
char c) c) j) let length
length == Keyboard.readInt(”How
Keyboard.readInt(”How many
many numbers?
numbers? ”);
”);
function int Class
sqrt(int
Class
function Screen
Screen
void {{
x) printChar(char c) let
}} method
method
} function
String
String void printChar(char
appendChar(char
appendChar(char c)
c) c) let aa == Array.new(length);
Array.new(length); //
// Constructs
Constructs the
the array
array
} function void clearScreen() let
method functionfunction
function void
void void clearScreen() s)
printString(String
printString(String s) let ii == 0;
0;
method void
void eraseLastChar()
class Memory
eraseLastChar()
class
function Memory
void {{
setColor(boolean b)
functionfunction
function void
void void setColor(boolean
printInt(int
printInt(int i)
i) b)
method int
method int function intValue()
intValue()
function
function void
void drawPixel(int
int peek(int
drawPixel(int x, int
address)
x, int y)
y) while
while (i(i << length)
length) {{
method function
function
void void
function
void
setInt(int println()
int peek(int
println()
j) address)
method voidfunction Class
setInt(int
Class Keyboard
void j)
Keyboard {{
drawLine(int x1, let
let a[i]
a[i] == Keyboard.readInt(”Enter
Keyboard.readInt(”Enter the
the next
next number:
number: ”);
functionfunction
function void
void void drawLine(int
backSpace()
backSpace() x1, int
int y1,
y1, ”);
function
function char char backSpace() let
function
backSpace()
function
function
void
void poke(intint
poke(int
char int x2,
x2, int
int y2)
address,
address,
keyPressed()
int
int value)
y2) value) let sum
sum == sum
sum ++ a[i];
a[i];
}} function char keyPressed() let
function
function charchar doubleQuote()
function Class
void
doubleQuote()
function Class
void Sys
Sys {
drawRectangle(int
{
drawRectangle(int x1,
x1, int
int y1,
y1, let ii == ii ++ 1;
1;
function function
function Array
Array alloc(int
alloc(int size)
int x2,
size) int y2) }}
function char char newLine() function
newLine()function char readChar()
charvoid int x2, int y2)
readChar()
function void function void halt():
drawCircle(int x,
}} function
function voidfunction
drawCircle(int
void deAlloc(Array x, int
halt(): int y,
o) y, int
int r)
r)
function void deAlloc(Array o) do
}} function
function String
String readLine(String
readLine(String message)
message) do Output.printString(”The
Output.printString(”The average
average is:
is: ”);
”);
function void error(int
function void error(int errorCode)errorCode) do
}} do Output.printInt(sum
Output.printInt(sum // length);
length);
function int
int readInt(String message) do
function
function readInt(String message) do Output.println();
Output.println();
function void void wait(int
wait(int duration)
duration) return;
}} return;
}} }}
}}

25 26

Typical OS functions The Jack OS


Language extensions / standard System-oriented services • Math: Provides basic mathematical operations;
library
„ Memory management
(objects, arrays, ...) • String: Implements the String type and string-related
• Mathematical operations
operations;
(abs, sqrt, ...)
„ I/O device drivers
• Abstract data types • Array: Implements the Array type and array-related
(String, Date, ...) „ Mass storage operations;
• Output functions „ File system
• Output: Handles text output to the screen;
(printChar, printString ...)
„ Multi-tasking
• Input functions • Screen: Handles graphic output to the screen;
(readChar, readLine ...) „ UI management (shell / windows)
• Keyboard: Handles user input from the keyboard;
• Graphics functions „ Security
(drawPixel, drawCircle, ...)
„ Communications • Memory: Handles memory operations;
• And more ...
„ And more ... • Sys: Provides some execution-related services.
27 28
Math operations (in the Jack OS) String processing (in the Jack OS))
class Math { class Math {
class Math { class Math {
class String { class String {
class String { class String {
class Array { class Array {
class Array { class Array {
class Output { class Output {
class Output { class Output {
class Screen { class Screen {
class Screen { class Screen {
class Memory {
class Memory {
Class
Class String
String {{ class Memory {
class Memory {
class Keyboard { class Keyboard {
class Keyboard { class Keyboard {
class
class Math
Math {{ constructor
class Sys {
class Sys { constructor String
String new(int
new(int maxLength)
maxLength) class Sys {
class Sys {
function (…) function (…)
function (…) function (…)
… …
function
function void
void init()
init() }
}

method
method void
void dispose()
dispose() }
}

function
function int
int abs(int
abs(int x)
x) method
method int
int length()
length()

function method
method char charAt(int
charAt(int j)
function int
int multiply(int
multiply(int x,
x, int
int y)
y) char j)

function method
method void setCharAt(int
setCharAt(int j,
j, char
char c)
function int
int divide(int
divide(int x,
x, int
int y)
y) void c)

function method
method String
String appendChar(char
appendChar(char c)
c)
function int
int min(int
min(int x,
x, int
int y)
y)
method
method void
void eraseLastChar()
eraseLastChar()
function
function int
int max(int
max(int x,
x, int
int y)
y)
method
method int
int intValue()
intValue()
function
function int
int sqrt(int
sqrt(int x)
x)
method
method void
void setInt(int
setInt(int j)
j)
}}
function
function char
char backSpace()
backSpace()
function
function char
char doubleQuote()
doubleQuote()
function
function char
char newLine()
newLine()
}}
29 30

Memory management (in the Jack OS) Memory management (simple)


class Math {
class Math {
class String {
class String {
class Array {
class Array {

• When a program constructs (destructs) an object, the OS has to


class Output {
class Output {
class Screen {

allocate
class Screen {
class Memory {
class Memory {
class Keyboard {
class Keyboard {
class Sys {
(de-allocate) a RAM block on the heap:
returns a reference to a free RAM block of size
class Sys {
function (…)
function (…)


– alloc(size):
}
} size
– deAlloc(object): recycles the RAM block that object points at
class
class Memory
Memory {{

/*
/* Returns
Returns the
the value
value of
of the
the main
main memory
memory at
at this
this address.
address. */
function
*/
„ The data structure that
function int
int peek(int
peek(int address)
address)
this algorithm manages
/*
/* Sets
Sets the
the contents
contents of
of the
the main
main memory
memory at
at this
this address
address to
to value.
value. */
*/ is a single pointer: free.
function
function void
void poke(int
poke(int address,
address, int
int value)
value)
/*
/* Finds
Finds and
and allocates
allocates from
from the
the heap
heap aa memory
memory block
block of
of the
the specified
specified size
size
and
and returns
returns aa reference
reference to
to its
its base
base address.
address. */*/
function
function Array
Array alloc(int
alloc(int size)
size)

/*
/* De-allocates
De-allocates the
the given
given object
object and
and frees
frees its
its memory
memory space.
space. */
*/
function
function void
void deAlloc(Array
deAlloc(Array o)o)
}}

31 32
Graphics primitives (in the Jack OS)
Memory management (improved) class Math {
class Math {
class String {
class String {
class Array {
class Array {
class Output {
class Output {
class Screen {
class Screen {
class Memory {
class Memory {
class Keyboard {
class Keyboard {
class Sys {
class Sys {
function (…)
function (…)


}
}

Class
Class Screen
Screen {{

function
function void
void clearScreen()
clearScreen()

function
function void
void setColor(boolean
setColor(boolean b)
b)

function
function void
void drawPixel(int
drawPixel(int x,
x, int
int y)
y)

function
function void
void drawLine(int
drawLine(int x1,
x1, int
int y1,
y1, int
int x2,
x2, int
int y2)
y2)

function
function void
void drawRectangle(int
drawRectangle(int x1,
x1, int
int y1,int
y1,int x2,
x2, int
int y2)
y2)

function
function void
void drawCircle(int
drawCircle(int x,
x, int
int y,
y, int
int r)
r)

}}

33 34

Character output primitives Keyboard primitives (in the Jack OS)


class Math {
class Math {
class String {
class String {
class Array {
class Array {
class Output {
class Output {
class Screen {
class Screen {
class Memory {
class Memory { Class
Class Keyboard
Keyboard {{
class Keyboard {
class Keyboard {
class Sys {

/* returns
returnsthe
thecharacter
characterofofthe
thecurrently
currentlypressed
pressedkey
keyon
onthe
thekeyboard;
keyboard;ififno
nokey
keyisiscurrently
currentlypressed,
class Sys {
pressed,
function (…)
function (…)
… /*
class
class Output
Output {{ …
}
} returns
returns0.*/
0.*/
function
function void
void moveCursor(int
moveCursor(int i,
i, int
int j)
j)
function
function char
char keyPressed()
keyPressed()
function
function void
void printChar(char
printChar(char c)
c)
/* waits
/* waitsuntil
untilaakey
keyisispressed
pressedon
onthe
thekeyboard
keyboardand
andreleased,
released,then
thenechoes
echoesthe
thekey
keytotothe
thescreen
screenand
and
function
function void
void printString(String
printString(String s)
s) returns
returnsthe
thecharacter
characterofofthe
thepressed
pressedkey.
key.*/*/
function
function char
char readChar()
readChar()
function
function void
void printInt(int
printInt(int i)
i)

function
function void
void println()
/* prints
printsthe
themessage
messageon
onthe
thescreen,
screen,reads
readsthe
theline
line(text
(textuntil
untilaanewline
newlinecharacter
characterisisdetected)
detected)from
fromthe
println() /* the
function
function void
void backSpace()
backSpace() keyboard,
keyboard,echoes
echoesthe
theline
linetotothe
thescreen,
screen,and
andreturns
returnsits
itsvalue.
value.This
Thisfunction
functionalso
alsohandles
handlesuser
user
backspaces.
backspaces.*/*/
}} function
function String
String readLine(String
readLine(String message)
message)

/* prints
/* printsthe
themessage
messageon
onthe
thescreen,
screen,reads
readsthe
theline
line(text
(textuntil
untilaanewline
newlinecharacter
characterisisdetected)
detected)from
fromthe
the
keyboard,
keyboard,echoesechoesthe
theline
linetotothe
thescreen,
screen,and
andreturns
returnsits
itsinteger
integervalue
value(until
(untilthe
thefirst
firstnon-digit
non-digitcharacter
character
ininthe
theline
lineisisdetected).
detected).This
Thisfunction
functionalso
alsohandles
handlesuser
userbackspaces.
backspaces.*/*/
function
function int
int readInt(String
readInt(String message)
message)
}}
35 36
Compiler architecture (front end) Tokenizing / Lexical analysis
Jack Compiler

Syntax Analyzer

Code (Project 2 )
Jack Toke- VM
Parser Gene
Program nizer code
-ration

• Remove white space


Front-end:
• Construct a token list (language atoms)
„ Syntax analysis: understanding the semantics implied by the source code
‰ Tokenizing: creating a list of “tokens” • Things to worry about:
‰ Parsing: matching the token list with the language grammar – Language specific rules:
„ Code generation: reconstructing the semantics using the target syntax. e.g. how to treat “++”
– Language specific token types:
keyword, identifier, operator, constant, ...

37 38

Jack Tokenizer Parsing


Source code
if
if (x
(x << 153)
153) {let
{let city
city == ”Paris”;}
• Each language is characterized by a grammar
”Paris”;}

Tokenizer’s output • A text is given:


– The parser, using the grammar, can either accept or
<tokens>
<tokens>
<keyword>
<keyword> if
if </keyword>
</keyword>
<symbol>
<symbol> (( </symbol>
</symbol> reject the text
<identifier>
<identifier> xx </identifier>
– In the process, the parser performs a complete
</identifier>
<symbol>
<symbol> &lt;
&lt; </symbol>
</symbol>
<integerConstant>
<integerConstant> 153
<symbol>
153 </integerConstant>
</integerConstant> analysis of the text
<symbol> )) </symbol>
</symbol>
<symbol>
<symbol> {{ </symbol>
<keyword>
</symbol> • The language can be:
<keyword> let
let </keyword>
</keyword>
<identifier>
<identifier> city
city </identifier>
</identifier> – Context-dependent (English, …)
<symbol>
<symbol> == </symbol>
– Context-free (Jack, …).
</symbol>
<stringConstant>
<stringConstant> Paris
Paris </stringConstant>
</stringConstant>
<symbol>
<symbol> ;; </symbol>
</symbol>
<symbol>
<symbol> }} </symbol>
</symbol>
</tokens>
</tokens>

39 40
A typical grammar (C/Java-like) Since the grammar rules are
hierarchical, the output generated
Parse tree
program:
program: statement;
statement;
code sample
by the statement:
statement: whileStatement
program:
program: statement;
statement; while
while (some
(some expression)
expression) {{ parser can be described in a tree- whileStatement
|| ifStatement
ifStatement
oriented data structure called
if
if (some
(some expression)
expression) || ////other
otherstatement
statementpossibilities
possibilities......
statement: whileStatement some statement;
parse tree or derivation tree.
statement: whileStatement some statement; || '{'
|| ifStatement while '{' statementSequence
statementSequence '}' '}'
ifStatement while (some
(some expression)
expression) {{
|| ////other
otherstatement
statementpossibilities
possibilities...... some
some statement;
statement; whileStatement:
whileStatement: 'while'
'while'
|| '{'
'{' statementSequence
statementSequence '}' if Input Text:
'}' if (some
(some expression)
expression) statement '('
some '(' expression
expression ')'
')'
some statement;
statement; while (count<=100) { statement
whileStatement: statement
whileStatement: 'while'
'while' '('
'(' expression
expression ')'
')' statement
statement }}
/** demonstration */ ...
while ...
while (some
(some expression)
expression) {{ count++;
ifStatement: simpleIf some
some statement;
Some compilers represent this tree
ifStatement: simpleIf statement; // ...
|| ifElse
ifElse some statement;
some statement; whileStatement
}} by an explicit data structure that is
simpleIf:
simpleIf: 'if'
'if' '('
'(' expression
expression ')'
')' statement
statement }} Tokenized: further used for
while code generation and error
reporting. Other compilers
ifElse:
ifElse: 'if'
'if' '('
'(' expression
expression ')'
')' statement
statement 'else'
'else' statement
statement (
count (including the one that we will build)
statementSequence: '' ////null,
'' null,i.e.
i.e.the
theempty
emptysequence
sequence
statementSequence:
|| statement <= represent
statement ';' ';' statementSequence
statementSequence
code sample 100 expression statement the program’s structure implicitly,
expression:
expression: ////definition
definitionofofan
anexpression
expressioncomes
comeshere
here if
) generating code and reporting
if (some
(some expression)
expression) {{ {
errors on the fly
statement;
statement; count
////more
moredefinitions
definitionsfollow
follow while
while (some expression)
(some expression) ++
statement;
statementSequence
statement; ;
statement;
statement; ...
• Simple (terminal) forms / complex (non-terminal) }}
forms if
if (some
if
(some expression)
expression) statement statementSequence
if (some
(some expression)
expression)
• Grammar = set of rules on how to construct some
some statement;
statement;
complex forms from simpler forms }}

• Highly recursive.
41 while ( count <= 100 ) { count ++ ; ... 42

Recursive descent parsing The Jack grammar


code sample
while
while (some
(some expression)
expression) {{
some
some statement;
statement;
some
some statement;
statement;
while
while (some
(some expression)
expression) {{
while
while (some
(some expression)
expression)
some
some statement;
statement;
some
some statement;
statement;
}}
}}

• If the non-terminal consists of terminal atoms


„ Top-down approach only, the routine can simply process them
„ Highly recursive • Otherwise, for every non-terminal building block
„ LL(0) grammars: the first token in the rule’s right hand side, the routine can
determines in which rule we are recursively call the routine designed to parse this
„ In other grammars you have to look non-terminal
ahead 1 or more tokens • The process will continue recursively, until all ’x’: xxappears
’x’: appearsverbatim
verbatim
„ Jack is almost LL(0) (exception: the terminal atoms have been reached and
x: x isaalanguage
x: x is languageconstruct
construct
expression). processed.
x?: xxappears
x?: appears00oror11times
times
• This approach parses the tokens stream recursively, • parseStatement()
using the nested structure prescribed by the grammar. • parseWhileStatement() x*: x appears 0 ormore
x*: x appears 0 or moretimes
times
• For each grammar rule describing a non-terminal, we • parseIfStatement() x|y: either
x|y: eitherxxor
oryyappears
appears
equip the parser with a recursive routine designed to • parseStatementSequence() (x,y): xxappears,
(x,y): appears,then
theny.y.
parse that non-termina- • parseExpression() 43 44
The Jack grammar (cont.) Jack syntax analyzer in action
Class
Class Bar
Bar {{ <varDec>
<varDec>
method
method Fraction
Fraction foo(int
foo(int y)y) {{ <keyword>
<keyword> var
var </keyword>
</keyword>
var
var int
int temp;
temp; //
// aa variable <keyword>
variable <keyword> int
int </keyword>
</keyword>
let
let temp
temp == (xxx+12)*-63;
(xxx+12)*-63; <identifier>
... <identifier> temp </identifier>
temp </identifier>
... <symbol>
... <symbol> ;; </symbol>
</symbol>
... </varDec>
</varDec>
Syntax analyzer <statements>
Syntax analyzer
<statements>
<letStatement>
<letStatement>
„ Using the language grammar, a
<keyword>
<keyword> let
let </keyword>
programmer can write a syntax analyzer </keyword>
program <identifier>
<identifier> temp
temp </identifier>
</identifier>
<symbol>
<symbol> == </symbol>
„ The syntax analyzer takes a source text </symbol>
file and attempts to match it on the <expression>
<expression>
language grammar <term>
<term>
„ If successful, it generates a parse tree in <symbol>
<symbol> (( </symbol>
</symbol>
some structured format, e.g. XML. <expression>
<expression>
This syntax analyzer’s algorithm: <term>
<term>
„ If xxx is non-terminal, output: <identifier>
<identifier> xxx
xxx </identifier>
</identifier>
’x’: xxappears
’x’: appearsverbatim
verbatim <xxx> </term>
</term>
x: xxisisaalanguage
languageconstruct
Recursive code for the body of xxx
x: construct </xxx>
<symbol>
<symbol> ++ </symbol>
</symbol>
x?: xxappears
appears00oror11times
<term>
x?: times <term>
„ If xxx is terminal (keyword, symbol, constant, or identifier) , <int.Const.>
<int.Const.> 1212 </int.Const.>
x*: xxappears
appears00orormore
moretimes
</int.Const.>
x*: times output: </term>
</term>
x|y: either
x|y: eitherxxor
oryyappears
appears <xxx>
xxx value
</expression>
</expression>
(x,y): xxappears,
appears,then
theny.y.
...
...
(x,y): </xxx>

45 46

Summary and next step


Syntax analysis (review)
Class
Class Bar
Bar {{ <varDec>
<varDec>
method
method Fraction
Fraction foo(int
foo(int y)y) {{ <keyword>
<keyword> var
var </keyword>
</keyword>
Jack Compiler var
var int
int temp;
temp; //
// aa variable <keyword>
variable <keyword> int
int </keyword>
</keyword>
let
let temp
temp == (xxx+12)*-63;
(xxx+12)*-63; <identifier>
... <identifier> temp
temp </identifier>
</identifier>
... <symbol>
Syntax Analyzer ... <symbol> ;; </symbol>
</symbol>
... </varDec>
</varDec>
Syntax analyzer <statements>
Code (Project 2 ) <statements>
Jack Toke- VM <letStatement>
Parser Gene
Program nizer
-ration
code The code generation challenge: <letStatement>
<keyword>
<keyword> let
let </keyword>
</keyword>
„ Extend the syntax analyzer into a <identifier>
<identifier> temp
temp </identifier>
</identifier>
full-blown compiler <symbol>
<symbol> == </symbol>
</symbol>
<expression>
„ Program = a series of operations <expression>
<term>
that manipulate data
<term>
<symbol>
<symbol> (( </symbol>
</symbol>
„ The compiler should convert each <expression>
<expression>
“understood” (parsed) source <term>
<term>
operation and data item into <identifier>
<identifier> xxx
xxx </identifier>
</identifier>
The code generation challenge: corresponding operations and data </term>
</term>
items in the target language <symbol>
<symbol> ++ </symbol>
</symbol>
„ Extend the syntax analyzer into a full-blown compiler that generates <term>
<term>
„ So we have to generate code for
executable VM code <int.Const.>
<int.Const.> 1212 </int.Const.>
</int.Const.>
• handling data </term>
</term>
„ Two challenges: (a) handling data, and (b) handling commands.

</expression>
</expression>
handling operations. ...
...

47 48
Handling data Symbol table

When dealing with a variable, say x, we have to


know:

• What is x’s data type?


Primitive, or ADT (class name)? Classical implementation:

(Need to know in order to properly allocate „ A list of hash tables, each reflecting a single
scope nested within the next one in the list
„ The identifier lookup works from the current
to it RAM resources) table upwards.

49 50

Life cycle Handling arrays


RAM state, just after executing
0 bar[k]=19
Java code ...
class
class Complex
Complex {{ 275 x (local 0)
...
... 276 y (local 1)
void
void foo(int
foo(int k)k) {{ Bar = new int(n)
277 4315 bar (local 2)
int
int x,
x, y; ... Is typically
y;
int[]
int[] bar; // declare
bar; // declare an
an array
array
...
... Following 504 2 k (argument 0)
handled by causing
compilation: ...
the compiler to
// Construct the array:
// Construct the array:
bar = new int[10]; 4315
generate code
bar = new int[10];
...
... 4316
bar[k]=19; affecting:
• Static: single copy must be kept alive throughout the
bar[k]=19; 4317 19
}} (bar array)
4318 bar =
...
program duration ...
...
Main.foo(2);
Main.foo(2); // // Call
Call the
the foo
foo method
method Mem.alloc(n)
...
... 4324

• Field: different copies must be kept for each object


...

• Local: created on subroutine entry, killed on exit VM Code (pseudo) VM Code (final)

• Argument: similar to local


//
// bar[k]=19,
bar[k]=19, or or *(bar+k)=19
*(bar+k)=19 //
// bar[k]=19,
bar[k]=19, oror *(bar+k)=19
*(bar+k)=19
push
push bar
bar push
push local
local 22
• Good news: the VM handles all these details !!! Hurray!!!
push k
push k push argument
push argument 0 0
add
add add
add
//
// Use
Use aa pointer
pointer toto access
access x[k]
x[k] //
// Use
Use the
the that
that segment
segment to
to access
access x[k]
x[k]
pop
pop addr
addr //// addr
addr points
points to
to bar[k]
bar[k] pop
pop pointer
pointer 11
push 19
push 19 push constant
push constant 1919
pop
pop *addr
*addr //// Set
Set bar[k]
bar[k] to
to 19
19 pop
pop that
that 00

51
Handling objects: memory allocation Handling objects: method calls
Java code Java code
Translating x.mult(5):
class
class Complex
Complex {{ class
class Complex
Complex {{
//
// Properties
Properties (fields):
(fields): //
// Properties
Properties (fields):
(fields):
• Can also be viewed as
int
int re;
re; //// Real
Real part
part int
int re;
re; // // Real
Real part
part mult(x,5)
int
int im; // Imaginary part
im; // Imaginary int
int im; // Imaginary part
im; // Imaginary
• Generated code:
part part
...
... ...
...
/**
/** Constructs a new Complex object.
Constructs a new Complex object. */*/ /**
/** Constructs a new Complex object.
Constructs a new Complex object. */
*/
public
public Complex(int
Complex(int aRe,
aRe, int
int aIm)
aIm) {{ public
public Complex(int
Complex(int aRe,
aRe, int
int aIm)
aIm) {{
re = aRe; re = aRe; //
// x.mult(5):
x.mult(5):
re = aRe; re = aRe;
im Following im
im == aIm;
im == aIm;
aIm; aIm; push
push xx
}} compilation: }} push
push 55
...
... ...
...
}} }} call
call mult
mult
//
// The
The following
following code
code can
can be
be in
in any
any class:
class:
public
public void
void bla()
bla() {{ class
class FooFoo {{
Complex
Complex a, a, b,
b, c;
c; ...
... General rule: each method call
...
... public
public void
void foo()
foo() {{
Complex foo.bar(v1,v2,...)
aa == new
new Complex(5,17);
Complex(5,17); Complex x; x;
bb == new
new Complex(12,192);
Complex(12,192); ...
...
xx == new
can be translated into
...
... new Complex(1,2);
Complex(1,2);
cc == a;
a; //// Only
Only the
the reference
reference is is copied
copied foo = new ClassName(…) x.mult(5);
x.mult(5); push foo
...
... Is typically handled by ...
... push v1
}} }}
causing the compiler to }} push v2
generate code affecting: ...
foo = Mem.alloc(n) call bar

53 54

Generating code for expressions


Program flow
push
push xx
push
push 22
push
push yy
push
push zz
x+g(2,y,-z)*5 Syntax Code neg
x+g(2,y,-z)*5 analysis generation
neg
call
call gg
push
push 55
call
call mult
mult
For a stack-based target platform, add
add
we need to print the tree in postfix
notation, also known as Right Polish
Notation (RPN). Ex. f(x,y) => x,y,f =>
(in VM) push x, push y, call f

The codeWrite(exp) algorithm (based on recursive post-order traversal of the underlying


parse tree:

55 56
Perspective
Final example • “Hard” Jack simplifications:
– Primitive type system
– No inheritance
– No public class fields (e.g. must use r=c.getRadius() rather than
r=c.radius)

• “Soft” Jack simplifications:


– Limited control structures (no for, switch, …)
– Cumbersome handling of char types (cannot use let x=‘c’)

• Optimization
– For example, c++ will be translated into push c, push 1, add, pop c.
– Parallel processing
– Many other examples of possible improvements …

57 58

You might also like