Professional Documents
Culture Documents
Programming and Data Structures
Programming and Data Structures
Programming and Data Structures
The first part may seem dull but we have not seen another way than coming from a
known domain, totem or taboo, the mathematics. This part does not need to be learn by
heart, as the absence of any exercise may suggest, but it is useful for anyone who
wondered, one day, but how does it work?
Which is not the case of the second part which contain the main data structures used in
the programming activity. All the exercises must be done and tested, if the reader have a
computer, before checking the answers as a simple poke reveal the ‘key’ of the program.
The reader must do these ‘body building’ exercises before trying to understand the third
part.
In fact, the third part contains problems taking us months and years to find the most
efficiency solution. Some are not yet resolved, listed in the first page, but missing in the
solution part. A lot of them contain affirmation without demonstration leading to false
programs. Some have been tested on typical or random data, which does not mean
anything, as Dijkstra had said: ‘testing a program does not mean that it is true, only that it
may be false’. The reader who has comment on these programs can write to
rakotoarisoatahiana@yahoo.fr. Answers are granted for those who have also an e-mail
account, and who are patient enough as we check rarely our e-mails, time is money…
PART ONE. COMPUTER ARCHITECTURE
A. PHYSICAL ARCHITECTURE
Numeration
The numeration problem is to write any number with a limited symbols, the digits 0, 1, 2,
…9, A, B, C, …. A number, noted unun-1...u1u0,u-1u-2...um is a polynomial in a, a
being the numeration basis
unun-1...u1u0,u-1u-2...um
= un*an+un-1an-1+...u1*a+u0+u-1*a-1+u-2*a-2+...uma-m
The number 453,7 is written in decimal. To write it in basis 7, we search the number x as
7x 453,7 < 7x+1. x is the logarithm basis 7 of 453,7.
453,71010 = 1215,46277
If the number is an integer, we can divide it by the basis, the first remain is the least
significant digit. The last remain is the most significant digit as the polynomial above can
be rewritten
xn = (unan-1 + un-1an-2 + ... u1) a + u0 = xn-1 a + u0
xn-1 = (unan-2 + un-1an-3 + ... ) a + u1 = xn-2 a + u1
...
x3 = (una2 + un-1 a + un-2) a + un-3 = x2 a + un-3
x2 = (una1 + un-1) a + un-2 = x1 a + un-2
x1 = (un) a + un-1
453 ÷ 7 = 64 remain 5
64 ÷ 7 = 9 remain 1
9 ÷ 7 = 1 remain 2
1 ÷ 7 = 0 remain 1
45310 = 12157
The basis is always in decimal. When not mentioned, the basis is 10.
Bit
A bit (binary digit) can take only two value noted 0 and 1 or false and true. There are
four functions having the form
f : B B where B = { 0, 1 }
x y
x 0 1
f0(x) 0 0 constant 0
f1(x) 0 1 identity function
f2(x) 1 0 ¬ x (not x)
f3(x) 1 1 constant 1
f : A B
where A B is the cartesian product of the two sets. Then, there are 16 (2(22)) functions
of the type
f : B B B
x1 0 0 1 1 x1 0 0 1 1
x2 0 1 0 1 x2 0 1 0 1
f0(x1,x2) 0 0 0 0 constant 0 fF(x1,x2) 1 1 1 1 constant 1
f1(x1,x2) 0 0 0 1 x1 x2 (x1*x2) fE(x1,x2) 1 1 1 0 ¬ (x1 x2)
f2(x1,x2) 0 0 1 0 x1>x2 (0=no) fD(x1,x2) 1 1 0 1 x1≤x2 (x1x2)
f3(x1,x2) 0 0 1 1 x1 fC(x1,x2) 1 1 0 0 ¬ x1
f4(x1,x2) 0 1 0 0 x1 < x2 fB(x1,x2) 1 0 1 1 x1≥x2
f5(x1,x2) 0 1 0 1 x2 fA(x1,x2) 1 0 1 0 ¬ x2
f6(x1,x2) 0 1 1 0 x1≠x2 f9(x1,x2) 1 0 0 1 x1=x2 (x1x2)
f7(x1,x2) 0 1 1 1 x1 x2 (x1+x2) f8(x1,x2) 1 0 0 0 ¬(x1 x2)
bit : type ;
cBit = 0 | 1 ;
The program
x : bit ;
x := 1 ;
Natural integer
Most of electronic computers use the binary numeration as it is easy to distingue only two
value. For example, 0 can mean no current and 1, a current is passing. As the integrated
circuits are burned once for all on a limited space, we must also use limited digits for a
number. With three bits we can use eight numbers from 0 to 7
0 0 0 0
0 0 1 1
0 1 0 2
0 1 1 3
1 0 0 4
1 0 1 5
1 1 0 6
1 1 1 7
With n bits, we can use the numbers from 0 to 2n-1. Generally, we use 8, 16, 32 or 64
bits. We suppose that there is the type
cNat = cDigit,*(_|cDigit) ;
cDigit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
A | B | C | D | E | F ;
The program
copy the value of a natural integer into another natural integer. The virtual type
nats : vtype ;
// n [1..64], nats := nats | nat n ;
allow the above program to accept any natural integer with any number of bit.
n : nat ;
n := 123_456 ;
We use the same operator := to copy a bit or a natural integer. But the type of the
argument allows the compiler to know which program to use. We say that the := operator
is overloaded. We can copy a natural integer into another natural integer with a different
number of digits
n64 : nat 64 ;
n32 : nat 32 ;
n64 := 1 ;
n32 := n64 ;
An error is raised if the value of x64 exceeds 232. To enter number in basis other than 10,
we use the modes
A mode is a program like another, but it can be called everywhere, between a literal and
an operator. Our goal is to behave like some pocket calculator where you can stroke the
bin or hex key everywhere, to change the current numeration basis
n64 := bin 1001_0011 ;
Relative integer
0 1 1 0 1 0 0 0
absolute
sign
If the circuits for natural integers are also been used for relative integers, the result is
false
1 1 1
0 1 1 0 6
+ 1 0 1 0 + (-2)
= 0 0 0 0 ?
We see that we need to add the carry to the result. We prefer to add a 1 to the negation
1
2 0 0 1 0
0 1 1 0 6
(¬2) 1 1 0 1
+ 1 1 1 0 + (-2)
(¬2)+1 1 1 1 0
= 1 1 0 0 4
We can no more use the same circuits for natural integer to compare relative integer. We
consider that there is the type
Example
i : int 63 ;
i := hex -123_456 ; // i = 1_193_046
Any expressions where all arguments are constants are evaluated during the compilation.
Fixed real
We can still use the circuits for integers to compute with real numbers. We need only to
multiply the number with a fixed power of two (e.g. 216) before storing the real number
and multiply by the same power of two before displaying.
0 1 1 0 1 0 0 1 1 0 1 0 0 0 0 0
;
sign integer part fractional part
fixe(nat8,nat8) : type ;
// fixe(n1,n2) = bit + (n1-n2)*bit + n2*bit
fixes : vtype ;
// n1 [1..63], n2 [1..n1[, fixes := fixes | fixe(n1,n2) ;
// fixes := fixes | ints ;
A fixed literal is formed by a natural integer literal, a point, and a natural integer literal. If
the digits before or after the point are nulls, they can be omitted
American use a point where French use a comma and they don’t like to put a zero before
the point. The program
f64 : fixe(31,32) ;
f64 := hex 123.456_789 ; // f64 291,271_111_071_109_772
We can use an integer to store a fixed real: we need only to multiply the number by a
power of the numeration basis before storing, and divide it by the same power before any
display.
Floating real
Fixed real doesn’t allow to use very great number. As we must always use a limited
number of bits, we keep only the most significant bits
These most significant digits are called the mantissa and the power of two (which must
be stored also) is the exponent. If we use only one digit before the point, the number is
called normalized floating point.
0 1 1 0 1 0 0 1 0 1 1 1 0
;
mantissa exponent
The program
( reals := reals ) : program ;
x : real ;
x := hex 123.456_789*10^A ; // r80 = 4.886_718_345*109
Unary functions are evaluated first, then exponentiation, division and multiplication,
addition and subtraction, then comparison. The following instructions are equivalent
x := 1 < 2 + 3 * 4 ^ ln 2 ;
x := ( 1 < ( 2 + ( 3 * ( 4 ^ ( ln 5 ) ) ) ) ;
Character
A character is used to store the 26 letters (52 if we want minuscule and majuscule), the 10
digits, and other symbols (+, -, *, /, =, <, >, :, ., !, ?, etc.). We must use at least 6 or 5
bits as 26 + 10 = 36 and 26 = 64. The ASCII (American Standard Code for Information
Interchange) use 8 bits whose 128 first characters are
0 1 2 3 4 5 6 7 8 9 A B C D E F
1 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
2 SP ! " # $ % & ’ ( ) * + , - . /
3 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4 @ A B C D E F G H I J K L M N O
5 P Q R S T U V W X Y Z [ \ ] ^ °
6 ` a b c d e f g h i j k l m n o
7 p q r s t u v w x y z { | } _ DEL
Unicode uses 16 bits whose 128 first characters are like the ASCII. We suppose that there
is the type
The program
c : char ;
c := "T" ;
And/or operator
There is a simple way to convert any boolean function into an expression using only the
and, or and not operator. First, we consider the results of the function
0 1 2 3 4 5 6 7
a 0 0 0 0 1 1 1 1
b 0 0 1 1 0 0 1 1
c 0 1 0 1 0 1 0 1
f(a,b,c) 1 0 1 0 0 0 1 0
We consider only the column where the result is 1 (if there are more 0 than 1, we
consider the column where the result is 0 and negate the final expression). The 2nd
column means, for example, that f(a,b,c) is true if a is false, b is true and c is false. In
other words, f(a,b,c) is true if a is true, b is true and c is true. The 2nd column is
equal to
(a)*b*(c) = (0)*1*(0) = 1
(a)*(b)*(c) = (0)*1*(0) = 1
a*b*(c) = (0)*1*(0) = 1
Cabled computer
In a cabled binary computer, each variable is carried by a wire, and can have only two
value. 1 is represented by a 5V current and 0 by 0V. To build the not operator, we
suppose that an electromagnet can push an interrupter if there is a current
a a
non a
a = 1 ; not a = 0 a = 1 ; not a = 0
a
not a
a b
a and
6 b
a or b
b
a b 6
a b c
By using these three operators, we can have complex functions. If the functions must be
modified, we need to rebuild all the circuits.
Programmed computer
With a programmed computer, the circuits are burned once for all. As we have a limited
space on the chips, we have also a limited number of operators. Each operator has a code
(for example, 0 = addition, 1 = multiplication, etc.). By sending these codes on the
operator bus, we can select the right operator
instruction
bus
¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
data
+ * - /
result
0000 0001 0010 0011
Generally, the instruction code has 8 or 16 bits. The data and result are stored in a
“memory”, a set of bits side by side. Each bit (or byte, 8 bits) has an identifier (an
address), which is a simple natural integer
address
bus
¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
data
bus
processor 1 0 0 1
Generally, an address is formed by 16, 24, 32 or 64 bits and the data bus can have 8, 16,
24, 32 or 64 bits. An instruction or data greater than the data bus must be read in two or
more steps.
Assembler instruction
Some processor has a special memory called accumulator register where a data and the
result is stored. Such processor can execute the following instruction
– load x: load the data x in the accumulator
– load_at a: load the data at the address a to the accumulator
– add x: add the data x with the accumulator and put the result in the accumulator
– add_at a: add the data at the address a with the accumulator and put the result in
the accumulator
– store a: copy the accumulator to the memory at the address a
We need also to indicate the number of bits to load or to store
– load_int8 x: load an integer of 8 bits into the accumulator
– load_int8_at a: load an integer of 8 bits at the address a into the accumulator
– load_int16 x: load an integer of 16 bits into the accumulator
– load_int16_at a: load an integer of 16 bits at the address a into the
accumulator, etc.
As a binary computer use only 0 and 1, we must also use 0 and 1 for the instruction code.
Yet, we use a hexadecimal code and a program called an “assembler” transforms these
hexadecimal codes into a binary code. For example, load_int16_at 275 becomes
A1 0000 0113: this instruction has 8 bytes, 1 for the code and 4 for the address
A1 00 00 01 13
code parameter
We can even use words for the instruction code and decimal notation for the data.
Assembler program
A program is a set of instructions placed side by side in the memory. At the booting time,
the processor executes the instruction at the address 0. After that, it increment a special
register called “instruction pointer” by a number depending on the length of the last
instruction. Then the processor load and execute this new instruction and so on. Example:
compute the sum of the integers 16 bits at the address 966 with the number 4205 and put
the result at the address 747
0 load_int16_at ( 966 ) ;
5 add_int16 ( 4205 ) ;
10 store_int16 ( 747 ) ;
The number on the left are the address of each instruction. The processor have also a
special memory called “state register” with 8 or 16 bits (flags) where it store the result of
comparison and multiplication or division by two (shifting)
– comp x: compare x with the accumulator and put the result in two flags
– comp_at a: compare the data at the address a with the accumulator and put the
result in two flags
– shift: shift the bits of the accumulator to the left and put the most significant bit
in a flag
– goto a: go to the instruction at the address a whatever may come
– jump_eq a: jump at the address a if the last comparison give a equal result, else
continue with the following instruction, etc.
Example: compute the absolute value of the integer at the address 966 and put the result
at the address 747
0 load_int16_at ( 966 ) ;
5 comp_int16 ( 0 ) ;
10 jump_lt ( 16 ) ;
15 neg ;
16 store_int16 ( 747 ) ;
As we can see, indicating the address of each data or instruction is very hard. We use
names and labels to identify them
x = 966 ;
y = 747 ;
0 load_int16_at ( x ) ;
5 comp_int16 ( 0 ) ;
10 jump_lt ( lbl2 ) ;
15 neg ;
16 lbl2 : store_int16 ( y ) ;
Generally, a “memory manager” can find free area to store the data and instructions. Our
goal is to write instructions like those in a mathematic formula, a program called an
“translator” transform these formulas into a set of 0 and 1. There are two types of
translator
– an interpreter transform a line of instruction and execute it without storing the
executable code
– a compiler transform a whole “source” program and store the result into an
“executable file”
An interpreter is very fast if the program must be modified frequently, but a compiler is
very fast if a set of instruction must be repeated.
B. LOGICAL STRUCTURE
Control structures
x : real ;
x := 1 ;
(bit then *it *(if bit then *it) else *it end) : program ;
evaluate the first bit. If it is true, the first instruction list is executed and the
program is ending. Else the second condition is evaluated (if there is any). If none
of conditions is true, the last instruction list is executed
d := b^2 - 4*a*c ;
d < 0 then
n := 0
if d = 0 then
n := 1 ;
x1 := -b/(2*a)
else
n := 2 ;
x1 := (-b+sqr d)/(2*a) ;
x2 := (-b-sqr d)/(2*a)
end ;
x < 0 then
y := - x
else
y := x
end ;
y := x ;
x < 0 then
y := - x
end ;
y := 1 ; /* y = a^b */
i := 1 ;
do i = b exit
y := y * a ;
i := i + 1 ;
end ;
y := 1 ;
i :: [1..b]/1 do
y := y * a
end ;
As the step is equal to 1, we can also write
y := 1 ;
i :: [1..b] do
y := y * a
end ;
Expression
A literal expression is
– a real literal
1.234
– a string literal
" alpha "
– a record literal
( "+" ; 123 )
– a range literal
[1..5[
– an identifier
c.x
– an identifier, a space (eventually) and an expression (prefixed notation)
ln x
– an expression, a point and a word (postfixed notation)
x.ln
– an operator and an expression
- x
– an expression, an operator and an expression (infixed notation)
1 + 2
– expressions separated by words and enclosed by two words (eventually)
c then a else b end
cExpr = cFixe |
cString |
cRec |
cId |
cId,( ),cExpr |
cExpr,.,cId |
cOp,cExpr |
cExpr,cOp,cExpr |
(cWord),cExpr,(cWord,cExpr),(cWord) ;
The program
y : real ;
y := e ;
Function
tan : R R
x sin x / cos x
becomes
x :/ real ;
tan real : fn real ;
tan x ::= sin x / cos x ;
This function work on every type with the sin, cos and / functions, we can write
t :/ type ;
x :/ t ;
tan x ::= sin x / cos x ;
Such functions are called generic. With function that cannot be defined by a simple
expression, we must define the assignation. Example: the absolute value
t :/ type ; x :/ t ; y :/ t ;
y = abs x :- (
x < 0 then
y := - x
else
y := x
end
) ;
If the function has only one parameter, we can use an operator rather than a word
t :/ type ; x :/ t ;
°x ::= x*180/pi ;
If the function has only two parameters, we can put the operator between them. Example:
addition of two complex numbers
t :/ type ;
xy t ::= sum ( x : t ; y : t ) ;
(x1, x2) :/ 2*xy t ;
(x1 + x2) ::= (x1.x+x2.x, x1.y+x2.y) ;
Program
When a result is also a data or when there are no parameters at all, we use a program
program : type ;
cProg = *it ;
The program
copy the value of a program into another program. If a program has parameters, they
must be declared before initializing the program. There is no way to distinguish a data, a
result or a data–result. Example:
t :/ type ; x :/ t ;
a :/ t ;
add ( a ; x ) ::= (
x := x + a
) ;
If the program has only two parameters, we can use an operator between them
t :/ type ; x :/ t ;
a :/ t ;
( a :=+ x ) ::= (
x := x + a
) ;
Instruction
it : type ;
An instruction literal is
– an identifier
stop
– an identifier and an expression
new p
– an expression, an operator and an expression
x := 1
– expressions, separated by words and between two words (eventually)
x then a1 else a2 ; a3 end ;
cIt = cId |
cId,cExpr |
cExpr,cOp,cExpr |
(cWord),cExpr,(cWord,cExpr),(cWord) ;
Address
An address (or a pointer) is a natural integer identifying a bit (or a byte) of the memory
If the memory has n bytes, an address must have at least log2 n bits. Generally, we use
16, 24, 32 or 64 bits. The program
search a free part of the memory able to hold the given type. The pseudo–function