Professional Documents
Culture Documents
Revision Questions
Revision Questions
Page 1 of 32
David Smith
Question 1:
a) The Machine Abacus represents a Virtual machine.
i. What is meant by a Virtual Machine
ii. Write a program in the language of Abacus to add two numbers that are found in memory
locations 1 and 2, the result of the computation should be found in location 0.
iii. Trace the execution of your program for memory initialised as [0,2,1,0,0,].
b) Using the language Haskell,
i. Write a recursive definition for the function add.
ii. Using your definition evaluate the expression add 2 3.
c) With the aid of the programs that you have written in parts a) and b), compare and contrast the
virtual machines presented by the imperative language Abacus with that of the declarative language
Haskell.
d) By choosing suitable predicates, represent the following assertions in Prolog.
i. Tom is Susans father. Jenny is toms mother. All fathers are male and all mothers are
female.
ii. By using predicates introduced above, write down the definitions for the predicates
parentOf(X,Y) read as X is the parent of Y and ancestorOf(X,Y) read X is the ancestor
of Y.
iii. Write down the goal clauses to determine
All the females
All the parents
Question 2:
a)
i. Write down a Regular grammar to generate the Language of strings of alternating 0s and 1. For
example 0,1,01,10,010,101,0101,1010, all belong to the language whereas 00,10100, 11010
do not.
ii. Construct a FSM to accept such a language.
iii. Trace your FSM for the string 10101.
b)
i. Write down a grammar to generate the Language L = {0n1n : n > 0}. Explain why a FSM cannot
accept this language.
ii. Construct a program for a suitable Abstract machine that accepts L clearly explaining the
machines architectural features.
iii. Trace your program for the string 00011.
Question 3:
a)
A game consists of two towers, tower A and tower B and a stock of eight discs. Tower A may
hold at most 5 discs whereas tower B may hold at most 3 discs. Starting with both towers empty, the
object of the game is to end up with exactly 2 discs on tower B.
The following diagram depicts a state of a game with two discs on Tower A and no discs on tower B.
Tower A
Tower B
Page 2 of 32
David Smith
The rules of the game are described as follows
Stacking Rules
You may completely fill up either tower by adding a suitable number of discs from you stock.
For example if tower A contains 2 discs (as shown) then you can add exactly three discs from your
stock to tower A.
Emptying Rules
You may remove all the discs from either tower.
For example you can remove the two discs from tower A.
Transfer Rules
You may transfer discs from one tower to another with the following proviso
You must move as many discs as possible from a source tower to the destination tower, leaving any
remaining discs on the source tower.
For example if tower A holds 5 discs and tower B holds 0 discs then we can move three of the discs
from tower A (leaving two) and place these on tower B.
If tower A holds two discs then both of these can be moved to tower B.
a)
By representing the state of the game as a pair (x,y) where x denotes the number of discs on
tower A and y denotes the number of discs on tower B, write down the rules of the game as a
production (state transition) system.
b)
Draw a complete search graph to depict all the states that may be reached from the initial state
(0,0) by applying the rules.
c)
Write down a breadth first algorithm to search the state space from a given initial state to a
goal state. The algorithm should return true if the goal state may be reached by applying the rules of the
game and false otherwise.
d)
Modify your answers to parts a), b) and c) above to determine the sequence of moves to arrive
at the goal state from the target state.
Question 4:
a) The following chip takes as input three binary values x1, x2 and x3 and outputs (y) 1 if two or more
of the inputs are set to 1 and 0 otherwise.
x1
x2
x3
Page 3 of 32
David Smith
Question 5:
i. Describe the Architectural features of a Turing Machine.
ii. Write a program for a Turing Machine that adds 1 to a number. Describe how such numbers are
to be represented in your machine.
iii. Explain what it is meant by the Church-Turing Thesis.
iv. Explain what is meant by the Halting Problem.
v. Show that if the halting problem is solvable then an algorithm can be written to determine
whether the equation xn = nx+1 has a solution for given a value of x.
For example
when x = 0 a solution to the equation is n = 1
when x = 1 a solution to the equation is n = 1
when x = 2 there is no solution to the equation
That is given that you have an algorithm to solve the halting problem, you are required to write
an algorithm for the function f(x) such that
f(x) = 1 if there is an n such that xn = nx+1
f(x) = 0 otherwise
Question 6:
a)
The polymorphic data type Tree may be defined recursively as follows
data Tree a = Leaf | Node (Tree a) a (Tree a)
i. Write down the constructors of the data type Tree.
ii. Explain what is meant by the term polymorphic in the above context.
iii. A form of polymorphism is implemented in OO languages such as Java by overloading. What is
meant by overloading and give an example that you have met.
iv. Express the following tree using the data structure defined above
Leaf
12
Leaf 11
Leaf
Leaf
Leaf
Write algorithms to
v. Determine whether a given item isOn a given tree.
vi. Determine the sum of the elements on a tree of numbers.
vii. Determine the greatest element on an ordered Tree
b)
A stack is a very important data structure found in computer science.
i. Write down the Four2 operations (methods) that are associated with a Stack identifying their
nature.
ii. Write a Java Class to realise a polymorphic Stack (the data type is Object).
Strictly there are Five if you include the (constant) empty stack.
Page 4 of 32
David Smith
iii. Write a Recursive definition in Haskell of a polymorphic Stack and implement the four
operations that you have identified earlier.
Question 7
a) Write down the time-complexity of each of the following algorithms and depict each complexity by
the way of a graph stating any assumptions that you make.
i)
isIn x [ ]
isIn x (y:ys)
| x == y
| otherwise
= False
= True
= isIn x ys
ii)
data Tree * = Leaf | Node (Tree *) * (Tree *)
isOn x Leaf
isOn x (Node lt y rt)
| x == y
|x<y
| otherwise
= False
= True
= isOn x lt
= isOn x rt
b)
Describe what is meant by the Knapsack problem and with the aid of a suitable example show how the
complexity of this problem may be used to advantage when transmitting encrypted data.
You should describe the coding and decoding techniques for a simple example of your own choice.
Question 8:
a)
With the aid of a suitable diagram, describe the architecture of a simple perceptron (threshold
logic unit).
b)
With the aid of a simple example describe the type of problems that simple perceptrons may
be trained to solve.
c)
For example, the representation of the digit 1 is given in the diagram below.
State a suitable data structure to represent such a grid and specify the value of such a structure for the
representation of the digit 1 (as shown).
d)
With the aid of a diagram, design a suitable neural network that may be trained to recognise
the ten digits 0,1,2,3,4,5,6,7,8,9.
e)
Well trained neural networks are resilient to noise, explain what this means in the context of
this example.
Page 5 of 32
David Smith
Question 9:
a) You are given a bag containing at least one coin. One coin in the bag is counterfeit. All other coins,
if there are any, are good and identical. The counterfeit coin weighs less than the good coins.
You have a balance which allows you to compare the weights of coins.
Write an O(n) algorithm which uses the balance to identify the counterfeit coin. Your algorithm must
be correct, it need not be efficient, it can be recursive or iterative. Remember to include any
preconditions.
b) Now devise another algorithm to do the same thing, but this time the algorithm must have a time
complexity better than O(n). It is not acceptable to give a second O(n) algorithm. Assume that the
balance can hold any number of coins on either side.
c) For both your algorithms what is the worst case (greatest number of weighings necessary) if there
are 8, 256, 1024 coins in the bag.
Question 10:
A travelling salesman wishes to make a journey from his home town (H) to visit three towns, Andover
(A), Basingstoke (B) and Croydon (C) and finally returning home. He visits each town only once. His
journey (J) may be represented as a sequence (list) of towns visited. In particular, [H,A,B,C,H]
represents the journey starting from his Home town, calling first at Andover then Basingstoke then
Croydon and finally returning home. The total distance (d) that he travels is simply the sum of the
distances between the towns that he visits (see the table below).
We may represent the state of his journey by the 3-tuple (J,d,T) where T represents the set of towns to
be visited.
For example, the 3-tuple ([H],0,{A,B,C}) represents the initial state and the 3-tuple
([H,A,B,C,H],125,{}) represents one of many final states (in this case the distance travelled for this
completed journey is 125 miles).
a)
Based on the above representation, write an uninformed (blind) search algorithm to generate
the different possible journeys the salesman may take.
b)
Construct the search tree for all the possible different journeys that the salesman can make.
c)
d)
i)
ii)
iii)
Define a suitable heuristic that will lead to a more informed search algorithm
Write down the complexity of the search algorithm using your heuristic.
Comment on the suitability of your heuristic to solve the problem.
distance in miles between towns
H
10
15
45
10
20
35
15
20
50
45
35
50
Page 6 of 32
David Smith
Question 11:
The language L() may be defined over the alphabet {(,)} by the following phrase structured grammar
<S> ::= ( ) | (<S>) | <S><S>
a) Draw a parse tree for the sentence ( )(( )) of L L().
b)
i)
ii)
c) Tabulate a trace of your program to show how it accepts the sentence ( )(( )).
Question 12:
a) Describe the main architectural features of the 8-bit machine Abacus - that you have studied. Your
answer should include how programs and data are stored in the machine and the constraints imposed by
a fixed word length.
Using the instruction set, write a program to output the largest of two numbers stored in location 1 and
location 2 assume that the memory space is initially [0,2,5,0,0..].
b) Languages can be classified according to the Chomsky hierarchy the most important of which in
computer science are type 3 and type 2 grammars. Give an example of each of these two grammars and
describe the type of machines that can act as acceptors for each of the grammar types.
c) Explain the purpose of compilation and outline the four major steps in this process.
Page 7 of 32
David Smith
Answers3:
Question 1:
a)
i. A Virtual Machine is a machine that is simulated by another (virtual or real). In particular, the
virtual Machine Abacus is implemented in the programming language Java that runs on your computer
(a real machine).
It is often instructive to write comments to describe the function of each program segment
ii.
clr 0
dec 1
beq 3
inc 0
jmp -3
dec 2
beq 3
inc 0
jmp -3
stop
iii.
The trace is for the above program running on Abacus with the memory initially set as [*,2,1,]. The
pc and sr are initially set to 0. I have included the step number and the next instruction to be executed
(as given by the value of the pc). Note that sr is only set if there is an attempt to decrement a memory
location that contains the value 0.
step
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
pc
0
1
2
3
4
1
2
3
4
1
2
5
6
7
8
5
6
9
10
sr
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
memory address
0
1
2
*
2
1
0
2
1
0
1
1
0
1
1
1
1
1
1
1
1
1
0
1
1
0
1
2
0
1
2
0
1
2
0
1
2
0
1
2
0
0
2
0
0
3
0
0
3
0
0
3
0
0
3
0
0
3*
0
0
instruction to be executed
clr 0
dec 1
beq 3
inc 0
jmp 3
dec 1
beq 3
inc 0
jmp 3
dec 1
beq 3
dec 2
beq 3
inc 0
jmp 3
dec 2
beq 3
stop
* 3 output by virtual machine
b)
i. There are many such definitions here is one of them the recursion is on the first argument
add 0 n
add m n
=n
= 1 + (add (m-1) n)
-- rule 1
-- rule 2
ii.
add 2 3 2 1 + (add 1 3) 2 1 + (1 + add 0 3) 1 1 + 1 + 3 arithmetic 5
I have included additional material in these answers. Such complete answers (indicated by green)
would not be required to obtain full credit in an examination.
Page 8 of 32
David Smith
Note that I have shown the application of each rule b an arrow, the arrow 2 should be read as
evaluates to by rule 2.
c)
There are a number of points that you should make.
Abacus programming is based upon the underlying computer architecture presented by the virtual
machine. In particular, the programmer has to be concerned with memory addresses and memory
addressing (there is a difference!) and the constraints imposed upon a fixed word length. There are no
high level features such as mnemonics used for addresses and the introduction of names for methods
(called macros in this context). For example it would be nice to be able to define a macro copy(n,m)
that copies memory location n to m and a macro add(n,m) that adds location n to m. Assembly
languages, in general, do support such features and contain code similar to the following
copy(1,0)
add(2,0)
stop
macro add(n,m)
loop: dec n
beq end
inc m
jmp loop
end:
macro copy(n,m)
clr m
add(n,m)
An Assembler (the compiler for an assembly language) would then perform textual substitutions for the
code found in each macro giving
clr 0
add(1,0)
add(2,0)
stop
which in turn expands further to
clr 0
loop1: dec 1
beq end
inc 0
jmp loop1
loop2: dec 2
beq end
inc 0
jmp loop2
stop
and finally replacing the loop mnemonic in the jmp instructions by -3 resulting in our original
program.
In contrast a language such as Haskell does not require knowledge of the underlying computer
architecture. Its computational semantics is based on a production system that is expressions are
evaluated according to the rules given by the definitions of the functions. Functional languages rely
heavily on the use of lists and recursion.
Page 9 of 32
David Smith
d)
We introduce the named predicates
father(X,Y)
read as X is the father of Y
mother(X,Y)
read as X is the mother of Y
male(X)
read as X is male
female(X)
read as X is female
i)
father(tom,susan).
mother(jenny,susan).
male(X) :- father(X,_).
female(X) :- mother(X,_).
ii)
parentOf(X,Y) :- father(X,Y);mother(X,Y).
ancestorOf(X,Y) :- parentOf(X,Y).
ancestorOf(X,Y) :- parentOf(X,Z),ancestorOf(Z,Y).
iii)
Goal clauses are
female(X).
parentOf(X,_).
Notes:
The underscore is an unnamed variable it is required when we write the definitions of (some
of) the predicates or when expressing (some of) the goal clauses.
:should be read as provided that
;
should be read as or
,
should be read as and
All clauses must be terminated by the period (.)
All named variables start with an upper case letter (X and Y are used in the above program)
all other names must start with a lower case letter.
The ancestor clause is analogous to the above clause in Blocks World.
There are other features supported by Prolog these will not be examined.
Page 10 of 32
David Smith
Question 2:
There are many ways to define this language here is one of them.
<S> ::= 0 | 1 | 0<W> | 1 <Z>
<W> ::= 1 | 1<Z>
<Z> ::= 0 | 0<W>
You can either draw a table or draw a diagram. The following table seems to reflect the grammar
given above
State
start
start
start
start
one
one
zero
zero
Input
0
0
1
1
1
1
0
0
Next State
accept*
one
accept*
zero
accept*
zero
accept*
one
Input
0
1
1
0
Next State
one*
zero*
zero*
one*
Input
1
0
1
0
1
Next State
zero
one
zero
one
zero
The machine has read the last character and is in state one hence it accepts the string 10101
Now for the string 110
State
Input Next State
start
1
zero
zero
1
???
The transition is not defined from state zero if the input is a 1 hence the FSM fails and thus 110 is not
accepted as a string of the language.
b)
i.
<X> ::= 0<X>1 | 01
This language cannot be accepted by a FSM since it needs to count the number of leading zeros in the
input string this may only be achieved by associating a state with each number this number is not
bounded and hence is not achievable in a Finite State Machine.
Page 11 of 32
David Smith
iii. The grammar is Type 2 (or context free). Thus the language may be accepted by a PDA. A
program for this is
State
start
start
last
Input
0
1
1
Current Stack
*
0:*
0:*
New State
start
last
last
New Stack
0:*
*
*
The PDA accepts by empty stack that is the PDA accepts a string of tokens provided that there is a
sequence of transitions that on reading the last token, the PDAs stack becomes empty.
The question also asks you to describe a PDA you should emphasise that it consists of a FSM and an
unbounded Stack. You should describe the methods associated with a Stack. See the lecture notes for a
fuller description4.
iii. The trace for the string 00011 is as follows
State
start
start
start
start
start
Input
0
0
0
1
1
Current Stack
[]
0:[]
0:0:[]
0:0:0:[]
0:0:[]
New State
start
start
start
last
last
New Stack
0:[]
0:0:[]
0:0:0:[]
0:0:[]
0:[]
We have reached the end of the input string and the machines stack is not empty this string is
therefore not accepted by the PDA and hence does not belong to the language.
You wont get any credit by referring the examiner to the lecture notes sorry.
Page 12 of 32
David Smith
Question 3:
a)
Stacking Rules:
(x,y)
(5,y)
(x,y)
(x,3)
s.1
s.2
Emptying Rules:
(x,y)
(0,y)
(x,y)
(x,0)
e.1
e.2
Transfer Rules:
(x,y)
(x,y)
(x,y)
(x,y)
(x+y,0)
(5,x+y-5)
(0,x+y)
(x+y -3,3)
t.1
t.1
t.2
t.2
The transfer rules are the most difficult to formalise however it helps to observe that the total number
of discs (x+y) always remains the same before and after each move this is often referred to as an
invariant. The preconditions ensure that the destination disc is filled up as much as possible.
b) The following is a trace of a breadth first search this is not what was asked in the question
namely a complete search graph irrespective of the mode of search however it is instructive to draw
this tree and it is simpler than the complete search graph. Note that the tree generated by the breadth
first search algorithm does not admit the expansion of nodes previously visited. I have highlighted all
the new nodes generated.
Parent node
[0,0]
[5,0]
[0,3]
[5,3]
[2,3]
[3,0]
[2,0]
[3,3]
[0,2]
[5,1]
[5,2]
[0,1]
[4,3]
[1,0]
[4,0]
[1,3]
Children
Rule s.1 Rule s.2
(5,0)
(0,3)
(5,0)
(5,3)
(5,3)
(0,3)
(5,3)
(5,3)
(5,3)
(2,3)
(5,0)
(3,3)
(5,0)
(2,3)
(5,3)
(3,3)
(5,2)
(0,3)
(5,1)
(5,3)
(5,2)
(5,3)
(5,1)
(0,3)
(5,3)
(4,3)
(5,0)
(1,3)
(5,0)
(4,3)
(5,3)
(1,3)
Rule t.2
(0,0)
(5,0)
(3,0)
(5,3)
(5,0)
(3,0)
(2,0)
(5,1)
(2,0)
(5,1)
(5,2)
(1,0)
(5,2)
(1,0)
(4,0)
(4,0)
Page 13 of 32
David Smith
(0,0)
(5,0)
(5,3)
(0,3)
(2,3)
(2,0)
(3,0)
(3,3)
(0,2)
(5,1)
(5,2)
(0,1)
(4,3)
(1,0)
(4,0)
(1,3)
Page 14 of 32
David Smith
Stacking Rules:
((x,y),M)
((x,y),M)
((5,y),M++[s1])
((x,3),M++[s2])
s.1
s.2
Emptying Rules:
((x,y),M)
((x,y),M)
((0,y),M++[e1])
((x,0),M++[e2])
e.1
e.2
Transfer Rules:
((x,y),M)
((x,y),M)
((x,y),M)
((x,y),M)
((x+y,0),M++[t1])
((5,x+y-5),M++[t1])
((0,x+y),M++[t2])
((x+y -3,3),M++[t2])
t.1
t.1
t.2
t.2
Page 15 of 32
David Smith
Question 4.
a)
i. The truth table is as follows
x1
0
0
0
0
1
1
1
1
x2
0
0
1
1
0
0
1
1
x3
0
1
0
1
0
1
0
1
y
0
0
0
1
0
1
1
1
x2 x3
OR
b) i.
AND
x1
T = 1.5
y
x1 x2
0 0
0 1
1 0
1 1
y
0
0
0
1
x1 x2
0 0
0 1
1 0
1 1
y
0
1
1
1
x2
OR
x1
T = 0.5
y
1
x2
T = -0.5
NOT
-1
x
x y
0 1
1 0
Page 16 of 32
David Smith
ii. A discussion of the non-linear separability of the XOR space is required here as follows
Perceptrons or TLUs are linear classifiers that is they divide the problem space into two regions each
separated by a straight line (or generally a hyperplane). For example, the problem space for the AND,
OR and XOR gates are as follows
x2
AND
x2
1 X
1 X
X
0
X
1
X
0
X
1
x1
OR
x2
x1
XOR
1 X
X
0
X
1
x1
We can separate the output 1 (in blue) from that of 0 (in black) by a straight line for both the AND and
OR gate. We require two lines to separate the 1s from the 0s in the XOR graph.
iii. The architectural feature that would allow us to do this is by introducing a third (hidden) layer into
the network depicted as
XOR gate
T = 0.5
1
x1
1
1
1
T = 0.5
y
T = 1.5
-1
x2
1
input layer
hidden layer
output layer
I have added weights to the diagram the question does not ask for this but it is quite straightforward
to do. The 1st node in the hidden layer is an OR gate, the 2nd node is an AND gate, (the nodes in the
first layer simply propagate the inputs through to the hidden layers). The weights on the inputs to the
node in the output layer ensure that if both hidden layers output a 1 then this is not propagated to the
output.
Question 5:
i. A Turing machine consists of an infinite tape that acts as an unbounded linear storage device
containing the program data. Only a finite part of the tape is ever used during a computation.
Tokens on the tape may be read and written one at a time by a read/write device that then moves either
one place to the Left or one place to the Right.
The machines program is represented by a Finite State Machine.
For a given program F (represented in the Turing machine by the FSM) and input d (as represented by
a sequence of tokens on the tape), we denote by F(d) the result (the contents of the tape suitably
interpreted) of running the Turing machine with program F for the data d. If the computation does not
terminate, F(d) is taken to be undefined (z).
Page 17 of 32
David Smith
Note: It is always informative to draw a diagram (see the lecture notes).
Note that a Universal Turing Machine is a Turing machine consisting of the above three components.
However the tape now consists of both the code, p, for a Turing program, P, and its data, d; the FSM is
a program, U, that for the given input applies the program with code p to the input d. We can write this
U(p,d) = P(d). All modern computers are essentially Universal Turing machines the programs, P that
we write are compiled giving a (binary) code p. The machines hardware applies this program to its
data. Such hardware constitutes a Universal Turing Machine5.
The program H defined as
H(p)
=1
if P(p) terminates
=0
otherwise
is a particular Universal Turing machine.
ii. When writing programs for Turing machines you should state the assumptions that you make in
particular
How you are going to represent and interpret the data
The position of the read/write device before and after the computation.
You may write your algorithm either in tabular form or an equivalent state transition diagram (which is
often easier).
The simplest way to solve this problem is to make the following assumptions
The read/write device will start to the right of the data.
The result of the computation is the number of consecutive 1s on the tape interpreted as a unary
number (see below).
The contents of the tape will be blank (b) except for the number to be represented.
We represent zero as 1, one as 11, two as 111, three as 1111 etc. note that there would be no
representation for zero if we used the conventional representation of unary numbers.
The program may be simply written as
<1,b,b,L>
<2,1,1,L>
<1,1,1,L>
state 1
<1,b,1,L>
state 2
state 0
The Abacus Control unit acts as a Universal Machine for a program stored in the program space and
its data stored in the data space.
Page 18 of 32
David Smith
iv. You should know the halting problem (see the lecture notes). There are many ways to show that the
halting problem is not computable I prefer the diagonalisation argument as this does not explicitly
refer to any particular machine or language. The only assumptions that it does make is that any
algorithm and its corresponding data may be coded by the natural numbers. This is not a particularly
strong assumption to make. All computers to date exhibit this characteristic. See the lecture notes for a
full discussion of the halting problem.
v. This question may appear a little strange at first glance but it has major significance.
Let us assume that the halting problem is computable.
We have just downloaded the code off the internet and to our surprise we have found that it is written
in java (in the class HP). We can now write the following program.
/**
*
* class Equation.
*
*/
public class Equation {
private static int x;
My program successively searches for values of n = 0,1,2,3, etc. for a solution to the equation
xn = nx+1 for a given value of x. If my program terminates then we know there is a solution, however if
it does not terminate then unfortunately we cannot deduce that there is no solution it cannot search all
the natural numbers in ours or anyone elses lifetime.
However with the halting program at our disposal, we need only apply the halting program (in class
HP) to my program. It will terminate this it is guaranteed to do - outputting YES if my program
terminates and hence there is a solution or it will terminate outputting NO if my program does not
terminate saying that there is no solution.
This argument can be applied to a whole class of problems including Fermats last theorem see
http://www.pbs.org/wgbh/nova/proof/ if you are interested in what this is.
You are unlikely or unlucky to get a question similar to part v. in an examination however it is worth
being aware of the significance of the halting problem.
Page 19 of 32
David Smith
Question 6:
i. The constructors of the data type Tree are Leaf and Node.
Constructors as their name suggest build members of the data type.
For example, Leaf, Node Leaf 4 Leaf, Node (Node Leaf 4 Leaf) 7 (Node Leaf 8 Leaf) are all Trees of
integers.
ii. The type Tree is said to be a polytype that is many typed. This means that we may construct trees
of integers, trees of characters, trees of strings etc. using this definition. Functions defined over
polytypes are said to be polymorphic (many shaped). Polymorphic functions are such that the definition
of the function does not depend upon the type variable, a.
For example the function isOn is polymorphic defined below as
-- precondition
-- isOn x t
the Tree t is ordered
isOn x Leaf
= False
isOn x (Node lt y rt)
| x == y
= True
| x < y = isOn x lt
| otherwise
= isOn x rt
in Haskell, the type of isOn is Ord a => a -> Tree a -> Bool
The type constraint Ord a dictates that the function may only be applied to types that are ordered
(ordinal types) the polymorphic functions == and < have been used in the definition - these only
apply to ordered types. There is no such constraint for the definition of the head of a list given by
-- precondition
-- head L
the list L is not empty
head (x:xs)
=x
the polymorphic function head defined over the polytype List as type [a] -> a
iii Overloading is used extensively in procedural languages. For example the equality operator == is
overloaded this means that it may be applied to objects of different type however unlike
polymorphism there are multiple definitions for == (one for each type). The compiler chooses the
appropriate definition based upon context using its type inference system. For example in the program
fragment
int x;
char y
if (x == 3){
if (y == a){
} // if
} // if
The Boolean operator == first is applied to integers and then to characters. The compiler chooses the
appropriate function to use.
Overloading also occurs when defining constructors with different arguments once again the
compiler will choose the appropriate method based upon context.
As you will see in Programming, inheritance is also a very powerful technique to implement
polymorphism.
iv.
v.
see above
vi.
sumOf Leaf
sumOf (Node lt n rt)
=0
= n + (sumOf lt) + (sumOf rt)
Page 20 of 32
David Smith
note that the type of sumOf is Num a => Tree a -> a . The constraint on the polytype a is that it
should be numeric.
vii. Since the tree is ordered then the greatest element is the rightmost hence the following
definition
-- precondition
-- greatest t
The tree t is not empty
greatest (Node lt n Leaf) = n
greatest (Node lt _ rt)
= greatest rt
however if the tree was not ordered then the greatest element would be given by
-- greatest element of a tree
-- precondition tree not empty
greatest (Node Leaf n Leaf)
greatest (Node lt n Leaf)
greatest (Node Leaf n rt)
greatest (Node lt n rt)
max2 a b
|a<b
| otherwise
max3 a b c
=n
= max2 (greatest lt) n
= max2 n (greatest rt)
= max3 (greatest lt) n (greatest rt)
=b
=a
= max2 (max2 a b) c
this definition is fairly tough you would be very unlucky to have to do something like this. The
difficulty arises from the case analysis and the definition of max2 and max3 the maximum of two and
three numbers respectively.
b)
The four operations that together with the constant emptyStack that characterise a stack are
push an element onto the top of a stack
pop an element off of a stack
top return the top value of a stack (the last element to be pushed)
empty return true if the stack is empty
ii.
The following is the code for the class Stack written in Java
/**
*
* class Stack
*
*/
public class Stack{
public Object[] stack;
public int size = 100;
public int top;
public Stack(){
stack = new Object[size];
top = -1;
} // Stack()
public void push(Object obj){
top++;
stack[top] = obj;
} // push
Page 21 of 32
David Smith
public void pop(){
top--;
} // pop
public Object top(){
return stack[top];
}// top
public boolean isEmpty(){
return (top == -1);
} // isEmpty
} // Stack
iii. The following is the code for the data type Stack written in Haskell
data Stack a = Null | Push a (Stack a)
empty
push x s
pop (Push _ s)
top (Push x _)
= Null
= Push x s
=s
=x
isEmptyStack Null
= True
isEmptyStack (Push _ _) = False
Notice that the polymorphism in stack is realised using inheritance the class is a stack of objects. The
polymorphism in Haskell is realised by the type variable a.
Question 7:
i. This is a sequential search for an item in a list of items the complexity is therefore linear written
O(n).
time
ii. On the assumption that the tree is ordered and well-balanced then the complexity of this algorithm is
O(log2 n). It does not get a lot better than this since it is the inverse of the exponential function 2n. This
means that (log2 n) stays almost constant for increases in the value of n as n gets large. Its graph is
Time
Page 22 of 32
David Smith
time
Question 8:
a) A perceptron or threshold logic unit (TLU) is a simple processing device that is founded on a
component of the animal brain called a neuron. It consists of a number of inputs (dendrites6) a single
output (axon) and a summing unit (soma) that in this context we shall call a node. Neural Networks are
Page 23 of 32
David Smith
simply one or more interconnected TLUs. (The axon of each neuron connects to one or more dendrites
each at a synapse) The strength of the connection may vary and are given by the weights on the neural
network. Each node (neuron) has a threshold value, if the total input to the node (neuron) exceeds this
value then the node (neuron) will output a logical 1, otherwise it will output a logical 0.
x1
w1
y
w2
y = step(x1* w1 + x2* w2 - )
x2
b) TLUs can be trained (using an algorithm that adjusts the weights) to solve linearly separable
problems. For example suppose we wish to classify jockeys and basketball players. We may use the
attributes height and weight in which to do so. In general jockeys will be short and light whereas
basketball players are tall and heavy. We may plot these each person based on these attributes on a
graph as follows
height (x2)
B = basketball player
J = jockey
BB B
B B B B
B B
B B
B
J
J J
J J
J J
weight (x1)
The Basketball players can be separated from the Jockeys by a straight line this classification
problem is linearly separable and may be solved with the use of a TLU. An appropriately weighted
TLU would output 1 (say) for the inputs given by the weight and height of each basketball player and 0
for each jockey.
For instance, if the intercept of the straight line and the weight axis was 2 units and the height axis 1.5
units then the equation of such a line is given by 1.5x1 2x2 + 3 = 0. We can now construct a TLU as
x1
1.5
= -3
y
-2
x2
Page 24 of 32
David Smith
However during the learning process we start with the data and adjust the weights to classify the data.
c) A suitable data structure to represent the digits on such a grid is a 2-dimensional array (5x5). (see
Brookshear) for a discussion on arrays.
In java a 2-D array of integers may be declared as
public int[][] one;
and may be created and initialised as
one = { {0,0,1,0,0},
{0,1,1,0,0},
{0,0,1,0,0},
{0,0,1,0,0},
{0,1,1,1,0}
}
(or created as one = new int[5][5];)
where one is the name of the array and for each i (0 [ i [ 4) one[i] represents the ith row (a 1-D array),
and for each j (0 [ j [ 4) one[i][j] represents the value in the ith row and the jth column. In particular
one[0][0] has the value 0 and one[2][1] has the value 1.
Arrays are the most important data structure in procedural languages they play an analogous role to
lists however there are very important differences7
Lists are linear structures that may grow and shrink according to need.
Array items may be referenced by an index but are of fixed length.
d) Pattern recognition problems are linearly separable. We will demonstrate this by the way of a
simpler example to the one above. For a two pixel display depicted as follows there are a possibility of
4 (= 22) patterns. A neural network architecture that recognises each of these patterns has two input
nodes (each one connected to a single pixel) and 4 output nodes (one for each pattern). The network is
fully connected as shown.
Four possible patterns
for 2 pixels
Neural Network to
recognise each pattern
0
1
0
1
0
0
The network can be trained to recognise each of the patterns. The output of a program to do such
training is given below8.
Java supports a class called ArrayList that is a hybrid between arrays and lists.
Page 25 of 32
David Smith
Before training
Input nodes
1
2
0.0
0.0
0.0
1.0
1.0
0.0
1.0
1.0
Output nodes
1
2
0.31
0.49
0.3
0.52
0.33
0.5
0.32
0.52
3
0.39
0.22
0.23
0.11
4
0.71
0.83
0.81
0.9
3
4.68
-4.74
2.39
4
4.53
4.53
6.85
Output nodes
1
2
0.99
0.01
0.01
0.99
0.01
0.0
0.0
0.01
3
0.01
0.0
0.99
0.01
4
0.0
0.01
0.01
0.99
3
0.02
0.46
0.0
0.0
0.01
4
0.97
0.47
0.0
0.47
0.01
After training
Input nodes
1
2
0.0
0.0
0.0
1.0
1.0
0.0
1.0
1.0
2
-4.74
4.67
2.39
And now for the question - the architecture for our ANN has 25 input nodes (one for each pixel) and 10
output nodes (one for each digit). It is a fully connected feed forward network.
d)
ANNs are resilient to noise suppose the input from the second pixel (on input node 2) takes
the value 0.9 (this is a noisy 1). For an input of 1 from the first pixel the output of the output
nodes is 0.0, 0.0, 0.02, 0.97 respectively. The network still thinks that the input pattern is
(1,1).
ANNs are also resilient to changes in the weights the ANN will still function if we change
one or more of the weights within the network.
It is these two properties that make neural networks behave similarly to that of our brains after a
heavy nights drinking zapping some neurons in the local bar, we are still able to stagger home and
function fairly normally the next day.
TLUs use the step function as output, my simulation uses a continuous version of the step function
called the sigmoid function. Its equation is f(x) = 1/(1 + e-x) where e is (Eulers) constant 2.71. and
is a positive constant.
Page 26 of 32
David Smith
Question 9:
a) There are lots of ways of doing this question probably the quickest algorithm of linear
complexity is to repeatedly select two coins (where possible) at a time (until the counterfeit coin is
found), place each coin on either side of the scales and select the lightest (if scales are not balanced). If
there is only one coin left then this must be the lightest.
precondition bag not empty
while not found
if contents of bag > 1 then
select and remove two coins from bag and place on scales
if scales not balanced then
return lightest coin
// if
else
return last coin in bag
// if
// while
We could realise this algorithm in Haskell as follows
bag = [3,3,3,3,2,3,3,3,3]
lightest [x]
lightest (x:y:xs)
| x == y
|x<y
| otherwise
=x
= lightest xs
=x
=y
Notice that we have represented the bag as a number of coins, all of which except one weigh 3 grams.
b) The second algorithm has logarithmic complexity O(log2 n)
In this case we repeatedly divide the coins into two equal parts (there may be one left over). We place
each half on either side of the scales if the scales are balanced the counterfeit coin must be the one
left over, if there is only one coin on each side of the scales then we select the lightest otherwise we
repeat the process with the lightest half. This algorithm constitutes what is known as a binary search.
while not found
if bag contains one coin then
return this coin as the counterfeit
else
where possible divide the bag into two equal halves
place each half on the scales
if the scales are in balance then
return the coin left over as the counterfeit
else
repeat with the lightest half
// if
// if
// while
Below is what this algorithm looks like in Java it also uses 1-D arrays.
/**
*
* class Search
*
* D.Smith
*/
Page 27 of 32
David Smith
public class Search{
// a bag of coins each weighing 3 units except the counterfeit
public static int[] bag = {3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3};
public static void main(String[] args){
System.out.println("The counterfeit coin is at position " + search(bag,0,bag.length-1));
} // main
/*
* preconditon
* a contains at least one coin exactly one of which is counterfeit (weighs less)
*
* search(a,lb,ub)
* returns the index of the counterfeit coin
* lb = lower bound of part of the array a to be searched
* ub = upper bound of part of array a to be searched
* m is mid point of array segment
*
*/
private static int search(int[] a,int lb, int ub){
int m = (lb + ub) / 2;
if (lb == ub){
// one coin in bag at the mid point
return m;
} else {
if ((lb + ub) % 2 == 0){
// odd number of coins in bag portion (>1)
if (weight(a,lb,m-1) == (weight(a,m+1,ub))){
return m;
} else if(weight(a,lb,m-1) < (weight(a,m+1,ub))) {
return search(a,lb,m-1);
} else {
return search(a,m+1,ub);
} // if
} else {
// even number of coins in bag portion (>0)
if (weight(a,lb,m) < weight(a,m+1,ub)){
return search(a,lb,m);
} else {
return search(a,m+1,ub);
} // if
} // if
} // if
} // search;
private static int weight(int[] a, int lb, int ub){
int result = 0;
for (int i = lb; i <= ub; i++){
result += a[i];
} // for
return result;
} // weight
} // Search
c)
For the linear search it is 4, 128, 512 respectively.
For the logarithmic search it is 3, 8, 10 respectively
Page 28 of 32
David Smith
Question 10:
a) An uninformed or blind search is given by the following algorithm
Let Live States = [initial state(s)]
Search( ) is
select and remove a state, called the current State from the Live States
if the current State is a goal state then
the result is the current state
else
generate new state(s) (the children) from the current State by the application of the rule and
add those to the Live States
Search( )
end if
end Search
where
initial state = ([H],0,{A,B,C})
rule
(J,d,T)
if T g {}
if T = {} and last(J) g H
([H,A],?,{B,C})
([H,A,B],?,{C})
([H,A,B,C],?,{})
([H,B],?,{A,C})
([H,C],?,{A,B})
([H,C,B],?,{A})
([H,C,B,A],?,{})
Note: the ? should have the cumulative distances travelled by the salesman these have been omitted
for simplicity
Page 29 of 32
David Smith
c) The complexity of the algorithm for a breadth first search is factorial (written O(n!) where n is the
number of towns to be visited. This is very bad news for finding a solution9 Moreover this solution is
not guaranteed to be optimal.
d)
i. We could perform a depth first search this is not informed but in this example would lead to a
solution in quadratic time10; better would be to go to the nearest town from our current position. This is
an informed search and has linear complexity. It is probably the most sensible option that we have at
our disposal.
ii. The complexity of our algorithm is now linear.
iii. Unfortunately there is no known heuristic that will guarantee the best solution to this problem.
The following are the results from a simulation
H is represented by 0
A is represented by 1
B is represented by 2
C is represented by 3
The Tree is as follows
Selected Path 0 1 2 3 0
Selected Path 0 1 3 2 0
Selected Path 0 2 1 3 0
Selected Path 0 2 3 1 0
Selected Path 0 3 1 2 0
Selected Path 0 3 2 1 0
Distance = 125
Distance = 110
Distance = 115
Distance = 110
Distance = 115
Distance = 125
For 10 towns there are a possibility of 3,628,800 different journeys and for 20 towns the total possible
number of different journeys is 2,432,902,008,176,640,000 this is a big number - if we could explore
1,000,000 paths each second it would still take us over 77,000 years to examine all possible journeys
a lot longer than the travelling time!
10
The number of nodes expanded is n + (n-1) + + 2 + 1 = n(n+1)/2 = O(n2)
11
It is purely coincidental that our three search techniques give the same overall distance and that the
breadth first and nearest path techniques give the same path.
Page 30 of 32
David Smith
Here is an output for the following distances between Nine Towns given by the following matrix. This
is about the upper limit for this algorithm on my PC using a breadth first search method (the machine
runs out of memory and I run out of patience).
public static int towns = 9;
public static int[][] map =
{
{0,10,15,14,2,16,5,21,40},
{10,0,20,35,33,14,22,20,22},
{15,20,0,50,5,33,17,37,17},
{14,35,50,0,17,21,3,13,41},
{2,33,5,17,0,45,29,18,12},
{16,14,33,21,45,0,18,32,16},
{5,22,17,3,29,18,0,12,23},
{21,20,37,13,18,32,12,0,17},
{40,22,17,41,12,16,23,17,0}
};
For example there is 45 miles between town 4 and town 5.
Page 31 of 32
David Smith
Question 11:
a)
<S>
<S>
<S>
<S>
Input
(
)
)
Current Stack
*
(:*
(:*
New State
start
start
start
New Stack
(:*
*
*
c) Tabulate a trace of your program to show how it accepts the sentence ( )(( )).
Current State
start
start
start
start
start
start
Input
(
)
(
(
)
)
Current Stack
[]
(:[ ]
[]
(: [ ]
(: (:[ ]
(:[ ]
New State
start
start
start
start
start
start
New Stack
(:[ ]
[]
(:[ ]
(: (:[ ]
(:[ ]
[]
The trace has terminated with all the characters of the input stream read AND the stack is empty thus
the sentence ( )(( )) is accepted.
Note that the trace for the string ( )(( ) would terminate at the line
start
)
(: (:[ ] start
(:[ ]
This would not be accepted since the stack is not empty and all the string had been read.
Moreover the trace for the string ( )(( ))) would terminate at the line
start
)
[]
?
?
The machine has arrived at the point where it has no rules to tell it what to do on reading the token )
with the stack empty. The machine therefore stops and the string is not accepted.
Page 32 of 32
David Smith
Question 12:
The first part of this question is standard bookwork however the important aspects to emphasise are
the constraints on the machine architecture imposed by a finite length word12. See the lecture notes on
the machine Abacus for the details.
An algorithm to determine the largest of two numbers found in location 1 and 2: The result will be
found in location 0.
clr 0
dec 1
beq 5
dec 2
beq 8
inc 0
jmp 5
**
dec 2
beq 3
inc 0
jmp 3
stop
****
inc 0
dec 1
beq 3
inc 0
jmp 3
stop
**
**
**
**
**
**
**
**
**
**
**
1
5
2
8
0
-5
2
3
0
-3
0
0
1
3
0
-3
0
b) This is again standard bookwork. We have seen numerous examples in the course see the lecture
notes on language and automata.
c) Please read the section on the language and automata notes.
12