Mandatory Reading 00 3438581

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

An Introduction to Artificial Intelligence

Author(s): Staffan Persson


Source: Ekonomisk Tidskrift , Jun., 1964, Årg. 66, n:r 2 (Jun., 1964), pp. 88-112
Published by: Wiley on behalf of The Scandinavian Journal of Economics

Stable URL: https://www.jstor.org/stable/3438581

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms

Wiley and The Scandinavian Journal of Economics are collaborating with JSTOR to digitize,
preserve and extend access to Ekonomisk Tidskrift

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
AN INTRODUCTION TO ARTIFICIAL
INTELLIGENCE*
By STAFFAN PERSSON1

What is artificial intelligence? Attempts to answer this questio


often give rise to semantic controversies. To avoid this let us intuitive
define the area of research in artificial intelligence as the design
machines2 the behavior of which in a given situation would be labe
intelligent if observed in human activity. The nature of the probl
of definition may be illustrated as follows: a person who scores hi
in an intelligence-test is usually considered intelligent, but is a machin
which performs equally well on the same test also intelligent? As
know of no generally accepted way of resolving this question we h
to leave the reader with his own conclusion.3 A question closely rel
to the discussion of intelligence in machines is the following: "ca
machines think". Turing (ref. 30) made this question operational
defining it as an "imitation game" played by two people (A and
and one machine (C). One of the participants, say A, is interroga
with the task to decide whether B or C is the machine. A perfor
his job by evaluating answers given by B and C to his questions,
all communication is transmitted mechanically such clues as hand
writing and voice are eliminated, leaving him with the actual conte
of the answers as basis for his decision. The machine is said to have

* To justify the presentation of artificial intelligence in this journal we should


exemplify its potential relevance for the field of business administration, as we
however in this context want to avoid speculations, let us for the moment accept
the fact that courses in artificial intelligence are now offered at some of the
leading American Schools of Business Administration as a tentative justification.
1 The author is greatly indebted to Dr. Edward A. Feigenbaum of University
of California for valuable comments and suggestions.
2 Machine is here synonymous to a program for a general purpose computer
or to a special purpose computer.
3 For a discussion of this kind of question see references 1, 27, and 30.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
ARTIFICIAL INTELLIGENCE 89

passed the "Turing test" if A can not identify it mo


pure chance would predict. At present there are n
can pass the Turing test for arbitrary questions, bu
limit for the area of exploration there actually ex
ficient enough to pass a partial Turing test.
The general area of artificial intelligence can be
specialized branches. At a "micro-level" we find the research on
neural-net-analogies,4 an area concerned with models composed
several simple and often randomly connected components. The al
edly "brain-like" behavior of these models is obtained by applicat
of simple learning-rules which after a trial and error-procedure
cause the initially unorganized network of components to assume
logical configuration which actually is capable of recognizing patte
or solving simple problems. These models will not be discussed furt
here because their problemsolving-capacity is not yet sufficien
developed to be of interest for our purposes.
At the "macro-level" of artificial intelligence we combine the sim
basic components to complex but specialized problemsolving units
This approach sacrifices some generality but permits design
machines with "interesting" capability using equipment availa
to-day. We can at this level distinguish two lines of research, nam
Simulation of Human Cognitive Processes and Artificial Intelligen
Proper.
Simulation of Human Cognitive Processes is concerned with the
exploration of human behavior at a level between the explicitly ob-
servable and the basic neural processes. The assumption behind this
research is that it is possible to analyze and precisely explain the
basic problemsolving and symbol-manipulation processes underlying
the human thought. As the computer is a general information-proces-
sing device of very high capacity it has been employed as a powerful
tool for the construction and testing of models of human behavior.
The task of the researchers is thus to formulate a precise theory in
the form of a computer-program and then to test the theory by
comparing the performance of the program (i.e. the predictions of the
theory) with available data from psychological experiments. It should
4 For a discussion of this area see references nos. 16 and 21.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
90 STAFFAN PERSSON

be stressed that not the solution given to specific pro


the process that generates the result is of interest.
Artificial Intelligence Proper is concerned mainly
generated by the problemsolving process and theref
efficient method to be used for reaching the goal
freedom of choice among methods it has turned out
research depends heavily upon results obtained from
of human problemsolving. This is not surprising beca
of the machines necessarily must find it difficult to det
from their own experience when looking for appropr
include in their models.

Having briefly presented our general area of discourse we n


proceed to survey some results and methods of artificial intelligen
The next few sections of this article are devoted to brief descript
of some major achievements of research, (unfortunately the limi
space forces us to simplifications and generalizations and we ther
fore recommend the interested reader to retrieve the original repo
for more accurate accounts). One section is devoted to a rather
detailed description of SEP 1, a program for extrapolation of seque
of numbers and letters. SEP 1 has got a preferential treatment n
because it is especially important, but due to the fact that this p
gram will provide several examples to illustrate methods of artifi
intelligence discussed in the last few sections of the paper.

Some Important Applications of Artificial Intelligence


1. Game-Playing Programs

It may seem rather wasteful to spend time and money upon th


design of game-playing programs for digital computers, but in ad
tion to the value of pure entertainment the parlor-games provide
examples of decision-making in complex and changing environmen
The most successful game-playing program so far is Samu
checkers-playing program (ref. 22). This program does not explicit
try to imitate the methods of the human checkers-player, but uti
interesting methods of learning which will be discussed in some m

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
ARTIFICIAL INTELLIGENCE 91

detail in a later section. The performance of this p


compared to that of a very good human player.
The game of chess has for a long time been consid
disputable queen of games, a situation which of co
the attention of people who want to prove that ma
grammed to perform well even in the most sophis
ments. In spite of large efforts there does howeve
machine that can consistently beat even an average player. This
"failure" on the part of the programmers has however contributed to
an increased understanding of how a chess-playing program should
be designed. As the history of these programs is of fundamental in-
terest for the understanding of the development of research in
artificial intelligence, a brief survey of the evolution of chess-playing
programs will be given.
The game of chess is from a theoretical point of view "uninterest-
ing" because there exists an optimal strategy,5 which always guaran-
tees at least a draw. The problem of identifying this strategy among
the about 10120 available is however unsolvable, because realizing that
an evaluation of as many as one million strategies per second would
only produce about 1016 evaluations per century obviously clearly
shows that no exhaustive technique can compute the optimal strategy.
Shannon (ref. 23) proposed an exhaustive evaluation to a limited
depth (say 2 moves6) basing the choice of move upon the maximiza-
tion of a numerical value assigned to every possible board-position.
In 1956 a Los Alamos group (ref. 12) implemented a Shannon-type
of chess-playing program on the MANIAC 1 computer.7 Because of the
enormous number of computations required for each move the evalua-
tion-function had to be very simple. The performance of this program
was very poor due to the very limited amount of consideration given
to each alternative. Bernstein (ref. 2) in 1957 presented a chess-
playing program for the IBM 704, which for each stage only evaluated
up to seven plausible alternatives to a depth of two moves.8 The re-
5For game-theoretical concepts see ref. 14.
6 This gives about 800,000 alternatives to explore for each move.
7To reduce the time necessary for evaluation of each move a chess-board of
only 6 x 6 was used, some kinds of moves were also eliminated.
8 This approach requires about 2500 alternatives to be evaluated for each move.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
92 STAFFAN PERSSON

duced number of evaluations permitted a more com


tion-function but the performance of the program
mediocre.

At this stage it may be of interest to see how the human chess-


player can use his limited computational ability to play a far bet
game of chess than the machines. The key to his excellence is selec
search, he at the most evaluates 100 alternatives at each step usin
previous experience, rules of thumb, etc. to guide his choice. New
Shaw, and Simon (ref. 19) have tried to capture the essential featu
of human chess-playing in their chess-playing program. As their i
are relevant as well for organization-theory as for artificial intelligen
we want to give at least a rudimentary feeling for their method
reducing the search for alternatives. One basic idea behind their
program is the well-known "aspiration-level" model of decision-mak-
ing (ref. 15). A simplified step by step account of the decisions preced-
ing one move runs as follows:

1. On the basis of the current stage of the game a set of goals (such
as Center Control, Material Balance, King-Safety, etc.) are ordered
in decreasing priority.
2. A move-generator proposes a few moves considered relevant for
the currently highest priority goal.
3. The set of proposed moves is evaluated in terms of the value for
each goal, giving for each move a list of values, which may be
numerical or just yes or no. The evaluation is performed to a
depth which is decided by an analysis-generator.
4. The list of values is now used to choose the most appropriate move.
The first entry decides the choice unless several moves have equal
value at this position in which case the second entry is compared
etc. Up to now we have a case of simple maximization but the
proposed move must also satisfy the requirement that all entries
in its list of values must exceed the aspiration-level of correspond-
ing goal, if this is not the case the next move in priority is tested
etc. If no move is accepted the move-generator for the second-
priority goal is initiated and the analysis starts again at point 2.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
ARTIFICIAL INTELLIGENCE 93

2. Problem-Solving Machines
The Logic Theory Machine (LT) by Newell, Shaw, and Simon
(references 17 and 18) is the first ancestor to several artificial in-

telligence projects. Although LT was explicitly designed to pro


set of theorems from Whitehead and Russel's Principia Mathem
the basic methods employed were of far more general applicab
LT utilized a set of five basic axioms and three rules of inference

(methods) which could be applied in a recursive fashion. T


heuristic9 of LT was the "working backwards" technique,
employed to identify subgoals which could bridge the "ga
the basic axioms and the theorem. A very abbreviated acco
procedure is given below:

1. Identify a set of subtheorems which can be transform


theorem which we intend to prove in one step by ap
of the methods to the axioms or previously proven and
theorems. If one of our subtheorems is identical to an axiom or

a previously proven theorem our task is completed, otherw


go to point 2.
2. Reduce and organize the set of subtheorems generated in 1 by
a. Excluding subtheorems believed to be unprovable,
b. Testing for similarities in order to avoid double work,
c. Ranking the remaining subtheorems in increasing order of
difficulty.
3. We now have a list of subtheorems, all of which we know how to
transform to the initial theorem, therefore if we could find a bridge
between the axioms (and proven theorems) and any of our sub-
theorems we have completed our task. So we go to 1 again using
the highest priority subtheorem instead of the original theorem.
If unsuccessful LT then tries other subtheorems, until a solution is
found for memory-space exhausted.

As we see LT if successful generates a string of successive sub-


theorems which all can be derived from its predecessor using one

9 Heuristic=any method that often (not necessarily always) can lead to a


solution by reducing the necessary amount of search.
7 - 644811 Ekonomisk Tidskrift 1964 Nr 2

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
94 STAFFAN PERSSON

application of some method. The string of subtheo


obviously constitutes a proof of the theorem.
The Symbolic Automatic INTegrator (SAINT) by Slagle (ref. 26)
performs symbolic integration at the level of freshman calculus. The
basic organization of SAINT is largely inherited from LT. The tools
available are a list of standard-integrals, memorized results and a set
of algorithmic10 and heuristic procedures. In order to reduce search
a crude evaluation of the relative difficulty of different lines of attack
is made before a string of transformations is applied.
Another successful relative of LT is the Geometry Theorem Machine
by Gelernter (ref. 9). This machine will be discussed in the section
entitled "Particular Heuristics".

The designers of LT have utilized their experience from t


gram in their General Problem Solver (GPS) (ref. 20), a prog
structure of which is based upon the analysis of "thinkin
protocols from experiments in human problemsolving. T
General Problem Solver does not mean that all problems can
by GPS but that the problemsolving methods employed are
independent and general in the sense that problems whic
cast into a certain general form can be solved. GPS will in a
section be discussed in some detail as an example of me
analysis.

3. Inductive Machines

The important area of induction has not yet produced many res
but some achievements of interest have been reported. The work with
this area aims toward the design of machines which can actually b
and utilize models of their environment. Besides ability to pred
and answer questions these machines should have the capacity to
"intelligent" questions when additional information is required f
their work.

Lindsay's SAD SAM (ref. 13) is a machine which can answer ques
tions concerning kinship-relations. The information is given in for

10 Algorithm = (in this context) a method which guarantees a solution, but wh


may be uneconomical to use because of the amount of computation-time requi
to reach the solution.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
ARTIFICIAL INTELLIGENCE 95

of a set of sentences1l which describe family-rela


playing with his brother Bobby" and "Bobby's sis
mother Anna for help" are simple examples of in
which is a sentence-diagrammer encodes the synt
given by any sentence into a map, the general fo
context-independent. SAM is a semantic analyser w
information from SAD's map and arranges it in
Given a question about family-relations" explic
derivable from the inputsentences SAD SAM can n
by consulting the tree-model. In our example we c
for the name of Joey's mother and get the answe
A program by Simon and Kotovsky (ref. 25) e
sequences used in intelligence-tests. The conclusio
is reached in two steps, first an explicit model of
is described in an appropriate language, then th
the continuation of the sequence. It may be noted
intended to illustrate a theory about human concep
efficient but purely artificial intelligence program
a subset of its problemenvironment, solve the sa
ligence-tests as the Simon and Kotovsky program.
An interesting example of a machine that can ad
a changing environment is Ernst's Mechanical Ha
its performance of such simple tasks as building t
or putting blocks into a box, the machine builds a mo
environment. This is accomplished by recording t
of potentiometers (one for each degree of freedo
every time the hand locates an object. The inducti
to use when changes in the environment are arran
menter (for instance by moving some previously
machine in this case is guided by a heuristic prog
basis of preceding events decides the most appropr
the model.

It may be noted that all programs discussed in this section d


with environments for which we know how to build efficient mo

1 Formulated in Basic English.


Unpublished.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
96 STAFFAN PERSSON

(such as family-trees), for more general cases we h


research into human information-storage and -retr
clues about efficient methods.

4. Question-Answering-Machines
One of the goals of artificial intelligence is the design of mach
which on questions presented in spoken or written natural lang
rapidly can produce answers in an easily understandable form. Be
we can actually implement such machines we have to know how
mechanically extract meaning from sentences. This problem the
ficulty of which is due to the fact that meaning can not be der
from separate words but is a function of the context is discussed in
the literature on Machine Translation of Languages to which we refer
for further information.

The previously discussed SAD SAM by Lindsay can be classified as


one attempt to build a question-answering-machine for a rather nar-
row environment.

BASEBALL by Green et al. (ref. 10) does not build its own mode
of the data-universe but is instead from the beginning given an
ficiently stored handbook of baseball. Questions (formulated
somewhat restricted English) about baseball-statistics can be an-
swered.

There also are some restrictions in the kind of questions w


can be answered by BASEBALL, but the program permits rather
plex inquiries as for instance "Did every team play at least once
each park in each month?". The flexibility of the program has
proven by presenting several differently formulated questions
which the machine gave the same answer.
SYNTHEX by Simmons (ref. 24) has a very wide data-uni
namely the Children's Golden Encyclopedia. As answers to quest
formulated in natural English the program presents quotations f
the above mentioned text.

5. Some Applications
Clarkson (ref. 3) has analyzed the work of a trust-investmen
officer at a medium-sized bank and expressed his findings in
program which has turned out to be a very successful attempt

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
ARTIFICIAL INTELLIGENCE 97

simulate human behavior in a complex decision-ma


Our account of his model must by necessity be very b
that it will induce the reader to study some of th
descriptions of this important achievement.
Clarkson divides the trust-investment process
phases.
i. Analysis and selection of a list of suitable stocks (the "A" list).
ii. Formulation of an investment-policy.
iii. Selection of a portfolio.

In stage i no attempt to simulate the actual behavior of the trust-


officer is made, instead an approximation of his experience as gained
over the years is formulated. This approach is necessitated because
the analysis is made continuously over the years and thus can not be
simulated for any particular point of time. The "A" list is produced
by choosing those entries from a list of stocks considered as suitable
(the "B" list) which, according to expectations derived from informa-
tion given about the general economy, the industries, and the com-
panies, are expected to produce the best performance.
In stage ii the program utilizes information about the client (such
as economical position, profession, desired amount of growth, etc.)
to choose a suitable investment-policy. The policies, which are charac-
terized by expected percentages of growth and income, are: "Growth
Account", "Growth and Income Account", "Income and Growth Ac-
count", and "Income Account".

For stage iii the following information is available: the "A


together with information about its members, information abo
client, and the chosen investment-policy. The stocks of the "A"
now ranked according to the relative performance in the dime
of the main attribute of the investment-policy (i.e. for a growth p
the prospects are ranked according to growth potential). Some s
can be eliminated because of the tax-position of the client or o
legal reasons leaving a reduced set for further tests. Depending
the investment-policy a set of tests is chosen and applied to th
maining list of stocks, leaving a further reduced list of stocks t
sider in the final choice. From the accepted stocks an investm

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
98 STAFFAN PERSSON

portfolio is composed by applying rules for diversif


of participation in each stock.
Clarkson's model has been tested against the perfor
trust-investor. Given the same information as he it has chosen almost

exactly the same portfolio as he did. The different stages of the mo


have also been tested separately and have shown a behavior very clo
to that of the officer. Tonge (ref. 28) uses heuristic methods for
balancing assembly-lines, a complex combinatorial problem whi
only in a few simple cases has been found accessible to mathemati
solutions. The problem involves the assignment of men and jobs t
work-stations and the object is to maximize the rate of assemb
given certain constraints specifying order between certain jobs et
Tonge uses methods inherited from industrial engineers, which by
using rules of thumb can make an efficient balancing without usi
optimal procedures. Tonge has also shown how GPS can be arrange
for balancing assembly-lines which provides us with a further indic
tion of the possibility to use general problemsolving methods for
wide variety of problems.

6. Simulation of Cognitive Processes

Most of the results up to now have at least to some extent be


based upon observations of human behavior. In this section we wil
briefly discuss some programs explicitly designed as simulations o
cognitive processes. Although our presentation will be rather brief w
want to emphasize that this area of research is of fundamental im-
portance for the future development of artificial intelligence.
The Elementary Perceiver and Memorizer (EPAM) by Feigenbaum
& Simon (ref. 6) is a precise formulation of a theory concerning t
basic processes that are involved in human cognition. EPAM has be
tested as an artificial subject in psychological experiments and has
given a very close approximation of the behavior of human subjec
We will in a brief account of EPAM's learning mechanism discuss i
performance in association-learning tests. In these tests the subjec
is presented a stimulus for which it is supposed to answer with th
correct response.
A basic feature of EPAM's learning is that only a minimal amoun

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
ARTIFICIAL INTELLIGENCE 99

of information is memorized. Stimuli together


responses are represented by the smallest am
which at the time of their learning can discrim
other and from previously learned items, responses however are
stored in full. Suppose that EPAM has learned two items, a stimulus
S1 together with its response Rl. At this stage EPAM's associative
memory (the discrimination-net) consists of a tree with one node
and two branches with corresponding terminals. The node contains a
test Tl which can discriminate between S1 and R1 by checking for
some specific information. The terminals are access points for storage
locations containing the previously mentioned minimum amounts of
information for S1 and Rl respectively. Now suppose that EPAM is
required to learn a new stimulus-response pair S2-R2. This is achieved
by growing the discrimination-net until it can hold every item of
information (new or old) at a unique terminal. First S2 is tested by
T1 and directed to one of the terminals (let us assume to the one
which holds information about SI) where a discrepancy between the
stored information and S2 is detected. A new test T2 is placed at the
point of discrepancy. T2 as previously T1 can direct stimuli to one
of two branches, the terminals of which contain information about
S1 and S2 respectively. R2 is sorted in the same manner and we end
up with a tree consisting of three nodes containing tests (T1, T2, and
T3) and four terminals containing information. Figure 1 shows two
possible configurations for this tree.

a b

T2) R T
T3 S SS1 S 2 IR1I R2

Figure 1.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
100 STAFFAN PERSSON

Given a stimulus say S2 EPAM can retrieve correspo


(R2) by sorting S2 via test Ti and test T2 (Fig. 1 b) t
The minimum clue for retrieving R2 has previously
this terminal and can thus be retrieved. The clue for R2 is thereafter
sorted via tests T1 and T3 down to the terminal containing the com-
plete response R2 which is presented as the solution of the task.
As long as new information is presented the discrimination-net con-
tinues to grow. Remembering that associations (clues) formed at any
point in time are just adequate to retrieve associations at that time,
we see that at a later time when the discrimination-net has grown,
some of these associations may become inadequate thus making
certain items irretrievable. This means that forgetting can occur in
the model without any destruction of information. EPAM's forgetting,
which is not explicitly present in the hypothesis behind the model
has a very close connection to forgetting as experienced by people.
Hunt uses discrimination methods of the same kind as used in
EPAM for his model of human concept formation (ref. 11). T
program works with experiments of the following kind: A subject
is shown a set of stimuli and is also told which stimuli are exampl
of certain concepts. The task of the subject (and the model) is
identify the concept.
Feldman in his binary choice experiment (ref. 8) uses a computer
program to formulate hypothesis about the next event to occur in
sequence of binary events. The model not only predicts the event bu
also gives the reason for the prediction. It must be observed th
the goal of Feldman's research is not to reach a high percentag
of correct predictions but instead is to closely reproduce the decision
made by specific subjects.

SEP 1 a Sequence Extrapolator


This section will give a rather detailed description of the bas
organization of SEP 1, a computer program for extrapolation of se
quences of numbers or letters, which has been explicitly designed
illustrate applications of basic problemsolving heuristics. We justif
this presentation by our intent to illustrate our survey of methods o

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
ARTIFICIAL INTELLIGENCE 101

artifical intelligence by examples taken from ma


from some other programs).
The problem-environment of SEP 1 is:
1. The identification and extrapolation of sequences of
by members of the following class of equations:

(a- X4+ b- X3+ c X2+ d X + e) [(f X + g) , h](i +i)


where a, b, c, d, e, f, g, h, i, and j are integers, (f-X+g) and (i.X+j)
are zero or positive, h= - i, or + 1.
Several exceptions from the general formula are accepted
a. One or more of the entries in the given input-sequence may be er-
roneous or even left out.
b. Two or more sequences may be mixed in which case the sub-seq
are identified and extrapolated separately.
c. The input-sequence may be "accidentally" scrambled in which
it will be unscrambled, extrapolated and printed out in general
d. A strictly exponential sequence can be separated from any no
bounded amplitude.
2. Extrapolation of certain kinds of sequences of letters.13
3. Recognition of previously encountered sequences using a mi
amount of clues.

For extrapolation of number-sequences SEP 1 uses the follow


processes:

Name Task Problem type


A Extrapolate polynomials a
B Extrapolate exponentials b
C Separate mixed sequences c
D Recognition of sequences d

The problem-type a, b, etc. is assigned to a


of which process was used for its extrapolatio
polated by process A is said to be of type a et
that any sequence of type a can be extrapolat
the process actually used depends upon the va
parameters.
SEP 1 is equipped with the following basic g
1. Solve the problem.
2. Use as few inputs as possible.
3. Minimize the amount of time necessary fo
13 Among others those used in Simon & Kotovsky (re

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
102 STAFFAN PERSSON

Goal 1 causes no difficulty, but goals 2 and 3 are in


dictory and therefore require some judgment about p

Heuristics for Reduction of Search in the Problem-S


In our survey of artificial intelligence we have s
the most difficult problems of artificial intelligence is t
to reduce the number of alternatives to consider and evaluate in
decision-making situations. In some cases there exist efficient pro-
cedures which can guarantee a solution (these methods are here
called algorithms), in other cases some heuristic method may be
applicable. The nature of artificial intelligence makes the use of
algorithms uninteresting unless they are parts of basically heuristic
procedures. We can recognize two groups of heuristic methods, namely
general heuristics and particular heuristics. The former group is
applicable to a wide variety of problems, as examples we will discuss:
The Basic Learning Heuristic, The Means-End Analysis, and Plan-
ning. Particular heuristics are methods which take advantage of the
structure of specific problem-environments to reduce the amount of
search for a solution.

The Basic Learning Heuristic

For our purpose learning will be defined as the utilization o


previous experience for the improvement of present performance
Learning may be implemented in two basically different ways, nam
generalization-learning and rote-learning. Generalization-learning u
lizes experience from previously encountered situations under new
but similar circumstances, rote-learning on the other hand only allo
specific information to be retrieved from the memory.

Examples of Generalization-Learning

SEP 1 does not need any learning at all to achieve its goal 1. Goals
and 3 however are more efficiently satisfied if a crude form
learning is utilized. Let us step by step follow a simple example.
Information about the problem.
Input-sequence: 1 2 3 4 5 8 7 16 9 ...
One error is accepted in the input-sequence.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
ARTIFICIAL INTELLIGENCE 103

The most recently solved problems have been of t


aaaacaacaaadaabbbaccc

Decision 1: Which procedure to us


It should be observed that whatev
the same final solution will be give
therefore only will affect the comp
SEP 1 can use two different methods for this decision, one is to
study the pattern of the sequence of problem-types,14 the other to
check for the distribution of problem-types during the last few ex-
periments.
Method A. Pattern-recognition: By applying its letter sequence
extrapolation routine to the sequence of encountered problem-types
SEP 1 can identify a fairly large set of different patterns. A sequence
like a b d a a a b d a a a b d a a a would suggest that a problem of
type b is likely to occur next time. In our example no such pattern
can be identified, so method B must be used.
Method B. In this method the decision rule is to look back a certain

number of problems (the horizon), determine which type of problem


occurred most frequently, and to choose the procedure corresponding
to this type.
The length of the horizon determines the sensitivity and stability
of the decision-rule. SEP 1 chooses the horizon which would minimize

the number of wrong decisions for the set of problems already en-
countered, thus implicitly assuming that a similar distribution of
problem-types will occur also in the future. The optimal horizon is
related to the degree of randomness of the occurrence of different
problem-types. A long horizon is required for purely random oc-
currences, but an horizon of length I is optimal when the different
problem-types occur in groups within which all entries are equal, as
for instance in the sequence: a a a a a a a a a c cc c c c c c c c c
b b b b ... Assuming the present horizon to be 5 we by this method
will choose procedure C.

Decision 2: How many inputs to ask for?


A choice of too few entries means that goal 2 but not goal 3 will
" This method was initially suggested by Dr. Edward A. Feigenbaum.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
104 STAFFAN PERSSON

be satisfied. On the other hand too many entries wil


goal 3 but will not achieve goal 2. To satisfy each go
number of entries (i.e. the minimum number of entries which defines
the sequence) must be chosen. The decision-rule used is very simple,
SEP 1 asks for the number of entries which in most cases has been
sufficient to solve problems of the type chosen by decision 1 withou
requiring further additions. Let us assume that a sequence of 6 entri
is chosen.

Current input-list: 1 2 3 4 5 8
Use process C. Split the sequence into two sub-sequences.
1 3 5
2 4 8

The sub-sequences are


1 3 5 7 9 11 ...
2 4 8 16 32 64 ...

And the result is printed out as:

Odd sequence is: (2X-1)


Even sequence is: 2X

It must be noted that some trial and error is often necessary before
a solution can be found, our example was chosen to give the solution
in a minimum number of steps.
Samuel has designed a sophisticated learning-procedure for his
checkers-playing machine. Checkers (as chess) has a huge game-tree15
which forces any player of the game to utilize search-reducing methods
which include heuristics for selection of the most "promising" alterna-
tives and for determination of the depth of exploration.
We know that any method which does not examine the alternatives
down to an end-position of the game requires a static evaluation-
method for determining the relative merits of different alternatives.
In Samuel's machine this evaluation is performed by a polynomial,
which can be revised and improved for every move of the game thus
allowing a very fast rate of learning. Before any move is chosen a
set of different alternatives are examined for possible consequences.
This examination may include several moves ahead but usually ends
in a non-terminal position, the value of which is computed by using
15 The game-tree of checkers contains some 1040 different paths.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
ARTIFICIAL INTELLIGENCE 105

the evaluation polynomial. A minimax-procedure


native with the highest attainable of the compute
is also assigned to the current board-position. A
between the just computed "backed up" value and
puted value for the same board-position. If a diff
coefficients of the evaluation-polynomial are mod
which will reduce it. This learning method leads
evaluation of board-positions; inclusion of a go
piece-advantage term also assures improvement i
of the machine.

The 16 parameters of the evaluation-polynomial are selected


the machine from a list of 38 candidates. For each move the p
meter which gives the smallest contribution to the total value of
polynomial is recorded, when this has happened 8 times for the s
parameter it is replaced by one of the 22 parameters currently
used in the evaluation-polynomial. The experience of this genera
tion learning-procedure has shown that after initially violent cha
in the polynomial a stabilization of its coefficients and paramet
has occurred after completion of some 40 games.

Rote-Learning

Current experience can often be utilized at a later point in tim


memorization of pertinent information. SEP 1 simply memorize
previously encountered solutions, a feature which in many cases
produced interesting results. LT memorizes solved sub-problems
use in later search for proofs of theorems. Samuel in his checke
player uses an advanced rote-learning technique where the mach
in order to save memory-capacity, stores all board-positions
normalized form. His program also forgets positions which are sel
encountered.

The utilization of rote-learning is mainly limited by the avail


of memory-capacity, but another difficulty (encountered by L
also be mentioned. LT has the ability to store sub-problems pr
during its work. The availability of these however sometim
creases the selective power of the machine by increasing the nu
of alternatives available at later stages.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
106 STAFFAN PERSSON

The Means-Ends Analysis Heuristic

Means-ends analysis is a powerful heuristic which


situations where we want to transform an initial state to a final

state by employing a limited set of tools. As GPS gives a very


formulation of means-ends analysis we find it convenient to
description of this program to convey the idea behind this met
GPS consists of two basic parts, namely the Task-Environm
which contains all specific information about particular proble
applicable rules and the Core which contains context-indep
methods for problem-solving. GPS can solve problems of the fo
general form: "Transform the initial state X to the final sta
applying rules Ri belonging to a given set R."
The means-ends procedure is generated by recursive applica
the following goals:
1. Transform state A to state B.
2. Reduce the difference between states A and B.

3. Apply rule R? to state A.

The complexity of the resulting chain of subgoals may be illustr


by a brief description of the actions initiated by the different goals.
Goal 1. "Transform A to B" requires knowledge of how the two st
differ. If there is no difference we can exit with success, but i
difference D exists it must be reduced so goal 2 is evoked.
Goal 2. "Reduce difference D" requires that a suitable operator (r
is found. A table look-up indicates which operators are most like
to be successful for differences of the kind represented by D.
operator Ri is chosen. R, must now be applied to A so goal 3
evoked.

Goal 3. "Apply rule R, to A" requires that the form of A conforms t


requirements of R?. If R, can be applied directly to A we get a
state A' and goal 1 is evoked to transform A' to B. Otherwise go
must be evoked to transform A to a form acceptable by R,.

We see how the different goals can call each other and (indirec
themselves in a recursive fashion, which in many cases may gene
very long chains of goals. Many of these chains terminate in "i
possible" situations necessitating several repetitions of the proced
in part or total. In order to detect unfruitful attempts at an ea
stage GPS contains some heuristic "indicators". One of these is t

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
ARTIFICIAL INTELLIGENCE 107

previously mentioned table of recommended opera


criteria for refusing impossible or trivial problem
heuristic is that no subgoal may be attempted if i
more difficult than any goal at a higher level. Thi
expects to partition its task into successively easier

The Planning Heuristic


In many cases it is possible to reduce the number
an initial state A and a goal B by insertion of a num
Let us assume that our problem requires a choice b
natives at each stage and that the expected numbe
thus giving a total of MN possible alternatives. Th
sequential and equidistant subgoals will reduce the
sibilities to S .M-/ (for M= 10, N= 15, and S= 5 initi
tives are reduced to 5.103). This means that any m
such subgoals however crude it is may prove to be
method which in many cases can offer some h
heuristic proposed by Newell, Shaw, and Simon fo
idea behind this heuristic is to strip the original p
detail, solve the thus simplified problem, and then
a plan for the solution of the original problem.

Particular Heuristics
In several problem-environments some particular structure may be
utilized in order to reduce the number of possible alternatives.
Methods which are efficient in such special environments are here
called particular heuristics.
The basic ability of SEP 1 is to extrapolate polynomials. As the
previously described general form can generate sequences which are
not necessarily polynomial in structure SEP 1 must utilize methods
which break down the input-sequence to polynomial sub-sequences.
Let us study a particular example.
"Analyze the sequence 5 56 729 10240 203125 in which one of the
entries may be erroneous, and print out its general form."
Subgoal 1:16 Find the general expression of the exponent. We know that
16 The procedures discussed can easily be generalized for more complicated
general expressions.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
108 STAFFAN PERSSON

this is a polynomial of at most the order 1. Our proble


isolate the exponent. The only information available is tha
assume a value larger than the number of occurrences of the m
prime-factor indicates (there are three exceptions from th
us compute these numbers, they are:
1 3 6 11 6

This sequence is certainly not linear, so a p


is necessary. SEP 1 therefore generates a
heuristic device) and plots the above sequenc

Number of 12 - X
occurences

10- /

8-

x /a
6-

4- _

2-

Figure 2. 0 ' 1 1 , Entry


1 2 3 4 5 6 7 number

A line which satisfies the following


points:
1. It may not pass above more than 3 points.
2. It shall pass as many points as possible.
3. Its slope must be 0 or a positive integer.
In most cases several such lines exist but by using heuristics for choos-
ing a candidate SEP 1 usually finds the correct one rather soon. In our
case the line passes the following points:
2 3 4 5 6 Giving us the line (X+l).
Subgoal 2: Assuming the above hypothesis as correct we now procee
identify the basis of the exponential part of the general expression.
know that the exponent indicates the minimum number of occurren
of the prime-factors of the basis. Therefore take the product of all fac
which occur at least as many times as the exponent tells. We get
following result:
1 2 3 4 5 Giving us the line (X).
(The same diagrammatic technique as in subgoal 1 is used to produce
the linear sequence.)l8
17 When the basis is 0 or 1 any exponent gives only one prime-factor, as we
also allow for one error. There are 3 possible exceptions.
18 Note that the polynomial part may contribute with some factors, which
usually disturb the linearity of the assumed basis.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
ARTIFICIAL INTELLIGENCE 109

Subgoal 3: Identify the polynomial part. The procedu


straight-forward. First compute the exponential par
1 8 81 1024 15625 ...==X(X+l).

Divide the input by the exponential part:


5 7 9 10 13

An algorithm gives the polynomial part (2X +


(NB one error is allowed)

Test: The general expression (2X+3) XX+1 is now used to generate a


sequence which is tested against the input, if at most one dissimilarity
is found the general expression is printed out.

Among particular heuristics semantic reformulation of the problem


is a method which in many cases can provide better insight than the
original formulation. As an example of such reformulation we will
briefly describe Gelernter's Geometry Proving Machine, which uses
a coded graph to eliminate infeasible solutions. The idea behind this
semantic model is obvious, we know from experience how much easier
it is to prove theorems in geometry when we are allowed to look at
a graph (on paper or in the "minds-eye") than when we have to
work only with the formulation of the theorem in words. The geo-
metry machine consists of three parts, a "Heuristic Computer" acts
as an executive for a "Syntactic Computer" and for a "Diagram Com-
puter". Problems given to the "Syntactic Computer" by the "Heuristic
Computer" give rise to strings of proof-components generated in a
rather straightforward way. These strings are tested for feasibility by
the "Diagram Computer" which studies a suitably coded graph. Ex-
perience has shown that the use of the "Diagram-Computer" efficiently
reduces the exploration of unfruitful attempts. In addition to the dia-
gram the machine contains several other heuristics for determination
of the relative difficulty of problems, for determination of which
subgoal to attack first etc.

Artificial Intelligence: Past, Present, and Future

During the first few years of its existence research in artificial in-
telligence was mainly directed toward particularly interesting pro-
blems, resulting in the development of some powerful special-purpose-
8- 644811 Ekonomisk Tidskrift 1964 Nr 2

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
110 STAFFAN PERSSON

programs. This kind of work certainly proved th


problem-solving by machine, but did not increase very m
ledge about how to solve problems in general.
Current research seems to be more concerned with t
of general methods, hopefully leading toward a gene
artificial intelligence. If this research turns out to be
may in the future be able to tie together a couple of
problemsolving processes and include the resulting "
new member of our organization. GPS must be credi
the forerunners of this approach, still more powerful
ally when it comes to inductive capability are howev
must here be stressed that the most important ar
general problemsolving methods is the study of h
processes. When we know more about how people pro
formation the next step is to include our findings in
grams, a task which at that time may be simplified by th
of new computer-languages and/or computer-hardwar
Future uses of artificial intelligence may easily be
we feel that given the information presented in this
himself is in a position to judge and guess. For furth
about the field of artificial intelligence we recommend
puters and Thought" edited and commented by Feige
man (ref. 7). This volume contains research-reports,
and a very comprehensive bibliography; ingredients
makes this book the best possible introduction to the

References
1. Armer, P., "Attitudes toward Intelligent Machines" in reference 7.
2. Bernstein, A. et al., "A Chess-Playing Program for the IBM 704 Com-
puter, Proceedings of the Western Joint Computer Conference, pp.
157-159, 1958.
3. Clarkson, G. P. E., "Portfolio Selection: A Simulation of Trust Invest-
ment", Prentice-Hall, Inc., Englewood Cliffs, N.J.; 1962.
4. Clarkson, G. P. E., "A Model of the Trust Investment Process" in refer-
ence 7.

5. Ernst, H. A., "MH-1. A Computer-operated Mechanical Hand" Ph.


sertation, MIT, presented at the Western Joint Computer Confe
1962.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
ARTIFICIAL INTELLIGENCE Ill

6. Feigenbaum, E. A., "The Simulation of Verbal Learning Behavior",


Proceedings of the Western Joint Computer Conference, pp. 121-132,
1961. Reprinted in ref. 7.
7. Feigenbaum & Feldman eds. "Computers and Thought" McGraw-Hill
Book Company, Inc., New York; 1963.
8. Feldman, J., "Simulation of Behavior in the Binary Choice Experiment",
Proceedings of the Western Joint Computer Conference, pp. 133-
144; 1961. Reprinted in ref. 7.
9. Gelernter, H., "Realization of a Geometry Theorem-Proving Machine",
Proc. International Conf. on Information Processing UNESCO House,
Paris; 1959. Reprinted in ref. 7.
10. Green, B. F., Wolf, A. K., Chomsky, C., and Laughery, K., "Baseball: An
Automatic Question Answerer", Proc. of the Western Joint Computer
Conference, pp. 60-68; 1961. Reprinted in ref. 7.
11. Hunt, E. B., "Concept Formation: An Information Processing Problem",
John Wiley & Sons, Inc., New York; 1962.
12. Kister, J., Stein, P., Ulam, S., Walden, W., and Wells, M., "Experiments
in Chess", Journal of the Association for Computing Machinery, April,
1957, pp. 174-177.
13. Lindsay, R. K., "A Program for Parting Sentences and Making In-
ferences about Kinship Relations", in Symposium on Simulation
Models: Methodology and Applications to the Behavioral Sciences,
Eds. Hoggatt A. C. and Balderston F. E. South-Western Publishing Co.
Cincinnati, Ohio; 1963.
14. Luce, R. D., and Raiffa, H., "Games and Decisions; Introduction and
Critical Survey", John Wiley & Sons, Inc. New York; 1957.
15. March, J. G., and Simon, H. A., "Organizations", John Wiley & Sons, Inc.
New York; 1958.
16. McCulloch, W. S., and Pitts, W., "A Logical Calculus of the Ideas Im-
manent in Nervous Activity", Bulletin of Mathematical Biophysics;
1943; pp. 115-137.
17. Newell, A., and Simon, H. A., "The Logic Theory Machine-a Complex
Information Processing System", IRE Trans. on Information Theory,
vol. IT-2, pp. 61-79; September, 1956.
18. Newell, A., Shaw, J. C., and Simon, H. A., "Empirical Explorations of
the Logic Theory Machine: a Case Study in Heuristics", Proc. of the
Western Joint Computer Conference, 1957, pp. 218-230. Reprinted in
ref. 7.

19. Newell, A., Shaw, J. C., and Simon, H. A., "Chess-playing Programs and
the Problem of Complexity", IBM Journal of Research and Develop-
ment, vol. 2, No 4, 1958, pp. 320-335. Reprinted in ref. 7.
20. Newell, A., and Simon, H. A., "GPS a Program that Simulates Human
Thought", in Lernende Automaten, R. Oldenburg KG, Munich 1961.
Reprinted in ref. 7.
21. Rosenblatt, F., "The Perception: A Probabilistic Model for Information
Storage and Organization in the Brain", Psychological Review, No-
vember 1958, pp. 386-407.
22. Samuel, A. L., "Some Studies in Machine Learning using the Game of
Checkers", IBM Journal of Research and Development, July, 1959,
pp. 211-229. Reprinted in ref. 7.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms
112 STAFFAN PERSSON

23. Shannon, C. E., "Program


Philosophy Magazine, Marc
24. Simmons, R. F., "Synthex: Toward Computer Synthesis of Human
Language Behavior", in Computer Applications in the Behavioral
Sciences", H. Borko ed., Prentice-Hall Inc., Englewood Cliffs, N.J.;
1962.
25. Simon, H. A., and Kotovsky, K., "Human Acquisition of Concepts for
Sequential Patterns", Psychological Review, 1963, no. 6, pp. 534-536.
26. Slagle, J., "A Heuristic Program that Solves Symbolic Integration
Problems in Freshman Calculus", in reference 7.
27. Taube, M., "Computers and Common Sense", Columbia University Press,
New York, 1961.
28. Tonge, F., "A Heuristic Program for Assembly Line Balancing", Prentice-
Hall Inc.; Englewood Cliffs, N.J.; 1962.
29. Tonge, F., "Summary of a Heuristic Line Balancing Procedure", Manage-
ment Science, 1960, 7, pp. 21-42. Reprinted in ref. 7.
30. Turing, A. M., "Computing Machinery and Intelligence", Mind, October
1950, pp. 433-460. Reprinted in ref. 7.

This content downloaded from


128.130.251.200 on Wed, 03 Jan 2024 15:39:12 +00:00
All use subject to https://about.jstor.org/terms

You might also like