Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

AI Magazine Volume 8 Number 4 (1987) (© AAAI)

The Yale Artificial


Intelligence Project:
A Brief Historv J

Stephen Slade

This overview of the Yale Artificial Intelli-


gence Project serves as an introduction to
A n AI lab is like a greenhouse.
Researchers develop new ideas
and plant them in programs. The pro-
have considered a range of topics
related to reasoning including non-
monotonic logic, planning with
Scientific Datalink’s microfiche publica- grams are cultivated, hybridized, nur- incomplete knowledge, and reasoning
tion of Yale AI Technical Reports tured. The weaker ideas die out. The about space and time.
stronger ideas are grafted onto new l Cognition and Programming. Elliot
stock and serve as the basis of hearty Soloway, and David Barstow before
new strains. him, have undertaken a scientific
At Yale, there has been a traditional inquiry focused on the task of pro-
summer seminar series at which grad- gramming itself. How could a com-
uate students present their unprepos- puter automatically write programs?
sessing theories to the vocal and criti- How do people learn to program?
cal review of their colleagues. The How can a computer program help
tenor of these discussions is conveyed people learn to write better programs?
by their title: “The Friday Fights.” l Cognitive Science All the work at
Such occasions provide the Yale the Yale AI Project could be termed
researcher opportunities for both part of cognitive science. We have had
pruning and growth. Cultivation by a long association with members of
candor is the standard. This level of the Yale Psychology Department,
peer review has also been experienced especially Robert Abelson and John
by colloquium speakers. Many visi- Black. Psychologists-faculty, visi-
tors to the lab were unprepared for the tors, graduate students-have actively
onslaught. By now though, Yale’s rep- contributed to the AI research efforts,
utation for open debate has led many testing and refining theories of human
speakers to agree to disagree. cognitive processing.
Ideas and theories grow through
this process of natural selection. The
Cognitive Modelling
best are represented here, in this col-
lection of technical reports from the The Yale AI Project began in 1974
past dozen years. These reports are when Roger Schank and Chris Ries-
our harvest-the results of twelve beck came from the Stanford AI Lab,
intense years of work by a large crop via the Istituto per gli Studi Semantici
of researchers. e Cognitivi in Castagnola, Switzer-
In the present article, we survey land, to join the Yale Computer Sci-
this collection, which is now made ence Department.
available on microfiche through Sci- The faculty at Yale, especially Alan
entific Datalink. The work falls into Perlis and Martin Schultz, were very
several areas. supportive of the establishment of an
l Cognitive Modelling This category AI lab. Robert Abelson, in the Psy-
is quite broad, and is used here to chology Department, had a long-
describe the work of Roger Schank standing interest in AI and had
and his students. It includes natural already begun collaboration with
language processing, models of human Schank Graduate students were
memory organization, learning, and quickly drawn into the work. Jim
explanation. Meehan, Wendy Lehnert, Rich
Cullingford, and Gerry DeJong were
l Spatial and Temporal Reasoning.
among the first students involved.
Drew McDermott and his students

WINTER 1987 67
The main research tool was a DEC In the latter case, even the most Inference.CD included a process
PDP-10, equipped with custom-built objective researcher may find himself model for determining the implicit
Sugarman CRT’s and Yale’s own full- tailoring the examples to just those meaning of a sentence.
screen “E” editor, courtesy of Ned cases which he knows the program Ambiguity. The same word can
Irons. can handle. Finally, the experimental have a variety of meanings. (“John
The focus of the initial research was paradigm implicit in this approach gave Mary a kiss.” versus “John gave
natural language processing. At Stan- requires the researcher to build an Mary a book.“) The CD paradigm
ford, Schank had developed the actual computer program. The pro- provided a means of distinguishing
MARGIE system (Schank 1975) with gram is the crucible in which theories among multiple word senses.
his students Goldman, Rieger, and are tested and molded. Without a pro- As the domain of concepts expand-
Riesbeck. MARGIE was used to gram, many of the unstated supposi- ed in the subsequent years, new types
demonstrate the effectiveness of con- tions in a theory are never revealed or of knowledge representations were
ceptual dependency (CD) as a lan- examined. In writing a program, the developed These included primitives
guage-free, canonical meaning repre- researcher must confront these for social acts, attitudes, and objects,
sentation MARGIE would read an assumptions. as well as larger knowledge structures
English sentence, using Riesbeck’s . Psychological Process Model. The built from these primitives, such as
expectation-based parser to build a MARGIE program was a cognitive scripts, plans, goals, memory organi-
CD form which represented the simulation. Not only did it try to per- zation packets (MOPS), thematic orga-
meaning of the sentence MARGIE form tasks that people perform, but it nization packets (TOPS), and explana-
would then make inferences based on tried to simulate the manner in which tion patterns (XPs).
the meaning of the input sentence the human mind works. By compari- One additional feature of the
using Rieger’s inferencing program. son, a computer chess program which MARGIE system that carried over to
Finally, the results of the initial parse, exhaustively searches ahead several Yale researchers was the habit of giv-
as well as the inferences, could be moves may be able to play a fine game ing programs names of people. From
converted back into natural language of chess, but it is unrealistic to con- SAM and ELI through BORIS and
with Goldman’s generator program, sider such a program a model of the CYRUS, this convention has been fre-
producing paraphrases in either way in which a person plays chess. quently adopted.
English or German. The underlying process model in
The MARGIE program, though con- MARGIE comprised three stages: Scripts, Plans, Goals,
sidered a toy system, was an effective parsing, inferencing, and generation. and Understanding
and productive model for the ensuing This basic triad has been the founda- The Yale AI research reports begin
research projects. There were several tion for the subsequent generations of with SAM, the Script Applier Mecha-
salient characteristics of MARGIE programs. The primary focus has been nism, the first computer program to
that have developed as themes in Yale on integrating the three processes to understand stories in context.
cognitive modelling research through allow more interaction with memory. MARGIE had been able to understand
to the present. In recent years, the role of memory simple sentences in terms of the
l Task Orientation. An AI program has transcended the specific natural actions or states which they repre-
should address a specific, real-world language concerns and has come to sented. However, connected text-a
task. The program should model encompass learning and explanation story-could cause a problem if the
something that a person actually does, processes. This development can be series of actions and states could not
rather than an artificial abstraction of viewed as a natural evolution of the be connected through simple infer-
intelligent behavior. MARGIE’s tasks original central inference process. ences. How can a program infer con-
included reading, paraphrase, and . Canonical Representation of text? Two brief stories illustrate the
translation. Subsequent programs Knowledge. The heart of the MARGIE problem.
have created stories, answered ques- system was the conceptual dependen- John picked up a rock. John threw
tions, summarized stories, skimmed cy knowledge representation system. the rock at Mary. The rock hit
stories, professed opinions, related old CD provided a means of representing Mary.
stories to new ones, and engaged in actions and states in a canonical, lan-
conversations. There are several rea- guage independent fashion. A concept In this first case, the actions are
sons for choosing real tasks. First, causally linked. A reader can infer the
represented using the dozen CD prim-
these tasks are in the realm of the connections from one action to anoth-
itives might be expressed in any num-
possible-people provide an existence er and build a causal chain based on
ber of ways in any number of lan-
proof. Second, the researcher has tan- the inferences associated with each
guages. More specifically, CD
gible ways of assessing the results of action. For example, picking up an
addressed a broad range of problems
the program through comparisons object can enable a person to propel
associated with meaning in language.
with human performance Third, the that object.
Translation, Synonymy and Para-
researcher has a ready supply of data. phrase CD insured an identical rep-
John went to a restaurant. He
It is preferable for a program to use resentation for two different sen- ordered a lobster. He paid the
real data instead of canned examples. tences having the same meaning. check and left.

68 AI MAGAZINE
This second story demonstrates Granger built a parsing module that was not about earthquakes at all, but
that some causal chains are derived could infer the meaning of unknown concerned the tragic shooting death of
from context. In this case, the reader words from context. the Mayor of San Francisco. FRUMP
can infer a lot about what John did at SAM had a small repertoire of had taken the lead sentence literally:
the restaurant simply because the scripts, and would read news stories in “San Francisco was shaken by the
reader knows a lot about restaurants. great detail, looking for every bit of death of Mayor Moscone and City
For example, we presume that John meaning it could find. Another Councilman Harvey Milk.” The figu-
ate something-probably a lobster. approach to reading news stories was rative use of language remains an open
The story, though, does not mention explored by Gerald DeJong in FRUMP question for researchers in natural lan-
eating at all. However, the reader [Fast Reading Understanding Memory guage processing.
knows that people usually go to Program). FRUMP was connected In addition to looking at scripts,
restaurants to eat. This type of knowl-
edge about stereotypical events has
been termed a script Scripts were
originally proposed by Schank and
Abelson (1975, 1977) for story under- An AIprogram should address a specific, real-world
standing and are similar to the frame task. The program should model something that a
knowledge structure proposed by
Minsky (1975). person actually does, rather than an artificial
Scripts comprise a set of roles, abstraction of intelligent behavior.
props, goals, locations, and events. In
the restaurant script, notated as
$RESTAURANT, the roles might
include customer, waitress, and cook; directly to the United Press Interna- Yale researchers explored intentionali-
the props could be a menu, table, and tional news wire and could skim news ty One of the earliest programs to
silverware; the locations could be the stories in dozens of different domains, embody goals and plans within the
bar, dining area, and kitchen; and the and produce summaries in several lan- CD paradigm was Jim Meehan’s
events would include arriving, seat- guages. On the DEC-20 (which by TALESPIN, which made up stories
ing, ordering, and so forth. Other 1978 had replaced the PDP-101, similar to the fables of Aesop. The
script-based situations would be tak- FRUMP could process an average program would start with a set of
ing a plane, attending a concert, buy- news story in under ten seconds. characters who wanted to achieve cer-
ing a car, or going to the dentist. FRUMP’s domain knowledge was rep- tain goals. The story would be a narra-
These are situations in which people resented in sketchy scripts that lacked tion of the characters’ attempts at exe-
have sets of expectations based on the detail of SAM’s scripts, but provid- cuting plans to satisfy their goals.
numerous previous episodes. The con- ed a feasible method for capturing the Many of the unsuccessful stories are
stituent events of these episodes are salient details of news stories. striking in exposing inferences that
sufficiently similar to allow a person FRUMP’s world knowledge comprised the program had been unable to make.
[or a program) to make inferences a range of news events including natu- One day Joe Bear was hungry. He
when details are not explicitly stated. ral disasters, such as floods and earth- asked his friend Irving Bird where
SAM was a test of the script knowl- quakes, international incidents, such some honey was. Irving told him
edge structure. After development as breaking diplomatic relations or there was a beehive in the oak
using simple stories about restau- armed conflict, and deaths of famous tree. Joe threatened to hit Irving if
rants, SAM was applied to real news- people. he didn’t tell him where some
paper stories using scripts describing It is important to note that the honey was.
events such as automobile accidents. scripts in FRUMP and SAM were not Here, the program did not know
Like MARGIE, the SAM project was a triggered by keywords like “earth- that it could infer the location of an
group effort. SAM’s parser was ELI quake” or “death,” but by the concepts object from the location of the con-
[English Language Interpreter], which that embody the underlying meaning. tainer of that object. Once this piece
was a revised version of Riesbeck’s This point can be illustrated by an of knowledge was added, the program
original parser. Anatole Gershman error FRUMP made in processing a tried again.
developed a method for parsing com- certain story. From a UP1 wire dis-
One day Joe Bear was hungry. He
plex noun-groups (for example, “Frank patch, FRUMP produced the summary
asked his friend Irving Bird where
Miller, 32, of 593 Foxon Rd., the driv- “There was an earthquake in San Fran-
some honey was. Irving told him
er,“). Rich Cullingford built the actual cisco. Two people were killed.” This
summary initially appeared plausible, there was a beehive in the oak
script applier which was the core of tree. Joe walked to the oak tree.
the system. Wendy Lehnert developed but FRUMP usually provided more
details for an earthquake story, such He ate the beehive.
a conceptually-based question- Examples such as these serve to
answering system that allowed the as the severity of the earthquake and
the amount of damage reported. As it increase an AI researcher’s respect for
program to respond to queries about the complexity and diversity of com-
the content of the stories. Rick happened, the original news report

WINTER 1987 69
monsense knowledge. had produced a number of theories world knowledge must play in lan-
Goals and plans received a different and techniques that could be adopted guage tasks, pointing out the problems
treatment in Jaime Carbonell’s POLI- by others outside Yale. Within the associated with the “syntax module”
TICS program, which modelled sub- Yale AI lab, the new challenge was to approach to language processing. Car-
jective beliefs using a hierarchy of combine the various knowledge struc- bonell, Cullingford, and Gershman
goals. The program would read a news tures and theories in an integrated sys- discussed the crucial role of meaning
headline and interpret it from one of tem. in machine translation. Later, Steve
two opposing perspectives: conserva- In the summer of 1978, Yale was the Lytinen’s MOPTRANS program
tive and liberal. The program would site for a workshop in cognitive sci- embodied an approach to machine
reason about its opponent’s behavior ence. Psychologists, linguists, philoso- translation which integrates seman-
(in this case, the Soviet Union) based phers, and anthropologists convened tics and syntax in a psychologically
on a model of the goals of the oppo- in New Haven for four weeks of AI, motivated fashion.
nent. In the event of competing goals, programming, CD, scripts, and theo-
the program would use counterplan- retical cross-pollination. Most of these Learning, Memory, and Explanation
ning to block an opponent’s goal, or to researchers had little programming The work on conceptual dependency,
make sure that its own goals were not background Chris Riesbeck and Gene scripts, plans, and goals demonstrated
blocked by the opponent. Charniak (a frequent visitor at Yale) the importance of knowledge repre-
The domain of political judgement developed “micro” versions of SAM sentation for cognitive modelling
in POLITICS led to the development and ELI that made those programs tasks. It was clear that people relied
of a new set of semantic primitives to more accessible to neophyte program on a considerable amount of knowl-
represent institutional actions. These edge about the world. It was also
social acts could capture the differ- apparent that much of this knowledge
ence between “John gave Mary a One day Joe Bear was was not innate. People acquire knowl-
book” and “The policeman gave Mary hungry. He asked his edge. To get computers to simulate
a ticket.” The first case is a simple friend Irving Bird where human cognitive behavior, the pro-
transfer of possession, while the sec- grams would have to learn as well.
ond embodies an institutional act some honey was. Irving One of the first Yale learning pro-
based on the implicit authority of the told him there was a bee- grams was developed by Mallory Self-
polity for which the policeman is an ridge. His program modelled the learn-
agent. The social acts were designed
hive in the oak tree. Joe ing of language during a child’s second
to represent institutional actions and walked to the oak tree. I2 months, that is, between the ages
mediated disputes. He ate the beehive. of one and two. At this stage, a child
Two other domains were the sub- presumably understands certain basic
ject of new semantic primitives: This pedagogical approach was later concepts such as specific physical
objects and attitudes Wendy Lehnert expanded into a book which included objects, movement, eating, and so
with Mark Burstein developed a set of five Yale AI programs (Schank and forth. In learning language, the child
primitives to capture inferences about Riesbeck 1981). develops a mapping between
physical objects, such as bottles, pens, Meanwhile, Rick Granger, and later sounds-the words-and these con-
sponges, umbrellas, and shopping Mike Dyer, addressed the synthesis cepts. Selfridge’s program was based
carts. Object representations allow problem. BORIS (Better Organized on extensive protocols, and concen-
the inference of default information Reading and Inference System) was an trated on learning the meaning of
such as location, associated scripts, effort to combine disparate knowledge words and word sequences, rather
and relations. Object primitives types including CD actions, scripts, than syntax rules. Selfridge demon-
embody a psychological approach to plans, goals, interpersonal relations, strated the generality of his system by
the problems of naive physics. role themes, and affect. Dyer’s pro- having it learn not only English, but
The attitudes primitives attempted gram could read stories in great Japanese as well.
to represent a number of dimensions depth-making numerous inferences Beyond the specific task of language
of attitudes including fondness- and tying together various pieces of acquisition, there lies the general
antipathy, fascination-disinterest, information The domain of Dyer’s problem of learning. The issue of
fear-security, attraction-repulsion, BORIS was melodramatic divorce sto- learning can be seen from many per-
jealousy-concern, irritation-comfort, ries, such as one might encounter on a spectives. The computational
respect-disdain, and trust-distrust. soap opera. metaphor leads one to view learning
These semantic primitives provided a These various programs reflected as a matter of updating memory. Sev-
basis for making inferences about the natural language processing eral questions arise.
interpersonal behavior. paradigm that began with MARGIE. l How is human memory organized?
At this point, it should be apparent The role of syntax was secondary to l How are new events assimilated
that a wide range of problems were the problems of meaning representa- into memory?
under investigation. The next stage of tion. Larry Birnbaum provided argu- l How are memories indexed for
development involved both exposi- ments for the roles that meaning and
tion and synthesis. Yale researchers retrieval?

70 AI MAGAZINE
Given the psychological claims ory CYRUS stored and retrieved numerous cases in the domain, and
made for the AI knowledge structures, episodes in the lives of Secretaries of who has generalized this experience
it was important that the predictions State Cyrus Vance and Edmund to apply it to new situations When
made by the theories reflect empirical Muskie. When new events were added confronted with a problem, the expert
findings In an experiment to explore to its memory, CYRUS integrated is reminded of previous, similar prob-
scripts, the psychologists Bower, them with the events it already lems and their respective resolutions.
Black, and Turner (1979) tested sub- knows about. CYRUS answered ques- It may be that the expert has so many
jects who had read script-based sto- tions using memory search strategies exemplars for a given problem that
ries. The results demonstrated that based on reconstructive memory pro- the experiences have been distilled
subjects would confuse elements of cesses into a general rule to be applied.
different stories if the underlying How could story understanding pro- In the production system paradigm,
events were similar For example, a grams benefit from these theories? the rule is hard-wired into the system.
visit to a doctor’s office is similar to The FRUMP program provides an If a rule fails, the system generally
going to the dentist This confusion illustration If FRUMP were given the requires human intervention to revise
suggested that there were not in fact same story to read a dozen times in a the rule. This makes the system more
distinct, mutually exclusive scripts row, it would process the story exact- fragile and less robust
$DOCTOR-VISIT and $DENTIST- ly the same way each time, and pro- Furthermore, most of the knowl-
VISIT. duce the same set of summaries This edge in the system is written strictly
The theory needed revision to repetitive behavior should not be sur- in terms of the domain. While it is
account for this data. Schank pro- prising from a computer, but it would certainly important for the system to
posed a generalization of the script be most unusual for a person. We have much domain-specific knowl-
notion: a higher-level knowledge would expect the person’s reaction to edge, it is also true that people have
structure that organized smaller com- change over time In particular, the the valuable ability to generalize their
ponents. This higher-order script was person should remember having knowledge across domains. That is,
termed a Memory Organization Pack- already seen the story FRUMP’s prob- they can apply general principles
et, or MOP. Thus, there could be a lem was that it could not remember acquired in one setting to a situation
MOP for IJROFESSIONAL-OFFICE- what it had read. involving quite different specific
VISIT which could have complex and Michael Lebowitz’ IPP program knowledge While the production-sys-
variable sets and orders of scenes applied MOPS to the problem of read- tem approach offers a general
This hierarchical view allowed greater ing newspaper stories IPP (Integrated paradigm for expert systems, it does
flexibility. Each scene component Partial Parsing) had two thrusts. The not provide a general mechanism for
would have less variability, but might first was a parsing strategy that applying knowledge from one domain
be shared among several higher-level allowed the program to focus its to another area of expertise.
knowledge structures In the PROFES- attention on the interesting words or A memory-based model of human
SIONAL-OFFICE-VISIT MOP, there phrases, and skip the dull ones. IPP’s expertise can be contrasted with the
would be common components such domain of terrorism had a great deal rule-based approach.
as making an appointment, going to of intrinsic interest 0 Psychologically valid. People don’t
the office, sitting in the waiting room, The second and most significant become experts simply by being told
and paying the bill. Only the dentist aspect of IPP was its ability to remem- the rules. They become experts by
version of this MOP would have a ber stories that it had read and relate extracting the rules from experience
scene for tooth extraction. them to new episodes Using this l Tolerant of rule-failure. Since the
MOPS provide an architecture for MOP-based mechanism, IPP could rules of the system are derived from
human memory of episodes, and sug- detect similarities among stories and experience, a new experience which
gest an explicit mechanism for the arrive at generalizations. For example, violates a rule would be assimilated to
process of reminding. However, it was IPP would notice that the victims of modify the rule.
apparent that people often relate terrorist acts in Northern Ireland were
l Generalizable across domains. The
events that have little surface similar- establishment, authority figures, such
general mechanism of developing
ity, but share an underlying goal as policemen or soldiers, and that the
expertise through experience allows
structure. Schank proposed Thematic terrorists were members of the IRA.
the system to be applied directly to
Organizational Packets or TOPS to When IPP subsequently read a story
new domains.
provide a goal-based indexing mecha- about a policeman being shot in
An expert system that can extract
nism for human memory. These top- Northern Ireland by an unidentified
information from its experience will
ics are explored in depth in Schank gunman, IPP could infer that the gun-
be able to grow and acquire knowl-
(1982). man was a member of the IRA.
edge on its own This is a crucial step
An early application of MOPS was Another broad area of application
for the long-range success of the
Janet Kolodner’s program CYRUS for episodic memory is the area of
expert system concept in AI There
[Computerized Yale Retrieval and expert systems The central feature of
are so many tasks to which automat-
Updating System). CYRUS represents expertise is experience An expert is
ed reasoning power might be applied,
one of the first attempts to model the someone who has vast, specialized
that it is absolutely necessary to
organization of episodic human mem- experience, who has witnessed

WINTER 1987 71
develop a mechanism that can assimi- tence to mete out, based on the results be applied to primary and sec-
late new knowledge directly from judge’s experience. ondary education? How can micro-
experience. Kristian Hammond’s CHEF program computers best be deployed in our
There are several complex research has the task of solving problems by schools?
issues involved in developing this reusing plans for similar problems.
novel approach to expertise. These The plans are recipes, and the prob- Spatial and
include the familiar questions of lems are the creation of new dishes Temporal Reasoning
memory organization and indexing, with specific ingredients. CHEF is
and the underlying learning mecha- able to reason about multiple, inter- Drew McDermott came to Yale from
nism IPP can be viewed as a proto- acting goals, and furthermore, can MIT in 1976. At MIT, McDermott had
type of this approach. learn more general planning heuristics worked with Gerald Sussman in
IPP’s learning was based primarily in the process. When CHEF detects a developing the CONNIVER AI pro-
on noticing similarities between failure in one of its recipes, it can ask gramming language (McDermott and
events. This similarity-based learning itself questions to reason about the Sussman I973), and applied deductive
is clearly part of human cognitive pro- cause of the failure. This process of techniques to problem solving and
planning. Three themes of McDer-
mott’s early work have continued in
The central feature of expertise is experience. his past ten years at Yale.
l Deduction McDermott has based
much of his research on the underly-
cessing, but it does not account for examining failures can generally lead ing paradigm of logical deduction. His
the human ability to reject some simi- to an explanation of the failure. work with Doyle on non-monotonic
larities as mere coincidences while Computer models of human learn- logics is one extension to classical
labelling others as significant. ing provide insights into how people logic to address particular AI problems
Schank proposed that learning is learn. Schank recognized that these of plausible reasoning.
triggered by expectation-failure That theories can be applied to the teaching l Embedded AI Languages. The origi-
is, when we observe a discrepancy of children (Schank 1981). One major nal work on CONNIVER has been
between our predictions and some lesson learned from building comput- extended with the Duck language.
event, we then have something to er programs that read stories is that Practical issues of portability were the
learn. We need to revise our knowl- children, even at the age of 3 or 4, motivation for the NISP language used
edge structure. The mechanism for have a tremendous amount of knowl- in implementing Duck.
updating our knowledge often edge about the world. Computer pro- l Planning. The intellectual testbed
requires explanation. Schank suggests grams have to be fed that knowledge for McDermott’s theories is problem
that explanation plays a central role in order to read. With children, that solving in the physical world. McDer-
in learning and intelligence (Schank knowledge is a rich asset that the edu- mott and his students have explored a
1986). He proposes an explicit knowl- cation process should exploit. In range of issues involved in reasoning
edge structure, explanation patterns recent years, Schank and the author about space and time including
(XI%), that are used to generate, index, have applied AI perspectives to the knowledge representations, planning
and test explanations. construction of educational software strategies, and scheduling heuristics.
The theories of failure-driven learn- for microcomputers. Together with McDermott has termed his research
ing and explanation are being explored Riesbeck and Soloway, we have begun domain theoretical robotics Just as
in several domains, including eco- to look at a wide range of issues to Schank’s work explores the implicit
nomics, law, and cooking. Chris Ries- create effective educational programs knowledge that people have about
beck, together with Jim Spohrer and that address the pressing needs of the intentionality and social domains,
Charles Martin, have developed a pro- schools. McDermott tries to make explicit the
gram in the domain of political eco- In summary, Schank’s work in cog- knowledge that people have about
nomics. The program begins with a nitive modelling has three broad agen- space and time. A robot must be able
novice view of economic questions, das. The first is scientific. What is the to solve problems which require
and becomes increasingly knowledge- nature of the human mind? How do knowledge about spatial and temporal
able through reading the opinions of people remember, learn, and under- phenomena, and the ability to act in
experts, such as Milton Friedman and stand? the absence of complete information.
Lester Thurow. The second is technological. How
William Bain has applied the mem- can we build intelligent machines? Deduction and Duck
ory-based approach to legal reasoning. How can we program computers to Many early AI programs adopted the
Bain observed several lawyers and communicate in natural language, robot planning metaphor, but usually
judges in the context of sentencing learn from experience, and explain in a restricted domain, such as the
convicted criminals. His program, anomalies? blocks world. In these cases, it was
JUDGE, simulates the process of a The final goal is educational. How feasible to represent the complete
judge deciding the appropriate sen- can the scientific and technological world. That is, the robot would have

72 AI MAGAZINE
exact knowledge of the location, size, Non-monotonic logics comprise a substantiated.
and shape of all relevant objects. It is range of logical assumptions, from the l Duck is a general-purpose AI pro-
now clear to most AI researchers that conceivable to the provable, from the gramming language which is particu-
such complete world knowledge is arguable to the doubtless. The point is larly suited to rule-based applications.
not merely an experimental abstrac- that people must rely on inferred Duck provides an interactive debug-
tion-it is delusional. Rarely if ever information to understand the world ging environment, including a conve-
does a person have exact knowledge around them Inference is the rule, not nient way to integrate English lan-
The blocks world thus finessed one of the exception. The problem of reason- guage templates into the program.
the toughest AI problems. reasoning ing from incomplete information per- This means that Duck can explain its
under uncertainty. vades disparate AI domains actions in a pseudo-English, instead of
The typical reasoning paradigm for McDermott implemented default regurgitated code fragments or stack
the blocks world was deductive reasoning in the context of a deduc- frames
retrieval. In a traditional deductive tive retrieval language: Duck Duck Duck has been around for nearly a
logic system, every item in the
database is true-either as an axiom
or as a theorem derived through a
valid inference process. The addition When faced with uncertainty, a person or robot
of facts always increases the number
of facts in the database The size of must make assumptions.
the database is a monotonic increas-
ing function. Deduction provides a
tidy way to keep track of the state of a
closed world.
When faced with uncertainty, a per- has been the primary vehicle for decade. It is a mature programming
son or robot must make assumptions. McDermott’s research, and has also environment for both teaching and
These assumptions become part of been made available to researchers research-especially in domains that
the reasoning process. Non-monoton- outside Yale, in both universities and are rife with uncertainty. In recent
ic logic derives its name from the sig- industry. years, Duck has been recognized out-
nificant property that the addition of Duck has several facets side academia as an effective tool for
a new fact to a database can result in a l Duck is a logic programming lan- building expert systems.
decrease in the size of the database. guage, similar to Prolog. However,
The Realms of the
This curious state is due to the inclu- Duck is a descendent of Hewitt’s orig-
sion of assumptions in the database inal PLANNER language (Hewitt Unknown: Space and Time
along with observed and proven facts. 1969), which pioneered logic-based The domains on which McDermott
The addition of a new fact could programming Duck maintains a rela- and his students have concentrated
result in previous assumptions tional database that supports predicate are space and time-an abundant
becoming invalid and then deleted calculus deductions, using both for- source of reasoning problems. Spatial
from the database. For example, if we ward and backward chaining. reasoning encompasses a range of
are given the statement “John is mar- l Duck is built on top of NISI’ (Nifty problems. These include the follow-
ried to Mary,” we might infer a range LISP)-McDermott’s portable dialect ing:
of things about John including the fol- of LISP that itself runs on top of other l Naive physics. How do physical
lowing. dialects of LISP, including Common forces affect physical objects? For
1. John is an adult male. LISP, Interlisp, Franz LISP, ZetaLISP, example, suppose a robot accidentally
2 John has the usual anatomical and Yale’s own T (Slade 1987). NISP drops a computer out of a third floor
structure. provides a range of features including office window. Where will the com-
3 John can walk, talk, and chew a structure package, type declarations, puter end up and in what condition
gum stream-oriented I/O, closures, and a will it be?
4. John has a primary interest in workspace manager. Duck has access l Navigation. How could a robot rep-
the welfare of Mary and himself. to all the features of NISP. resent knowledge about a familiar ter-
These inferences are based on l Duck provides a reason mainte- rain to allow it to plan a route? For
default reasoning, and they are not nance system which is a mechanism example, in the previous situation,
facts. In particular, if we later are told for plausible reasoning. In particular, what is a reasonable path for the robot
“John died, I’ we must revise our Duck maintains justification links, or to take to get to the new location of
beliefs to remove any conclusions data-dependencies, for assertion-infer- the computer? Presumably, the robot
that we had inferred based on the ence chains. If some fact is removed should try a route other than that
assumption, directly or indirectly, from the Duck database, then Duck taken by the computer itself.
that John was alive. Thus, learning of can automatically remove all related Design. What physical properties of
l

John’s demise results in our removing beliefs that were derived based solely objects can be exploited to solve prob-
conclusions from the database. on that fact, and thus are no longer lems? What available containers could

WINTER 1987 73
our forlorn robot use to help carry the a state change that occurs over time, revised assumptions. Dean’s program
pieces of the computer back upstairs? we may invoke a rule such as provides a methodology for reasoning
An envelope? A Coke bottle? A desk Rule: If a person is eating, then his about these kinds of temporal dimen-
drawer? A trash can? hunger decreases. sions to planning.
Ernie Davis’ program MERCATOR We might then read the story. Dean’s temporal database is used by
was a computational model of memo- John was hungry. He started eat- David Miller’s task scheduler. A given
ry for spatial relations McDermott plan may have a large number of steps
which can be arranged in a number of
orders. The question for the planner
is What is the best order? The answer
Reading a computer program is like reading a to that question depends on a variety
story-almost. of factors.
l Preemption. Can a task be broken
up into parts that may be executed
and Davis demonstrated that a simple ing a hamburger. He finished
noncontiguously?
Cartesian representation of space is lunch.
l Resources Do plan steps have over-
not appropriate. In general, people do When we observe John eating, we
lapping resource requirements?
not know the precise location and infer that his hunger subsides. That is,
extent of objects in the world Yet, John’s diminished hunger is an l Precedence Is a given step a prereq-
people can nevertheless perform a assumption based on the fact that he uisite to another?
range of spatial reasoning tasks based is eating. However, once John stops l Delays How soon can a given step
on a cognitive map, a conceptual rep- eating, we must remove this assertion be initiated?
resentation of spatial relations from the database. At this point, we l Deadlines How soon must a given
MERCATOR addressed a number of can no longer infer a decrease in his step be completed?
problems, including the following. hunger, so this fact is erased The sys- l Execution time How long will it
l Inexact knowledge. Many geograph- tem then concludes that John is still take to complete a given step?
ic facts are imprecise or incomplete hungry after eating lunch. Miller developed a representation
MERCATOR used a fuzzy representa- What is needed to rectify this situa- and heuristics for scheduling combi-
tion scheme to capture partial knowl- tion is some notion of effects that per- nations of plans, such as preparing
edge. Locations could be specified sist. Thorny issues of causality often dinner while washing clothes. The
with great imprecision. appear in the guise of temporal prob- program recognizes which steps could
l Retrieval The program accessed a lems. McDermott and his students be reordered or interleaved.
database of geographic knowledge to have addressed these problems from a The work of McDermott and his
accomplish tasks such as finding variety of perspectives. students represents a long-term
routes and answering geographic McDermott devised a temporal research program. These research
queries. A question like “Is the Chi- logic for planning and problem solv- issues are broad in scope. They can
nese restaurant within walking dis- ing. A computer program, the Forbin provide an illuminating perspective to
tance?” does not require calculating planner, was developed to explore many areas of artificial intelligence.
the exact distance involved. issues of temporal reasoning in plan- McDermott has applied his vision and
l Assimilation. The program could ning. Forbin builds plans top-down, methodology in his introductory AI
learn new geographic facts. That is, expanding and analyzing the sub- textbook, written with Charniak
the program would revise its cognitive tasks. Forbin may have many alterna- (Charniak and McDermott 1985).
map to be congruent with new senso- tive plans in its library for accom-
ry data This learning component was plishing a given task It must decide Cognition and Programming
the primary focus of Davis’ thesis which plan best achieves its goal. In
The domain of spatial relations is this process, Forbin relies on two key The one knowledge domain in which
enormous. Davis argues that almost components: a temporal database and AI researchers feel most at home is
any physical problem includes spatial a task scheduler. computer programming. Many fields
reasoning. Furthermore, many The ordering of events in a plan can of computer science develop tools and
abstract problems benefit from spatial be represented using a time map, anal- techniques to make the construction
analogies. Space invades our thoughts ogous to Davis’ cognitive map, but of computer programs easier to learn,
The companion of space is time with distinct properties. The cognitive more accurate, quicker, and generally
Like space, time is pervasive and peo- map was a geographical database; the more efficient. AI research efforts
ple reason about time without com- time map is a temporal database Tom have attempted to automate and
plete knowledge Dean implemented a time map main- model the process of programming.
Using default logic to reason about tenance system to allow a planner to David Barstow’s work represents
time can be revealing and problemat- reason about persistent and transient the automatic programming paradigm.
ic. Data-dependencies and temporal states. A planner faces a dynamic Barstow’s PECOS system is a knowl-
reasoning can have anomalous inter- world of shifting goals, deadlines, edge-based coding expert. Developed
actions. For example, in representing changes in the environment, and at Stanford, PECOS takes a high-level

74 AI MAGAZINE
representation of an algorithm and tain, and so forth. Soloway’s approach, generate programs. In developing
converts it into a concrete implemen- however, is to look less at software these models, CAPP has carried out
tation through a process of gradual per se, but rather, focus on designers, traditional controlled-experiments,
refinement. PECOS’s programming programmers, and maintainers as they analyzed talking-aloud protocols, and
knowledge comprises a set of transfor- engage actively in building and main- constructed computer-based simula-
mation rules. Barstow came to Yale taining software The claim, then, is tions. In sum, the theoretical aspects
after Stanford. He explored both the that by understanding how program- of this research push the frontiers of
applicability of algorithm refinement mers go about comprehending a pro- problem solving and text comprehen-
to new problems and the contrasting gram, one can pinpoint where they are sion research, while practical payoffs
paradigm of deduction-based theorem having difficulties, and thus be in a for software engineering follow direct-
proving for program synthesis, before position to design methodologies, lan- ly from the principled study of the
moving on to Schlumberger. guages, or tools that address the pro- programming process.
Following Barstow, Elliot Soloway grammers’ problems from a princi-
arrived at Yale with a different pled, cognitive viewpoint. AI and Education
approach to studying programming. Soloway and his group have applied A commonly held belief is that teach-
Instead of a rule-based or logical this cognitive approach to a number ing kids computer programming really
methodology, Soloway’s research is of issues in software. teaches them some important prob-
goal-based and psychological. Soloway l Language Design. CAPP has found lem solving strategies that transfer to
established the Cognition and Pro- empirical evidence that in a wide other subject domains. Unfortunately,
gramming Project (CAPP) which class of situations, Pascal’s while con- there is precious little evidence that
focuses on two themes: struct is more difficult to use correct- supports this claim. Soloway’s group
l AI and Software Engineering ly than is Ada’s loop construct. is carrying out a number of studies in
Rather than prescribe how software l Program Documentation. Soloway search of this elusive transfer effect.
should be designed and maintained, and his students have identified spe- For example, they are looking to see if
CAPP’s approach is first to under- cific types of knowledge (for example, learning programming helps students
stand how experts (and non-experts) static and causal knowledge] that to solve certain types of algebra word
design, develop, debug, and maintain maintainers need to abstract from problems more effectively. Unless
programs, Based on this understand- documentation in order to carry out transfer can be shown, then program-
ing, one then is in a better position to effective enhancements. CAPP then ming might best be viewed as simply
design tools and make prescriptions. has gone on to prescribe a documenta- another job skill, along with drafting.
l AI and Education. What should tion format that enables maintainers In fact, this view is gaining accep-
children be taught about program- easy access to these key types of tance in educational circles now.
ming? How should they be taught? AI knowledge. Soloway disputes that vocational view
offers some answers to these key edu- l Programming Instruction Lewis of programming, and is attempting to
cational questions. In particular, Johnson of CAPP has constructed a provide quantitative evidence of the
Soloway has focussed on the develop- system, PROUST, that can identify, elusive transfer effect.
ment of a curriculum for introductory for a class of moderately complex One main reason why transfer has-
programming, which should teach introductory programming assign- n’t been found can be traced to the
more than just the syntax and seman- ments, the non-syntactic bugs in stu- current content of the vast majority of
tics of Pascal. More generally, dents’ programs PROUST’s perfor- introductory programming courses. By
Soloway is exploring the development mance is comparable to a human and large, students are taught the syn-
of computer-based instructional sys- teaching assistant. Currently, tax and semantics of a programming
tems than can deliver high-quality, Soloway’s group is designing a cur- language. A quick look at the intro-
individualized instruction. riculum for an introductory program- ductory programming textbooks will
ming course, as described in the next convince the reader of this claim. The
AI and Software Engineering section. tables of contents are typically orga-
The design, development, and mainte- The particular theoretical perspec- nized in terms of the constructs of the
nance of significant pieces of software tive Soloway brings to the study of language being taught. Based on
programmers can be summed up by studying thousands of programs-and
are complex tasks. One aim of soft-
ware engineering is the development saying. “Reading a computer program bugs-generated by novice program-
of methodologies, languages, and tools is like reading a story-almost.” That mers, Soloway feels that language
that facilitate these tasks. There are is, Soloway has explored the notions constructs are not the problem, but
several approaches to developing such of schema, goal, and plan that were rather that students are having signifi-
developed in the story understanding cant difficulty in “putting the pieces
methodologies or tools. For example,
some researchers advocate a more for- world, and applied them to the pro- together,” that is, composing and coor-
mal approach to software: by provid- gramming world. Building on these dinating constructs, plans, and goals
ing a mathematical grounding for soft- notions, Soloway has developed fine- into a coherent whole. Moreover,
ware, the claim is that the product grained cognitive models of how pro- based on what CAPP researchers have
will be more reliable, easier to main- grammers design, comprehend, and learned about what expert program-

WINTER 1987 75
mers know and about the strategies on the psychological implications of References
they employ, Soloway has identified a scripts, and subsequently, MOPS. Bower, G. H.; Black, J B ; and Turner,
set of concepts that need to be taught Story understanding tasks explored T. J. 1979 Scripts in Memory for Text.
explicitly. In teaching introductory issues in causal coherence of texts, Cognitive Psychology 11.177-220.
programming, Soloway introduces hierarchical memory organization, Charniak, E , and McDermott, D.
concepts such as goals, plans, rules of and the thematic organization of the 1985. Introduction to Artificial Intel-
programming discourse, problem sim- story itself. ligence. Reading, Mass . Addison-Wes-
plification, simulation, reflection on This research program provided per- ley.
past problem solving efforts, and top- spective on the process of psychologi- Hewitt, C 1969. PLANNER: A Lan-
down design. Studies are now under- cal experimentation. Researchers guage for Proving Theorems in
way to assess the effectiveness of this noted a marked contrast between dis- Robots. In Proceedings of the Interna-
new curriculum. covery and verification research. AI tional Joint Conference on Artificial
research tended to generate new Intelligence, 295301. Bedford, Mass :
Cognitive Science hypotheses, while cognitive psycholo- Mitre Corporation.
gy served to test existing ones. While McDermott, D. V., and Sussman, G. J.
Many of the questions posed by artifi-
both roles were important and sug- 1973. The CONNIVER Reference
cial intelligence researchers are not
gested synergism, the Yale psycholo- Manual, Technical Report, 259, AI
new or unique to AI. Other disciplines
gists argued that cognitive psychology Laboratory, Massachusetts Institute of
have examined the problems of lan-
should develop more hypothesis gen- Technology.
guage, mind, and cognition. In recent
eration techniques. Minsky, M 1975. A Framework for
years, researchers from many circles
This dichotomv between verifica- Representing Knowledge. In The Psy-
have reached out across traditional
tion and discovery paradigms is chology of Computer Vision, ed. P.
academic boundaries to explore con-
reflected in another distinction adopt- Winston, 211-277. New York.
trasting perspectives and paradigms.
ed by Abelson and Schank: neat ver- McGraw-Hill.
The intersection of these fields-psy-
sus scruffy In this view, a theory or Schank, R. C. 1986. Explanation Pat-
chology, philosophy, linguistics,
discipline or hypothesis (or researcher) terns. Understanding Mechanically
anthropology, neuroscience, and
can be neat or scruffy. Neat implies and Creatively Hillsdale, N J..
AI-has become known as cognitive
formal, quantitative, and observable Lawrence Erlbaum Associates.
science
Scruffy suggests the antithesis: infor- Schank, R. C. 1982. Dynamic Memo-
Yale researchers embrace this inter-
mal, qualitative, and intuitive ry. A Theory of Learning in Comput-
disciplinary approach. Over the years,
Broadly speaking, AI is scruffy and ers and People Cambridge, England:
we have invited visitors from dis-
cognitive psychology is neat. There Cambridge University Press
parate fields to spend time at Yale to
are clear exceptions, such as automat- Schank, R. C. 1981. Reading and
share their views. They represent a
ic theorem proving in AI and Freudian Understanding Teaching from the
wide spectrum, including George
mentalism in psychology. More Perspective of Artificial Intelligence.
Lakoff, John Ross, James McCawley,
importantly, there is a symbiosis Hillsdale, N. J.: Lawrence Erlbaum
Donald Norman, Joseph Weizenbaum,
between the neats and scruffies of the Associates.
Hubert Dreyfus, John Searle, and Jer-
world. They each need the other, Schank, R. C. 1975. Conceptual Infor-
rold Katz. though they may not be willing to mation Processing. Amsterdam.
The primary source of ongoing
admit it. The scruffies generate boun- North-Holland.
interdisciplinary research has been
tiful harvests of ideas, while the neats Schank, R. C., and Abelson, R. 1977.
Yale psychologists, in particular,
painstakingly sift the grains of truth Scripts, Plans, Goals and Under-
Robert Abelson. In 1979, Schank and
from the pounds of chaff standing Hillsdale, N. J.: Lawrence
Abelson established a formal research
The process continues. AI Erlbaum Associates
and training program at Yale in cogni-
researchers plant the seeds that sur- Schank, R. C., and Abelson, R. 1975.
tive science. Under that aegis, they
vived previous harvests. The cultiva- Scripts, Plans, and Knowledge. In Pro-
attracted many young psychologists
tion proceeds under various guises: ceedings of the Fourth International
including John Black, James Galam-
scientific, technological, and educa- Joint Conference on Artificial Intelli-
bos, Noel Sharkey, Ray Gibbs, Steven
tional gence, 151-157. Los Altos, Calif:
Shwartz, and Steven Read Psychology
William Kaufmann.
graduate students were likewise Acknowledgments Schank, R. C., and Riesbeck, C. 1981.
attracted to the program They includ- This work was supported in part by the Inside Computer Understanding: Five
ed Brian Reiser, Scott Robertson, Defense Advanced Research Projects Agen-
Programs with Miniatures. Hillsdale,
Colleen Seifert, and Valerie Abbott. cy and the Office of Naval Research under
contract N00014-85-K-0108, by the Air
N.J.. Lawrence Erlbaum Associates
The experimental paradigm of psy-
Force Office of Scientific Research under Slade, S 1987. The T Programming
chology provided a vehicle for explor-
contract AFOSR-85-0343, and by the Language: A Dialect of LISP Engle-
ing the AI theories of cognitive pro-
National Library of Medicine under con- wood Cliffs, N.J.: Prentice-Hall.
cessing, and the results of the experi-
tract 1-RO 1-LM0425 1
ments stimulated new AI theories
Much of Black’s early work focussed

76 AI MAGAZINE

You might also like