Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Gateways: An approach to parsing spatial domains

Eric Chown echown@bowdoin.edu

Department of Computer Science, Bowdoin College, 8650 College Station, Brunswick, ME 04011 USA

Abstract our belief that the architecture of the human spatial


Cognitive maps are made up of two types system is adaptive enough that it can function e ec-
of spatial units { landmarks and gateways. tively even with systems such as those typically found
While most research has focused upon land- in today's robots. Human spatial structures have been
marks, gateways o er an attractive alterna- tested by hundreds of thousands of years of evolution
tive that can open up new possibilities for and have adapted to the point where they can function
building spatial representations. One exam- even in the face of a breakdown in the object recog-
ple of this is the crucial role that gateways nition system; this happens at night, during fog, and
play in helping to parse environments. Pars- under a variety of other circumstances. Visual object
ing is a crucial step in reasoning about com- recognition is merely an implementation detail (albeit
plex environments since it works to break an extraordinarily complex and useful one) in the hu-
large regions into smaller spaces. Further, man spatial system. For a robot, or even for a human
since gateways are less directly tied to ob- in an abstract environment, this detail can be replaced
ject recognition they a ord a number of ad- by alternative systems. The human spatial system can
vantages for computational models. This pa- be usefully studied as an system even with many of it-
per examines the possibilities of using gate- s details abstracted away. The reasonableness of this
ways, structures rst proposed as part of the approach can be found in the wide range of tasks to
PLAN architecture, as the basis for spatial which humans apply spatial thinking.
modeling. Two systems are reviewed which Abstractly it is the task of the spatial system to be
have implemented this methodology and ex- able to represent and provide reasoning capacity about
tensions to the research are proposed, includ- environments physical or otherwise. In many environ-
ing spatial representations of abstract envi- ments this task is challenging because of the tremen-
ronments. dous amount of information available. In order to
e ectively manage complex environments the spatial
system must be capable of discarding useless informa-
1. Introduction tion, and breaking large environments into collections
It is far from obvious that researchers interested in ma- of smaller environments. This process involves a kind
chine learning should concern themselves with human of parsing, roughly analogous to how languages are
spatial representation. At its core, the human spatial analyzed. Whereas the spoken word might be broken
system relies on a highly developed object recognition down into individual sentences, then words, syllables,
system that is far advanced from its most sophisticat- etc. a city can be broken into districts, neighborhoods,
ed machine counterpart. This object recognition sys- blocks, houses, and so on.
tem makes constructing spatial representations (usu- The most obvious, and most well studied, spatial unit
ally called \cognitive maps") relatively simple { a rea- is the landmark. Landmarks are the unique object-
sonable map of an environment consists of a topolog- s which populate an environment. A system simply
ically organized collection of landmarks. Because ma- armed with a good object recognition system could
chines do not have the same capabilities, it can be very navigate fairly e ectively just by relying upon land-
reasonably argued that spatial representations should marks (Chown et al., 1995). Of course humans have
be developed which are more tailored to their abili- a superb object recognition system far beyond any
ties. Robots, for example, can use sonar, can eÆciently computer-based system. Fortunately landmarks are
dead reckon, etc. There is a di erence, however, be- only one way that humans parse environments. Fur-
tween an architectural level description of a structures ther, with large environments, landmarks are not even
and the implementation of that same structure. It is
necessarily a reasonable basis on which to build the ly called a \route map" (Siegel & White, 1975; Chown
kind of hierarchical structure necessary for e ective et al., 1995). For a system with a highly develope-
reasoning. An alternative rst proposed as part of the d object recognition system route maps are easy to
PLAN architecture (Chown et al., 1995) is called a construct, represent space relatively eÆciently, and
gateway. are simple to use in navigation. It is not surpris-
ing that roboticists have extensively researched route
As the name suggests, gateways mark the transitions
maps as well. As a biologically-based model, route
between di erent regions of space. In indoor environ-
maps are implemented as connectionist spreading ac-
ments doors are the most obvious gateways, in out-
tivation networks in PLAN. Planning with this type
door environments gateways might include mountain
of system consists of activating the network node cor-
passes, bridges, or openings in a forest. Gateways are
responding to the current location as well as the one
characteristically places where people stop to pause
corresponding to the goal. Activity spreads from each
and look around. This may be because they occur at
node and when the waves of activity meet a path has
choice points, such as intersections, or because new in-
been found. Tests comparing systems employing this
formation is a orded, such as when arriving at the top
style of planning against human data have found the
of a hill. Those reasons alone strongly argue for the im-
results to be extremely comparable (O'Neill, 1990).
portance of gateways as a place to anchor a cognitive
structure; choice points should clearly be central to a Topological structures, such as networks, are attrac-
structure used for planning. The fact that gateways tive because they capture sequence in a natural way.
are also locations which are frequently visited means Since numerous tasks, including navigation, are inher-
that the learning process will bene t from repetition. ently sequential this serves as an important way of en-
coding spatial information. Because this also happens
In the PLAN architecture cognitive maps consist of
to be perhaps the most fundamental way that human-
two integrated halves, one based on landmarks, the
s encode spatial information, and also the fastest to
other on gateways. As be ts an adaptive system nav-
learn, it has been the focus of the most research on
igation is possible with either alone, but works opti-
cognitive mapping.
mally if both are functioning eÆciently. In both cases
the spatial units are connected in a network structure. Gateway-based mapping develops in parallel with the
This is a good example of a basic architecture, in this landmark network, but probably at a slower pace
case a topological network, working eÆciently with t- (though this is undoubtedly heavily in uenced by en-
wo di erent implementations. Each implementation vironmental factors). Each of these system arises out
results in a subsystem with di erent strengths and of one of the two main visual pathways. The land-
weaknesses, ideally with one complementing the oth- mark network from the \what" pathway, and the gate-
er in that regard. For example indoor environments way network (in PLAN gateways are called local maps
are often rich with gateways, but lacking in landmark- and networks of them are called regional maps) from
s, while the reverse might be true in many outdoor the \where" pathway. Again these are di erent imple-
environments. Given these di erences one might pre- mentations of the same architectural idea. Landmark-
dict that people use di erent navigational strategies s stand out to the object recognition system because
in each case. A number of unresolved research issues they are perceptually distinct from their surroundings.
revolve around how the two systems interact, function Gateways are de ned by perceptual distinctiveness as
separately, etc. as well as how di erent environments well. A simple way to detect a gateway is to look for
resonate di erently to each of the structures. places where the perceived background distance sud-
denly changes. For example, when a vista that had
The rest of this paper consists of a brief review of sev-
been occluded suddenly opens up. In their own way
eral salient aspects of the PLAN architecture. That
gateways are landmarks de ned by structural proper-
is followed by a description of systems which imple-
ties.
ment signi cant portions of PLAN. Finally there is
a discussion of extending the architecture to domains Topological networks do not naturally scale. Searching
other than navigation, including abstract domains. for paths in a structure consisting of a small number of
landmarks is fairly simple, but the search grows more
2. Cognitive Structures
diÆcult as the number of landmarks increases. This is
a problem currently being addressed in reinforcement
Within the spatial representation community there is learning systems which also process networks of states
one widely agreed upon feature of cognitive maps { a (Dietterich, 2000). Unfortunately there is still only a
topologically organized collection of landmarks usual- rudimentary understanding of hierarchy in reinforce-
ment learning, and it is clearly central to human spa- basis for navigation systems.
tial representation. The PLAN proposal is that spatial
hierarchies arise most naturally out of the structural 3. R-PLAN and A-PLAN
properties of a given environment. Physical bound-
aries, for example, can serve as an intrinsic mechanism We have built two systems which are based more heav-
for separating neighborhoods. It is worth noting that ily upon gateways than on landmarks. One system
since reinforcement learning systems completely ignore is a simulator, the other was implemented on a mo-
such properties it is diÆcult to see how they will ever bile robot, and neither require sophisticated objec-
build the same sorts of hierarchical structures as hu- t recognition. The rst, called A-PLAN (for Agent-
mans. In the PLAN architecture, on the other hand, PLAN) was implemented on top of the Quake III soft-
it is postulated that gateways serve to di erentiate re- ware engine and has been tested on simulated envi-
gions from one another. Passing through a gateway, ronments. The second, called R-PLAN (for Robot-
as the name implies, results in moving from one region PLAN) has been tested on indoor environments (Ko-
to another. In PLAN gateways are de ned according rtenkamp, 1993).
to a set of highly visual criteria; this is not a necessary
In indoor environments recognizing gateways is a rela-
condition however, and indeed it has been proposed
tively simple task. A robot moving along a hallway can
that gateways can serve as the basis of building spa-
simply track sonar readings looking for a large change
tial structure in abstract and nonvisual environments
in distance. When one occurs the most likely reason is
as well (Chown, 1999b).
that an intersection has been reached (or a doorway)
Gateways allow for natural hierarchies and scaling in and therefore a gateway. In tests on a mobile robot
several important ways. First, although the structure it was found that a robot could repeatably return to
of a regional map is topological its content is spatially the same gateway location within 70 millimeters on
rich. Corresponding to each gateway is a stored collec- successive trips (Kortenkamp et al., 1992). The sim-
tion of scenes. Scenes provide information on relative ulator, which had the advantage of noise free sonar,
distance and direction that is otherwise lacking in a could achieve essentially arbitrary accuracy.
purely topological structure. This renders concepts
We have experimented with several variations of what
like one object being to the left of another meaningful.
to store corresponding to each gateway. In PLAN it
Second, gateways provide a useful and simple heuristic
was proposed that people store a collection of scenes
for parsing space: the things that can be seen from the
at each gateway. The resulting structure can be orga-
current gateway are part of the same region of space,
nized in terms of a combination of body, head and eye
things which cannot be seen are part of a di erent re-
positions. Each scene in this scheme consists of a few
gion. This heuristic can even be applied in a recursive
salient landmarks and the contents of the scenes are
fashion to imagined views to create higher levels in the
useful in accessing and distinguishing individual gate-
hierarchy (Chown et al., 1995).
ways. Since landmark recognition is diÆcult for robots
Implicit in this representation is the fact that a spatial we have looked towards other representations for these
system built on this scheme is egocentrically organized. tasks. In both R-PLAN and A-PLAN we have tried
There is substantial evidence that humans organize s- fairly simplistic strategies and have found that they are
pace this way (Shelton & McNamara, 1997) and there suÆcient to achieve excellent results. In A-PLAN we
is a great deal of payo for such a scheme. Since gate- took the snapshot idea literally and stored a series of
ways occur at critical, often visited, environmental lo- bitmaps corresponding to what can be seen from each
cations they are exactly at the places where decisions gateway. In a simulated environment with controlled
need to be made, turns taken, etc. The information is lighting, and minimal environmental changes this can
stored in a semantically transparent fashion (Smolen- be an extremely e ective method for doing place recog-
sky, 1988) and is therefore simply extracted. It may nition. Such a strategy might even be more generally
be possible to construct an allocentric representation plausible, but would require more extensive computa-
from a collection of gateways but there is no obvious tions involving normalizing what is viewed. For A-
payo and potentially a great deal of costs in terms of PLAN to determine which gateway it is at, it merely
extra processing and the introduction of errors. needs to center itself within the gateway, then take
Because gateways serve as a natural way to parse envi- a series of snapshots at predetermined angles (based
upon lining up along hallways) and compare those s-
ronments and because computers and robots can rec-
napshots to stored gateways. The process can be done
ognize them much more simply than they can identify
exceptionally quickly as it involves a simple vector sub-
landmarks, it is reasonable to focus on gateways as the
traction. Most of the time A-PLAN will not need to do
this as it will be able to predict where it is according to 4. Other domains
its internal map; this is mainly useful when exploring
or trying out new routes. One of the intriguing things A-PLAN and R-PLAN represent a proof of concept of
about this scheme is that it was achieved without any the general PLAN architecture. While both systems
vision algorithms whatsoever. Gateways were derived are true to the general spirit of the architecture, nei-
purely using sonar, and the identi cation task was a ther is a true model in the sense that they function in
simple arithmetic operation. exactly the same way that humans do. That these sys-
tems are e ective should not be a complete surprise,
In R-PLAN, facing a real environment, a di erent ap- after all the human spatial system is used for a wide
proach was used (Kortenkamp, 1993). In R-PLAN, variety of tasks including many which are not visual
the snapshot of \landmarks" stored with a gateway at all in character. R-PLAN and A-PLAN abstract
consisted of a visual scene stripped of everything but away from implementation details and allow an exam-
vertical lines. Landmark recognition consisted of us- ination of the basic structure of the architecture. This
ing Bayes-nets to compare the relative probabilities provides credible evidence that this structure could be
that the current pattern of vertical lines correspond- used on the same sorts of tasks that people process
ed to stored patterns. This computationally simple spatially in everyday life. The next step for this re-
trick proved to be extremely successful in allowing the search is to apply these same structures to some of
robot to distinguish where it was in an environment these tasks.
(Kortenkamp, 1993).
We have proposed (Chown, 1999b) that the spatial
R-PLAN and A-PLAN work in similar fashions. As structures of PLAN are suited for use with tasks that
they explore an environment they naturally will come share three characteristic properties:
upon gateways. When this happens they \look
around" and compare what they see to any gateways 1. There should be uniquely identi able objects.
they have stored previously. If the gateway is new it
too is stored and connected into the cognitive map. 2. Information ow should be sequential.
As an agent continues to explore an environment it
will build up network of gateways which it can later 3. There should be easily identi able transitions
use to make navigational plans. A-PLAN is actually marking context shifts.
a more recent system than R-PLAN despite the fact
that it was simpler to construct. The reason for build- For example a website is populated with uniquely i-
ing it is that it can serve as a more general purpose denti able objects (the individual pages) and infor-
simulator. For example, to date both A-PLAN and R- mation normally ows sequentially both within a page
PLAN have only been tested on indoor environments, and between pages. Transitions from one page to the
environments naturally rich with gateways. It is an- next are easily located. It is easy to imagine using a
ticipated that making these systems work in an out- PLAN-style system to represent the information on a
door setting will be much more diÆcult. It is simple website. The general possibilities would seem to be
to use sonar to pinpoint a hallway intersection, but it virtually limitless. Music, for example, consists of se-
is another matter altogether to determine where on a quences of notes with transitional markers like pauses,
hilltop a gateway should be anchored. crescendos and the like. The regions between these
markers consist of various themes and melodies which
Both A-PLAN and R-PLAN e ectively parse indoor the composer is free to navigate back to later in a
environments without the bene t of a strong objec- piece. Di erent musical formats even consist of fair-
t recognition system. The representations they con- ly generic spaces where the paths are xed, but the
struct do not take direct advantage of the precision content changes. Lako and Johnson have done exten-
possible in computer systems, but it is altogether pos- sive research on the pervasiveness of spatial metaphors
sible that this is one of the reasons why they are so suc- in language, providing insight into just how much of
cessful (Chown, 1999a). The gateway construct gives cognition is organized on spatial principles (Lako &
rise to eÆcient representations which store knowledge Johnson, 1980).
in a natural way. This is undoubtedly one of the rea-
sons that humans have been able to use their spatial The next step in the development of PLAN is to pursue
systems for such a wide variety of tasks other than these avenues, applying the architecture to disparate
navigation. domains. There will be a number of pertinent issues to
explore along the way. With navigation we have been
working on a task which is fairly well understood and
for which there is a great deal of empirical data. As
we shift away from navigation to other domains the domains. This research is still at a relatively early
mapping onto the spatial system will not be so clear stage of development, but the results achieved so far,
cut. While it has been relatively simple to think of in systems like R-PLAN and A-PLAN, are extremely
domains, like music, for which a spatial representation promising.
seem natural we have not systematically explored this
idea. Indeed one area of research we plan to pursue Acknowledgements
involves making predictions of how hard various envi-
ronments (abstract and otherwise) would be for a hu- The author would like to thank Doug Vail, David Ko-
man to learn. Such a system would have clear value in rtenkamp and Steve Kaplan for their invaluable help
helping guide the design process for environments (for and guidance in this research.
example software systems) in order to make them user
friendly. Ultimately the goal should be to develop an References
understanding of how people use their spatial system-
s for such a variety of tasks. This knowledge should Chown, E. (1999a). Error tolerance and generalization
bring the added bene t of a better understanding of in cognitive maps: Performance without precision.
how to build spatial representations on computers. In R. Golledge (Ed.), Way nding behavior: Cog-
nitive mapping and spatial behavior, 349{369. The
So far our approach has focused on how people parse Johns Hopkins University Press.
environments. The gateway construct is crucial to this
process because it suggests how people break up large Chown, E. (1999b). Making predictions in an uncer-
continuous environments into smaller more manage- tain world: Environmental structure and cognitive
able chunks. It is also useful in a machine setting be- maps. Adaptive Behavior, 1{17.
cause recognizing gateways appears to be less challeng-
Chown, E., Kaplan, S., & Kortenkamp, D. (1995).
ing computationally than recognizing objects. Recog-
Prototypes, location and associative networks: To-
nizing gateways is just one step towards a full-scale
wards a uni ed theory of cognitive mapping. Cog-
spatial representation however. There is still the ques-
nitive Science, 19, 1{52.
tion of how people organize information around gate-
ways. In visual environments scenes are an obvious Dietterich, T. (2000). State abstraction in maxq hier-
choice, but in abstract environments this may not be archical reinforcement learning. In S. A. Solla, T. K.
the case. What is a \snapshot" of a musical theme for Leen and K.-R. Muller (Eds.), Advances in neu-
example? Should it be stored roughly as it is heard, ral information processing systems, 994{1000. MIT
or translated into a more spatial format? There are a Press.
number of these questions to be researched.
Kortenkamp, D. (1993). Cognitive maps for mobile
5. Concluding Remarks
robots: A representation for mapping and naviga-
tion. Doctoral dissertation, The University of Michi-
In the past computational modelers of cognitive maps gan.
have focused mainly upon route maps. This has been
a useful exercise which has led to the development of Kortenkamp, D., Baker, L., & Weymouth, T. (1992).
a number of important systems, but it has probably Using gateways to build a route map. IEEE/RSJ
also created built-in limitations as to what these sys- International Conference on Intelligence Robots and
tems can achieve; this is both because such systems Systems.
do not capture the entire range of spatial information Lako , G., & Johnson, M. (1980). Metaphors we live
and because of the still developing state of computer by. University of Chicago Press.
vision research. Shifting from an object approach to
a more purely locational approach opens up a number O'Neill, M. J. (1990). Computer simulation of the cog-
of interesting new possibilities. In navigation systems nitive map: A validation study. In R. I. Selby, K. H.
it is possible to build e ective systems which do not Anthony, J. Choi and B. Orland (Eds.), Coming of
need to rely upon computationally costly and relative- age: Proceedings of the twenty- rst annual confer-
ly inaccurate vision systems. Further, it provides a ence of the environmental design research associa-
much more natural way to approach the issue of build- tion. EDRA.
ing hierarchical structure. Finally, moving away from
the focus on object recognition opens up the possibili- Shelton, A. L., & McNamara, T. P. (1997). Multiple
ties of translating spatial systems for use in nonvisual views of spatial memory. Psychonomic Bulletin &
Review, 4, 102{106.
Siegel, A. W., & White, S. H. (1975). The develop-
ment of spatial representations of large-scale envi-
ronments. In H. W. Reese (Ed.), Advances in child
developoment and behavior, vol. 10. Academic Press.
Smolensky, P. (1988). On the proper treatment of con-
nectionism. Behavioral and Brain Sciences, 11, 1{
74.

You might also like