Professional Documents
Culture Documents
Digital Mapping and Image Understanding
Digital Mapping and Image Understanding
Digital Mapping and Image Understanding
Abstract
Emerging requirements associated with digital mapping pose a broad set of challenging
problems in image understanding research. Currently several leading research centers are
pursuing the development of new techniques for automated feature extraction; for example,
road tracking, urban scene generation, and edge-based stereo compilation. Concepts for
map-guided scene analysis are being defined which will lead to further work in automated
This paper seeks to describe on-going activity in this field and suggest areas for future
research. Research problems range from the organization of large-scale digital image/map
databases for tasks such as screening and assessment, to structuring spatial knowledge for
image analysis tasks, and the development of specialized "expert" analysis components and
workstations have been configured for both film-based and digital image exploitation which
interface conventional image analysts and extracted spatial data in computer assisted systems.
However, the state-of-the-art research capabilities are fragile, and successful concept
demonstrations require thoughtful analysis from both the mapping and image understanding
communities.
1. Introduction
The goal of this paper is to give a broad description of current research and development in
the areas of digital mapping and image understanding. This paper should be viewed as as
position paper, reflecting the authors opinions and observations on directions for current
research and prospects for the; future.
There are many research and development opportunities at the interface between digital
mapping and image understanding. This paper identifies several tasks that we believe require
cooperative efforts, both in the precise identification of the problem and in designing and
evaluating particular concept demonstrations. In Sections 2 and 3 we present a brief overview
of Digital Mapping and Image Understanding. Section 4 describes three prototype research
systems developed by the authors. Section 5 suggests several mapping and feature extraction
tasks that could benefit from cooperative research efforts. In summary, Section 6 outlines
some prospects for the future and the possible impact of image understanding on digital
mapping.
2. Digital Mapping
Conventional mapping, based on processing aerial photography with analog and analytical
instrumentation, requires film input much of the contemporary mapping research, however,
addresses issues of processing digital imagery with digital systems to produce digital map
products. The objective of this section is to provide a frame of reference for current initiatives
in digital mapping research.
The advent of the digital computer led to the analytical characterization of photogrammetric
imaging geometry and to the subsequent development of the real-time analytical stereo
plotter. Rigorous solutions have been implemented for metric camera systems, panoramic
camera systems [12] and non-optical systems such as synthetic aperture radar [27]. Analytical
plotters were designed to generate gridded elevation data directly with interactive profiling
capabilities. Automated stereo plotters based on real-time area correlation have been
developed building on this analytical photogrammetric foundation.
The compilation of feature data generally starts with the identification and delineation of
map features in the stereo model by human photo interpreters skilled in a relevant discipline
such as cartography, forestry, or geology. Then, delineated features are transferred from the
annotated photos or photo overlays to an orthographic manuscript using rigorous
photogrammetric procedures or less precise feature transfer methods. Finally, analog maps
and map overlays are produced from these orthographic manuscripts using conventional
cartographic methods.
The major thrust of digital cartography has been to produce machine readable
representations of spatial data from existing orthographic manuscripts and map sheets. The
initial approach, based on interactive use of a digitizing tablet to manually trace and digitally
encode individual features, was relatively successful and dominates current practice despite
labor intensive requirements. Recently, capital intensive raster scanning systems have
emerged as effective products in high volume applications while other developments pursue
automated and semi-automated line-following techniques.
Much of the mapping research community has transitioned rom film-based to digital
media. There are several reasons for this trend. First, there is an increase in digital
electro-optical and non-optical sensors such as multi-spectral scanners and synthetic aperture
radars. Second, there is the expectation of automated image analysis and feature extraction
using digital image processing technology. Finally, requirements are growing for digital
products in the military and civilian communities, such as digital elevation models (DEM),
orthophoto maps, and geographic information systems. Major problems, include the
maintenance of traditional measurement accuracy, storage requirements for digital images at
the resolution necessary for digital mapping, as well as the transfer of photogrammetric tools
to digital image processing environments.
The growing interest in digital map data involves far more than demands for increased
production quantity and higher metric accuracy. Critical applications such as advanced
computer image generation [7, 8, 28]for flight and sensor simulation can effectively exploit
far more detailed feature data and higher resolution elevation data than currently produced.
Expanding capabilities in digital terrain and: environmental analysis suggests similar needs.
To date, expectations for significant increases in productivity based on digital data extraction
remain largely unfulfilled. A major thrust has been to develop digital system components
that perform the function of traditional analog instruments: digital rectifiers; digital
correlators of elevation compilation; digital light tables for photo interpretation. These
systems and subsystems are largely experimental, and many significant obstacles must be
overcome to realize adequate throughput for operational use.
3. Image Understanding
There are several' partially overlapping disciplines that have historically had application
to feature extraction from aerial imagery. These disciplines can be grouped loosely into
two groups: pattern recognition and image processing, and computer vision and image
understanding. One major difference is that pattern recognition has had its roots in the
interpretation of 2 dimensional shapes in a 2d image, whereas image understanding
explicitly performs interpretation of 2d images as projection of 3d world [1, 4]. The
research goal in image understanding is to ultimately build general vision systems that
approach human level performance in a wide variety of application domains. On the way to
that goal, various application areas such, as natural scene analysis, robotics and manufacturing
assembly, vehicle navigation, and cartography have been used to evaluate the strengths and
limitations of various research approaches. Most research in pattern recognition and image
processing is performed on monocular views of the world, whereas a relatively large body of
work has been done in the area of computational stereo within image understanding.
Computational stereo, the recovery of the three-dimensional characteristics of a scene from
multiple images taken from different points of view [2], has been used in most of the
application areas previously mentioned.
patterns can be used. Broad applications have included linear feature extraction [23,14],
road detection and tracking [26,10,13], symbolic registration and change detection [25],
recursive region segmentation [24, 30]. A good survey and evaluation of model based
vision systems can be found in Minford [3].
In this section we briefly describe three systems whose research goals and capabilities
overlap between digital mapping and image understanding. A common link among these
systems is the interaction between image analysis and spatial databases. The explosive
increase in the availability of digital imagery and image related information makes ending
some small piece of relevant information increasingly difficult. On-line storage of tens of
thousands of images does not help unless the user can quickly locate a feature or landmark
of interest in many different images simultaneously. The same underlying problem exists
from an automated image interpretation standpoint. Symbolic indexing and addressing of
images for a priori knowledge in automated analysis requires many of the same techniques
as in interactive analysis, except that the human image analyst provides the guidance.
Facilities such as on-line image/map databases, signal and symbolic indexing of natural and
man-made features, and spatial reasoning can be viewed in the short term as a semi-automated
tool for increasing productivity of human photo interpreters and analysts, and in the long term,
as the knowledge base for automated rule-based systems capable of detailed analysis including
change detection and the update of map descriptions. We feel that research in building spatial
databases has immediate payoff for both cartographic production environments and as a
component of an automated feature extraction system. The CAPIR system applies rigorous
photogrammetric principles for the generation of spatial data using film-based media with
computer generated superposition. MAPS is a digital system, with a less rigorous
photogrammetric front-end, providing detailed descriptive and display capabilities for
extracted spatial data. It is also an interface to image analysis tools that use the spatial data to
provide knowledge and constraints for feature extraction.
4.1.CAPIR
A program in Computer Assisted Photo Interpretation Research (capir), underway at the
U.S. Army Engineer Topographic Laboratories [15], reflects operational mapping
considerations. A tested CAPIR system has been developed using an APPS-IV analytical
plotter as a computer-interfaced stereoscope to support interactive feature extraction.
Computer graphic displays have been introduced into the optical. Train of the instrument to
provide direct display of encoded spatial data in the stereo model. This capability for
stereoscopic graphic superposition provides, for the first time, a closed-loop system for the
compilation and management of digital spatial data using high resolution, stereo mapping and
reconnaissance imagery. Initial studies have focused on terrain analysis [6] and management
of previously compiled digital map data [9].
The capir system has demonstrated the implementation of rigorous metric procedures to
interface a photo interpreter to a geographic information system. In one sense, it represents
a prototypical digital mapping system. The system serves a second role as a research tested
available for the introduction of other forms of computer-assistance to support spatial data
extraction and management tasks.
world model is represented from multiple images using camera models. Information in this
model may then be used to interpret new images of the area. An important aspect of the
system is that the world model need not be complete in order to be useful; it is usable as it is
developed. Three-dimensional information can be extracted by monocular analysis
(shadow, contour, texture, shading), by stereo analysis and through the combination of
information from these sources into a coherent description. Since, the shape information we
can obtain from images is partial, noisy, and imprecise, having a variety of representations, 3D
image analysis requires versatile and robust geometric reasoning methods. The image/map
database provides a framework in which various three-dimensional shape-extraction
techniques can be developed, evaluated, and integrated.
One near-term task for rule-based systems in digital mapping is to provide constraints so
that image analysis and interpretation tools can be used in spite of their inherently errorful
performance. The goal, then, is to integrate rule-based systems with image analysis
techniques to constrain the search space of possible scene interpretations using domain
and general knowledge. These constraints can be characterized as "what to look for" and
"where to look for it". SPAM is an experimental system that is exploring these ideas in the
context of airport scene analysis [19, 21]. For example, given an image segment that is
tentatively labeled as "linear", particular rules are invoked for segment extension or
evaluation against possible interpretations such as runway, taxiway, or access road. At any
time during the interpretation there are a large number of different partial combinations of
cues that can trigger continually more reined segmentation hypotheses. These partial
interpretations interact with the image processing component by providing simple low-level
segmentation goals. What forces the system to am interpretation is that although large
numbers of hypotheses exist, only a small subset is consistent. Each ailed recognizes
inconsistency or is part of a reasoning chain that may identify an inconsistency.
Inconsistencies can be detected by analysis of geometric relationships between local
features and the application of world knowledge such as the typical length of runways, sizes
of hangars and maintenance buildings, and their relative spatial organization within a general
airport scene, spam makes use of the facilities provided by the maps system to tie feature
descriptions to a geodetic coordinate system <latitude,longitude,elevation>, and to use
camera models (image-to-map correspondence) to predict their location and appearance in the
aerial photography, maps also provides facilities to compute geometric properties and
relationships such as containment, adjacency, subsumed by, intersection, and closest point.
Experience with operational "expert" systems in other domains gives insight into scaling
systems for greater performance. Simply speaking, achieving higher performance is
non-linear. If one can capture 80% of the situations for a small, well-denied task with 500
rules, 90% performance may require 1000 rules, and 95% performance 2000 rules. In an
application area such as digital mapping, any small task, such as reliable extraction of road
networks, may exhibit such behavior. Therefore, it is best not to set requirements such as
90% performance without understanding the complexity of the technology for achieving
simpler goals.
7. Bibliography
[1] Ballard, D. H. and Brown, C. M.. Computer Vision. Prentice-Hall, Englewood Cliffs, NJ,
1982.
[2] Barnard, S. T., and Fischler, M. A. "Computational Stereo." Computing Surveys 14,4
(December 1982), 553-572.
[3] Binford, T. O., "Survey of Model-Based Image Analysis Systems." International Journal of
Robotics Research 1,1 (MIT Press, Spring 1982), 18-64.
[5] Brooks, R. A. "Symbolic Reasoning among 3-D Models and 2-D Images." Artficial
Intelligence 17 (1981), 285-349.
[7] Fainich, M. B. Digital Sensor Simulation at the Defense Mapping Agency Aerospace Center.
Proceedings of the National Aerospace and Electronics Conference (NAECON), May, 1979,
pp. 1242-1246.
[9] Federhen, H. A Spatial Analysis System for Digital Feature Data. Technical Papers of the
50th Annual Meeting, American Society of Photogrammetry, Washington, D.C., March,
1984.