Digital Mapping and Image Understanding

Digital Mapping and Image Understanding
Abstract
Emerging requirements associated with digital mapping pose a broad set of challenging
problems in image understanding research. Currently several leading research centers are
pursuing the development of new techniques for automated feature extraction; for example,
road tracking, urban scene generation, and edge-based stereo compilation. Concepts for
map-guided scene analysis are being defined which will lead to further work in automated
techniques for spatial database validation, revision and intensification.
This paper seeks to describe on-going activity in this field and suggest areas for future
research. Research problems range from the organization of large-scale digital image/map
databases for tasks such as screening and assessment, to structuring spatial knowledge for
image analysis tasks, and the development of specialized "expert" analysis components and
their integration into automated systems. Significantly, prototype image analysis
workstations have been configured for both film-based and digital image exploitation which
interface conventional image analysts and extracted spatial data in computer assisted systems.
However, the state-of-the-art research capabilities are fragile, and successful concept
demonstrations require thoughtful analysis from both the mapping and image understanding
communities.
Seminar Report Page 1

1. Introduction
The goal of this paper is to give a broad description of current research and development in
the areas of digital mapping and image understanding. This paper should be viewed as as
position paper, reflecting the authors opinions and observations on directions for current
research and prospects for the; future.
There are many research and development opportunities at the interface between digital
mapping and image understanding. This paper identifies several tasks that we believe require
cooperative efforts, both in the precise identification of the problem and in designing and
evaluating particular concept demonstrations. In Sections 2 and 3 we present a brief overview
of Digital Mapping and Image Understanding. Section 4 describes three prototype research
systems developed by the authors. Section 5 suggests several mapping and feature extraction
tasks that could benefit from cooperative research efforts. In summary, Section 6 outlines
some prospects for the future and the possible impact of image understanding on digital
mapping.

2. Digital Mapping
Conventional mapping, based on processing aerial photography with analog and analytical
instrumentation, requires film input much of the contemporary mapping research, however,
addresses issues of processing digital imagery with digital systems to produce digital map
products. The objective of this section is to provide a frame of reference for current initiatives
in digital mapping research.
2.1 Conventional Data Extraction Processes
Reconstruction of a three-dimensional stereo model, recorded by overlapping aerial

photography, represents a fundamental photogrammetric objective which is addressed by a
variety of well-known instrumentation. Classically, elevation data have been compiled
manually from metric stereo models using analog stereo plotters and recorded as contour lines
in orthographic manuscripts. Such equipment and procedures have dominated
photogrammetric practice and continue to serve as the workhorses of modern commercial
practice.
The advent of the digital computer led to the analytical characterization of photogrammetric
imaging geometry and to the subsequent development of the real-time analytical stereo
plotter. Rigorous solutions have been implemented for metric camera systems, panoramic
camera systems [12] and non-optical systems such as synthetic aperture radar [27]. Analytical
plotters were designed to generate gridded elevation data directly with interactive profiling
capabilities. Automated stereo plotters based on real-time area correlation have been
developed building on this analytical photogrammetric foundation.
The compilation of feature data generally starts with the identification and delineation of
map features in the stereo model by human photo interpreters skilled in a relevant discipline
such as cartography, forestry, or geology. Then, delineated features are transferred from the
annotated photos or photo overlays to an orthographic manuscript using rigorous
photogrammetric procedures or less precise feature transfer methods. Finally, analog maps
and map overlays are produced from these orthographic manuscripts using conventional

cartographic methods.
The major thrust of digital cartography has been to produce machine readable
representations of spatial data from existing orthographic manuscripts and map sheets. The
initial approach, based on interactive use of a digitizing tablet to manually trace and digitally
encode individual features, was relatively successful and dominates current practice despite
labor intensive requirements. Recently, capital intensive raster scanning systems have
emerged as effective products in high volume applications while other developments pursue
automated and semi-automated line-following techniques.
2.2 Transition from Film to Digital Media
Much of the mapping research community has transitioned rom film-based to digital
media. There are several reasons for this trend. First, there is an increase in digital
electro-optical and non-optical sensors such as multi-spectral scanners and synthetic aperture
radars. Second, there is the expectation of automated image analysis and feature extraction
using digital image processing technology. Finally, requirements are growing for digital
products in the military and civilian communities, such as digital elevation models (DEM),
orthophoto maps, and geographic information systems. Major problems, include the
maintenance of traditional measurement accuracy, storage requirements for digital images at
the resolution necessary for digital mapping, as well as the transfer of photogrammetric tools
to digital image processing environments.
2.3 Expectations and Capabilities
The growing interest in digital map data involves far more than demands for increased
production quantity and higher metric accuracy. Critical applications such as advanced
computer image generation [7, 8, 28]for flight and sensor simulation can effectively exploit
far more detailed feature data and higher resolution elevation data than currently produced.
Expanding capabilities in digital terrain and: environmental analysis suggests similar needs.
To date, expectations for significant increases in productivity based on digital data extraction
remain largely unfulfilled. A major thrust has been to develop digital system components

that perform the function of traditional analog instruments: digital rectifiers; digital
correlators of elevation compilation; digital light tables for photo interpretation. These
systems and subsystems are largely experimental, and many significant obstacles must be
overcome to realize adequate throughput for operational use.
We believe that the numerous experimental and commercial initiatives to develop

computer-assisted systems, including less expensive analytical plotters, tailored to the
production of digital map data are highly significant. In the long term, highly integrated
digital map production operations based on interactive systems and subsystems should offer
numerous opportunities for the introduction of advanced automated modules leading to
evolutionary increases in throughput and productivity.

3. Image Understanding
There are several' partially overlapping disciplines that have historically had application
to feature extraction from aerial imagery. These disciplines can be grouped loosely into
two groups: pattern recognition and image processing, and computer vision and image
understanding. One major difference is that pattern recognition has had its roots in the
interpretation of 2 dimensional shapes in a 2d image, whereas image understanding
explicitly performs interpretation of 2d images as projection of 3d world [1, 4]. The
research goal in image understanding is to ultimately build general vision systems that
approach human level performance in a wide variety of application domains. On the way to
that goal, various application areas such, as natural scene analysis, robotics and manufacturing
assembly, vehicle navigation, and cartography have been used to evaluate the strengths and
limitations of various research approaches. Most research in pattern recognition and image
processing is performed on monocular views of the world, whereas a relatively large body of
work has been done in the area of computational stereo within image understanding.
Computational stereo, the recovery of the three-dimensional characteristics of a scene from
multiple images taken from different points of view [2], has been used in most of the
application areas previously mentioned.
3.1 Model Based Systems

A second thrust in image understanding research is the application of domain knowledge
to guide scene interpretation. This knowledge can be represented in terms of 3 dimensional
models. Recognition and interpretation requires the extraction and matching of features in
the image to the model. Image features include edges, lines, line junctions, surfaces, and
regions with particular shape, spectral, and texture properties. In order to perform feature
extraction and model building, we must have adequate spatial resolution in the image,
probably higher than is needed by a human to identify the same feature. In addition to the
feature extraction aid model building, we must have adequate constraints on the scene in
order to interpret it. Some common constraints that have been used in several systems[31,
22, 5,11, 20, 21]are camera geometry, knowledge about the scene such as urban
environment vs. countryside or forested area, and spectral properties of generic features
such as buildings, roads, water, etc. Once a general model is constructed, more specific
knowledge such as agricultural and farming styles and building construction methods and
patterns can be used. Broad applications have included linear feature extraction [23,14],
road detection and tracking [26,10,13], symbolic registration and change detection [25],
recursive region segmentation [24, 30]. A good survey and evaluation of model based
vision systems can be found in Minford [3].
3.2 Spatial Knowledge
If a world model is represented explicitly in 3-dimensional geodesic coordinates,

specific spatial knowledge can be applied to the extraction of cartographic features. For
example, it is possible to represent what a geographic area looked like the last time we
updated the map and to predict where to attempt to extract new information. The application
of spatial knowledge, when applied using map-based models projected using a known
camera model, is a relatively new result in image understanding. Previous work in pattern
recognition and image processing has relied on the generation of image-based descriptions
of particular scene features. These descriptions were generally not robust across images
showing differences in scale, camera orientation, and sensor response. The integration of
digital spaial databases with photogrammetry and image processing tools and techniques is
a promising area for research and development in digital mapping.

4. Systems At the Interface
In this section we briefly describe three systems whose research goals and capabilities
overlap between digital mapping and image understanding. A common link among these
systems is the interaction between image analysis and spatial databases. The explosive
increase in the availability of digital imagery and image related information makes ending
some small piece of relevant information increasingly difficult. On-line storage of tens of
thousands of images does not help unless the user can quickly locate a feature or landmark
of interest in many different images simultaneously. The same underlying problem exists
from an automated image interpretation standpoint. Symbolic indexing and addressing of
images for a priori knowledge in automated analysis requires many of the same techniques
as in interactive analysis, except that the human image analyst provides the guidance.
Facilities such as on-line image/map databases, signal and symbolic indexing of natural and
man-made features, and spatial reasoning can be viewed in the short term as a semi-automated
tool for increasing productivity of human photo interpreters and analysts, and in the long term,
as the knowledge base for automated rule-based systems capable of detailed analysis including
change detection and the update of map descriptions. We feel that research in building spatial
databases has immediate payoff for both cartographic production environments and as a
component of an automated feature extraction system. The CAPIR system applies rigorous
photogrammetric principles for the generation of spatial data using film-based media with
computer generated superposition. MAPS is a digital system, with a less rigorous
photogrammetric front-end, providing detailed descriptive and display capabilities for
extracted spatial data. It is also an interface to image analysis tools that use the spatial data to
provide knowledge and constraints for feature extraction.
4.1.CAPIR
A program in Computer Assisted Photo Interpretation Research (capir), underway at the
U.S. Army Engineer Topographic Laboratories [15], reflects operational mapping
considerations. A tested CAPIR system has been developed using an APPS-IV analytical
plotter as a computer-interfaced stereoscope to support interactive feature extraction.

Computer graphic displays have been introduced into the optical. Train of the instrument to
provide direct display of encoded spatial data in the stereo model. This capability for
stereoscopic graphic superposition provides, for the first time, a closed-loop system for the
compilation and management of digital spatial data using high resolution, stereo mapping and
reconnaissance imagery. Initial studies have focused on terrain analysis [6] and management
of previously compiled digital map data [9].
The capir system has demonstrated the implementation of rigorous metric procedures to
interface a photo interpreter to a geographic information system. In one sense, it represents
a prototypical digital mapping system. The system serves a second role as a research tested
available for the introduction of other forms of computer-assistance to support spatial data
extraction and management tasks.
4.2. maps: An Image/Map Database

Over the past three years we have developed the maps system (Map Assisted
Photo-interpretation System) as part of an Image Understanding Systems project at
Carnegie-Mellon University. maps [16,17,18] is a large-scale image/map database
system for the Washington D.C. area which contains approximately 100 high resolution
aerial mapping images, a digital terrain (elevation) database, and map databases rom USGS
and Defense Mapping Agency (DMA). The current database covers approximately a 150
square mile area centered over the District of Columbia. This system is the first to deal with
a complex urban environment, integrating digital image, terrain, and map databases, using
large-scale imagery with high ground resolution. Users are able to interact with a high
resolution image display and query the database for the names, descriptions, and spatial
relationships between natural and man-made map entities. Current research involves the
evaluation of a hierarchical spatial representation to constrain search in large databases, the
application of spatial knowledge to navigate through a map database, and support for
complex geometric and factual queries that arise in cartography. Dynamic symbolic and
signal access to the image/map database, detailed semantic descriptions of man-made, and
natural features, generalized geometric computations of map feature relationships, and an
intelligent window-based image display manager distinguish maps from other work in this
area. Maps represent a step toward the goal of an integrated spaial database, since a 3D
world model is represented from multiple images using camera models. Information in this
model may then be used to interpret new images of the area. An important aspect of the
system is that the world model need not be complete in order to be useful; it is usable as it is
developed. Three-dimensional information can be extracted by monocular analysis
(shadow, contour, texture, shading), by stereo analysis and through the combination of
information from these sources into a coherent description. Since, the shape information we
can obtain from images is partial, noisy, and imprecise, having a variety of representations, 3D
image analysis requires versatile and robust geometric reasoning methods. The image/map
database provides a framework in which various three-dimensional shape-extraction
techniques can be developed, evaluated, and integrated.
4.2. spam: A Rule-Based System

Current know ledge-based systems typically exhibit a rather narrow character ~ often
described-as shallow. Though substantial knowledge is collected in the rules, capable of
being applied appropriately to perform a task, there is no ability to reason further with that
knowledge. The basic semantics of the task domain is not understood by the system. New
rules are not learned from experience nor does behavior with the existing rules become
automatically tuned. To point out such limitations is not to be hypercritical of the current
state-of-the-art. Indeed, the important scientific discovery behind the success of artificial
intelligence knowledge-based expert systems is precisely that sufficient bodies of such
shallow knowledge could be assembled, without any of the supporting reasoning and
understanding ability, and still proves adequate to perform real consultation tasks in the
medical and industrial world.
However, even without general problem solving capabilities, there is much to be

gained by the development of "existence proof rule-based interpretation systems. Current
image processing systems are incapable of high-level description of the results of image
interpretation. We believe that the combination of a map-based world model and rule-based
systems which have "expert" level site-speciic or task-speciic knowledge, can be used to
bridge the gap between users and current state-of-the-art image interpretaion systems. The
long-term goal of this research is to develop systems that can interact with a human
photo-interpreter at a highly symbolic level, that maintain a world database of previous

events, use expert level knowledge to predict areas for fruitful analysis, and integrate the
results of the analysis into a coherent model. In the near-term, useful knowledge about the
organization, strengths, and limitations of this approach will be gained.
One near-term task for rule-based systems in digital mapping is to provide constraints so
that image analysis and interpretation tools can be used in spite of their inherently errorful
performance. The goal, then, is to integrate rule-based systems with image analysis
techniques to constrain the search space of possible scene interpretations using domain
and general knowledge. These constraints can be characterized as "what to look for" and
"where to look for it". SPAM is an experimental system that is exploring these ideas in the
context of airport scene analysis [19, 21]. For example, given an image segment that is
tentatively labeled as "linear", particular rules are invoked for segment extension or
evaluation against possible interpretations such as runway, taxiway, or access road. At any
time during the interpretation there are a large number of different partial combinations of
cues that can trigger continually more reined segmentation hypotheses. These partial
interpretations interact with the image processing component by providing simple low-level
segmentation goals. What forces the system to am interpretation is that although large
numbers of hypotheses exist, only a small subset is consistent. Each ailed recognizes
inconsistency or is part of a reasoning chain that may identify an inconsistency.
Inconsistencies can be detected by analysis of geometric relationships between local
features and the application of world knowledge such as the typical length of runways, sizes
of hangars and maintenance buildings, and their relative spatial organization within a general
airport scene, spam makes use of the facilities provided by the maps system to tie feature
descriptions to a geodetic coordinate system <latitude,longitude,elevation>, and to use
camera models (image-to-map correspondence) to predict their location and appearance in the
aerial photography, maps also provides facilities to compute geometric properties and
relationships such as containment, adjacency, subsumed by, intersection, and closest point.

5. Tasks at the interface

It is interesting to note differences in emphasis and maturity between aerial photogrammetry
and image understanding. First, photogrammetry has 100 years of experience and
refinement, while image understanding is in its early development. Photogrammetry uses
rigorous camera models, both vertical and oblique. In the context of digital mapping, image
understanding has mostly worked with images taken vertically with high scale, and without
the benefit of camera models or ground control. In aerial photogrammetry, accurate 3D
geodetic positions of ground control, which tie the model to the real world, are the sine qua
non. Image understanding, however, makes use of many tools and techniques from
rule-based systems, computer graphics, large-scale databases, the physics and
geometry of image formation, and computer systems. We believe that these differences in
emphasis point to opportunities at the interface. In the following sections we describe
some mapping tasks that could benefit from cooperative research.
5.1. Road Tracking

Work on detecting and tracking roads has been performed using monocular views. Most
techniques rely on image correlation along the road surface or finding significant points on
the edges of the road. Due to a variety of factors such as shadows and occlusions, changes in
road width, cars and trucks on the road, and the inherent noisiness in the image data, the
success of such experimental systems is poor. However, another piece of information that we
believe would greatly improve performance for detection, and especially tracking, is the use of
elevation and slope information. If the grain of resolution for terrain elevation was fine enough,
knowledge about constraints for changes in road slope, as well as road profiles could be used as
another source of information in an overall system. This requires either a detailed digital
elevation database or the ability to recover elevation dynamically using digital stereo.
5.2. Shadow Geometry

The use of shadows has been a traditional tool in photogrammetry. Techniques in
computer graphics for the generation of shadows and indirect illumination could be applied
to predict and detect 3D discontinuities in an image. Ray casting, the projection of light rom
possibly multiple sources, has been used to produce realistic computer generated scenes and
could be applied to prediction of shadows, and the inference of shapes from shadows [29].
5.3. Stereo Compilation

Area correlation has been the basis for most automated stereo compilation
devices, and best performance is achieved in areas with smoothly changing terrain, having
sufficient texture for correlation, and without large jumps in elevation. Emerging techniques
in stereo compilation use a feature-based approach using unipolar constraints. The
feature-based approach shows promise in areas with man-made structures, since these
structures provide a variety of edges and 3D vertices that correspond to the corners, surface
markings, or boundaries of cultural features. The combination of feature-based and
area correlation stereo methods provide complementary disparity information to a
stereo reconstruction system [2]. Further, investigation of the use of coarse terrain models
to constrain high resolution stereo matching shows promise.
5.4. Spatial Databases

As traditional geographic information systems are augmented with
knowledge-based image interpretation modules, we will begin to close the loop
between map feature extraction and its representation for a variety of production tasks.
Such systems require the capability to represent detailed knowledge of foliage and
vegetation, surface material and composition, and hydrology and drainage as well as
man-made structures. There are subtle, and important differences between this type of
spatial knowledge representation and those found in current remote sensing systems.
Remote sensing uses known or modeled spectral properties of various terrain classes to
perform statistical classification. Spectral information captures only one component of
knowledge about the terrain and there are a variety of errors introduced by sensors, such as
atmospherics, and temporal changes that are difficult to model. The construction of spatial
databases containing knowledge from a variety of sources can be achieved as a result of
cooperation and joint projects between the mapping and image understanding communities.
To effectively capture general rules used by photo analysts requires interaction between
domain experts and computer scientists focusing on small, well denied problems with
careful analysis of the strengths and limitations of the approach.
5.5. Workstation Environments

The development of workstation environments for feature extraction will greatly aid the

application of image understanding to digital mapping problems. In order to achieve

increased performance, accuracy, and productivity, we must find ways to combine many of
the steps of traditional feature extraction into a workstation environment. Within this
environment it should be transparent to the user whether the various subtasks are being
performed locally at his workstation or arc performed remotely at specialized processing
sites. This allows for modular development and integration of new capabilities without
adverse impact to the user. The user need not be an expert at each stage of the feature
extraction process, but can be assured that the portions not locally done are being reliably
performed. In order to achieve a level of performance suitable for production requirements,
automated feature extraction programs must have access to a wide variety of information.
It is clear that image intensity alone is not sufficient to perform complex analysis. Other
components of the information model are a priori map knowledge, knowledge of terrain,
hydrography, etc. The workstation should provide the focus for this knowledge; it must be
carefully organized so that information is relevant and specific to the analysis task at hand.

6. Prospects for the Future

It is interesting to speculate about ways in which image understanding will contribute to
automated mapping. First, the contribution will be evolutionary rather than
revolutionary. Some tasks are inherently harder than others. Further, there is a major
difference between capabilities needed to perform interactive or guided scene analysis,
and a truly automated system. The required level of detail of the analysis also affects the time
frame for success. In the near term, interactive tools are most likely, with various degrees of
automation. As these techniques mature, automated systems will evolve.
Experience with operational "expert" systems in other domains gives insight into scaling
systems for greater performance. Simply speaking, achieving higher performance is
non-linear. If one can capture 80% of the situations for a small, well-denied task with 500
rules, 90% performance may require 1000 rules, and 95% performance 2000 rules. In an
application area such as digital mapping, any small task, such as reliable extraction of road
networks, may exhibit such behavior. Therefore, it is best not to set requirements such as
90% performance without understanding the complexity of the technology for achieving
simpler goals.
Secondly, techniques related to image understanding may improve productivity in

unexpected ways. For example, the introduction of powerful image analysis workstations
communicating via local area networks, the use of integrated image databases, and the
ability to reference and display known geographic informaion while compiling new data
will create digital mapping environments where feature extraction and map compilation can
be aided without major breakthroughs in artificial intelligence and image understanding.
As the cost of embedding more powerful processors in local workstations decreases,
work that was partitioned due to organizational artifacts will become more integrated.

7. Bibliography
[1] Ballard, D. H. and Brown, C. M.. Computer Vision. Prentice-Hall, Englewood Cliffs, NJ,
1982.
[2] Barnard, S. T., and Fischler, M. A. "Computational Stereo." Computing Surveys 14,4
(December 1982), 553-572.
[3] Binford, T. O., "Survey of Model-Based Image Analysis Systems." International Journal of
Robotics Research 1,1 (MIT Press, Spring 1982), 18-64.
[4] Brady, M. "Computational Approaches to Image Understanding." Computing Surveys 14,1

(March 1982), 3-71.
[5] Brooks, R. A. "Symbolic Reasoning among 3-D Models and 2-D Images." Artficial
Intelligence 17 (1981), 285-349.
[6] Edwards, D. Terrain Analysis Generation Through Computer-Assisted Photo Interpretation.

Technical Papers of the 49th Annual Meeting, American Society of Photogrammetry,
Washington, D.C., March, 1983.
[7] Fainich, M. B. Digital Sensor Simulation at the Defense Mapping Agency Aerospace Center.
Proceedings of the National Aerospace and Electronics Conference (NAECON), May, 1979,
pp. 1242-1246.
[8] Fainich, M. B. Sensor Image Simulator Application Studies. Proceedings: International

Conference on Simulators, Sept., 1983.
[9] Federhen, H. A Spatial Analysis System for Digital Feature Data. Technical Papers of the
50th Annual Meeting, American Society of Photogrammetry, Washington, D.C., March,
1984.

Digital Mapping and Image Understanding

Uploaded by

Copyright:

Available Formats

You might also like

Digital Mapping and Image Understanding

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Digital Mapping and Image Understanding

Uploaded by

Copyright:

Available Formats

Digital Mapping and Image Understanding

techniques for spatial database validation, revision and intensification.

their integration into automated systems. Significantly, prototype image analysis

Seminar Report Page 1

Seminar Report Page 2

2.1 Conventional Data Extraction Processes

Reconstruction of a three-dimensional stereo model, recorded by overlapping aerial

Seminar Report Page 3

2.2 Transition from Film to Digital Media

2.3 Expectations and Capabilities

Seminar Report Page 4

We believe that the numerous experimental and commercial initiatives to develop

Seminar Report Page 5

3.1 Model Based Systems

3.2 Spatial Knowledge

If a world model is represented explicitly in 3-dimensional geodesic coordinates,

Seminar Report Page 7

4. Systems At the Interface

Seminar Report Page 8

4.2. maps: An Image/Map Database

4.2. spam: A Rule-Based System

However, even without general problem solving capabilities, there is much to be

photo-interpreter at a highly symbolic level, that maintain a world database of previous

Seminar Report Page 11

5. Tasks at the interface

5.1. Road Tracking

5.2. Shadow Geometry

5.3. Stereo Compilation

5.4. Spatial Databases

5.5. Workstation Environments

Seminar Report Page 13

application of image understanding to digital mapping problems. In order to achieve

Seminar Report Page 14

6. Prospects for the Future

Secondly, techniques related to image understanding may improve productivity in

Seminar Report Page 15

[4] Brady, M. "Computational Approaches to Image Understanding." Computing Surveys 14,1

[6] Edwards, D. Terrain Analysis Generation Through Computer-Assisted Photo Interpretation.

[8] Fainich, M. B. Sensor Image Simulator Application Studies. Proceedings: International

Seminar Report Page 16

You might also like