L11 PDF

Dept of Physical Geography and Ecosystems Analysis, and
GIS Centre at Lund University
GIS402: Advanced Course in Geographical Information Systems (GIS)
An Internet-based university course

given within Lund University Master’s Programme in GIS (LUMA-GIS)
Digitalisation and
Image Rectification
 Copyright 2004. Lund University GIS Centre, Sweden. All rights reserved. 1
Slide 1:
Digitalisation is the process of converting analogue data to a digital form, or creating new
data directly in digital form. Digitalisation can be done in a number of ways, for instance, by
scanning, using a digitizing table, or digitizing on the computer screen in a GIS environment.
Scanning always creates raster data, while digitizing with a table or on the screen creates
vector data (points, lines and polygons).
Slide 2:
Paper maps can be digitized by using a digitising table, but nowadays, it is often quicker and
more cost efficient to scan a paper map into a digital raster image, and thereafter use it to
digitize vector features from it in a GIS. Aerial photographs and satellite images are
commonly used as sources of information from which features are digitized.
Slide 3:
To digitize a paper map, one needs a digitising table (or tablet), and also special software to
communicate between the digitising table and the computer. When using a digitising table, there are
two ways of collecting the coordinates. In “point mode”, the user has full control over the coordinate
collection, as each point must be individually selected. In “stream mode”, points are automatically
collected and transferred to the computer either at specified time intervals (e.g. 1/10 sec) or at specific
distances from each other. Stream mode can only be used to digitise lines and polygons.
Slide 4:
One advantage with on-screen digitising is that, except the GIS software, no extra software or
hardware is needed. Users can use for example scanned paper maps or photographs as backgrounds,
from which they digitise only the features they want.
Slide 5:
When digitising a linear network, one should be careful that lines that should be connected together
are indeed connected together. Polygons must always be closed, and common borders should never be
digitised twice.
Slide 6:
This slide presents some common terms related to digitalisation. Snapping is an automated editing
operation in which points that are near to other points or lines will be moved slightly so that their
coordinates correspond (e.g. used to make sure that lines really connect to each other). These terms are
explained in more detail in the next slides.
Slide 7:
In the ArcGIS software, one can choose to snap to the start or end nodes of features, to vertices
defined by a coordinate pair, or to the closest edge of a feature. The snapping option is always
activated by first setting the tolerance value at which the snapping will be allowed. The tolerance can
be set either in map units (e.g. meters) or in screen pixels. For most cases, setting the tolerances in real
world distance units (e.g. metres) is better than using the artificial screen pixel units. In a general case,
the default value suggested by the software will be appropriate. However, if you experience problems
during digitising, e.g. snapping does not work properly, you should try to increase or decrease the
distance.
Slide 8:
An over-shoot is created when a line that should stop at the intersection point with another line
continues after the intersection point, creating what is known as a “dangling node”. Dangling nodes
can be automatically removed by special algorithms based on a specified “dangling length”, which is
the threshold at which smaller lines will be automatically removed. For example, all over-shoots
shorter than 10 meters (dangling length) can be removed. One has to be careful when using automatic
dangling node removal, e.g. if a road database has small “dead-end” roads these should be kept in the
database. On the contrary, an under-shoot is created when a line is not quite long enough to intersect
another line (when in fact, it should). In the normal case it is better to over-shoot than under-shoot
when digitising.
Slide 9:
Snapping is used during the actual digitalisation process, while fuzzy tolerance is used as a
post-digitizing process to merge features that fall within a defined tolerance value together.
Identifying an optimal fuzzy tolerance is usually done through “trail and error”. It is
recommended to start with a very small fuzzy tolerance, increasing it gradually if necessary.
This is a useful method to remove “sliver polygons”.
Slide 10:
The weed distance is the minimum distance allowed between coordinates (vertices) on the same line.
Any vertices closer than the weed tolerance will be merged together. In a similar manner, any start or
stop nodes closer than the node match tolerance will be merged together. With a normal cursor, it is
often very difficult to click precisely on a point or a line on the screen, as they do not really have a
spatial dimension. Therefore, to facilitate the selection of such objects, the edit distance defines a
“virtual buffer” making it easier to select features. Features inside the edit distance “buffer” will be
selected.
Slide 11:
Digitizing points is very straight forward because no topology is needed. One problem to think about
is that with zero (0) snapping and tolerance, it could be happen that many points are digitized on top of
each other.
Slide 12:
When lines intersect each other, a node is usually automatically created at the intersection point (but
this is software specific). If digitising straight lines, the “stream mode” should be avoided because it
will only create unnecessary data points (vertices). It is always quite tricky to trace complex lines
perfectly. To make use of the zoom capabilities in the software is one way of improving digitising
results (when digitising on the computer screen).
Slide 13:
When digitizing individual polygons that are not connected to each other, they can be directly
digitized by creating new polygons each time. But when making, for example, a land use map, where
an entire study area must be covered by polygons, the easiest and most appropriate way is to create a
first polygon to start with and then add other polygons to it by “snapping” them together. Most GIS
software have two digitising tools – one for individual polygons and one for adding or attaching
polygons to the first one. Consequently, common borders will not be digitised twice. Sometimes,
unique situations (like when polygons are drawn inside other polygons) require the use of overlay
operations such as merge, union, intersect and clip. In such cases, one must experiment to see which of
these four operations that is the most appropriate to the specific case.
Slide 14:
The next few slides describe briefly recommended steps when digitising. The first step is to study
carefully the original data from which features will be digitised. It is important to know and document
everything (the metadata of the original map), such as quality indicators, original scale, year of
production, producers, how attributes have been collected, etc. When using a digitizing table to
digitise from a paper map, reference marks (also known as tic marks) with known coordinates on the
map must be used to relate the coordinate system of the digitising table to the real-world coordinate
system. It is also extremely important to know and document all the parameters of the coordinate
system of the original data.
Slide 15:
One advantage of digitizing on the computer screen, compared to digitizing from a paper map, is the
ability to zoom, which will increase precision of the digitised features. It is strongly recommended to
verify the progress regularly on screen and save often to avoid unnecessary work if something goes
wrong. Digitising can be quite tiring and demanding, so it is also recommended to rest often to stay
alert and focused, otherwise errors will be created in the database and/or the quality of the work will
decrease. Depending on the complexity of the database, one can either add attributes during the
digitising process or afterwards.
Slide 16:
The more sophisticated the GIS software, the more options are available to check and edit the
digitised material. Nevertheless, one should zoom in randomly to check connectivity and
evaluate if the snapping and fuzzy tolerance values were appropriately defined. To check if
accidentally polygons overlap in the database, a “quick and dirty” method is to randomly
delete a polygon to inspect the “hole” in the database and the borders to surrounding
polygons. The deleted polygon can be restored using the Edit Undo command. It may also be
useful to display map and table side-by-side on screen and click to see that each object has a
corresponding row in the table and vice versa.
Slide 17:
Attributes can be added individually for each feature, or if many features should have the same
attribute (like polygons classified as “coniferous forest”), they can be selected all at once and given the
same attribute.
Slide 18:
In order to collect/digitize data correctly, the original data must of course be correct itself. Aerial
photographs and satellite images in “raw” format are not geometrically correct from the beginning,
and geometric corrections must be performed before they are used as data input. When multiple layers
are used together in a GIS, it is also very important that they match geometrically (have same
coordinate/reference system). Remote sensed data has geometric characteristics that requires special
software to be corrected in the best way, but it is often possible to obtain a fairly well fit using a
standard GIS-software.
Slide 19:
Correcting remotely sensed images, or other data, is often done by using Ground Control Points
(GCP), which are points (on earth) with known coordinates. The GCP must be possible to identify in
the data set to be corrected (identify the same location). Coordinates are then compared and a
relationship established (see next slide). The quality of the image correction will normally increase
with the number of GCP used.
Slide 20:
The difference between the coordinates of the GCP and the image for the same location are compared,
and a model (equation) is applied to describe the “best fit” between the two. A high quality image
correction will require quite advanced mathematics (non-linear transformation) and it is not obvious
that such are available in simpler software. Errors are quantified by calculating the Root Mean Square
Error (RMSE).
Slide 21:
Here is an example of the RSME calculation. ∆Xi equals the differences in the X direction between the
measured points (GCP) and the points at the same location in the map. The differences between the
same points in the Y direction equal ∆Yi. With the Pythagoras Theorem, the differences in the X and
the Y direction are combined (∆Li), and calculated for each pair of points (GCP versus the map). The
RMSE is thereafter calculated by taking the square root of the sum of all ∆Li squared and divided by
the number of points.
Slide 22:
Once the GPS are identified in both maps, a model (equation) is used to make them “fit” with each
other. Linear or non-linear transformations can be applied. Non-linear transformations are more
complex, and might fit better at the actual points, but users should be aware that the fit in between the
points might not be very good. Therefore, it is always safer to use a polynomial equation of low order
to make such transformation. The linear transformation might not give as good results at the actual
points, but compared to high-order polynomial transformations, could give better results between the
points.
Slide 23:
This is an example of a “world file”, used to define the location of raster files in a GIS. With such a
file, it is even possible to rotate an image, therefore making a linear transformation, simply by opening
the file and change the values.

L11 PDF

Uploaded by

Copyright:

Available Formats

You might also like

L11 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

L11 PDF

Uploaded by

Copyright:

Available Formats

Dept of Physical Geography and Ecosystems Analysis, and

GIS Centre at Lund University

GIS402: Advanced Course in Geographical Information Systems (GIS)

An Internet-based university course

You might also like