HGES408 Module

Bachelor of Science
Honours in Geography
and Environmental
Studies
Geographic Information Systems
Module HGES 408

Authors: Matawa Farai
Masters in Environmental Policy and Planning (UZ)
Bachelor of Arts Honours in Geography (UZ)
Post-graduate Certificate in Multi-hazard Risk Assessment
and Strategic Environmental Assessment for Spatial
Planning, Faculty of Geo-Information Science and Earth
Observation of the University of Twente (ITC),
Netherlands, 2010.
Sigauke Esther
Msc Environmental Policy and Planning (UZ)
BA Humanities and Social Science (Africa University)
Content Reviewer: Rwasoka Donald T.

Masters of Science in Geo-Information Science and Earth
Observation in Water Resources and Environmental
Management-International Institute for Geo-Information
Science and Observation (ITC) University of Twente
(Netherlands)
Bsc Hons in Geography and Environmental Studies
(Midlands State University)
Editor: Mupunga Diana

MA. Distance Education (Indira Gandhi National Open
University)
Post Graduate Diploma in Distance Education (IGNOU)
Graduate Certificate in Education (UZ)
BA General (UZ)
Published by: Zimbabwe Open University
P.O. Box MP1119
Mount Pleasant
Harare, ZIMBABWE
The Zimbabwe Open University is a distance teaching and open

learning institution.
Year: October 2013
Cover design: T. Ndhlovu
Layout and design: D. Satumba Nyandowe
Printed by: ZOU Press
Typeset in Times New Roman, 12 point on auto leading
© Zimbabwe Open University. All rights reserved. No part of this

publication may be reproduced, stored in a retrieval system, or transmitted,
in any form or by any means, electronic, mechanical, photocopying,
recording or otherwise, without the prior permission of the Zimbabwe Open
University.
To the student
The demand for skills and knowledge academics, technologists and
and the requirement to adjust and administrators of varied backgrounds,
change with changing technolog y, training, skills, experiences and personal
places on us a need to learn continually interests. The combination of all these
throughout life. As all people need an qualities inevitably facilitates the
education of one form or another, it has production of learning materials that
been found that conventional education teach successfully any student,
institutions cannot cope with the anywhere and far removed from the
demand for education of this tutor in space and time. We emphasize
magnitude. It has, however, been that our learning materials should enable
discovered that distance education and you to solve both work-related problems
open learning, now also exploiting e- and other life challenges.
learning technology, itself an offshoot
of e-commerce, has become the most To avoid stereotyping and professional
effective way of transmitting these narrowness, our teams of learning
appropriate skills and knowledge materials producers come from different
required for national and international universities in and outside Zimbabwe,
development. and from Commerce and Industry. This
openness enables ZOU to produce
Since attainment of independence in materials that have a long shelf life and
1980, the Zimbabwe Government has are sufficiently comprehensive to cater
spearheaded the development of for the needs of all of you, our learners
distance education and open learning in different walks of life. You, the
at tertiary level, resulting in the learner, have a large number of optional
establishment of the Zimbabwe Open courses to choose from so that the
University (ZOU) on 1 March, 1999. knowledge and skills developed suit the
career path that you choose. Thus, we
ZOU is the first, leading, and currently strive to tailor-make the learning
materials so that they can suit your
the only university in Zimbabwe
personal and professional needs. In
entirely dedicated to teaching by
developing the ZOU learning materials,
distance education and open learning.
we are guided by the desire to provide
We are determined to maintain our
you, the learner, with all the knowledge
leading position by both satisfying our
and skill that will make you a better
clients and maintaining high academic performer all round, be this at certificate,
standards. To achieve the leading diploma, undergraduate or postgraduate
position, we have adopted the course level. We aim for products that will
team approach to producing the varied settle comfortably in the global village
learning materials that will holistically and competing successfully with anyone.
shape you, the learner to be an all-round Our target is, therefore, to satisfy your
performer in the field of your own quest for knowledge and skills through
choice. Our course teams comprise distance education and open learning.
Any course or programme launched by ZOU is you may never meet in life. It is our intention
conceived from the cross-pollination of ideas to bring the computer, email, internet chat-
from consumers of the product, chief among rooms, whiteboards and other modern methods
whom are you, the students and your employers. of delivering learning to all the doorsteps of
We consult you and listen to your critical our learners, wherever they may be. For all these
analysis of the concepts and how they are developments and for the latest information on
presented. We also consult other academics what is taking place at ZOU, visit the ZOU
from universities the world over and other website at www.zou.ac.zw
international bodies whose reputation in
distance education and open learning is of a Having worked as best we can to prepare your
very high calibre. We carry out pilot studies of learning path, hopefully like John the Baptist
the course outlines, the content and the prepared for the coming of Jesus Christ, it is
programme component. We are only too glad my hope as your Vice Chancellor that all of you,
to subject our learning materials to academic will experience unimpeded success in your
and professional criticism with the hope of educational endeavours. We, on our part, shall
improving them all the time. We are determined continually strive to improve the learning
to continue improving by changing the learning materials through evaluation, transformation of
materials to suit the idiosyncratic needs of our delivery methodologies, adjustments and
learners, their employers, research, economic sometimes complete overhauls of both the
circumstances, technological development, materials and organizational structures and
changing times and geographic location, in order culture that are central to providing you with
to maintain our leading position. We aim at the high quality education that you deserve.
giving you an education that will work for you Note that your needs, the learner ‘s needs,
at any time anywhere and in varying occupy a central position within ZOU’s core
circumstances and that your performance activities.
should be second to none.
Best wishes and success in your studies.
As a progressive university that is forward
looking and determined to be a successful part
of the twenty-first century, ZOU has started to
introduce e-learning materials that will enable
you, our students, to access any source of
information, anywhere in the world through
internet and to communicate, converse, discuss _____________________
and collaborate synchronously and Prof. Primrose Kurasha
asynchronously, with peers and tutors whom Vice Chancellor
The Six Hour Tutorial Session At
The Zimbabwe Open University
A s you embark on your studies with the
Zimbabwe Open University (ZOU) by open
and distance learning, we need to advise you so
This is where the six hour tutorial comes in. For
it to work, you need to know that:
· There is insufficient time for the tutor
that you can make the best use of the learning
to lecture you
materials, your time and the tutors who are based
· Any ideas that you discuss in the
at your regional office.
tutorial, originate from your experience
as you work on the materials. All the
The most important point that you need to note is
issues raised above are a good source
that in distance education and open learning, there
of topics (as they pertain to your
are no lectures like those found in conventional
learning) for discussion during the
universities. Instead, you have learning packages
tutorial
that may comprise written modules, tapes, CDs,
· The answers come from you while the
DVDs and other referral materials for extra reading.
tutor’s task is to confirm, spur further
All these including radio, television, telephone, fax
discussion, clarify, explain, give
and email can be used to deliver learning to you.
additional information, guide the
As such, at the ZOU, we do not expect the tutor
discussion and help you put together
to lecture you when you meet him/her. We believe
full answers for each question that you
that that task is accomplished by the learning
bring
package that you receive at registration. What
· You must prepare for the tutorial by
then is the purpose of the six hour tutorial for each
bringing all the questions and answers
course on offer?
that you have found out on the topics
to the discussion
At the ZOU, as at any other distance and open
· For the tutor to help you effectively, give
learning university, you the student are at the centre
him/her the topics beforehand so that
of learning. After you receive the learning package,
in cases where information has to be
you study the tutorial letter and other guiding
gathered, there is sufficient time to do
documents before using the learning materials.
so. If the questions can get to the tutor
During the study, it is obvious that you will come
at least two weeks before the tutorial,
across concepts/ideas that may not be that easy
that will create enough time for
to understand or that are not so clearly explained.
thorough preparation.
You may also come across issues that you do not
agree with, that actually conflict with the practice
In the tutorial, you are expected and required to
that you are familiar with. In your discussion
take part all the time through contributing in
groups, your friends can bring ideas that are totally
every way possible. You can give your views,
different from yours and arguments may begin. You
even if they are wrong, (many students may hold
may also find that an idea is not clearly explained
the same wrong views and the discussion will
and you remain with more questions than answers.
help correct the errors), they still help you learn
You need someone to help you in such matters.
the correct thing as much as the correct ideas.
The Six Hour Tutorial Session At The Zimbabwe Open University
You also need to be open-minded, frank, inquisitive learning package together with the sources to
and should leave no stone unturned as you analyze which you are referred. Fully-fledged lectures
ideas and seek clarification on any issues. It has can, therefore, be misleading as the tutor may
been found that those who take part in tutorials dwell on matters irrelevant to the ZOU course.
actively, do better in assignments and examinations
because their ideas are streamlined. Taking part Distance education, by its nature, keeps the tutor
properly means that you prepare for the tutorial and student separate. By introducing the six hour
beforehand by putting together relevant questions tutorial, ZOU hopes to help you come in touch
and their possible answers and those areas that with the physical being, who marks your
cause you confusion. assignments, assesses them, guides you on
preparing for writing examinations and
Only in cases where the information being assignments and who runs your general academic
discussed is not found in the learning package can affairs. This helps you to settle down in your
the tutor provide extra learning materials, but this course having been advised on how to go about
should not be the dominant feature of the six hour your learning. Personal human contact is,
tutorial. As stated, it should be rare because the therefore, upheld by the ZOU.
information needed for the course is found in the
The six hour tutorials should be so structured that the

tasks for each session are very clear. Work for each
session, as much as possible, follows the structure given
below.
Session I (Two Hours)

Session I should be held at the beginning of the semester. The
main aim of this session is to guide you, the student, on how
you are going to approach the course. During the session, you
will be given the overview of the course, how to tackle the
assignments, how to organize the logistics of the course and
formation of study groups that you will belong to. It is also during
this session that you will be advised on how to use your learning
materials effectively.
The Six Hour Tutorial Session At The Zimbabwe Open University
Session II (Two Hours)

This session comes in the middle of the semester to respond
to the challenges, queries, experiences, uncertainties, and
ideas that you are facing as you go through the course. In this
session, difficult areas in the module are explained through the
combined effort of the students and the tutor. It should also give
direction and feedback where you have not done well in the
first assignment as well as reinforce those areas where
performance in the first assignment is good.
Session III (Two Hours)

The final session, Session III, comes towards the end of the
semester. In this session, you polish up any areas that you still
need clarification on. Your tutor gives you feedback on the
assignments so that you can use the experience for preparation
for the end of semester examination.
Note that in all the three sessions, you identify the areas
that your tutor should give help. You also take a very
important part in finding answers to the problems posed.
You are the most important part of the solutions to your
learning challenges.
Conclusion for this course, but also to prepare yourself to

contribute in the best way possible so that you
can maximally benefit from it. We also urge
In conclusion, we should be very clear that six you to avoid forcing the tutor to lecture you.
hours is too little for lectures and it is not
necessary, in view of the provision of fully self- BEST WISHES IN YOUR STUDIES.
contained learning materials in the package, to
turn the little time into lectures. We, therefore, ZOU
urge you not only to attend the six hour tutorials
Contents
Module Overview ______________________________________________ 1
Unit One: Introduction to Geographical Information Systems

1.1 _______ Introduction ____________________________________________________ 3
1.2 _______ Objectives ______________________________________________________ 4
1.3 _______ Definition of Terms _____________________________________________ 4
__________ 1.3.1 What is Geographical Information Systems (GIS)? _____________ 4
__________ Activity 1.1 ______________________________________________________ 5
__________ 1.3.2 What is data? ______________________________________________ 6
__________ 1.3.3 What is information? _______________________________________ 6
__________ 1.3.4 What is a model? ___________________________________________ 7
__________ Activity 1.2 ______________________________________________________ 7
1.4 _______ Components of GIS _____________________________________________ 7
__________ 1.4.1 GIS hardware ______________________________________________ 7
__________ 1.4.2 Data _______________________________________________________ 8
__________ 1.4.3 People _____________________________________________________ 8
__________ 1.4.4 Methods ___________________________________________________ 8
__________ Activity 1.3 ______________________________________________________ 8
1.5 _______ Functions of GIS ________________________________________________ 8
__________ 1.5.1 Data capture/input __________________________________________ 8
__________ 1.5.2 Data compilation ___________________________________________ 9
__________ 1.5.3 Data storage (GIS data models) ______________________________ 9
__________ 1.5.4 Manipulation _______________________________________________ 9
__________ 1.5.5 Analysis ___________________________________________________ 9
__________ Activity 1.4 _____________________________________________________ 10
1.6 _______ Advantages and Disadvantages of GIS ____________________________ 10
__________ 1.6.1 Advantages of GIS _________________________________________ 10
__________ 1.6.2 Disadvantages of GIS ______________________________________ 10
__________ Activity 1.5 _____________________________________________________ 10
1.7 _______ Summary ______________________________________________________ 11
__________ References _____________________________________________________ 12
Unit Two: Conceptual Models of Real World Phenomena

2.1 _______ Introduction ___________________________________________________ 13
2.2 _______ Objectives _____________________________________________________ 14
2.3 _______ Geographic Phenomena _________________________________________ 14
__________ 2.3.1 Fields ____________________________________________________ 14
__________ 2.3.2 Objects ___________________________________________________ 15
__________ Activity 2.1 _____________________________________________________ 17
2.4 _______ Models and Representations of the Real World ___________________ 17
__________ 2.4.1 Model ____________________________________________________ 17
__________ 2.4.2 Types of entities ___________________________________________ 18
__________ Activity 2.2 _____________________________________________________ 20
2.5 _______ Spatial Data Models ____________________________________________ 21
__________ 2.5.1 The Raster Data Model ____________________________________ 21
__________ 2.5.2 The Vector Data Model ____________________________________ 22
2.6 _______ Advantages and Disadvantages of Vector and Raster Data ___________ 25
__________ Activity 2.3 _____________________________________________________ 26
2.7 _______ Summary ______________________________________________________ 26
__________ References _____________________________________________________ 27
Unit Three: Spatial Referencing I

3.1 _______ Introduction ___________________________________________________ 29
3.2 _______ Objectives _____________________________________________________ 30
3.3 _______ Reference Surfaces _____________________________________________ 30
__________ 3.3.1 Geoid ____________________________________________________ 30
__________ 3.3.2 The ellipsoid ______________________________________________ 31
__________ Activity 3.1 _____________________________________________________ 32
3.4 _______ Datum _________________________________________________________ 32
__________ Activity 3.2 _____________________________________________________ 32
3.5 _______ Coordinate System ______________________________________________ 33
__________ 3.5.1 Geographic coordinate system ______________________________ 33
__________ 3.5.2 2D cartesian coordinates ___________________________________ 35
__________ 3.5.3 2D polar coordinates ______________________________________ 35
__________ Activity 3.3 _____________________________________________________ 36
3.6 _______ Georeferencing _________________________________________________ 36
__________ Activity 3.4 _____________________________________________________ 37
3.7 _______ Summary ______________________________________________________ 37
__________ References _____________________________________________________ 38
Unit Four: Spatial Referencing II

4.1 _______ Introduction ___________________________________________________ 39
4.2 _______ Objectives _____________________________________________________ 40
4.3 _______ What is a Map Projection? ______________________________________ 40
4.4 _______ Types of Map Projections _______________________________________ 40
__________ 4.4.1 Types of normal projections ________________________________ 41
__________ Activity 4.1 _____________________________________________________ 41
4.5 _______ Developable Surface Projection Types ____________________________ 41
__________ 4.5.1 Tangent projections ________________________________________ 41
__________ 4.5.2 Secant projections _________________________________________ 42
__________ 4.5.3 Transverse projections _____________________________________ 42
__________ 4.5.4 Oblique projections ________________________________________ 43
__________ Activity 4.2 _____________________________________________________ 44
4.6 _______ Distortion Properties of Map Projections _________________________ 44
__________ 4.6.1 Conformal map projection __________________________________ 44
__________ 4.6.2 Equal area map projections _________________________________ 45
__________ 4.6.3 Equidistant map projection _________________________________ 45
__________ 4.6.4 Direction _________________________________________________ 46
__________ Activity 4.3 _____________________________________________________ 46
4.7 _______ How Do We Change From One Co-Ordinate System To Another? __ 46
__________ 4.7.1 Forward mapping equations ________________________________ 46
__________ 4.7.2 Inverse mapping equations _________________________________ 47
__________ Activity 4.4 _____________________________________________________ 47
4.8 _______ Coordinate Transformation _____________________________________ 48
__________ Activity 4.5 _____________________________________________________ 48
4.9 _______ Changing Map Projections _______________________________________ 48
__________ Activity 4.6 _____________________________________________________ 49
4.10 ______ Summary ______________________________________________________ 49
__________ References _____________________________________________________ 50
Unit Five: Spatial Data Capture and Preparation

5.1 _______ Introduction ___________________________________________________ 51
5.2 _______ Objectives _____________________________________________________ 52
5.3 _______ Definitions and Concepts ________________________________________ 52
5.4 _______ Methods of Spatial Data Capture _________________________________ 53
__________ 5.4.1 Digitising _________________________________________________ 53
__________ 5.4.2 Keyboard entry ____________________________________________ 54
__________ 5.4.3 The Global Positioning System (GPS) _______________________ 54
__________ 5.4.4 Interpreting and classifying remotely sensed images ___________ 57
__________ 5.4.5 Importing data from other sources/data file into a GIS ________ 57
__________ Activity 5.1 _____________________________________________________ 58
5.5 _______ Sources of Geographic Data _____________________________________ 58
__________ 5.5.1 Primary sources ___________________________________________ 58
__________ 5.5.2 Secondary data sources ____________________________________ 58
__________ 5.5.3 Other sources _____________________________________________ 58
__________ Activity 5.2 _____________________________________________________ 59
5.6 _______ Data Quality ___________________________________________________ 59
__________ 5.6.1 Positional accuracy ________________________________________ 59
__________ 5.6.2 Completeness _____________________________________________ 60
__________ 5.6.3 Temporal accuracy _________________________________________ 60
__________ 5.6.4 Lineage ___________________________________________________ 60
__________ 5.6.5 Logical consistency _________________________________________ 60
__________ Activity 5.3 _____________________________________________________ 60
5.7 _______ Data Checks and Repairs ________________________________________ 61
__________ Activity 5.4 _____________________________________________________ 61
5.8 _______ Combining Data from Multiple Sources ___________________________ 61
__________ Activity 5.5 _____________________________________________________ 62
5.9 _______ Summary ______________________________________________________ 62
__________ References _____________________________________________________ 63
Unit Six: Spatial Data Management and Processing Systems

6.1 _______ Introduction ___________________________________________________ 65
6.2 _______ Objectives _____________________________________________________ 66
6.3 _______ Hardware Trends ______________________________________________ 66
6.4 _______ Software Trends ________________________________________________ 66
__________ Activity 6.1 _____________________________________________________ 67
6.5 _______ GIS and Stages of Data Handling ________________________________ 67
__________ 6.5.1 Spatial data capture ________________________________________ 67
__________ 6.5.2 Spatial data storage and maintenance ________________________ 68
__________ 6.5.3 Spatial query and analysis __________________________________ 68
__________ 6.5.4 Spatial data presentation ___________________________________ 68
__________ Activity 6.2 _____________________________________________________ 69
6.6 _______ Database Management Systems (DBMS) __________________________ 69
__________ 6.6.1 Definitions and concepts ___________________________________ 69
6.7 _______ The Importance of DBMS? ______________________________________ 70
__________ Activity 6.3 _____________________________________________________ 71
6.8 _______ The Relational Data Model ______________________________________ 71
6.9 _______ Databases _____________________________________________________ 72
__________ 6.9.1 Steps in creating a database _________________________________ 72
__________ Activity 6.4 _____________________________________________________ 73
__________ 6.9.2 Primary key _______________________________________________ 73
__________ Activity 6.5 _____________________________________________________ 74
__________ 6.9.3 Querying a relational database ______________________________ 74
__________ 6.9.4 Spatial database functionality _______________________________ 76
__________ Activity 6.6 _____________________________________________________ 78
6.10 ______ Summary ______________________________________________________ 78
__________ References _____________________________________________________ 78
Unit Seven: Spatial Data Transformations

7.1 _______ Introduction ___________________________________________________ 79
7.2 _______ Objectives _____________________________________________________ 80
7.3 _______ Interpolation ___________________________________________________ 80
__________ Activity 7.1 _____________________________________________________ 80
__________ 7.3.1 Interpolating discrete data __________________________________ 80
__________ Activity 7.2 _____________________________________________________ 81
__________ 7.3.2 Interpolating continuous data _______________________________ 81
__________ Activity 7.3 _____________________________________________________ 84
7.4 _______ Summary ______________________________________________________ 85
__________ References _____________________________________________________ 86
Unit Eight: Spatial Data Analysis I

8.1 _______ Introduction ___________________________________________________ 87
8.2 _______ Objectives _____________________________________________________ 88
8.3 _______ Spatial Data Analysis ____________________________________________ 88
__________ 8.3.1 Logical operators __________________________________________ 88
__________ 8.3.2 Mathematical operators ____________________________________ 91
__________ Activity 8.1 _____________________________________________________ 92
8.4 _______ Measurement of Vector Data ____________________________________ 92
8.5 _______ Measurements on Raster Data ___________________________________ 93
__________ Activity 8.2 _____________________________________________________ 93
8.6 _______ Spatial Queries _________________________________________________ 93
__________ Activity 8.3 _____________________________________________________ 94
8.7 _______ Classifications __________________________________________________ 94
__________ 8.7.1 Reclassification ____________________________________________ 94
__________ 8.7.2 Automatic classification ____________________________________ 96
__________ Activity 8.4 _____________________________________________________ 97
8.8 _______ Overlay Functions ______________________________________________ 97
__________ 8.8.1 Vector based overlay _______________________________________ 97
__________ 8.8.2 Raster based overlay _______________________________________ 98
__________ Activity 8.5 _____________________________________________________ 99
8.9 _______ Summary ______________________________________________________ 99
__________ References ____________________________________________________ 100
Unit Nine: Spatial Data Analysis II: Neighbourhood Analysis and

Network Analysis
9.1 _______ Introduction __________________________________________________ 101
9.2 _______ Objectives ____________________________________________________ 102
9.3 _______ Neighbourhood Analysis _______________________________________ 102
__________ 9.3.1 Proximity computations ___________________________________ 103
__________ Activity 9.1 ____________________________________________________ 105
__________ 9.3.2 Spread and diffuse computations ___________________________ 105
__________ Activity 9.2 ____________________________________________________ 106
__________ 9.3.3 Seek computations ________________________________________ 106
__________ Activity 9.3 ____________________________________________________ 107
__________ 9.3.4 Spatial operations on continuous surface ___________________ 107
__________ Activity 9.4 ____________________________________________________ 111
__________ 9.3.5 Applications of neighbourhood analysis _____________________ 111
9.4 _______ Network Analysis ______________________________________________ 112
__________ Activity 9.5 ____________________________________________________ 112
9.5 _______ Optimal Path Finding __________________________________________ 112
9.6 _______ Trace Analysis ________________________________________________ 113
__________ Activity 9.6 ____________________________________________________ 113
9.7 _______ Summary _____________________________________________________ 114
__________ References ____________________________________________________ 114
Unit Ten: GIS Applications

10.1 ______ Introduction __________________________________________________ 115
10.2 ______ Objectives ____________________________________________________ 116
10.3 ______ Application of GIS in Disaster Risk Management _________________ 116
10.4 ______ Application of GIS in Habitat Mapping (Species Distribution _________
__________ Modelling) ____________________________________________________ 117
10.5 ______ Application of GIS in Waste Management ________________________ 118
10.6 ______ Vector and Disease Management (Epidemiology) _________________ 119
10.7 ______ Fleet Management and Route Planning ___________________________ 120
10.8 ______ Agricultural Activities __________________________________________ 120
10.9 ______ Governance ___________________________________________________ 121
__________ Activity 10.1 ___________________________________________________ 122
10.10 _____ Summary _____________________________________________________ 122
__________ References ____________________________________________________ 123
BLANK PAGE
Module Overview
G
eographical Information Systems (GIS) is a set of computer-based
procedures that enables capture, modelling, storage, retrieval,
sharing, manipulation, analysis, and presentation of geographically
referenced data. The capabilities and functions of GIS in capturing, modelling,
storing, retrieving, sharing, manipulating and analysing data has resulted in a
growing trend in the use of GIS in organisations. Thus, all these capabilities
and functions of GIS have been covered in the module.
Unit One introduces you to the basic concepts of Geographical Information
Systems. These concepts help us to understand what GIS is all about. The
definition of GIS is captured as well as functions and components of GIS.
The advantages and disadvantages of GIS are also explored. In the unit we
also provide an understanding of the Global Positioning Systems (GPS)
including its functions and associated errors and how they can be corrected.
In Unit Two we focus on the conceptual models of the real world phenomena.
This unit is important to understand because we do not store the real world
into a computer but we use models to represent the real world. The key
concepts covred in this unit include raster and vector data models.
Geography Information Systems HGES 408
In addition, the unit covers geographic data types, namely nominal, ordinal,
interval and ratio data. These data types determine how a phenomenon should
be represented in the computer.
Units Three and Four focus on spatial referencing in GIS. Unit Three covers
the concepts such as ellipsoids, geoids and datums. The coordinate systems
such as geographic, polar and Cartesian coordinates are explained in this
unit. Unit Four goes deeper in explaining the concept of spatial referencing by
introducing map projections. The thrust of Unit Four is in different types of
map projections
Unit Five covers spatial data capture and preparation methods. The methods
of spatial data capture include digitising, Global Positioning Systems (GPS)
and keyboard entry. Some of the data sources are primary sources, secondary
sources and others. In this unit, we also discuss the sources of GIS data as
well as sources of error and possible remedies as well. We further discuss
data quality, that is, how we can check for accuracy and precision.
Unit Six focuses on spatial data management and processing systems. That is
systems that facilitate the management and processing of geo-information,
hardware and software trends as well as data handling procedures. In addition
to this, databases, including geographic databases are covered.
Unit Seven covers the methods of transforming data in GIS. These techniques
use both interpolation continuum and discrete data. The interpolation techniques
are nearest neighbour interpolation, Thiessen's polygons, trend surface fitting,
triangulation, moving window averaging and kriging.
Unit Eight and Nine present the various techniques of spatial data analysis.
These include techniques namely, querying, buffering, classification,
reclassification and neighbourhood functions. In Unit Nine we go further in
discussing neighbourhood analysis, where proximity computations (buffer zone
generation and Thiessen's polygon generation), spread and diffuse computations
as well as seek computations, spatial operations on continuous surfaces and
network analysis.
Unit Ten covers the Geographic Information Systems applications. These

include disaster risk assessment, species distribution (habitat) modelling,
agriculture management, vector and disease management, fleet management
and route planning and waste management.
2 Zimbabwe Open University

Unit One
Introduction to Geographical
Information Systems
1.1 Introduction
I
n this unit, you will be introduced to some basic concepts of Geographical
Information Systems (GIS). These concepts will help us to understand
what GIS is all about. We will learn the definition of and functions of
GIS. In the unit we also provide an understanding of the Global Positioning
Systems (GPS) including its functions and associated errors and how they
can be corrected.
1.2 Objectives
By the end of this unit you should be able to:
 define the terms Geographical Information Systems, data, information
spatial data, model
 explain the functions of Geographical Information Systems in Geography
and Environmental Studies
 state the advantages and disadvantages of Geographical Information
Systems
 describe how Global Position Systems works and its functions
 discuss the sources of error in Global Position Systems
1.3 Definition of Terms

In this section, we shall define the following terms: Geographical Information
Systems, data, spatial data, information and model.
1.3.1 What is Geographical Information Systems (GIS)?

GIS can be defined in several ways. Here are some of the definitions:
 Burrough and McDonnell (1998:6) defined GIS as "a powerful set of
tools for storing and retrieving at will, transforming and displaying spatial
data from the real world for a particular set of purposes"
 It's a special class of information systems that keep track not only of
events, activities and things, but also where these events, activities and
things happen and exist (Longley et al.,2004:4)
 It is a manual or computer- based set of procedures used to store and
manipulate geographically referenced data (Aronoff, 1989:39).
 It is a special case of information systems in which information systems
where the database consists of observations on spatially distributed
features, activities or events, which are definable in space as points
lines and polygons (Dueker, 1979:106).
From the given definitions we can grasp that GIS is a computer based system
for geo-referenced data for handling:
 data input,
 data management (storage and retrieval),
 manipulation and analysis,
 data output/presentation

Unit 1 Introduction to Geographical Information Systems
From the definitions you can also deduce that, GIS as a tool for spatial analysis,
enables us to answer question on location, condition, trends, patterns and
modelling. The following are examples of the five main types of questions that
a GIS can answer:
1. Location: What is at? This question looks at what is at a particular
location. A location can be described in many ways such as place name,
post code or geographic reference such as latitude or longitude or x
and y. Examples include:
 What is at the corner of 3rd Street and Kwame Nkurumah Avenue?
 What feature is located at geographic position 165o E, 5o N?
2. Condition: Where is it? This question inquires about the specific
location and requires spatial data to answer. For example:
 Arable land within 100m of the main road and with soils suitable for
Soya bean production.
 Which hotels are within 600m from the main road and are classified as
three star?
3. Trends: What has changed since? This question inquires about how
conditions have changed over time over the earth's surface. Examples
are:
 Have forest areas in Mashonaland West Province decreased in size
over the past 15 years?
 Have land use activities in Gokwe changed over the past decade?
4. Patterns: What spatial patterns exist? This question describes and
compares spatial patterns at different locations through the process of
spatial analysis. In this case, we need to know the relationship between
two or more datasets that occupy the same location? For instance:
 Is there a relationship between a region's varying elevation and the
amount of rainfall that falls across it?
 Do seasonal variations affect flood and drought occurrence?
5. Modelling: What if…? Such questions involve scenarios that differ
when you change the model's parameters. An example includes:
 What happens to guinea fowl habitat when a road is constructed through
a forest area?
 Which areas are at most risk (worst affected) if a flood occurs?
Activity 1.1
1. Define the term GIS.
? 2. Explain the five main types of questions that GIS can answer.
3. GIS enables you to perform trend and pattern analysis. Discuss.
Zimbabwe Open University 5

1.3.2 What is data?

Data can be defined as:
 Facts or statistics used for reference or analysis. http://teaching.ust.hk/
~gned008/socioecondata09.pdf
 Numbers, characters, symbols or images that can be processed by a
computer for analysis.
http://www.differencebetween.net/language/difference-between-data-and-
information/
Spatial or geographic data can be defined as:

 data that define specific location of features and boundaries on earth
 data that is spatially referenced, meaning that the data is identified
according to the locations
http://www.tiger.esa.int/TrainingCds/cd_01/content_2/sez_2_3/Unit-III-
GIS.pdf
Temporal data can be defined as:

 data where the variations with time are indicated
forhttp://www.tiger.esa.int/TrainingCds/cd_01/content_2/sez_2_3/Unit-III-
GIS.pdf
1.3.3 What is information?

There is widespread interchange of the word data and information, thus, we
are going to distinguish these two terms. We have learnt the definition of data;
therefore we learn the definition of information.
Information is any kind of knowledge that is exchangeable amongst people,

about facts or concepts. It is knowledge derived from experiences or studies
Information can also be referred as interpreted data (i.e. when we try to give
meaning to the data)
information/
Thus, we can say that, when facts (data) are interpreted, they become
knowledge (information). In this case, we can also note that it is difficult to
separate the two terms.

1.3.4 What is a model?

 A model is a representation of one or more processes that are believed
to occur in the real world-in other words, of how the world works.
 A model is a computer program that takes a digital representation of
one or more aspects of the real world and transforms them to create a
new representation.
Thus, modelling can be defined in the context of geographic information
systems (GIS) as occurring whenever operations of the GIS attempt to emulate
processes in the real world (Goodchild.,2005:2)
Activity 1.2
1. Define the terms: data, spatial data, temporal data and model.
? 2. Give at least 5 examples of spatial data.
3. Distinguish between data and Information.
1.4 Components of GIS

Now that we have defined GIS, let us look at the key components of GIS.
GIS integrates five components namely: hardware, software, data, people,
and methods.
1.4.1 GIS hardware

This can be defined as a computer and associated peripheral devices essential
for handling spatial data in a GIS. Examples of hardware used in GIS include:
 Computer
 High resolution colour monitor
 Digitizer-for inputting analogue map data
 Scanner-for inputting image data
 Printer-for printing maps and images
 Plotter-for plotting maps
GIS software is a collection of computer programs and related data that provide
the instructions for telling a computer what to do and how to do it. In other
words, software is a conceptual entity which is a set of computer programs,
procedures, and associated documentation concerned with the operation of
a data processing system. The examples of software that are available on the

market for GIS include, Integrated Land and Water Information System
(ILWIS), Arc View and Earth Resources Data Analysis Systems (ERDAS).
1.4.2 Data
It includes any information that is spatial or tabular that relates to geography
and speciality fields. For example, they may include road names, crime
statistics, borehole points and illegal dump sites.
1.4.3 People
GIS technology requires people who manage and develop the system. GIS
users include technical specialists who design and maintain the system and
those who use it in their everyday work.
1.4.4 Methods
GIS works with models and operating practices.
Activity 1.3
1. State the 5 components of GIS.
? 2.
3.
Why are the 5 components important in GIS?
Define the terms software and hardware.
4. Give examples of GIS software and hardware.
1.5 Functions of GIS

Now that we have understood the definition of GIS, it is important to look at
the functions of GIS. The functions of GIS describe the steps that have to be
taken to implement a GIS. These steps have to be followed in order to obtain
a systematic and efficient system. Thus, the five major functions of GIS are
data capture, compilation storage, manipulation and analysis.
1.5.1 Data capture/input

Data input refers to the procedure of automatisation of the data and the
conversion into forms that can be stored and analyzed in computers. There
are two methods of data input, namely, primary and secondary methods

a) Primary methods includes surveying, photogrammetry, GPS, and

remote sensing
b) Secondary methods includes digitization, and scanning
These data input techniques are discussed in detail in Unit five.
1.5.2 Data compilation

At this stage the user completes the compilation phase by relating all spatial
features to their respective attributes, and by cleaning up and correcting errors
introduced as a result of the data conversion process (De By, 2001). The end
results of compilation are a set of digital files, representing all of the spatial
and attribute data of interest contained on the original map manuscripts. These
digital files contain geographic coordinates for spatial objects (points, lines,
polygons, and cells) that represent mapped features.
1.5.3 Data storage (GIS data models)

When the data has been compiled, digital map files in the GIS are stored on
magnetic or other digital media. Data models namely the Raster and Vector
are used in the storage of data in digital format. These models are used to
simplify the data shown on a map into a more basic form that can be easily
stored in the computer. Vector and Raster data models are discussed in detailed
in the next unit.
1.5.4 Manipulation
When data are stored in a GIS, many manipulation options are available to
users. These functions are often available in the form of "toolkits." A toolkit is
a set of generic functions that a GIS user can employ to manipulate and analyse
geographic data. Toolkits provide processing functions such as data retrieval,
measuring area and perimeter, overlaying maps, performing map algebra, and
reclassifying map data (De By, 2001). Data manipulation tools include
coordinate change, projections, and edge matching, which allow a GIS to
reconcile irregularities between map layers or adjacent map sheets called
tiles.
1.5.5 Analysis
The analysis functions in GIS use the spatial and non-spatial attributes in the
database to answer questions about the real world. Geographic analysis
facilitates the study of real-world processes by developing and applying models.

Activity 1.4
1. What are the functions of GIS?
? 2. List some data input techniques used in GIS.
3. Explain why data analysis is an important function of GIS.
1.6 Advantages and Disadvantages of GIS

In this section, we focus on the advantages and the disadvantages of GIS.
1.6.1 Advantages of GIS

GIS has several advantages.
 Allows detailed planning of project having a large spatial component,
where analysis of the problem is pre requisite to start a project.
 Enables making better decisions, it is a tool to query, analyse and map
data for decision making process.
 Helps in making visualization of landscapes better, leading to better
understanding of certain relations in the landscape. This can be done
through use of the Digital Terrain Modelling (DTM) or Digital Elevation
Model (DEM) utilities.
 Helps you to make relevant calculations, such as soil erosion and water
volumes can be made.
 Ability to link data sets together, hence it facilitates interdepartmental
information sharing and communication.
 Enables handling of large amounts of data.
1.6.2 Disadvantages of GIS

There are a number of disadvantages of GIS:
 GIS software and hardware can be expensive to purchase.
 There is need for training and skills for one to be able to use a GIS.
 Decision makers may fail to understand the applications of GIS, hence
it becomes irrelevant to use if society is not benefiting from it.
Activity 1.5
1. What are the advantages of using GIS?
? 2. What are the disadvantages of using GIS?

1.7 Summary
In this unit, we discussed the basic concepts of Geographical Information
Systems. These concepts help us to understand what GIS is all about. We
learnt the definition of GIS, functions and the components of GIS. The unit
also provided an understanding of the Global Positioning Systems (GPS)
including its functions and associated errors and how they can be corrected.
In the next unit we focus on conceptual models of real world phenomena.

References
Aronoff, S. (1989). Geographic Information Systems: A Management
Perspective. Ottawa, WDL Publications.
Burrough, P., and McDonnell, R. (1998). Principles of Geographical
Information Systems. (New York: Oxford University Press).
Clarke, K.C. (1997). Getting Started with Geographic Information
Systems. London, Prentice Hall.
De By. (2001). Principles of Geographic Information Systems: An
Introductory Textbook, International Institute of Geo-Information and
Earth Observation, Enschede, De By et al. Educational Textbook Series.
Dueker, K.J. (1979). Land Resource and Information Systems: A review
of 13 years experience-Geo-processing 1.
Goodchild, F.M (2005). GIS and Modeling Overview, National Center for
Geographic Information and Analysis, University of California, Santa
Barbara, California.
Longley, P., Goodchild, M., Maguire, D., and Rhind, D., 2004, Geographic
Information Systems and Science. New York, Wiley.
http://www.ncgia.ucsb.edu/giscc/units/u002/u002.html : Accessed 3/09/2012.
http://dusk.geo.orst.edu/gis/Chapter9_notes.pdf : Accessed 28/08/2012.
http://igis.nust.edu.pk/compogis.htm: Accessed12/09/2012.
http://www.westminster.edu/staff/athrock/GIS/GIS.pdf-Accessed:18/09/
2012.
http://teaching.ust.hk/~gned008/socioecondata09.pdf: Accessed 06/11/2012.
information: Accessed 06/11/2012.
http://www.tiger.esa.int/TrainingCds/cd_01/content_2/sez_2_3/Unit-III-
GIS.pdf: Accessed 21/11/2012.

Unit Two
Conceptual Models of Real

World Phenomena
2.1 Introduction
I
n this unit we introduce you to some conceptual models of the real world
phenomena. It is important to understand these models since we do not
store the real world into a computer but we use models to represent the
real world. We introduce you to geographic data types, namely nominal,
ordinal, interval and ratio data. We learn about raster and vector data models
and structures. We also outline the advantages and disadvantages of these
models.
2.2 Objectives
By the end of this unit, you should be able to:
 define geographic phenomena, objects and fields
 discuss geographic data types (nominal, ordinal, interval and ratio)
 represent real world phenomena either as points, lines and polygons
 describe vector and raster data models and structures
 explain the advantages and disadvantages of raster and vector data
2.3 Geographic Phenomena

A geographic phenomenon can be described as a manifestation of an entity or
process of interest that can be:
 named/described
 geo-referenced
 assigned a time at which it is or was present
For example, in water management, river basins, agro-ecological units and
irrigation levels can be named/described, georeferenced and be assigned time
at which each exists. In waste management, sewage canals and landfills can
also be named/described, georeferenced and be assigned time at which each
exists. For example, some phenomena manifest themselves everywhere in the
study area for example temperature, pressure and elevation. This is an example
of a geographic field.
A geographic field is a geographic phenomenon in which for every point in

the study area a value can be determined. These fields are continuous in nature.
We also have discrete fields and examples are crop type, vegetation type,
land use and soil classifications.
Some phenomena do not manifest themselves everywhere but only at certain

localities for example buoys. These are called geographic objects and for
these, we know exactly where they are located. Geographic objects populate
the study area and are usually well distinguishable, discrete and bounded entities.
2.3.1 Fields
Continuous surfaces form the basis of geographic phenomena, known as the
field view. The field view represents the real world as a finite number of
variables, each one defined at every possible position. Examples of continuous

Unit 2 Conceptual Models of Real World Phenomena
field data are elevation, air pressure, temperature, or clay content of the soil
(Burrough and McDonnell 1998).
Fields can be distinguished by what varies and how smoothly. A field of

elevation, for example, varies much more smoothly in a landscape that has
been worn down by glaciation or flattened by blowing sand than one recently
created by cooling lava. Cliffs are places in fields where elevation changes
suddenly rather than smoothly.
Fields can also be created from classifications of land, into categories of land
use or soil type. Such fields change suddenly at the boundaries between
different classes. Other types of fields can be defined by continuous variation
along lines rather than across space. Traffic density, for example, can be defined
everywhere on a road network and flow volume can be defined everywhere
on a river (De By, 2001).
2.3.2 Objects
Geographic objects are identified by their dimensionality. Objects that occupy
area, including lakes and forest stands, are termed two-dimensional and
generally referred to as areas or polygons. Other objects that are linear,
including roads, railways, and rivers, are termed one-dimensional and generally
referred to as lines. Objects that are single locations, including individual animals
and buildings, are termed zero-dimensional and generally referred to as points.
The discrete object view leads to a powerful way of representing geographic

information about objects. Consider a class of objects of the same
dimensionality-for example, all the elephants in Kariba. We would naturally
think of these objects as points. We might want to know the sex of each
elephant and its date of birth if our interests were in monitoring the elephant
population. All of this information could be expressed in a table (see Table
2.1). Each row corresponds to a different discrete object and each column to
an attribute of the object. To reinforce a point made earlier, this is a very
efficient way of capturing raw geographic information on elephants.

Table 2.1: Example of representation of geographic information as a

table.
Elephant ID Sex Estimate Date Date of collar
of Birth installation
010 Male 2009 15/11/2010
020 Female 2010 05/05/2011
030 Female 2009 13/12/2010
040 Male 2010 05/05/2011
Adapted from De By, 2001
NB: Thus, as a rule-of-thumb, geographic phenomena are usually fields and

man made phenomena are objects.
Since we have distinguished between discrete and continuous data, now let
us look at the data types that we can use to represent our phenomena.
Specifically, we look at nominal, ordinal, interval and ratio data.
1. Nominal data
Nominal data values merely establish identity. No mathematical operations

can sensibly be carried out on this data. For example, rain gauges within a
study area may be given a numerical identity code. The identity numbers do
not indicate any order in terms of rainfall at the site.
2. Ordinal data
Ordinal data values establish order only. Comparisons of size can be made,
but no other mathematical operation. For example, air pollution monitoring
equipment situated in different suburbs enables the suburbs to be rated 1st,
2nd, 3rd, etc., according to their air quality. This information does not tell us
how much worse the 5th and 8th suburbs are compared to the 1st. Another
example is household income can be identified as low, average and high.

3. Interval data
On interval data values, the numbering scale does not start at zero. Interval
data is measured along a scale in which each position is equidistant from one
another.
http://changingminds.org/explanations/research/measurement/types data.htm
An example of interval scale measurement is temperature with the Celsius

scale. If temperatures are measured at various locations, then it is sensible to
say that 20oC is 10oC warmer than 10oC, not that it is twice warmer than
10oC.
4. Ratio data
The ratio scale of measurement has an absolute zero, and the difference
between numbers is significant. Mathematical operation such as addition,
subtraction, and division make sense. For example, the population data coded
for census districts can be manipulated in many ways, in particular, the
population can be divided by area (another ratio scale measurement) to obtain
population density.
Activity 2.1
1. Define the following terms:
? a. Geographic Phenomena
b. Geographic Fields
c. Geographic Objects
2. Distinguish between discrete and continuous data.
3. Explain the four data types that we can use to represent our phenomena.
2.4 Models and Representations of the Real World

In this section we are going to learn about models and modelling. We will also
learn about different types of entities namely points, lines, polygons, surfaces
and networks
2.4.1 Model
We can use GIS to help analyse and understand more about processes and
phenomena in the real world. Thus, models can be used in representation of

the real world. Modelling is a process of building a representation which has

certain characteristics in common with the real world. It can also be defined
as a process of producing an abstraction of the real world so that some part
of it can easily be handled (De By et al., 2001). Practically, it refers to process
of representing key aspects of the real world on the computer or digitally.
Thus, it can be said aspects of the real world are translated into a computer
representation.
2.4.2 Types of entities

All geographic phenomena can be divided into five entities namely points,
lines, polygons, surfaces and networks.
Figure 2.1: Map of Zimbabwe (source: http://wikitravel.org/en/Zimbabwe)
Figure 2.1 shows various entity types. For example, towns such as Kadoma,
Kwekwe, Chinhoyi and Marondera are represented as points. National Parks
such as Hwange, Matusadonha and Gonarezhou are represented as polygons.
Then roads and rivers are represented by lines.
a. Point representations
Points are defined as a single coordinate pairs (x,y) when working in a 2D

coordinate system. Points are used to represent objects that are best described

as single-locality features. However, this depends on spatial application and

spatial extent of the objects as compared to the scale. For example in tourist
city map, museums and phone booths can be represented as point features.
Figure 2.2(a) illustrates point features.
Figure 2.2 (a): Point features (Source, Longley et al., and 2004:184)
b. Line representations
Line data are used to represent one-dimensional objects such as roads,
rivers and power lines. Again, in this case, there is need to consider relevance
for application and the scale that the application requires. For example, in a
tourist map, subways and streetcar routes can be line features. Connected
lines may represent a phenomenon that is viewed as networks. Figure 2.2(b)
shows line features.
Figure 2.2(b) Line Features (Source: Longley et al., 2004:184)
c. Polygon/area representations
An area fully encompassed by a series of connected lines is a polygon/area
representation. Because lines have direction, the system can determine the
area that falls within the lines comprising the polygon. Polygons are often an
irregular shape. Each polygon contains one type of data. Examples of polygon/
area features are forests and national parks. All of the data points that form
the perimeter of the polygon must connect to form an unbroken line. Figure
2.2(c) illustrates polygon/area features.

Figure2.2(c): Polygon Features (Source, Longley et al., 2004:184)
d. Network representations
A network is a series of interconnecting lines along which there is a flow of

data, objects or materials. Networks include a road network, along which
there is a flow of traffic from one point to another. Another example is a river,
along which there is flow of water. Other examples include sewage and
telephone systems. Figure 2.2(d) illustrates a network representation.
Figure2.2d: Polygon Features (Adapted from: Longley et al., 2004:184)
e. Surface representations
A surface entity is used to represent continuous features or phenomena.
For these features, there is a measurement or value at every location as is the
case with elevation, temperature and pressure. The continuous nature of
surface entities distinguishes them from other entity types (points, lines, polygons
& networks) which are discrete, that is either present or absent at a particular
location.
Activity 2.2
1. Define the term model.
? 2. Giving examples, describe the five types of entities used to represent
geographic phenomena.

2.5 Spatial Data Models

We have two main ways in which a computer can handle and display spatial
entities. These approaches are raster and vector approaches. All spatial data
models are approaches for storing the spatial location of geographic features
in a database.
2.5.1 The Raster Data Model

In a Raster Model individual cells are used as building blocks for creating
images of points, lines, areas, surface and network entities. Raster data models
incorporate the use of a grid-cell data structure where the geographic area is
divided into cells identified by row and column. With each cell, some value is
associated to characterise that part of space. This data structure is commonly
called raster. The raster is a regular tessellation with square cells. What then
do we mean by tessellation? A tessellation is partition of space into mutually
exclusive cells that together make up the complete space. In a regular
tessellation the cells are of the same shape and size (See Figure 2.3) and the
field attribute assigned to a cell is associated with the entire area occupied by
the cell (De By, 2001:85).
Figure 2.3: Regular Tessellation (De By et al., 2001:85)
The size of cells in a tessellated data structure is selected on the basis of the
data accuracy and the resolution needed by the user. A raster data structure is
in fact a matrix where any coordinate can be quickly calculated if the origin
point is known, and the size of the grid cells is known. Since grid-cells can be
handled as two-dimensional arrays in computer encoding, many analytical
operations are easy to program. This makes tessellated data structures a
popular choice for many GIS software (De By, 2001:86).
Figure 2.4 shows how a range of different features represented by five different
entities can be modelled using the raster approaches. You can see that the
hotel is modelled by a single, discrete cell that is a single pixel. Rivers are
modelled by a group of cell into lines that is a string of pixels and forest by
grouping cells into blocks that is a group of pixels. The road network is

modelled by linking cells into networks. The relief of the area has been modelled
by giving every cell in the raster image an altitude value.
The pixel is the basic element of raster data-a term derived from picture
elements, which is cell representing a certain terrain property such as soil type
and altitude. These properties are always represented in a pixel as a numerical
value.
2.5.2 The Vector Data Model

The Vector Data Model uses two dimensional Cartesian (x,y) coordinates
to store the shape of spatial entity. In the vector world, the point is the basic
building block from which all spatial entities are constructed. The Simplest
spatial entity the point is represented by a single (x y) coordinate pair. Lines
and area entities are constructed by a series of points into chains and polygons.
Figure 2.4 shows how vector data models have been used to represent various
features. In vector representations, an attempt is made to associate
georeferences with the geographic phenomena explicitly. Vector lines are often
referred to as arcs and consist of a string of vertices terminated by a node. A
node is defined as a vertex that starts or ends an arc segment (De By et al.,
2001:91).
Vector data structures
Several data structures have developed for the storage of vector data (points,
lines and polygons). The most popular ones are the spaghetti model and the
topological model.
a. The Spaghetti Model
The Spaghetti Model stores data as string of coordinate pairs, without any
indication regarding their spatial relation. This means that coordinate pairs are
relatively in unstructured form. It is a very simple model and does not require
complex file structures. The model does not pose any problems with the storage
of points (single x,y coordinate pair). However, for the storage of lines and
polygons, this model is less effective because of the following reasons:
 No direct relation exists between different lines, so that a network analysis
is very time consuming.
 The coordinates which form part of the boundary lines between two
polygons have to be stored twice, which may lead to problems when
file is edited.

 In order to know which lines are adjacent to a certain polygon, it is

necessary to look in the database at all other polygons in order to find
those that have common boundaries.
 The model is suitable for graphic display not for analysis.
b. The Topological Model
Topological data structures provide the information that the computer

requires to recognise the line networks, and adjacency and polygons. A set of
instruction is required which informs the computer where one polygon, or
lines with respect to its neighbours. Topology can be defined as geometric
characteristics of objects which do not change under transformations such as
bending and stretching and are independent of any coordinate system. Elements
of topology are:
 Adjacency - share a common boundary
 Containment - features wholly contained within another feature
 Connectivity - describes linkages between the line features
Topology is concerned with connectivity between entities and not their physical
shape. A point is the simplest spatial entity that can be represented in the
vector world with topology. In order for a line entity to have topology it must
consist of an ordered set of points know as arcs, segment or chain with defined
start and end points (nodes).
Figure 2.4 shows the raster and vector data models that we have discussed in
previous sections. It illustrates how point, line, area/polygon, network and
surface entities have been represented using the raster and the vector models.

Figure 2.4: Raster and Vector Data Models (Source: http://

www.indiana.edu/~gisci/courses/g338/lectures/introduction_vector.html).

2.6 Advantages and Disadvantages of Vector and

Raster Data
There are advantages and disadvantages associated with the use of vector
data.
Advantages of vector data
The following are the advantages of vector data:

 Data can be represented at its original resolution and form with minimal
generalisation.
 Graphic output is usually more aesthetically pleasing (traditional
cartographic representation).
 Since most data, e.g. hard copy maps, is in vector form, no data
conversion is required.
 Accurate geographic location of data is maintained.
 Requires relatively small data capacity.
 Allows for efficient encoding of topology, and as a result, more efficient
operations that require topological information, e.g. proximity, network
analysis.
Disadvantages of vector data
The following are disadvantages of vector data:

 The location of each vertex needs to be stored explicitly
 For effective analysis, vector data must be converted into a topological
structure. This is often processing intensive and usually requires extensive
data cleaning.
 Algorithms for manipulative and analysis functions are complex and,
this inherently limits the functionality for large data sets.
 Continuous data, such as elevation data, is not effectively represented
in vector form.
Advantages of raster data
The following are advantages of raster data:

 Each cell can easily be referenced to its geographic location.
 Due to the nature of the data storage technique, data analysis is usually
easy to program and quick to perform.
 Raster maps, (for example, attribute maps), are ideally suited for
mathematical modelling and quantitative analysis.

 Discrete data, for example, forestry stands, is accommodated equally

well as continuous data, for example, elevation data, and facilitates the
integrating of the two data types.
 Neighbourhood analysis can be done easily
Disadvantages of raster data
The following are disadvantages of raster data:

 Lack of explicit topology, therefore network analysis complex.
 Processing of associated attribute data may be cumbersome if large
amounts of data exist.
 Vector-to-raster conversion leads to increased processing requirements
and may introduce data integrity concerns due to generalisation.
 The cell size determines the resolution at which the data is represented.
 Difficult to adequately represent linear features depending on the cell
resolution and network linkages.
 Most output maps from grid-cell systems do not conform to high-
quality cartographic need.
Activity 2.3
1. Distinguish between raster model and a vector model.
? 2.
3.
Explain the term tessellation.
Distinguish between a spaghetti model and a topological model.
4. Describe the term topology and its importance in GIS.
5. What are the advantages and disadvantages raster and vector models?
2.7 Summary
In this unit, we discussed about some conceptual models of the real world
phenomena. It has been important to understand these models since we do
not store the real world into a computer but we use models to represent the
real world. We also discussed geographic data types, namely nominal, ordinal,
interval and ration data. We also learnt about raster and vector data models
and their advantages and disadvantages. In the next unit we will focus on
spatial referencing.

References
Burrough, P., and McDonnell, R. (1998). Principles of Geographical
Information Systems. (New York: Oxford University Press).
Cornelius, S. and Heywood, I. (2000). An Introduction to Geographic
Information Systems, London, Prentice Hall.
De By. (2001). International Institute of Geo-Information and Earth
Observation. Principles of Geographic Information Systems: An
Introductory Textbook, Enschede, De By Educational Textbook
Series.
Longley, P.A., Goodchild, M.F., Maguire, D.J. and Rhind, D.W. (2004).
Geographical Information Sciences Volume 1: Principles and
Technical Issues, New York, John Wiley and Sons Ltd.
Geographical Information Systems and Science, Chichester, John
Wiley and Sons Ltd.
http://www.eng.auburn.edu/users/doughmp/webfiles/spatial Accessed 15/02/
2012.
http://bgis.sanbi.org/gis-primer/page Accessed 19/04/2012.
http://www.verlter.virginia.edu Accessed 19/09/2012.
http://adlib.itc.nl Accessed 19/09/2012.
http://wikitravel.org/en/Zimbabwe: Accessed 19/09/2012.
http://www.indiana.edu/~gisci/courses/g338/lectures/introduction_vector.html:
Accessed 10/11/2012.


Unit Three
Spatial Referencing I
3.1 Introduction
I
n this unit we focus on spatial referencing in GIS. The thrust of this unit is
to introduce you to coordinate systems such as geographic coordinates,
polar coordinates and Cartesian coordinates. Other concepts to be learnt
in this unit are ellipsoids, geoids and datums. These are all important concepts
in locating ones position on the earth's surface.
3.2 Objectives
By the end of the unit, you should be able to:
 define the terms: ellipsoid, datum and geoid
 describe the geographic coordinate systems, polar coordinates and
cartesian coordinates
 discuss the process of converting polar to cartesian coordinates and
vice versa
 explain the term of Georeferencing
3.3 Reference Surfaces

It is well known that the earth is not round. We need to define the shape of the
earth so that we use maps accurately. This can be done through the use of a
spatial referencing system. A spatial reference system defines the way that
any landmark (trees, houses, roads, buildings etc.) can have its own unique
address (Zhang, 2005). To have a good spatial reference system, engineers
really need to need to know about the shape of the earth, that is, geodesy.
Two main reference surfaces have been established to approximate the shape
of the earth. The reference surfaces are the Geoid and the Ellipsoid.
Figure 3.1: The geoid and ellipsoid reference surfaces (De By, 2001:192)
3.3.1 Geoid
The geoid is shown on figure 3.2. The earth's geoid is a surface which is
complex to accurately describe mathematically. But it can be identified by
measuring gravity. The earth's geoid is regarded as being equal to mean sea
level. Over open oceans, the geoid and mean sea level approximately the
same, but in continental areas they can differ significantly. However, it must be
noted that this difference is not of any practical consequence for most people
and it is considered reasonable that they are regarded as the same.

Unit 3 Spatial Referencing I
The geoid is used to describe heights. In order to establish the geoid as

reference for heights the ocean's water level is registered at coastal places
over several years using tide gauges. Averaging the registrations largely
eliminates variations of sea level with time. The resulting water level represents
an approximation to the geoid and is called the mean sea level (De By,
2001:195).
Figure 3.2: Geoid (Source: Figure 3.2: http://www.icsm.gov.au.mapping/

datums1.html)
3.3.2 The ellipsoid

The ellipsoid is formed when an ellipse is rotated about its minor axis. The
ellipse which defines the ellipsoid or spheroid is called a meridian ellipse
(De By, 2001). The most convenient geometric reference is the oblate ellipsoid
(see figure 3.2) It consists of two different axes namely, an equatorial radius
(the semi-major axis), and a polar radius (the semi-minor axis).
Figure 3.3: An oblate ellipse (source: De By, 2001:196)

Activity 3.1
1. Define spatial referencing.
? 2. Explain the terms geoid and ellipsoid.
3. Explain the importance of geoids and ellipsoids.
3.4 Datum
A datum is a reference surface which is defined mathematically and
approximates the shape of the earth in particular areas. At different areas
across the world, different map datums were (and some still are) used due to
the differences in the earth's general surface shape at different places. Specific
map datums are more applicable to particular areas or regions than others
(http://www.gpswaypoints.co.za).
A datum enables us to calculate the position of a specific location accurately

and consistently. An example is the Arc 1950 geodetic datum first defined in
1950. It is suitable for use in countries such as Botswana, Malawi, Zambia,
and Zimbabwe. The Arc 1950, references the Clarke 1880 (Arc) ellipsoid
and the Greenwich prime meridian. The Arc 1950 is a geodetic datum for
topographic mapping and geodetic survey (http://georepository.com/
datum_6209/Arc-1950.html).
Ellipsoids vary in position and orientation. An ellipsoid positioned and oriented

with respect to the local mean sea level by adopting a latitude and longitude
and ellipsoidal height of a fundamental point and an azimuth to an additional
point defines a local horizontal datum.
The motivation to make geodetic results mutually comparable across the globe
has resulted in the need for global horizontal datums. An example is the
World Geodetic System (WGS84) used by the GPS. We also call this
referencing system a geocentric datum since it is positioned with respect to
the centre of the mass of the earth (De By, 2001: 203)
Activity 3.2
1. Define map datum.
? 2. Explain the importance of map datums.
3. Distinguish between a local horizontal datum and a global horizontal
datum.

3.5 Coordinate System

A coordinate system is the specified units and origin point used to locate
features on the two-dimensional map. A coordinate system is a reference
system consisting of a set of points, lines, and/or surfaces, and a set of rules,
used to define the positions of points in space in either two or three dimensions.
(http://www.biology.ualberta.ca) The major types of coordinate systems are
geographic coordinates, cartesian coordinates and polar coordinates.
3.5.1 Geographic coordinate system

One of the most common coordinate systems in use is the geographic
coordinate system. Geographic coordinates are often used to store and
manage, and interchange spatial data. The data are projected onto a local
map coordinate system for editing, analysis and mapping.
Figure 3.4: The latitude and longitude angles represent the 2D

geographic coordinate system (Source: http://kartoweb.itc.nl/geometrics/
Coordinate%20systems/coordsys.html)
The 2 D geographic coordinate system consists of lines of geographic

latitude and longitude (see figure 3.4). Lines of equal latitude are called parallels.

They form circles on the surface of the ellipsoid. Lines of equal longitude are
called meridians and they form ellipses (meridian ellipses) on the ellipsoid.
Both lines form the graticule when projected onto a map plane. http://
kartoweb.itc.nl/geometrics/Coordinate%20systems/coordsys.html
Lines of latitude run parallel to the equator and divide the earth into 180 equal
portions from north to south (or south to north). The reference latitude is the
equator and each hemisphere is divided into ninety equal portions, each
representing one degree of latitude.
In the northern hemisphere degrees of latitude are measured from zero at the
equator to ninety at the North Pole. In the southern hemisphere degrees of
latitude are measured from zero at the equator to ninety degrees at the South
Pole. To simplify the digitisation of maps, degrees of latitude in the southern
hemisphere are often assigned negative values (0 to -90°). Wherever you are
on the earth's surface, the distance between lines of latitude is the same, so
they conform to the uniform grid criterion assigned to a useful grid system.
Lines of longitude, on the other hand, run perpendicular to the equator and
converge at the poles. The reference line of longitude (the prime meridian)
runs from the North Pole to the South Pole through Greenwich, England.
Subsequent lines of longitude are measured from zero to 180 degrees east or
west (values west of the prime meridian are assigned negative values for use
in digital mapping applications) of the prime meridian. (http://geology.isu.edu/
geostac/Field_Exercises/toppomaps/grid_assign.htm)
We also have 3D geographic coordinate system which is obtained by

introducing the ellipsoidal height (h). The ellipsoidal height of a point is the
vertical distance of a point in question above the ellipsoid. 3D geographic
coordinates can be used to define the position on the earth surface (point p)
on figure 3.5 (De By, 2001: 209).
Figure 3.5: 3D geographic coordinate system (De By, 2001: 209)

3.5.2 2D cartesian coordinates

A 2D cartesian coordinate system specifies each point uniquely in a plane
by a pair of numerical coordinates, which are the signed distances from the
point to two fixed perpendicular directed lines, measured in the same unit of
length. Each reference line is called a coordinate axis or just axis of the
system and the point where X and Y axis meet is its origin (see figure 3.6).
The coordinates can also be defined as the positions of the perpendicular
projections of the point onto the two axes, expressed as signed distances
from the origin. See figure 3.6 for illustrations. (http://en.wikipedia.org/wiki/
ca_cartesian_coordinate_system) Thus, given two numerical coordinates x
and y for point P one can specify the location of P on a map.
Figure 3.6: 2D cartesian coordinate system (De By, 2001:212)
3.5.3 2D polar coordinates

The 2D polar coordinate system is a two-dimensional coordinate system in
which each point on a plane is determined by a distance from a fixed point
and an angle from a fixed direction (http://en.wikipedia.org/wiki/
polar_coordinate_system. This is the distance d from the origin to the point
concerned and the angle a between a fixed (o zero) direction and the direction
to the point. The angle a is called azimuth/bearing and is measured in a
clockwise direction (De By, 2001:215). See figure 3.7 for illustrations.

Figure 3.7: 2D Polar Coordinates (De By et al., 2001:215)
Activity 3.3
1. Define the term coordinate system.
? 2. Explain the following coordinate systems:
a) 2D and 3D geographic coordinates
b) 2D polar coordinates
c) 2D cartesian coordinates
3.6 Georeferencing
Georeferencing is the process of aligning geographic data to a known
coordinate system so that it can be queried, and analysed with other geographic
data http://resources.esri.com. This is usually done in two main ways. First
you have another layer of information (e.g. road networks of the area of
interest with a coordinate system) and you can match with your raster image
and the layer together. Secondly, you have a series of control points with
their coordinates and you know where on the map they correspond. The
points on the GIS dataset features that correspond to points on the image are
called control points. Since the coordinates of a GIS dataset reflect the map
projection, by assigning coordinates we are also declaring the image to be
displayed in a particular map projection (www.victoria.ac.nz/sgees/pdf/
georeference-image.pdf). In this case, the issue of map projections becomes
imperative.

Activity 3.4
1. Define georeferencing.
? 2. Explain the steps taken in georeferencing.
3.7 Summary
In this unit we focused on spatial referencing in GIS. We introduced you to
coordinate systems such as geographic coordinates, polar coordinates and
Cartesian coordinates. Other concepts that were learnt in this unit are
ellipsoids, geoids and datums. These are all importance concepts in locating
ones position on the earth's surface. In the next chapter you will learn about
map projections.

References
De By et al. (2001). International Institute of Geo-Information and Earth
Observation Principles of Geographic Information Systems: An
Series.
Schwartz, C.R. (1989). North American Datum of 1983, NOAA
Professional Paper NOS 2, Rockville: National Geodetic Survey.
Smith, J. (1997). Introduction to Geodesy: The History and Concepts of
Modern Geodesy, New York: Wiley.
Zhang, L. (2005). Implications of Geodesy, Spatial Reference Systems,
and Map Projections in Processing, Conversion, Integration, and
Management of GIS Data, eMap Div., Integrated Solution Services
Dept, Saudi Aramco
http://www.icsm.gov.au.mapping/datums1.html: Accessed 12/03/2012.
http://www.gpswaypoints.co.za: Accessed 12/03/2012.
http://www.biology.ualberta.ca: Accessed 12/03/2012.
http://geology.isu.edu/geostac/Field_Exercises/toppomaps/grid_assign.htm:
http://en.wikipedia.org/wiki/ca_cartesian_coordinate_system: Accessed 14/
03/2012.
http://en.wikipedia.org/wiki/polar_coordinate_system: Accessed 14/03/2012.
http://kartoweb.itc.nl/geometrics/Coordinate%20systems/
coordsys.html: accessed 27/11/2012.
http://resources.esri.com Accessed 10/11/2012.
www.victoria.ac.nz/sgees/pdf/georeference-image.pdf: Accessed 14/10/2012.

Unit Four
Spatial Referencing II
4.1 Introduction
I
n this unit we discuss map projections. We define map projection, name
and describe map projections and methods of changing from one
projection type to another.
4.2 Objectives
 define map projections
 describe different types of map projections
 explain methods of changing from one projection to another
4.3 What is a Map Projection?

A map projection is a mathematically described technique of how to represent
the Earth's curved surface on a flat map (De By, 2001:217; Longley et al.,
2004: 118). The Earth is a geoid, it is not flat. Therefore, the question is how
we can represent it on a flat map. There are mathematical models that have
been developed to represent the Earth on developable surfaces. These are
called map projections.
A GIS application needs a consistent projection framework. The size of the

area and type of application determines an appropriate projection. The
reference surface for large-scale mapping -is in an oblate ellipsoid whilst the
reference surface for small-scale mapping is a sphere (De By, 2001:217).
It involves transforming each point on the reference surface with geographic

co-ordinates ( φ, λ ) to a set of Cartesian coordinates (X, Y) representing
positions on the map plane (De By, 2001:217; Longley et al., 2004: 118).
Map projections are associated with scale distortions. When flattening an
ellipsoid or spherical surface some parts are stretched more than others.
4.4 Types of Map Projections

Projections are also classified according to the positioning of the developable
surface on the datum surface (ellipsoid). These are classified into tangent and
secant projections. In addition, projections area classified by the orientation
of the developable surface (plane) or axis (cone and cylinder) with the
rotational (minor) axis of the ellipsoid (De By, 2001:217; Longley et al.,
2004: 118). These are classified into normal, transverse and oblique
projections.

Unit 4 Spatial Referencing II
4.4.1 Types of normal projections

There are about three types of normal map projections. These are classified
according to the type of developable surface. The developable surfaces are
the cylinder, the cone and the plane and the resultant projections are cylindrical,
conical, and azumithal respectively (see figure 4.1). The cylinder is used to
represent the curved earth surface on a flat paper or a computer screen. The
horizontal reference surface must be mapped onto a 2D mapping plane. The
cylinder is tangent to the sphere along a great circle (the circle formed on the
surface of the Earth by a plane passing through the centre of the earth).
Figure 4.1: Normal Projections (Source: De By, 2001:221)
Activity 4.1
1. Define map projections.
? 2.
3.
What are the functions of a map projection?
Name types of normal projections.
4. Describe the characteristics of normal projections
4.5 Developable Surface Projection Types

The developable surfaces are classified into tangent and secant projections.
4.5.1 Tangent projections

Tangent surfaces include the cone, cylinder and plane. The cone, cylinder and
plane are tangent surfaces that touch the horizontal surface planar reference

surface at a point (plane) or along a closed line for cone and cylinder (De By,
2001: 225; Longley et al., 2004: 119). A cone is used to map different
continents. The Cylinder is used to map the entire world. The cylinder is
tangent to the sphere along a great circle (the circle formed on the surface of
the Earth by a plane passing through the centre of the Earth).
4.5.2 Secant projections

Secant projections intersect with the horizontal reference surface along a closed
line or two closed ones for cones and cylinders (De By, 2001: 225; Longley
et al., 2004: 119). In the secant case, the cylinder touches the sphere along
two lines, both small circles (a circle formed on the surface of the Earth by a
plane not passing through the centre of the Earth).
Secant projections are used to reduce/ average scale errors because the line(s)
of intersection are not disturbed on the map. The symmetry axes of the plane,
cone and cylinder coincide with the rotation axis of the ellipsoid or sphere i.e.
a line from the North and South Pole (see figure 4.2).
Figure 4.2: Classes of secant projections (Source: De By, 2001:221)
4.5.3 Transverse projections

An example of a cylindrical projection is the Universal Transverse Mercator
(UTM) projection. The symmetry axis is the equator. UTM is used worldwide.
It is derived from the Transverse Mercator/ Gauss - Kruger/Gauss conformal
projection). UTM is conformal projection.

Figure 4.3: Transverse cylindrical projection (Source: De By, 2001: 222).
The UTM projection defines horizontal positions world-wide by dividing the

surface of the Earth into 60 narrow longitudinal zones of 6o numbered from 1
to 60. Zone numbers designate 6o longitudinal strips extending from 80o South
latitude to 84o North latitude. UTM zone characters designate 8o zones
extending north and south from the equator. Each zone has a central meridian
in the centre of the zone. Maximum linear error is 1 in 2,500, which translates
into a maximum error of 4 in 10,000.The secant map surface makes distortions
smaller enough for large scale topographic mapping (De By, 2001: 225;
Longley et al., 2004: 119).
In Zimbabwe, topographic maps use the universal transverse Mercator system.

Zone 35o South and Zone 36o South cover Zimbabwe.
4.5.4 Oblique projections

In oblique projections, the symmetry axis is somewhere between the rotation
axis and the equator of the ellipsoid or sphere (De By, 2001: 222). It can be
an oblique conical projection or an oblique cylindrical projection. Refer to
figure 4.4a and 4.4b for illustrations of oblique projections.
Figure 4.4 a: Oblique conical projection (Source: De By et al. 2001:

222)

Figure 4.4b: Oblique Cylindrical Projection (Source: http://

mathworld.wolfram.com/topics/MapProjections.html)
Activity 4.2
1. Name developable surfaces that are used for map projections.
? 2. Describe the secant, transverse and oblique map projections.
3. What are the differences between secant, transverse and oblique map
projections?
4.6 Distortion Properties of Map Projections

All map projections contain some distortion. Projections preserve some
properties at expense of others. Distortion increases away from the places of
tangency and it is a function of certain mathematical relationships of a projection.
The most commonly described mathematical relationships are conformality,
equivalence (equal area) and equidistance. Distortions can occur in shape,
area, distance, direction (De By, 2001: 223; Longley et al., 2004: 118).
4.6.1 Conformal map projection

A conformal projection locally preserves angles. Any two lines in the map
follow the same angle as the corresponding original lines on the Earth; projected
graticule lines always cross at right angles. The angles between lines in the
map are identical to the angles between the original lines on the curved reference
surface. Also, at any particular point, scale is the same in all directions.

Therefore, angles with short sides and shapes of small areas are shown correctly
on the map ((De By, 2001: 223; Longley et al., 2004: 118).
Shape is preserved locally on conformal maps and angles at any point are
correct, although sizes change. Even though conformal projections preserve
the angles, the directions of lines can be changed in the process of projecting
them.
The Mercator projection is a conformal map projection that has straight

meridians and parallels that intersect at right angles (De By, 2001: 223; Longley
et al., 2004: 118). Scale is true at the equator or at two standard parallels
equidistant from the equator. The projection is often used for marine navigation
because all straight lines on the map are lines of constant azimuth.
4.6.2 Equal area map projections

The areas in the map are identical to the areas on the curved reference surface
taking into account the map scale. Areas are represented correctly on the
map. When a map portrays areas over the entire map so that all mapped
areas have the same proportional relationship to the areas on the Earth that
they represent, the map is an equal-area map (De By, 2001: 223; Longley et
al., 2004: 118).
Cylindrical equal-area projections have straight meridians and parallels, the

meridians are equally spaced, the parallels unequally spaced. There are normal,
transverse, and oblique cylindrical equal-area projections. Scale is true along
the central line (the equator for normal, the central meridian for transverse,
and a selected line for oblique) and along two lines equidistant from the central
line. Shape and scale distortions increase near points 90 degrees from the
central line (De By, 2001: 225).
4.6.3 Equidistant map projection

The length of the particular lines in the map is the same length of original lines
on the current reference surface taking into account map scale. A map is
equidistant when the distances between one or two points and every other
point on the map differ from the corresponding distances on the sphere by
only a constant scaling factor (De By, 2001: 223; Longley et al., 2004: 119).

4.6.4 Direction
A map preserves direction when azimuths (angles from a point on a line to
another point) are portrayed correctly in all directions (De By, 2001: 223).
A particular map can have one of these three properties (conformality,

equivalence (equal area) and equidistance) and no map for example, can
have both the conformal and equal area (De By, 2001: 223).
The type of projections depends on use. Most GIS applications warrant equal
area projection. Usually when doing analysis that requires areal calculations,
an equal area projection should be used. If area is small enough, the difference
between conformal and equal area is not significant.
Activity 4.3
1. Identify types of map distortions.
? 2. Describe types map distortions and their properties.
4.7 How Do We Change From One Co-Ordinate

System To Another?
You can change from one coordinate system to another using map forward
map equations and inverse mapping equations.
4.7.1 Forward mapping equations

When changing from geographic coordinates to a map projection coordinates
we use forward mapping equations. A forward mapping equation transforms
geographic coordinates ( φ, λ ) of a point on the curved reference surface to
a set of planar Cartesian Coordinates (x, y), representing the position of the
same point at the map plane (De By, 2001: 217).
The forward mapping equation is formulated as follows:
For example for the Mercator projection:
Source: De By, 2001: 219

Where and are given in radians, R is the radius of the spherical reference
surface (for the WGS84 ellipsoid the equatorial radius is 6378137m), o is
the central meridian of the projection, e= 2.7182818 and ln is the natural
logarithm (De By, 2001: 219; Longley et al., 2004: 118).
4.7.2 Inverse mapping equations

When changing from map projection/ Cartesian coordinates to a geographic
coordinates, we use inverse mapping equations. Inverse mapping equations
mathematically transform Cartesian coordinates (x, y) of a point on the map
plane to a set of geographic coordinates (φ, λ ).
An inverse mapping equation is formulated as follows:
For the Mercator projection:
Activity 4.4
1. Given that the central meridian is 270, =290 and R=6378137m, what
? is the value of x?
2. Given that =170 and R=6378137m, what is the value of y?
The answers will give you the x, y Cartesian coordinates for a particular
location.
3. Given that x=600000, y=7000000 and R=6378137m calculate the
value of and .
The result will be geographic coordinates in degrees.
4. Describe how you change from geographic to projection coordinates.
5. Describe how you change from projection coordinates to geographic
coordinates.

4.8 Coordinate Transformation

Coordinate transformation may involve transforming polar coordinates into
Cartesian map coordinates or transformation from a 2D Cartesian (x, y) system
of a specific map projection to 2D Cartesian (x', y') system of a defined map
projection.
For example, 2D polar to 2D Cartesian transformation.
Polar coordinates (α, d) into Cartesian map coordinates (x, y)
The inverse mapping equations are formulated as follows (Cartesian

coordinates into polar coordinates:
Activity 4.5
1. Describe how you transform polar coordinates into Cartesian
? coordinates.
2. Describe how you can transform Cartesian coordinates into polar
coordinates.
4.9 Changing Map Projections

The following steps are followed:
Firstly, inverse mapping equation of the source projection is used to transform

source projection coordinates (X, Y) to geographic coordinates .
Secondly, a forward mapping equation of the target projection is used to
transform coordinates in to projection coordinates (X', Y'). In a GIS
you first have to convert a map in projection coordinates to geographic
coordinates. Then convert the resultant map in geographic coordinates to

projection coordinates. Refer to figure 4.5. In order to convert from projection

A to projection B, one has to covert from Projection A to Geographic, then to
Projection B.
Figure 4.5: Changing from one projection coordinate system to another

(Source: De By, 2001:232)
Activity 4.6
1. Identify and describe types of projections.
? 2. Describe how you would change from one projection coordinate to
another?
4.10Summary
In this unit we covered the definition of map projections, types of map
projections and distortion properties of map projections as well as how to
change from one coordinate system to another. Now you have to read widely
on the concept of map projections.

References
De By (2001). International Institute of Geo-Information and Earth
Observation, Principles of Geographic Information Systems: An
Introductory Textbook, Enschede, De By, Educational Textbook
Series.
Geographical Information Systems and Science, Second Edition,
New York, John Wiley and Sons Ltd.
http://mathworld.wolfram.com/topics/MapProjections.html/ accessed 26
November 2012.

Unit Five
Spatial Data Capture and

Preparation
5.1 Introduction
I
n this unit, we cover spatial data capture and preparation methods; define
important terms and concepts, types and sources of GIS data, sources
of error and possible remedies as well as data quality. Analysis and
modelling in a GIS requires input of relevant data. Therefore, we need to
learn how we can acquire and input relevant data in our GIS.
5.2 Objectives
 identify types and sources of GIS data
 discuss the methods of capturing and preparing spatial data
 describe spatial data capture techniques
 discuss the sources of error in spatial data capture and preparation
 explain possible remedies to errors in spatial data capture and
preparation
 evaluate the concept of data quality
5.3 Definitions and Concepts

Data refers to verifiable facts about the real world.
Metadata refers to background information that describes all necessary

information about the data itself. For example, identification information such
as data source(s) and time of acquisition; data quality information that includes
positional, attribute and temporal accuracy, or lineage; and entity and attribute
information such as related attributes and units of measure (De By, 2001: 46;
282; Longley et al., 2004: 152).
Data input is process of encoding data into computer readable format and
making it compatible with GIS. There are two types of GIS data namely
spatial data and attribute data (De By, et al., 2001: 149).
Spatial data is data whose X and Y coordinates are known (Data whose
physical/ geographic location is known). It can be points, line (segments) or
polygons (areas) (De By, 2001: 27).
Attribute data is data that provides descriptive information for spatial features
e.g. names, length.
Thematic layers refer to maps representing different phenomena, for

example, an elevation map representing altitude of an area.

Unit 5 Spatial Data Capture and Preparation
Table 4.1: Attribute and Spatial Data

Spatial Data Attribute Data
X Y Farm ID Farmer Name Farm Size (ha)
609000 8080000 58 Peter Mapanga 5000
609001 8080001 59 Sam Muti 6000
609002 8080002 60 John Gudo 80000
5.4 Methods of Spatial Data Capture

There are various methods of spatial data capture. These include digitising,
keyboard entry, and Global Positioning System (GPS) data as well as
interpreting and classifying remotely sensed satellite imagery and importing
data from other sources and data files into a GIS (De By, 2001: 149; Longley
et al., 2004: 200).
5.4.1 Digitising
This refers to the process of converting analogue data into digital data (De
By, 2001: 149). It also includes the process of creating vector data from
raster data (for example a scanned topographical maps). In this process, we
trace features like roads, rivers, farm boundaries, homesteads and schools
from scanned maps. Scanning hardcopy topographic and topological maps
and aerial photography is a method of converting analogue data into digital
format (De By, 2001: 149; Longley et al., 2004: 201). Digitising methods
include on-tablet digitising, on-screen digitising, scanning and vectorisation.
In digitising, we have to consider whether the data we want to extract is point,
line or polygon. Points, lines and areas (polygons) are geographic primitives
(Clarke, 1997: 54). Point objects represent single locations of data e.g. mine,
soil sample. They are represented by a symbol e.g. circle, triangle and diamond
and the point has attributes attached to it for example, road. Line/segment
objects represent the generalised shape of the geographic feature e.g. roads,
rivers, fault lines defined by a set of sequential coordinates (vertices and nodes).
A line is defined by two or more nodes. Polygon (Area) objects are defined
by a line describing the location of its boundary. The first and last points of the
lines have the same coordinate pair.

Now lets us define the various methods of digitisation.

a. In on-tablet digitising the original map is fitted on a special surface
(the Tablet) and features are traced from it with a mouse device (De
By, 2001: 275; Longley et al., 2004: 207).
b. In on-screen digitising a scanned image of the map or a high resolution
satellite image or aerial photography is shown on the computer screen.
Features such as points, lines, and polygons are traced using the mouse
(De By, 2001: 279; Longley et al., 2004: 207).
c. In scanning, a scanner illuminates a document and measures the intensity
of light reflected with a CCD array. This results in an image comprising
of pixels with rows and columns. A resolution of between 200-300
dots per inch (dpi) is recommended (De By, 2001: 276).
d. Vectorisation is the process of distilling points, lines and polygons
from a scanned image (De By, 2001: 277).
5.4.2 Keyboard entry

Keyboard entry is another form of spatial data capture. It involves typing in
data using the keyboard. For example, entering attribute data into attribute
tables and databases. A GIS user can enter and edit data and update
information in the existing database (De By, 2001: 150; Longley et al., 2004:
215).
5.4.3 The Global Positioning System (GPS)

This is also used in spatial data capture (Longley et al., 2004: 205). Data can
be downloaded directly from a GPS into a computer and eventually used in a
GIS. The data can be points or lines downloaded from GPS receiver. GPS is
a navigational system using satellite signals to provide location and time
information on or near the earth's surface. GPS is a satellite-based navigation
system made up of a network of 24 satellites placed into orbit by the United
States of America Department of Defense. GPS was originally intended for
military applications, but in the 1980s, the U.S.A government made the system
available for civilian use. There are many uses of the GPS which include:
 Disaster relief/emergency services: Depend upon GPS for location and
timing capabilities.
 Map-making: GPS can be used in making of maps
 Surveying: Surveyors use absolute locations to make maps and determine
property boundaries.
 Tectonics: GPS enables direct fault motion measurement in earthquakes.
 Locate natural resources and wildlife

A GPS consists of three major components which are:

 A group of satellites orbiting the earth
 A portable GPS called the rover unit
 A stationery GPS unit at a fixed location called a base station
a. The satellites
GPS technology is based on a constellation of 24 satellites in orbit around the

earth. By locating and communicating with four and more of these satellites, a
GPS unit can locate itself on the surface of the planet using the triangulation
technique (See Figure 1.6). Triangulation entails measuring both the radial
distance and the direction of the GPS device from each of the satellites
involved. By recording the time taken for the device's signal to reach each of
the satellites, its position can be accurately isolated.
Triangulation signal can be affected by obstructions such as sizeable buildings,

trees, towers and even large hills. For this reason, the more satellites that are
involved, the higher the chance of accurate GPS readings.
GPS can be used to determine the exact geographic location with an accuracy
that is much higher than that achieved by a map or a compass. GPS satellites
circle the earth twice a day in a very precise orbit and transmit signal information
to earth. GPS receivers take this information and. essentially, the GPS receiver
compares the time a signal was transmitted by a satellite with the time it was
received. The accuracy of positions determined with a GPS range between
+/- 1m and +/- 300m and is dependent on:
 The number of satellites that can be located by GPS rover unit.
 The strength of the GPS receiver
 The constellation of the satellites in the sky
 The topography of the terrain
 Atmospheric conditions
Figure 5.1: GPS location with respect to multiple satellites (Adapted

from: http://www.techrepublic.com)

b. The rover unit
The rover unit is a portable GPS that is carried in the field and calculates the
location of the user. The rover unit connects to the satellites, calculates the
location of the unit and saves the location as a rover file. In some rover units,
the data can be stored as one of three different feature types; point, lines or
polygons. The rover unit, consists of three distinct parts namely, the antenna
and the data logger and the GPS receiver.
 The antenna acquires signals from the satellites.
 The GPS receiver locates and maintains contact with satellites.
 The Data logger calculates and stores geographic position data for the
user.
c. The base station
The base station is a stationery GPS unit known as geographic location. Its
position is predetermined with the highest accuracy possible. A base station is
not a requirement of the GPS system, but it does increase accuracy.
GPS accuracy
The accuracy of a position determined with GPS depends on the type of

receiver. Most consumer GPS units have an accuracy of about +/-10m. Other
types of receivers use a method called Differential GPS (DGPS) to obtain
much higher accuracy. DGPS requires an additional receiver fixed at a known
location nearby. Observations made by the stationary receiver are used to
correct positions recorded by the roving units, producing an accuracy greater
than 1 meter. There are many causes of position errors or low signal. The
factors that can degrade the GPS signal and thus affect accuracy include the
following:
Ionosphere and troposphere delays -The satellite signal slows as it passes

through the atmosphere. The GPS system uses a built-in model that calculates
an average amount of delay to partially correct for this type of error.
Signal multi path - This occurs when the GPS signal is reflected off objects
such as tall buildings or large rock surfaces before it reaches the receiver. This
increases the travel time of the signal, thereby causing errors.
Receiver clock errors -A receiver's built-in clock is not as accurate as the

atomic clocks on board the GPS satellites. Therefore, it may have very slight
timing errors.

Orbital errors - Also known as ephemeris errors, these are inaccuracies of

the satellite's reported location.
Number of satellites visible - The more satellites a GPS receiver can "see,"
the better the accuracy.
Positional errors -Buildings, terrain, electronic interference, or sometimes

even dense foliage can block signal reception, causing position errors or possibly
no position reading at all. GPS units typically will not work indoors, underwater
or underground.
Satellite geometry - This refers to the relative position of the satellites at

any given time. Ideal satellite geometry exits when the satellites are located at
wide angles relative to each other. Poor geometry results when the satellites
are located in a line or in a tight grouping.
5.4.4 Interpreting and classifying remotely sensed images

Satellite remotely sensed imagery is a source of geographic or spatial data.
The data is spatially referenced or can be given a spatial reference. And once
the imagery is interpreted or classified, it becomes valuable spatial data that
can be readily manipulated in a GIS (De By, 2001: 272; Longley et al., 2004:
202).
5.4.5 Importing data from other sources/data file into a GIS

GIS is capable of acquiring data from other sources and data files and
converting it into its native file formats. For example TIF images, excel spread
sheets, jpeg images can be imported into a GIS converted to the file formats
of that particular GIS software. In addition, file formats from other GIS software
can be imported into another, for example and ArcView file can be converted
into an ILWIS readable file format. A GIS must be able to import the most
common data formats both for image-type (raster) and vector type maps.

Activity 5.1
1. Define the following terms:
? a. data entry
b. spatial data
c. attribute data
2. Explain the five methods of spatial data capture.
3. Describe the major components of the GPS.
4. Explain the major sources of GPS error.
5. Describe how these errors can be corrected.
5.5 Sources of Geographic Data

Sources of geographic or spatial data can be classified into primary and
secondary sources. Data which is captured directly from the environment is
known as primary data (De By, 2001: 272; Longley et al., 2004: 200). Any
data which is not captured directly from the environment is known as secondary
data (De By, 2001: 272; Longley et al., 2004: 200).
5.5.1 Primary sources

Primary sources for vector data models are GPS measurements and Survey
measurements. The primary sources for the raster data model are digital
remotely sensed images and digital Aerial photographs (De By, 2001: 272;
Longley et al, 2004: 200).
5.5.2 Secondary data sources

Secondary data sources for the vector data model are topographic maps.
Secondary data sources for the raster data model are scanned topographic
maps and Digital Elevation models (DEM). These include external digital
sources, other studies, public domain datasets and images (De By, 2001:
272; Longley et al., 2005: 200).
5.5.3 Other sources

Other sources of geographic data include clearing houses and web portals,
metadata, and data formats and standards.
1. Clearinghouses and web portals refer to data can be obtained from
centralised repositories. Public domain digital data is available in the
form of maps. There are various web portals where geographic data

can be obtained. Examples include Worldclim (www.worldclim.org(

for climate data, http://glovis.usgs.gov/ for satellite imagery (for example,
Landsat TM), http://earthexplorer.usgs.gov/ for satellite imagery for
example, Landsat TM as well as http://www.fao.org/geonetwork where
georeferenced data can be obtained.
2. Metadata refers to background information that describes all necessary
information about the data itself. For example, identification information
such as data source(s) and time of acquisition; data quality information
that includes positional, attribute and temporal accuracy, or lineage;
and entity and attribute information such as related attributes and units
of measure (De By, 2001: 46; 282; Longley et al., 2004: 152).
3. Data formats and standards refer to information about content of the
map and type and format of the data, for example:
Content Type Format
Rainfall raster ascii
Activity 5.2
1. Identify sources of geographic data.
? 2. Describe each of the sources of GIS data that you identified.
5.6 Data Quality

As GIS specialists, we should consider the quality of the data we produce or
acquire. The data should be checked for error or accuracy and precision.
Error /accuracy is defined as the closeness of observations, computations
or estimates to the true value or the values perceived to be true (De By, 2001:
284). It can be relative or absolute accuracy. Precision is the smallest unit of
measurement to which data can be recorded (De By, 2001: 284). In terms of
error/ accuracy we consider positional accuracy, completeness, and temporal
accuracy as well as lineage and logical consistency.
5.6.1 Positional accuracy

In terms of positional accuracy we consider human errors, instrumental or
systematic errors and random errors. There may be human errors in
measurement for example reading errors. We may encounter instrumental or
systematic errors e.g. due to poorly adjusted instruments. In addition random

errors caused by natural variations in quality being measured may be

encountered.
5.6.2 Completeness
Completeness refers to whether there are data lacking in the database
compared to what exists in the real world. It can be spatial, temporal or
thematic aspects of the data. For example, a property boundary may contain
10 houses instead of 12 houses that are on the ground.
5.6.3 Temporal accuracy

Temporal accuracy refers to a situation whereby there is a difference in the
values of coordinates of a particular feature at two different times. This occurs
when one is recording coordinates of a feature at two different times and
coming up with two different coordinates.
5.6.4 Lineage
Lineage describes the history of the data set, for example, date and scale of
aerial photography. Most datasets can have historical information on them
such as date of acquisition and scale. There may be errors in these recordings.
5.6.5 Logical consistency

In Logical consistency the data has to be checked for compatibility with
other data in a data set e.g. in terms of data format. We also check for the
absence/ presence of contradictions within a data set and the topological
consistency of the data set. GIS users should also check whether or not the
data shows allowed attribute value ranges, and combinations of attributes
e.g. attribute value for population, area, and population density must agree
for all entries in the database.
NB: the absence of inconsistencies does not necessarily imply that data is
accurate
Activity 5.3
1. Identify sources of error in acquiring GIS data.
? 2. Describe the sources of error in GIS data acquisition.

5.7 Data Checks and Repairs

Need to check in data for accuracy, consistency and completeness and make
appropriate remedies. The diagram below shows some possible errors in
digitizing and their remedies.
Figure 5.2: Digitizing errors and their remedies (Source: De By, 2001:
306).
Activity 5.4
1. Identify common errors in digitizing GIS data.
? 2. What are the remedies for each particular digitizing error?
5.8 Combining Data from Multiple Sources

In GIS, there may be need to combine data from different sources. The
following situations can be encountered:
1. The data may be about the same area, but different in accuracy
2. May be about the same area but differing in choice of representation
3. May be adjacent areas and have to be merged into a single data set
4. They may be about the same or adjacent areas, but referenced in different
coordinate systems.
5. Data may be of the same area but differ in scale or resolution.
Given such situations, we have to come up with solutions. The following
solutions can be implemented:

1. Improving the accuracy of the low accuracy data

2. Harmonising the representation of the data
3. Addressing issues of spatial references and putting the data into a similar
coordinate system.
4. Converting the data to the same scale or resolution.
Activity 5.5
1. Identify constraints that can be encountered when combining data from
? different sources.
2. What could be possible remedies when you are faced with such
scenarios?
5.9 Summary
In this unit, we discussed spatial data capture and preparation methods, defined
important terms and concepts and types and sources of GIS data as well as
sources of error and possible remedies as well as data quality. Now you
should read widely on spatial data capture. The methods also change and
improve over time. So, keep searching and reading widely.

References
Clarke, K.C. (1997). Getting Started with Geographic Information
Systems, London, Prentice Hall.
Observation, Principles of Geographic Information Systems: An
Introductory Textbook, Enschede, De By, Educational Textbook
Series.
http://www8.garmin.com/aboutGPS/applications.html: Accessed 7/08/2012.
www.worldclim.org.
http://glovis.usgs.gov/
http://earthexplorer.usgs.gov/
http://www.fao.org/geonetwork.


Unit Six
Spatial Data Management and

Processing Systems
6.1 Introduction
A
GIS should be able to manage and process spatial data. In this unit,
we will discuss the systems that facilitate the management and
processing of geo-information, hardware and software trends as well
as stages in data handling. We also cover databases, including geographic
databases.
6.2 Objectives
 describe hardware trends that aided the development of GIS
 discuss software trends in GIS
 explain the components of a GIS
 discuss the stages of spatial data handling
 analyse reasons for using databases
 describe the components of a relational database
 explain the steps taken in database creation
 evaluate the methods of querying relational databases
6.3 Hardware Trends

There have been advances in computer hardware. For example, faster and
more powerful processors are being developed. In the 1980s, a Personal
Computer (PC) had a 2MHz Central Processing Unit (CPU), 128KB memory
and 10MB hard disk. Today the processing speed, memory and capacity of
the hard drive have been expanded and some have as much as one Terabyte
hard disk and over 4GB memory as well as over 2.27 GHz CPU (De By,
2001: 137).
In the 1970s, computers became more accessible and affordable. Mainframe

computers gave way to minicomputers and then workstations. Computers
are now more affordable and there is widespread use of computers in business
and personal use. There have been significant developments in computer
networks/ linkages. Workstations gave greater power to the user and access
to networks (De By, 2001: 137). Therefore, there have been and there continue
to be improvements in computer hardware that are aiding the development of
GIS.
6.4 Software Trends

In line with hardware trends, there have been significant software
improvements. Software, developers continue to produce application programs
and operating systems that provide more functionality. The type of interface
required to operate technical software changed from batch, command-line
and remote access to windowing systems and "point and click" graphic
interaction. These tend to consume significantly more memory.

Unit 6 Spatial Data Management and Processing Systems
Existing software performs better when run on faster computers. Software

such as Arc View, Arc Map and ILWIS are continuously being upgraded.
New versions of the software are being developed (De By, 2001: 142).
Activity 6.1
1. Identify software changes that have taken place over time.
? 2. Describe software trends that have taken place over time and how
they have aided GIS development.
3. Explain hardware changes that have taken place over time
4. Describe hardware trends that have taken place over time and how
they have aided GIS development.
6.5 GIS and Stages of Data Handling

A GIS has a range of capabilities that include:
 Data capture and preparation
 Data management (storage and maintenance)
 Data manipulation and analysis
 Data presentation
Paper work and manual data processing has been replaced by use of digital
information and computers. A GIS helps store spatial data in digital form in
world coordinates. All GIS packages available have their strengths and
weakness. These have resulted from their development history and or intended
application domain(s) of the package.
 Some are more raster based
 Some are more vector based
Fully fledged GIS packages that support vector and raster data structures
include ILWIS, Intergraph's Geomedia, ESRI's ArcGIS and Map Info. The
Choice of software depends on intended application and expertise of its user.
No software is better than another.
6.5.1 Spatial data capture

Spatial data capture methods include manual digitizing, scanning, vectorisation
and data conversion i.e. changing from original format to a particular GIS
readable format.

Manual digitising covers coordinate and attribute entry via keyboard, digitising
tablet with mouse cursor, mouse cursor on the computer (heads up digitising)
and digital photogrammetry. Automatic digitising involves scanning documents
with flatbed or drum scanners. Semi-automatic digitising covers vectorisation
of data using line following software or point extracting software. We can
also input available digital data from CD-Rom, DVD-Rom, via computer
networking or internet (including geo - web services) (De By, 2001: 149;
Longley et al., 2004: 201).
6.5.2 Spatial data storage and maintenance

The data has to be organised in order to meet the objectives of the GIS. The
files have to be properly managed for easier retrieval and analysis of data. We
need to use DBMS if the data sets are large.
6.5.3 Spatial query and analysis

A GIS has to answer user questions. A well organised and developed GIS
can be used to carry out queries and analysis that answer the user's needs.
For example overlay analysis, neighbourhood computations and buffering.
See unit 1, 7, 8 and 9.
6.5.4 Spatial data presentation

After doing some analysis the user has to present some results. The presentation
may either be an end product such as a printed map, an intermediate product
for example spatial data made available through the internet. The following
outputs and materials needed for producing that output are listed below (De
By, 2001: 157).
Output Devices
Hardcopy Printer
Plotter
Film writer
Soft copy Computer screen
Other of digital data sets Magnetic tapes

CD-Rom / DVD-Rom
The internet

Activity 6.2
1. Identify the components of a GIS.
? 2. Describe the components of a GIS.
3. Discuss the stages in spatial data handling.
6.6 Database Management Systems (DBMS)

The choice on whether or not to use a DMBS will depends on:
1. How much data is there and will be there?
2. What type of use will be made of it?
3. How many users might be involved?
6.6.1 Definitions and concepts

In this section, we define the terms, Database Management System, database,
query, field, form, primary key, record, table, relationship and reports.
A Database Management System (DBMS) is a software package that

allows the user to set up, use and maintain a database for example, Ms Access.
A DBMS is designed to organise the efficient and effective storage and access
of data, maintain data and make that data available on demand (De By, 2001:
158; Longley et al., 2004: 218).
Database is a large, computerized collection of structured data on a particular

subject (for example, bank account administration) (De By, 2001: 158;
Longley et al., 2004: 218). A collection of information organised in such a
way that a computer program can quickly select desired pieces of data. Data
is stored in one or more tables. Geographic databases are databases containing
geographic data for a particular area and purpose.
Query is a computer program that extracts data from the database that meets
the conditions indicated in the query. A query uses a set of rules (criteria) to
select specific records from the database (De By et al., 2001: 161).
Field is a category of information. For example first name or last name.
Form is used in databases to make it easier to enter and modify.
Primary key is a field that can uniquely identify each record (De By, 2001:
167; Longley et al., 2004: 225).

Record is a complete set of information in a database. For example, details

about a particular person.
Table refers to data arranged in rows and columns. All database information
is stored in tables.
Relationship is a link between two database tables that have related

information. For example, a link between student details and the results for
each student.
Reports are used to present data in a neat, ready to print format.
6.7 The Importance of DBMS?

There are several reasons that explain the importance of databases. These
include the following (Longley et al., 2004: 225; Mather, 1993:93):
1. Databases support the storage of manipulation of very large datasets
2. Databases can be instructed to guard over data correctness, eliminates
obvious errors e.g. where we know the study area we are working in,
we also know the range of possible geographic coordinates and
therefore, we can ensure DBMS checks them. These rules are known
as integrity constraints.
3. Databases support concurrent use of the same dataset by many users.
The functions to allow multiple users are called concurrency control.
4. Databases provide a high level declarative query language. The user
can define queries for selecting answers to particular queries. For
example, instructing a database to show all male participants or all child-
headed households in a household database.
5. Databases support the use of a model. A data model is a language with
which one can define a database structure and manipulate the data
stored in it.
6. Databases include data backup and recovery functions to ensure data
availability at all times
7. Databases allow control of data redundancy
8. Maintenance costs decrease because of better organisation and reduced
data duplication.
9. Provides security as DBMS provide controlled access to data.

Activity 6.3
1. Define the terms: Database Management System, database, query,
? field, form, primary key, record, table, relationship and reports.
2. Explain the importance of using database in GIS.
6.8 The Relational Data Model

A data model is a language that allows the definition of the structures that will
be used to store the data, the integrity constraints the stored data has to obey
at all moments in time and the computer programs used to manipulate data.
The relational data model is most prominently used database. It is a collection
of relations / tables. A table is a collection of records. A table contains an
attribute column, which is a named field of a record. All values for the same
attribute have a single domain for example, string, date, yes/no or number.
Figure 6.1 illustrates a relational data model. The three tables are linked to
each other.
Figure 6.1: An example of a database consisting of three relations

(tables) and there tuples and attributes (Source: De By, 2001: 165)
The relational data model consists of a Relation schema. A relation schema

refers to the definition of a relation. When a relation is created we have to:

1. Provide a name for the relation

2. Indicate which attributes it will have
3. Define what the domain of each attribute is.
In addition, there will be a Database schema. The relation schemas of all
tables (relations) together make up the database schema.
Each attribute has a domain. This is referred to as the attribute domain. An

attribute domain describes the type of data that can be stored in an attribute.
Examples of attribute domains include:
 String E12-18-07
 Numerical
 integer 255
 real 23.457
 date 18/07/2000
 Memo At this location we see ….
 Location -21.5280502, 29.2430274 (Lat, long)
 Boolean yes/no
 Domain constrains the attribute value (e.g. attaching a domain to an
attribute means that "all values for this attribute must be an element of
the specified set")
6.9 Databases
In this section we focus on the eight steps taken in the creation of a database.
6.9.1 Steps in creating a database

1. Clearly define the purpose of the database.
2. Decide what information you want to get out of the database, for
example, crop type per farm/ field
3. Decide what information the database needs to store in order to generate
the desired output.
4. Plan tables to store the information in the database and decide what
fields will be in each table.
5. Create a database file in Access and create the tables in the database.
6. Create forms to assist in entering information.
7. Create queries to generate the required output.
8. Create reports to present the output neatly, ready for printing.

Activity 6.4
1. Define:
? a. The relational data model
b. The relation schema
c. The database schema
d. The attribute domain
2. Describe the steps taken in creating a database. Illustrate your answer
with an example.
6.9.2 Primary key

A key of a relation comprises one or more attributes (column). A value for
these attributes uniquely identifies a record. This constitutes the primary key.
There are rules that should be followed when defining a primary key. These
include:
 key uniqueness: unique identifier, no duplicates within relation
 key integrity: a primary key value is never null (unknown)
 referential integrity value of a foreign key refers to an existing primary,
key value in another relation
 foreign key values must have current matching primary key values (De
By et al., 2001: 167; Longley et al., 2004: 225).
A foreign key is used to refer to the primary key of another relation. It is not
a primary key of the relation in which it appears but is a primary key of
another relation (De By et al., 2001: 170).
Figure 6.2: The table Title Deed has a foreign key in its attribute Plot
(De By, 2001: 170)

Activity 6.5
1. What attribute is the primary key of the relation Title Deed?
? 2. Can Deed Date be the primary key of the relation? Explain your
answer.
6.9.3 Querying a relational database

To define queries, we need SQL which is the standard language to access
and manipulate databases. SQL stands for Structured Query Language ((De
By, 2001: 172; Longley et al., 2004: 219). The syntax of SQL statement is
as follows:
SELECT attribute name(s)
FROM relation name
WHERE selection condition
There are three basic query operators used in performing queries. These are
record selection, attribute projection and joining.
Record selection works like a filter. It allows only records that meet the
selection condition to pass and disallows records that do not meet the
condition. Figure 6.3 is an illustration of record selection. The query selects
fewer records as compared to the entire database as illustrated in 6.3 below.
Figure 6.3: Record selection (De By, 2001: 175)

Attribute projection passes through all records of the input but reshapes each
of them in the same way. The query produces fewer attributes. Figure 6.4 is
an illustration of attribute projection, the table at the top shows three attributes
before attribute projection and the one at the bottom shows two attributes
after record selection.
Figure 6.4: Attribute projection (De By, 2001: 175)
The join operator is another query operation. It can be used to join tables in
a database (see Figure 6.5). The table at the bottom in Figure 6.5 shows the
result of joining the two tables at the top.
Figure 6.5: The join operator (De By, 2001: 176)

The join condition is TitleDeed.Plot=Parcel.Pid that expresses a foreign key/

key link between TitleDeed and Parcel. The result has 3+3=6 attributes.
Record selection, attribute projection and the join query can be combined.
This results in a combined selection, projection and join query. The combined
query is illustrated on figure 6.6.
Figure 6.6: The combined query (De By, 2001: 177)
A combined selection/ projection/ join query selecting owners and deed dates
for parcels with a size larger than 1000. The join is carried out first, and then
follows tuples collection on the result tuples of the join. Finally an attribute
projection is carried out (De By, 2001: 177).
6.9.4 Spatial database functionality

A spatial DBMS aids user in managing spatial data including storage, retrieval,
update, and query of collections of spatial data types. It can handle spatial
data types (for example point, line, and polygon) in its data model. Spatial

databases offers extended standard SQL to perform spatial queries (for

example they perform intersects, inside, disjoin). A Spatial database is
illustrated in Figure 6.7.
Figure 6.7: A spatial data base (De By, 2001)
Figure 6.8: Linking raster spatial data and attribute data
Figure 6.9: Linking vector spatial data and attribute data

The illustrations in Figure 6.8 and 6.9 show that maps (spatial data) can be
linked with attribute data in a spatial enabled database in order to perform
spatial query and retrieve, analyse and present information on demand.
Activity 6.6
1. Name methods used to query relational databases.
? 2. Describe the methods of querying relational databases.
6.10Summary
In this unit, we covered the components of a GIS that facilitate the management
and processing of geo- information, hardware and software trends as well as
stages in data handling. In addition, we covered databases including geographic
databases. Now, you have to read widely on the functional components of a
GIS, stages in spatial data handling and databases.
References
Series.
Mather, P.M. (Ed.) (1993). Geographical Information Handling- Research
and Applications, Chichester, John Wiley and Sons Ltd.

Unit Seven
Spatial Data Transformations
7.1 Introduction
I
n this unit, we will cover the methods of transforming data in a GIS. For
example, a sample of points may be available, but we may need to derive
a value for the phenomenon at another location or for the whole study
area. There may be need to transform points into other representations in
order to facilitate interpretation or integration with other data. Examples include,
defining homogeneous area (polygons) from point data and deriving contour
lines. We cover interpolation on continuous and discrete data. We also cover
interpolation techniques such as nearest neighbour interpolation, Thiessen's
polygons, trend surface fitting and triangulation as well as moving window
averaging and kriging.
7.2 Objectives
 define interpolation
 explain the parameters taken into consideration in interpolation
 describe methods of interpolation of discrete data
 discuss the methods of interpolation of continuous data
7.3 Interpolation
Interpolation is the calculation of a value from "surrounding" observations
(Burrough and MacDonnell, 1989: 203; De By, 2001:320; Longley et al.,
333; Mather, 1993: 109). There are various methods of interpolation. Some
are suitable for continuous data and some for discrete data. We need to
consider the type of data. Basically, we have two data types namely discrete
data and continuous data. We also consider the nature of the surface, scale
and resolution of the data. The surface can be simple or complex.
Discrete data refers to qualitative/ categorical e.g. geological unit. Discrete

data can be represented as a classified raster or as a polygon data layer
(vector). The polygon will have a constant value.
Continuous data refers to data represented as continuous measurements.

This may include elevation, temperature and salinity. The data is quantitative
and can be represented as an unclassified raster, isoline (vector layer) and
Triangular Irregular Networks (TIN).
Activity 7.1
1. Define the term interpolation.
? 2. Identify parameters taken into consideration for interpolation.
3. Describe how the parameters considered for interpolation are taken
into consideration in interpolating data.
7.3.1 Interpolating discrete data

Interpolating discrete data refers to interpolation of nominal, categorical
and ordinal data. The techniques include nearest neighbour interpolation and
Thiessen's polygons.

Unit 7 Spatial Data Transformations
Nearest neighbour interpolation
The value of a point for a given location (X, Y) is calculated from the nearest
value to the point and assigned that value. Each location is assigned the value
of the closest measured point. The nearest neighbour technique constructs
"zones" around the points of measurement. Each point belonging to a zone is
assigned the same value (De By, 2001:323).
Thiessen polygons
Thiessen polygons are constructed if the desired output is a polygon layer.

The boundaries of the polygons are the locations for which more than one
point of measurement is the closest point. If the desired output is a raster
layer, we rasterise the polygon (De By, 2001:323; Longley et al., 333).
Partitions make use of geometric distance for determining neighbourhoods.

It generates a polygon around each target location that identifies all these
locations that "belong to" that target.
Figure 7.1: Thiessen's Polygon (Source: De By, 2001: 324)
Activity 7.2
1. Identify methods of interpolation discrete data.
? 2. Describe the methods of interpolation discrete data.
7.3.2 Interpolating continuous data

Methods for interpolating continuous data include trend surface fitting using
regression, triangulation, spatial moving averages using inverse distance
weighting and kriging.

Trend surface fitting using regression
The entire study area can be represented by a formula f(X, Y). For a given
location with coordinates (X, Y) we will use an approximate the value of the
field in that location. We have to derive a formula that best describes the field.
Regression techniques can be used to determine the coefficients of the equation
(De By et al., 2001: 326).
Triangulation
This refers to construction of a triangular tessellation of the study area from

the known measurement points (De By, 2001: 93; 331; Longley et al., 2004:
189). We define which values of the field we want to construct isolines from.
For example, for the elevation we may need a 100m isoline, 200m isoline and
so on. The illustration below shows known point measurements, constructed
triangulation on known points and isolines constructed from the triangulation
(See figure 7.2).
Figure 7.2: Triangulation (De By, 2001: 331)
Moving average using inverse distance weighting (IDW)
Moving window averaging derives raster data from a set of sample points.
Cell values in the output raster are computed one by one. A window known
as a kernel has to be defined.
Measurement points falling inside the kernel contribute to the averaging

computation. Measurement points falling outside the kernel do not contribute
to the averaging computation. After the cell value is computed and assigned
to the cell, the window is moved one cell to the right. The computation will be
performed for that cell. All cells in the raster will be assigned values this way
De By, 2001: 332; Longley et al., 2004: 333).

Figure 7.3: Moving window averaging (De By et al., 2001: 332)
The averaging function will compute the arithmetic, treating all measurements
equally through the following formula:
Where n are the measurements selected in the Kernel and m is measurement

(De By, 2001: 333). The measurements closer to the cell centre should have
greater influence on the predicted value than those further away. This principle
is called spatial autocorrelation. A distance factor is applied. A weighted
distance function called inverse distance weighting functions is used to
interpolate spatial data.
Where d = distance from measurement point to the centre of the cell (De By,
2001: 333). The squares in figure 7.3 show the moving window and its centre,
whilst the + (crosses) show the measurement points with their values and
distances to the centre. Some measurement points are inside and some are
outside the window (see Figure 7.4).
Figure 7.4: The inverse distance weighting technique (De By, 2001:
334).

In the moving window averaging technique, parameters such as raster

resolution, shape and size of kernel, selection criteria and averaging functions
are considered. In terms of raster resolution too large a cell size will smooth
the function too much; this removes local variations. Too small a cell size
results in large clusters of equally valued cells. This results in little value addition.
A shape that ensures that each raster cell will have its window include the
same number of measurement points is ideal. Shape can be rectangular, square,
circular or elliptical. Small kernels tend to exaggerate local extreme values.
Large window have a smoothing effect on the predicted field values. In terms
of selection criteria, one may choose at most five, nearest measurements or
all where there are, for example, at least 3 measurements in the window.
Averaging functions also have an implication on the result. Different weighting
functions may result in different value in the output maps.
Kriging
Kriging is an advanced interpolating technique. It belongs to the field of

geostatistics and it is used to estimate values from limited sample measurements.
It is similar to IDW interpolation. Both weight surrounding values to derive a
value for an unmeasured location. On the contrary, kriging also looks at the
overall spatial arrangement of the measured points. In addition, it considers
the spatial variation between their values, to derive values for an unmeasured
location.
There are two important techniques in kriging. The first step is to create a
Semi-variogram. A semi-variogram compares successive pairs of point
measurements. Secondly, weights are calculated. The Variogram is used to
calculate weights used in interpolation (De By, 2001: 93; 337; Longley et al.,
2004: 336).
Activity 7.3
1. Discuss the methods of interpolation of continuous data.
? 2. Describe methods of interpolation continuous data

7.4 Summary
In this unit we covered the methods of transforming data in a GIS. Here we
covered interpolation on continuous and discrete data. Under interpolation
on discrete data we covered nearest neighbour interpolation and Thiessen's
polygons whilst under interpolation on continuous data covered interpolation
techniques such trend surface fitting, triangulation, moving window averaging
and kriging. Now you are required to read widely on these spatial data
transformation techniques.

References
Burrough, P.A. and McDonnell, R.A. (1989). Principles of Geographic
Information Systems, London, Oxford University Press.
Introductory Textbook, Enschede, De By et al. Educational Textbook
Series.
Mather, P.M. (Ed.) (1993). Geographical Information Handling- Research
and Applications, Chichester, John Wiley and Sons Ltd.

Unit Eight
Spatial Data Analysis I
8.1 Introduction
I
n this unit we introduce you to spatial data analysis. Remember, in Unit
One we learnt that data analysis is one of the functions of GIS. In this
unit, we shall learn the various techniques of spatial data analysis. These
techniques include querying, buffering, classification, reclassification and
neighbourhood functions.
8.2 Objectives
 explain data analysis techniques in GIS
 describe how features are measured in GIS
 explain spatial and attribute data are querying in GIS
 perform spatial data analysis through use of operators
 describe how networks are analysed in GIS
8.3 Spatial Data Analysis

Once the data input process is complete and your GIS layers are pre-
processed, you can begin the analysis stage. Analysing geographic data requires
critical thinking and reasoning. You look for patterns, associations, connections,
interactions, and evidence of change through time and over space. GIS helps
you analyse the data sets and test for spatial relationships, but it does not
replace the necessity for you to think spatially.
By integrating GIS layers, you can ask the spatial questions outlined in Unit 1:
"What is at…?", "Where is it…?", "What has changed since?", "What spatial
patterns exist?", and "What if…?" The first two of these questions inventory
features and minimally examine feature location and relationships. The last
three questions are more complex. To answer these questions, you must use
or string together some of the analytical functions that you will learn about in
this unit. There are a wide range of functions for data analysis available in
most GIS packages which include measurement techniques, attribute queries,
proximity analysis, and overlay techniques and modelling of surfaces and
networks. Spatial data analysis can be done through the use of operators
namely the Logical operators and Mathematical operators.
8.3.1 Logical operators

The operators that fall under the category of logical operators are: Boolean,
conditional and relational operators.

Unit 8 Spatial Data Analysis I
a) Boolean Operators
Figure 8.1 Boolean Operators (Source: www.geo.hunter.cuny.edu)
Figure 8.1 shows venn diagrams which illustrate that, AND, OR, XOR, NOT
are boolean operators that can be used in carrying out spatial analysis. Table
8.1 illustrates some examples in which these operators can be used in carrying
out spatial data analysis. For example, on table 8.1, the boolean operator and
requires you to find hotels where both conditions A and B are met. The
condition A, being hotels in luxury category while the condition B, being hotels
with more than 20 bedroom.

Table 8.1: Using Boolean Operators

Boolean Operator Examples
A and B Which hotels are in luxury category and have more than 20 bedrooms?
A or B Which hotels are in luxury category or have more than 20 rooms?
B not A Which hotels are in luxury category but do not have more than 20 bedrooms?
B XOR A Which hotels are either in luxury category or have more than 20 bedrooms?
(A and B)or C Which hotels are in luxury category and have more than 20 bedrooms or have
more than 5 swimming pools?
A and (B or C) Which hotels are in luxury category and have more than 20 bedrooms or more
than 20 pools?
(Adapted from: www.geo.hunter.cuny.edu)
b) Conditional Operators (IF)
Conditional Operator evaluates an expression and returns option1 if result is

true or else returns option2. Examples of iff statements are:
 Iff (A='forest', true?)
Figure 8.4 illustrates the areas that have forest on map C1, where the there
are zeros the area is not covered by forest.
Figure 8.2: Conditional Expressions (Source: De By., 2001:389)
c) Relational Operators:
> Greater than
< Less than
>= Greater than or equal to
<=Less than or equal to

Figure 8.3: Use of Relational and Boolean Operators (Source: De By.,

2001:388)
Figure 8.3 illustrates the following:
Map D1-areas with forest (A) and are less than 500m (B)
Map D2- areas with forest (A) or less than 500m (B)
Map D3- areas with forest (A) with either or less than 500m (B)
Map D4- areas with forest (A) and not less than 500m (B)
8.3.2 Mathematical operators

a) Arithmetric (+, ÷, x,-)
b) Trigometry (sin, cos, tan,atan, asin)
c) Logarithim ( log, log2, log10, exp exp2)
d)Powers (x2, x3 etc)

Figure 8.4: Use of mathematical operators (Source: De By, 2001:383)
Figure 8.4 shows how mathematical operators can be used in a GIS. For
example, in map C1 and C2 addition has been done. In C3, subtraction,
addition and multiplication has been used.
Activity 8.1
1. Define the term spatial data analysis.
? 2. Discuss the three types of logical operators used in GIS.
3. Explain the four types of mathematical operators used in data analysis.
8.4 Measurement of Vector Data

Related geometric measurements include l distance, and area size.
Measuring distance between features is another important function. If both

features are points say p and q the computation in a cartesian spatial reference
systems are given by the well know Pythagorean distance function:
Distance (p,q) = (xp - xq)2 + (yp - yq)2
Area size measurements are used when one wants to sum up the area sizes of
all polygons belonging to the same class. This could be crop type, for example,
what is the size of the area covered by potatoes? If our crop classification is
in a stored data layer the computation could include:
 selecting the potato area
 summing up their (stored) area sizes

8.5 Measurements on Raster Data

Measurements on raster data are simpler because of the regularity of the
cells. The area size of the cell is constant and determined by cell resolution.
Location of an individual cell derives the raster's anchor point, the cell resolution
and the position of the cell in the raster. The area size of a selected part of the
raster (group of cells) is calculated as the number of cells multiplied by the cell
area size. The distance between two raster cells is the standard distance function
applied to the locations of their respective midpoints taking into account cell
resolution.
Activity 8.2
1. Explain how measurements are done in:
? a) Vector data
b) Raster data
8.6 Spatial Queries

Performing queries on a GIS database to retrieve data is essential in most
GIS project. Queries offer a better method of data retrieval and can be
performed on data that are part of GIS database or new data produced as a
result of analysis. There are two types of queries that can be performed in
GIS: spatial and attribute queries. A spatial query is used for retrieval of
information without changing the original data or creating new data. A spatial
query can be used to analyse spatial relationships by analysing containment,
proximity, adjacency and intersection.
Intersection is when features have the same geographical space. It refers to

features that are not disjoint Examples are:
a) Lines that intersect other lines such as pipelines intersect faults.
b) Polygon that intersects lines such as lines/faults, for example, to find out
which businesses are vulnerable to earth quakes?
c) Polygons that intersect other polygon
Adjacency is the meet relationship. This expresses that feature share common
boundary and therefore apply only to line and polygon feature For example,
features that are within a distance of 100m of the main road?
Containment expresses a feature whole contained in other features.

Query of attribute data
Attribute query selects features based on their attribute values. It involves

picking features based on query expressions, which use Boolean algebra (and,
or, not), set algebra (>, <, =, >=, <=), arithmetic operators (=, -, *, /), and
user-defined values. Simply put, the GIS compare the values in an attribute
field with a query expression that you define.
Activity 8.3
1. Define the terms: attribute query and spatial query.
? 2. Explain the following terms:
a) intersection
b) proximity
c) adjacency
d) containment
8.7 Classifications
Classification is technique of purposefully removing detail from an input data
set, in the hope of revealing important patterns of (spatial distributions). In the
process, we produce an output data set, so that the input set can be left intact.
We do so by assigning a characteristic value to each element in the input set,
which is usually a collection of spatial features that can be raster cells, points,
lines or polygons.
8.7.1 Reclassification
Reclassification occurs when input data set may have itself been a result of
classification and in such case, we call it reclassification. For example, we
may have a soil map that shows different soil type units and we would like to
show suitability for a specific crop. In this case, it is better to assign to the soil
units an attribute of suitability for the crop.
Another reason to reclassify is to assign values of preference sensitivity, priority,

or some similar criteria. For example, a certain soil type may be good for a
building suitability model. But for erosion, animal habitat, or identifying farm
land, that same soil type will have a different suitability weighting based on the
problem at hand.

When identifying areas at most risk of flooding input rasters might be slope,
soil type, and vegetation. Each of these rasters might be reclassified on a
scale of 1 to 10 depending on the susceptibility of each attribute in each raster
to flooding-that is, steep slopes in the slope raster might be given a value of
10 because they are most susceptible to flooding. There are usually four steps
in producing a suitability map (see illustration of figure 8.5):
1. Input datasets. Decide which datasets you need as inputs.
2. Derive datasets. When applicable, create the datasets that you can
derive from your base input datasets, for example, slope and aspect
can be derived from the elevation raster. Create data from existing data
to gain new information.
3. Reclassify datasets. Reclassify each dataset to a common scale (for
example, 1 to 10), giving higher values to more suitable attributes.
4. Weight and combine datasets. Weight datasets that should have more
influence in the suitability model if necessary then combine them to find
the suitable locations.
Figure 8.5 is a flow diagram of a sample for finding the best locations for a
school. The four steps to produce such a suitability map are:
 Decide which datasets you need as inputs. The datasets you will use in
this exercise are displayed to the right.
 Derive datasets. Create data from existing data to derive new
information.
 Reclassify each dataset to a common scale (for example, 1-10) giving
higher values to more suitable attributes.
 Weight datasets that should have more influence in the suitable locations.
Figure 8.5 shows that, the input base layers are land use, elevation, recreation
sites, and existing schools. The derived datasets are slope, distance to
recreation sites, and distance to existing schools. Each raster is then reclassified
on a scale of 1 to 10. The reclassified rasters are added together with distance
from recreation sites and other schools having a higher weight.

Figure 8.5: User Controlled Classification (http://maps.unomaha.edu/

Peterson/gisII/Labs/NewSchool/NewSchool.htm)
The user selects the attribute/s that can be used as the selection parameter/s
and defines the classification method. The latter involves declaring the number
of classes as well as the correspondence between old attribute values and
new classes. This can be done thorough a classification table.
8.7.2 Automatic classification

GIS software can perform automatic classification in which a user only specifies
the number of classes in the output data set. The system automatically

determines the class breaks. This can be done through equal interval and
equal frequency techniques.
a) In the Equal interval technique the minimum and maximum values of
the classification parameter are determined and the (constant) interval
size for each category is calculated as (Vmax-Vmin)/n, where n is the
number of classes chosen by the user.
b) In the equal frequency technique or the quantile classification, the
objective is to create categories with roughly equal numbers of features
per category. The total number of features is determined first and
calculated. The class break points are then determined by counting of
the features in order of classification parameter values
Activity 8.4
1. Differentiate between classification and reclassification.
? 2. Distinguish between user controlled classification and automatic
classification.
8.8 Overlay Functions

One of the most key GIS functions is its ability to integrate data from two
sources using map overlays. Using GIS makes it possible to take two different
thematic map layers of the same areas and overlay them one on top of the
other to form a new layer. Standard overlay operators take two input data
layer and assume they are georeferenced in the same system and overlap the
study area. The principle of spatial overlay is to compare the characteristics
of the same location in both data layers and to produce a result for each
location in the output data layer. The specific result to produce is determined
by user. It could involve some calculations and logical operations. There are
two types of overlays, namely Vector Based Overlay and Raster Based
Overlay.
8.8.1 Vector based overlay

Vector based overlays are more complicated than raster because topological
data is stored as points, lines, polygons requires complex geometrical
operations. There are three main types of vector overlays namely: point in
polygon overlay, line in polygon overlay and polygon on polygon overlay.

Point in polygon overlay is used to find out the polygon in which a point
falls. For example, one would like to identify the boreholes in a village.
Line in polygon overlay is more complicated than the point in line polygon.
Imagine we want to know where roads pass through forest areas to plan for
a scenic drive. To do this, we need to overlay the road data on a data layer
containing forest polygons. The output contains roads split into smaller
segments representing roads in the forest areas and roads outside forest areas.
Topographical information must be contained; therefore, this is more complex
Polygon on Polygon overlay can be used to examine the areas of forestry

in Victoria Falls Resort area. In this case, we have two data input layers
namely the forest data layer and the resort boundary layer. These data layers
are polygons. This technique combines not only the spatial characteristics of
the polygons but their attributes as well.
8.8.2 Raster based overlay

Raster Based Overlay is performed pixel by pixel in the maps being worked
on. Several maps can be combined at the same time using arithmetic, relational
and conditional operators as shown on figure 8.6. One can produce an output
raster using this expression:
Suitability: = ((Landuse = "forest" AND geology = "alluvial") OR (Landuse=

"grass" AND geology "shale"), "suitable, unsuitable")
From figure 8.6, you can observe that the output raster shows areas that are
suitable for the desired model. Forests on alluvial terrain and grassland on
shale are considered suitable.
Figure 8.6: Raster Based Overlay (Source, De By, 2001:391)

Activity 8.5
1. Explain how you would carry out overlay functions.
? 2. Differentiate between vector based overlays and raster based overlay.
3. Describe how you would use the overlay techniques in selecting a
suitable nuclear waste repository.
4. Using examples, explain the importance of spatial data analysis in GIS.
8.9 Summary
In this unit, we discussed about spatial data analysis. We learnt the various
techniques of data analysis. These techniques learnt are querying, buffering,
classification, and reclassification. In the next unit, we learn about
neighbourhood analysis as another data analysis method.

References
Introductory Textbook, Enschede, De By. Educational Textbook
Series.
Geographical Information Systems Volume 1: Principles and
www.geo.hunter.cuny.edu, Accessed 20/11/2012.
http://maps.unomaha.edu/Peterson/gisII/Labs/NewSchool/NewSchool.htm,

Unit Nine
Spatial Data Analysis II:

Neighbourhood Analysis and
Network Analysis
9.1 Introduction
I
n this unit, we discuss neighbourhood analysis. Under neighbourhood
analysis, we cover proximity computations (buffer zone generation and
Thiessen's polygon generation), and spread and diffuse computations as
well as seek computations, spatial operations on continuous surfaces and
applications of neighbourhood analysis. In this unit we also introduce you to
another method of analysing data. This method is called network analysis.
We focus on optimal path finding, trace analysis, network partitioning and
network allocation.
9.2 Objectives
 define neighbourhood analysis
 discuss the methods of neighbourhood analysis
 explain the applications of neighbourhood analysis
 analyse the optimal path finding concept
 describe trace analysis
 distinguish between network partition and explain network allocation
9.3 Neighbourhood Analysis

Neighbourhood operations, also called proximity analysis, consider the
characteristics of neighbouring areas around a specific location. These functions
either modify existing features or create new feature layers, which are
influenced, to some degree, by the distance from existing features. All GIS
programs provide some neighbourhood analysis, which include buffering,
interpolation, Thiessen polygons, and various topographic functions. In
neighbourhood analysis, the principle here is to find out the characteristics of
the vicinity of a particular object.
In order to perform neighbourhood analysis, we must:

 state which target locations are of interest to us and define their spatial
extent
 define how to determine the neighbourhood for each target
 define the characteristics that must be computed for each neighbourhood
for example, an area within 2km travel distance; all school, within 10km
of the city centre; all hospitals within 20km from your homestead (De
By et al., 2001:392).
These are typical questions in an urban setting. When our interest is more in
natural phenomena, different examples of location and neighbourhood
characteristics arise. In raster data, neighbourhood characteristics often are
obtained via statistical summary functions that compute values such as average,
minimum, maximum and standard deviation of the cells in the identified
neighbourhood. Neighbourhood can be determined as well by making use of
geometric distance function. Geometric distance does take into account
direction and certain phenomena.
Neighbourhood analysis cover interactive spatial selection techniques; for

example, features that overlap (intersect, meet, contain, or are contained).

Unit 9 Spatial Data Analysis II: Neighbourhood Analysis and Network Analysis
We will cover proximity computations, spread computations and seek

computations.
9.3.1 Proximity computations

In proximity computations, we make use of geometric distance in order to
determine neighbourhood of one or more target locations. Geometric distance
does not take direction into account. Proximity computations can be used to
study pollution spread by rivers, ground water flow or prevailing weather
systems. The most common proximity computations are buffer zone generation
and Thiessen's polygon generation.
Buffer zone generation
Buffering creates physical zones around features. These "buffers" are usually
based on specific straight-line distances from selected features. Buffers,
common to both raster and vector systems, are created around point, line, or
polygon features.
In buffer zone generation, we select one or more target locations and then
we determine the area around them within a certain distance. For example,
roads and rivers can be selected as targets and a certain distance around
them can be defined as the buffer. For instance, in the case of stream bank
cultivation, one may define a 30m buffer as per the Laws of Zimbabwe. Figure
9.1 is an example of buffer zone generation (De By, 2001:396). In 9.1a
distances of 25m and 75m were selected for minor and major roads
respectively. In 9.1b zoned buffers of 100m, 200m and 300m form roads
were selected.
Figure 9.1: Buffer zones around streets (roads) (De By et al., 2001:396)

In vector-based buffer generation, the buffers themselves become polygon

features. Buffers are usually separate data layers that can be used for further
analysis. On rasters, the target location(s) are always represented by a selection
of a raster's cells and geometric distance is defined using the cell resolution as
the unit. The distance function applied is the Pythagorean distance between
the cell centres.
Once complete, buffer layers are used to determine which features (in other
layers) occur either within or outside the buffers (spatial queries), to perform
overlay, or to measure the area of the buffer zone. In some instances zoned
buffers must be determined, for instance in assessments of traffic noise effects.
Thiessen polygon generation
Thiessen polygons are boundaries created around points objects within a

point layer. The resultant polygons form around each of the points, and they
delineate territories around which any location inside the polygon is closer to
the internal point (that created it) than to any other point in the layer. Attributes
associated with each point are assigned to the resultant polygon. It is a vector
and raster process, but for more than one attribute, raster systems must use
multiple layers.
Thiessen polygons make use of geometric distance for determining

neighbourhoods. It is useful if we have a spatially distributed set of points as
target locations and we may want to know for each location in the study area
which target it is closest. This technique generates a polygon around each
target location that identifies all those locations that belong to that target. Figure
9.2 is an illustration of a Thiessen polygon (De By, 2001:398). Thus, the
Thiessen polygons are constructed as follows:
 All points are triangulated into a triangulated irregular network (TIN)
that meets the delaunay criterion. The delaunay criterion ensures that
no vertex lies within the interior of any of the circumcircles of the triangles
in the network.
 The perpendicular bisectors for each triangle edge are generated,
forming the edges of the thiessen polygons. The locations at which the
bisectors intersect determine the locations of the thiessen polygon
vertices.
 The thiessen polygons are built to generate polygon topology. The
locations of the points are used as the label points for the thiessen
polygons.

Figure 9.2: Thiessen Polygon Construction (De By, 2001:399).
Activity 9.1
1. Define neighbourhood analysis.
? 2.
3.
Define proximity.
Describe buffer zone generation.
4. What is the importance of buffer zone generation?
5. Describe how Thiessen's polygons are generated.
9.3.2 Spread and diffuse computations

Spread computations depend on distance, direction and/ or terrain
characteristics in different directions. They involve selection of one or more
target locations. Spread computations apply, for example, when the target
location contains source material that spreads over time such as water and
waves. The source material may be air, water or soil pollution; radio waves
emitted on radio relay station, commuters exiting a train station, people from
open up refugee camp. In these entire cases, one would expect the spread to
occur evenly in all directions. The source material does not spread evenly in
all directions. A local resistance raster map is used to calculate how much
minimal resistance the spreading material encounters. Each cell in the local
resistance raster indicates how difficult it is for the source material to pass by
that cell. From the local resistance raster, a normalised raster map showing
how much minimal total resistance the spread has experienced for each cell is
calculated. The GIS can take care of path lengths. In the illustration on figure
9.3, the lower shaded cell in (a) is the source location and (b) is the minimal
total resistance raster computed by the GIS. The GIS calculates the minimal
cost, that is, minimal resistance path (De By, 2001:400).

Figure 9.3: Spread computation (Source: De By, 2001:401)
For example, from the shaded cell on figure 9.3 there are three paths to its
northeast neighbour cell (with local resistance 5). Path 1 = N-E, Path 2 = E-
N and path 3 = NE based on compass direction from the shaded cell.
Path 1 =
Path 2 =
(Source: De By, 2001:402)
Thus, path 2 is the minimal cost path. Spread computations can be used for
base stations for cellphones, missile launch and assessing nuclear spillage.
Activity 9.2
1. Define spread computations.
? 2. Describe the concept of spread and diffuse computations.
9.3.3 Seek computations

Seek computations determine how phenomena spreads over an area in all or
a specific direction. Seek computations calculate how much difficulty or
resistance the source material may encounter. For example, drainage pattern
where water chooses a way to leave the area.
For each cell in the raster, the steepest downward slope to the neighbour cell
is calculated and its direction is stored in a new raster map. The computation

determines the elevation differences between the cell and a neighbour cell and
takes into account cell distance minus one for neighbour cells in the N-S
direction and the square root of 2 cells in the NE-SW and NW-SE directions
(refer to figure 9.4). Amongst the neighbour cells, GIS picks the one with the
steepest path. From the flow direction raster, the GIS can compute an
accumulated flow count raster map. The accumulated flow count raster map
indicates how many cells have water flow into a particular cell. Cells with high
flow counts have concentrated flow and thus, belong to a stream and cells
with zeros are local topographic highs (De By, 2001:403).
Seek functions are used to delimit streams. In delimiting streams they calculate
flow direction, the steepest slope, the flow count raster map and the flow
accumulation raster.
Figure 9.4: Flow computations (a) the original elevation raster (b) the
flow direction raster computed from it (c) accumulated flow count raster
map (De By, 2001:405).
Activity 9.3
1. Define seek computation.
? 2.
3.
Describe the concept behind seek computations.
Differentiate between spread and seek computations.
4. Differentiate between overlay functions and neighbourhood functions.
9.3.4 Spatial operations on continuous surface

a) Slope Percent- To calculate percentage, of the slope, divide the
difference between the elevations of two points by the distance between
them, then multiply the quotient by 100.

b=100
b=50
a=100 a=100
% of slope % of slope
= (b/a)*100 = (b/a)*100
= (100÷100)*100 = (50÷100)*100
= 100 = 50
b) Slope Angle/Degree
b=100
b=50
45o 26.6o
a=100 a=100
= (b/a) = (b/a)*100
= (100÷100) = (50÷100)
= 45o = 26.6o
c) Slope aspect calculation- the calculation of aspect (orientation) of

the slope in degrees between 0-360o for any or all locations

d) Slope convexity/concavity calculation-slope convexity is defined

as the change of the slope (negative when the slope is concave and
positive when the slope is convex) can be defined as the second derivative
of the field.
e) Slope length calculation- With the use of the neighbourhood
operations, it is possible to calculate for each cell, the nearest distance
to the watershed boundary (the upslope length) and the nearest stream
(the down slope length). This information is useful for hydrological
modelling.
f) Hill shading- which is cartographically called shaded relief is a lighting
effect which mimics the sun to highlight hills and valleys. Some areas
may appear to be illuminated while others lie in shadows. The application
of a special filter to a Digital Elevation Model (DEM) produces hill
shading (see figure 9.7).
Figure 9.7: Topographic Functions (http://giscommons.org/)
Interpolation
Interpolation is a method of predicting or estimating pixel values at unsampled

locations based on the known values of neighbouring pixels. Figure 9.8
illustrates that, the dots are the points where values are known. The grey cells
are the estimated data based on the known values. Since it is impractical for
you to take measurements at all locations across your study area due to money,
time, legal, and physical constraints, you interpolate between known pixel
values (sampled locations). With interpolation, you create a continuous surface
like elevation, temperature, and soil characteristics that occur everywhere.
Because of its continuous nature, interpolation is only available within raster-
based systems. http://giscommons.org.

Figure 9.8: Interpolating between point features (http://giscommons.org)
g) Filtering
A filter operates by moving a "window" across the entire raster. In some

cases, the filter is used to compare values in the window with those on the
raster layer. Filters are frequently used in image processing, especially with
remotely sensed data but also in GIS raster applications. By changing the
weights of filter values, we can produce two major effects:
 smoothing (a "low pass" filter, removes or reduces local detail)
 edge enhancement (a "high pass" filter, exaggerates local detail)
In some cases, the new value for the cell at the middle of the window is a
weighted average of the values in the window
High pass filters
High pass filters have the following characteristics:

 Highlights detail from a raster coverage
 Create a 3 x 3 cell filter with a weighting value of 9 at the centre of the
grid cell and a weight of -1 for all the remaining grid cells.
 Then these nine, new cell values are summed to get the final filter-value
for the centre cell in the grid.
 The window is moved over the raster coverage, one cell at a time, from
left to right and the members of each corresponding cells are multiplied.

Low pass filters
Low pass filters have the following characteristics:

 Severely smoothens the spatial variation on the layer.
 Here, the 3 x 3 'roving window' often contains the value 1/9 in each of
the nine cells.
 The nine new values are then summed and assigned to the cell in the
centre of the grid.
 The idea is to make all the numbers closer together in value.
 With each pass (movement of one cell from right to left), you average
the grid cells under the filter.
Figure 9.10: High pass filter and low pass filter (Adapted from: http://
courses.washington.edu/gis250/lessons/raster_analysis2/index.html)
Activity 9.4
1. What is the importance of knowing the slope percent, degree and aspect
? in a crop production study?
2. Distinguish between high pass filters and low pass filters.
9.3.5 Applications of neighbourhood analysis

Products of neighbourhood analysis can be applied in:
 3-dimensional map display.
 determining change of elevation through time.

 automatic catchment delineation.

 dynamic modelling.
 visibility analysis.
9.4 Network Analysis

Network analysis involves analysing the flow of networks, that is, a connected
set of lines and point nodes. These linear networks most often represent
features such as rivers, transportation corridors (roads, railroads, and even
flight paths), and utilities (electric, telephone, television, sewer, water, gas).
Point nodes usually represent pickup or destination sites, clients, transformers,
valves, and intersections. People, water, consumer packages, kilowatts, and
many other resources flow to and from nodes along linear features.
Each linear feature affects the resource flow. For example, a street segment
might only provide flow in one direction (a one-way street) and at a certain
speed. Nodes can also affect flow. A stuck valve might allow too much of a
resource to stream out and away from its intended destination. Network
analysis tools help you analyze the "cost" of moving through the network.
Like spread functions, "cost" can represent money, time, distance, or effort.
Network analyses are vector-based applications, but there are similarities
with raster-based spread functions.
Activity 9.5
1. Explain the tern network analysis.
? 2. Describe the importance of network analysis.
9.5 Optimal Path Finding

Optimal path finding techniques are used when a least cost path between
two nodes in a network must be found. The two nodes are called origin and
destination, respectively. The aim is to find a sequence of connected lines to
traverse from origin to the destination at the lowest possible cost. The shortest
path between one point and another on a network may be based on the
shortest distance. Impediments to travel can be added to a raster grid by
increasing the value of cells that are barriers to travelling and then finding the
least cost route through a grid. Networks structured in vector GIS offer more
flexibility and more thorough analysis of impediments such as traffic restrictions

and congestion. However, the shortest path may not be defined simply in
terms of distance, For example, for an emergency vehicle to reach an accident,
the quickest route may be needed and this may require the traverse of less
congested minor roads. Several paths may be considered before the route
with the least cumulative impedance is constructed from the intervening network.
Another example, is the waste collection vehicle, which needs to visit a specific
set of clients in a day and to do so, the best route is required. So the question
is: In which order should the stops be visited and which path should
taken between? In GIS network analysis, the ordering of the stops can be
determined by calculating the minimum path between each stop and every
other stop in the list based on impedance met in the network. A trial and error
method can be used to order the visits so that the impedance from the first
stop to the last is minimised.
9.6 Trace Analysis

Another problem is trace analysis. The ability to trace flows of goods, people,
services and information through a network is another useful function of
network analysis. Route tracing is particularly useful for networks such as
stream networks, sewage systems and cable TV networks. In hydrological
applications, route tracing can be used to determine the streams contributing
to reservoir or to trace pollutants downstream from the site of a spillage.
Route tracing can be used to find all the customers serviced by a particular
sewer main or find those affected by a broken cable.
Activity 9.6
1. What is trace analysis?
? 2. Explain how trace analysis can be used to trace pollutants from a spillage
site.
3. Explain the concept of optimal path finding.
4. Describe how you would apply optimal path finding in waste
management.

9.7 Summary
In this unit, we discussed neighbourhood analysis. Under neighbourhood
analysis, we covered proximity computations (buffer zone generation and
Thiessen's polygon generation), and spread and diffuse computations as well
as seek computations, spatial operations on continuous surfaces and
applications of neighbourhood analysis. Now you are required to read widely
on neighbourhood analysis and explore more literature and examples. In this
unit we introduced you to another method of analysing data called network
analysis. We focused on optimal path finding and trace analysis.
References
Burrough, P.A. (1986). Principles of GIS for Land Resources Assessment,
Oxford, Clarendon Press.
Burrough, P.A. and McDonnell, R.A. (1989). Principles of Geographic
Information Systems, London, Oxford University Press.
Series.
Geographical Information Systems Volume 1: Principles and
http://giscommons.org : Accessed (11/07/2012).

Unit Ten
GIS Applications
10.1 Introduction
I
n this unit we focus on Geographic Information Systems applications.
These include disaster risk assessment, species distribution (habitat)
modelling, vector and disease management, fleet management and route
planning, agricultural activities and waste management.
10.2 Objectives
 state the application of GIS in disaster risk assessment
 explain the application of GIS in species distribution (habitat) modelling
 discuss the application of GIS Waste management
 analyse the application of GIS in vector and disease management
 discuss application of GIS in fleet management and route planning
 describe the application of GIS in agricultural activities
10.3Application of GIS in Disaster Risk Management

A Geographic Information System (GIS) analytical model can be developed
to evaluate the characteristics of infrastructure damages incurred during a
certain period in a certain area. For example, GIS helps to determine if any
spatial similarities exist which may be an indicator of predicting areas in which
future floods events may occur. The model can use soil types, land use, slope
and stream data. Each criterion can then be ranked as best (least likely to
experience flooding), moderate, or worst (most likely to experience flash
flooding) respectively. Thus, areas with the highest risk factors (most likely to
flood) can be identified. GIS is also useful in the following areas: vulnerability
mapping, situation mapping and risk mapping. It is also used to identify hazard
and map elements at risk. Figure 10.1 shows a lava flow hazard zone map in
Hawaii Island. The island of Hawaii is divided into zones according to the
degree of hazard from lava flows. Zone 1 is the area of the greatest hazard,
Zone 9 of the least (http://pubs.usgs.gov/gip/hazards/maps.html).
Figure 10.1 Hazard zone areas in Hawaai (Source, http://pubs.usgs.gov/

gip/hazards/maps.html)

Unit 10 GIS Applications
10.4 Application of GIS in Habitat Mapping (Species

Distribution Modelling)
GIS can be used for habitat suitability modelling. For example, elephant
habitat mapping can be established. Wildlife habitat suitability study involves
an analysis of the complex inter relationship among various environmental
factors that exist over a geographical area. These factors include forest type,
topography, water resource, distance from human activity centres, etc. Each
model can be applied in GIS in order to identify the most suitable and
moderately suitable habitats. Landsat imagery, field investigation wit GPS
and topographic maps can be employed to generate and thematic layers
relevant to each model in the GIS database. Overlay operations on these
layers leads to generation of habitat suitability maps. Geostatistical methods,
machine learning techniques, general linear models as well as general additive
models can be used to map species habitats and plan conservation parks to
mention but a few. Figure 10.1 shows an example of a modelled buffalo habitat
map that can be used to plan in wildlife management. Figure 10.1 is a habitat
map showing the probability distribution of buffalo developed using the
maximum entropy technique (MAXENT) a presence-only machine learning
tool (Matawa et al. 2012: 192). On the map, dark shading means highly
suitable and whilst lighter shading means least suitable.
Figure 10.1: Buffalo habitat suitability map (Matawa et al. 2012: 195)

10.5Application of GIS in Waste Management

In waste management GIS can be used in the following areas:
Quantity of waste generated - A map showing the current waste
generated in different wards, sectors and along the roads, streets and
junctions can be created.
Location of waste bins can be done using GPS and demarcating on
the base map. Existing location of the waste bins and the street maps
provide the proximity of the bins to the waste collection service routes.
In case of any inconvenience for the waste collecting crew, the bins can
be re-located. In addition, a unique number to all the waste bins can be
allocated so it can be easily and quickly located in case of any complaint
registered or planning and maintenance.
Waste collection- The optimal path for the waste collection vehicle
can be generated. A GIS based optimal routing model can be created
using parameters like, population density, waste generation capacity,
road network, waste storage bins and waste collection vehicles storage
bins and waste collection vehicles were considered to develop the model.
It is intended to plan cost efficient waste collection route for
transportation of waste to the landfills. It can be a good decision support
tool for waste transport, fuel consumption, work distribution amongst
the vehicles for load balance and generation work schedules for both
employees and vehicles.
Location of the waste dumping ground/landfill site- Geographical
Information System (GIS) can provide an opportunity to integrate field
parameters with population and other relevant data or other associated
features, which helps in selection of suitable disposal sites. Though
there are numerous criteria used for evaluation, the ones used here
represent local factors. In a GIS, the factors which include soils,
topography, distance from roads, distance from settles, proximity to
wetlands are put as data layers and are overlaid to establish a suitable
location. In order to comprehend to select some most appropriate sites
for wastes disposal from a set of alternatives, Multi-criteria Evaluation
(MCE) has to be in-cooperated. It is a decision support technique
where a decision is a choice between alternatives (such as alternative
actions, land allocations, etc.). In MCE, an attempt is made to combine
a set of criteria to achieve a single composite basis, a score function,
for a decision according to a specific object. Weight parameters are
established based on the degree of importance of the factors. Techniques
such as overlaying, buffering and reclassification are also used into to
select the most suitable site.

10.6Vector and Disease Management (Epidemiology)

Geographic Information Systems (GIS) are increasingly applied to studies of
vector-borne diseases for data management and analysis (Brownstein et al.,
2002: 158; Cecchi et al., 2008: 365; Kitron et al., 1996: 372) as species
distribution models facilitate interventions to eliminate vector-borne diseases
(Brownstein et al. 2002: 158) through identifying reservoirs/sources of disease
(Cecchi et al., 2008: 365; Drake et al. 2000: 424). GIS and RS based
species distribution models can be used to estimate risk of infection, increase
the efficiency of protective measures and help understand the factors that
influence the geographic extent of disease (Rácz et al., 2006: 370) as well as
understand the spatial epidemiology of human, animal and zoonotic diseases
that are vector-borne, including trypanosomes and malaria (Cecchi et al.,
2008: 365). Therefore, spatial distribution models of vectors of disease make
it possible to effectively implement eradication programmes. GIS offers new
and expanding opportunities for epidemiology because they allow an informed
user to choose between options when geographic distributions are part of the
problem (Clarke et al., 1996: 88). Figure10.3 is a vector habitat map showing
the probability distribution of Glossina Morsitans a tsetse fly that transmits
trypanosomes that cause sleeping sickness in humans and trypanosomiasis in
livestock such as cattle, sheep and goats in North western Zimbabwe (Matawa
et al., unpublished).
Figure 10.3: Vector habitat map (Matawa et al., unpublished).

10.7Fleet Management and Route Planning

GPS and GIS tools including software can be used for fleet management and
route planning by transport manger, logistics companies as well as emergence
services. GPS based transport management systems provide the possibility
of monitoring the movement of vehicles. They involve the integration of the
Global Positioning System (GPS) and the Geographical Information System
(GIS) to monitor movement of the fleet.
GIS is used by highway authorities who need to decide location of new routes
and to keep track of highway condition. It is also used by logistics companies
(e.g., parcel delivery companies, shipping companies) to organise their
operations, to decide where to place their central sorting warehouses and the
facilities that transfer goods from one mode to another (for example, from
truck to ship), how to route parcels from origins to destinations, and how to
route delivery trucks.
Transit authorities also use GIS. They use it to plan routes and schedules, to
keep track of vehicles and to deal with incidents that delay them, and to
provide information on the system to the travelling public.
10.8Agricultural Activities
GIS has been used to show the spatial distribution of soil properties in
agricultural fields as shown on Figure 10.4. The conventional tillage field has
the highest average bulk density as compared with basin tillage and mulch
ripping fields. The maps on figure 10.4 show that, the conventional tillage field
has darker shading (high bulk density), while the basin tillage and mulch ripping
fields have lighter shading (low bulk density).

a b
Figure 10.4: Spatial Distribution of bulk density in fields (a) basin tillage
(b) mulch ripping (c) conventional tillage in Ward 1, Insiza District,
Zimbabwe (Sigauke et al., 2009 unpublished)
10.9Governance
A GIS can be used to manage populations, develop administrative boundary
and manage supply of public services and utilities as well as for development
planning. Under governance the list is endless. For example it can be used to
define census blocks for enumerating people.
Other applications include:

 To inventory and manage resources and infrastructure
 government decision making at all levels
 plan transportation routing
 improve public service delivery
 manage land development
 generate revenue by increasing economic activity
 monitoring public health risk
 managing public housing stock
 allocating welfare assistance funds

 tracking crime
 Operational, tactical, and strategic decision making in law enforcement,
health care planning, and managing education systems.
Activity 10.1
1. Explain the importance of GIS to environmentalists.
? 2. Discuss the application of GIS in:
a) Habitat mapping
b) vector and disease management
c) fleet management and route planning
d) waste management
e) agriculture
3. Suppose you work for a city or town Council. Describe how you
would use GIS in selection of a suitable waste disposal site for that
City or Town.
10.10 Summary
In this unit we focused on the applications of GIS. These included disaster
risk assessment; species distribution (habitat) modelling; vector and disease
management; fleet management forest management; governance.

References
Brownstein, J.S., Rosen, H., Purdy, D., Miller, J.R., Merlino, M., Mostashari,
M., Fish, D., (2002). Spatial Analysis of West Nile Virus: Rapid Risk
Assessment of an Introduced Vector-borne Zoonosis. Vector-Borne
and Zoonotic Diseases, 2 (3), pp. 157-164.
Clarke, K.C., McLafferty, S.L. and Tempalski, B.J. (1996). On Epidemiology
and Geographic Information Systems: A Review and Discussion of
Future Directions in Emerging Infectious Diseases, Volume 2 (2),
pp85-92.
Cecchi, G., Mattioli, R.C., Slingenbergh, J., Dela Rocque, S., (2008). Land
cover and tsetse fly distributions in sub -Saharan Africa. Medical and
Veterinary Entomology, 22, pp. 364-373.
Drake, J.M., Randin, C., Guisan, A., (2006). Modelling ecological niches
with support vector machines. Journal of Applied Ecology, 43, pp.
424-432.
Kitron, U., Otieno, L.H., Hungerford, L.L., Odulaja, A., Brigham, W.U.,
Okello, O.O., Joselyn, M., Mohamed-Ahmed, M.M., Cook, E.,
(1996). Spatial analysis of the distribution of tsetse flies in the Lambwe
Valley ,Kenya, using Landsat TM satellite imagery and GIS. Journal
of Animal Ecology, 65, pp. 371-380.
Matawa F., Murwira, A., Schmidt, K.S., (2012). Explaining elephant
(Loxodonta africana) and buffalo (Syncerus caffer) spatial distribution
in the Zambezi Valley using maximum entropy modelling. Ecological
Modelling, 242, pp. 189- 197.
Matawa F., Murwira K.S., Shereni, W., Modelling the distribution of Glossina
spp. in the north western parts of Zimbabwe using remote sensing and
climate data, unpublished.
Rácz, G.R., Bán, E., Ferenczi, E., Berencsi, G., (2006). A simple spatial model
to explain the distribution of human tick-borne encephalitis cases in
Hungary. Vector-Borne and Zoonotic Diseases, 6(4). pp. 369-378.
Sigauke E., Mujere N., Mabiza C. (2009). Effects of conventional and
conversation tillage practices on soil properties, in Ward 1, Insiza District
(unpublished).

HGES408 Module

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

HGES408 Module

Uploaded by

Copyright:

Available Formats

Bachelor of Science

Geographic Information Systems

Module HGES 408

Content Reviewer: Rwasoka Donald T.

Editor: Mupunga Diana

P.O. Box MP1119

The Zimbabwe Open University is a distance teaching and open

Year: October 2013

Cover design: T. Ndhlovu

Layout and design: D. Satumba Nyandowe

Printed by: ZOU Press

Typeset in Times New Roman, 12 point on auto leading

© Zimbabwe Open University. All rights reserved. No part of this

The six hour tutorials should be so structured that the

Session I (Two Hours)

Session II (Two Hours)

Session III (Two Hours)

Conclusion for this course, but also to prepare yourself to

Module Overview ______________________________________________ 1

Unit One: Introduction to Geographical Information Systems

Unit Two: Conceptual Models of Real World Phenomena

Unit Three: Spatial Referencing I

Unit Four: Spatial Referencing II

Unit Five: Spatial Data Capture and Preparation

Unit Six: Spatial Data Management and Processing Systems

Unit Seven: Spatial Data Transformations

Unit Eight: Spatial Data Analysis I

Unit Nine: Spatial Data Analysis II: Neighbourhood Analysis and

Unit Ten: GIS Applications

Unit Ten covers the Geographic Information Systems applications. These

2 Zimbabwe Open University

1.3 Definition of Terms

1.3.1 What is Geographical Information Systems (GIS)?

4 Zimbabwe Open University

Zimbabwe Open University 5

1.3.2 What is data?

Spatial or geographic data can be defined as:

Temporal data can be defined as:

1.3.3 What is information?

Information is any kind of knowledge that is exchangeable amongst people,

6 Zimbabwe Open University

1.3.4 What is a model?

1.4 Components of GIS

1.4.1 GIS hardware

Zimbabwe Open University 7

1.5 Functions of GIS

1.5.1 Data capture/input

8 Zimbabwe Open University

a) Primary methods includes surveying, photogrammetry, GPS, and

1.5.2 Data compilation

1.5.3 Data storage (GIS data models)

Zimbabwe Open University 9

1.6 Advantages and Disadvantages of GIS

1.6.1 Advantages of GIS

1.6.2 Disadvantages of GIS

10 Zimbabwe Open University

Zimbabwe Open University 11

12 Zimbabwe Open University

Conceptual Models of Real

2.3 Geographic Phenomena

A geographic field is a geographic phenomenon in which for every point in

Some phenomena do not manifest themselves everywhere but only at certain

14 Zimbabwe Open University

Fields can be distinguished by what varies and how smoothly. A field of