Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K.

Pathan

GEOGRAPHIC INFORMATION SYSTEM (GIS) - BASICS

by

S.K. Pathan
Group Director, UPDG / EPSA
Space Applications Centre (ISRO), Govt. of India
Ahmedabad – 380 015.

1.0 INTRODUCTION

Land is one of the economic entities that can not be re-produced. The pressure of growing
population, increased demands for food, fodder and fuel wood combined with intensive industrial activity
have led to large scale environmental degradation and ecological imbalance. Rapid deformation of land is
another major problem. Therefore, conservation of renewable natural resources such as land and water at
micro level is a must. It essentially means that one needs to have comprehensive development plan so as
to make productive use of land and all its natural resources in an optimal way. Such development
requires systematic, detailed, reliable, accurate and timely information on the extent and spatial
distribution of various natural resources, socio-economic, demographic patterns and cultural structures of
the inhabitants. The data collected on different aspects of the natural resources has to be translated into
useful information and converted into user defined formats. Then there is a need to aggregate this
information according to administrative and natural resource units. The experiences gained regarding the
existing natural resource information system in different fields of development clearly bring out the fact of
several short comings in regard to acquisition of statistics, processing, generation of graphic outputs and
their storing. In fact, these short coming acts as serious predicament for efficient and meaningful
planning including implementation of programmes and monitoring of development. The latest
developments in computer hardware and software are competent of meeting these needs i.e. handling of
both alpha-numeric and graphic databases. The computer software which meets this need is ‘Geographic
Information System (GIS)’. It is in this context, GIS play a major role by providing linkage between the
spatial information domain (land parcels) and non spatial information (attributes of land parcels). So, GIS
is a particular form of Information System that is applied to geographical data. Accordingly, the GIS has
the capacity of full range of functions to achieve its purpose related to observations, measurement,
description, explanation, forecasting and decision-making. Therefore, the objective of this lecture notes is
to provide you the full range of capabilities of GIS with a specific reference to ‘Natural Resources
Information System)’.

2.0 GEOGRAPHICAL INFORMATION SYSTEM (GIS)

GIS, basically refers to the science and technology dealing with the character and structure of
spatial information, its methods of capture, organisation, classification, qualification, analysis,
management, display and dissemination as well as the infrastructure necessary for the optimal use of
the information. Technically, GIS is geographic information systems which includes mapping software
and its application with remote sensing, land surveying, aerial photography, mathematics,
photogrammetry, geography, and tools that can be implemented with GIS software. In the strictest
sense, the term describes any information system that integrates, stores, edits, analyzes, shares, and
displays geographic information. In a more generic sense, GIS applications are tools that allow users to
create interactive queries (user created searches), analyze spatial information, edit data, maps, and
present the results of all these operations. Therefore, Geographic Information Science is the science
underlying the geographic concepts, applications and systems. In simplest terms, GIS is the merging of
graphic map entities and databases. Consumer users are very much familiar with applications for finding
driving directions, like a GPS program on a hand-held device. Hence, GIS is defined as "an automated tool
useful to capture, storage, retrieval and manipulation, display and querying of both spatial and non-spatial
data to generate various planning scenarios for decision making".
UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 1
Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

2.1 Components of GIS

The environment in which a GIS operates is defined by hardware (the machinery including a host
computer), a digitizer or scanner for converting the input data, a plotter for presentation of processed
outputs and video display unit for commanding the system by a user, the software (programs that tell the
computer what to do) and the data. In this context GIS can be seen as a system of hardware, software
and procedures designed to support the capture, management, manipulation, analysis, modeling and
display of spatially-referenced data for solving complex planning and management problems. In all, GIS
constitutes of five key components viz. i) hardware, ii) software, iii) data, iv) people and v) methods
(Figure-1).

2.1.1. Hardware

The hardware comprises the computer system on which the GIS software will run. The choice of
hardware system range personal computers (desktop/laptop) to super computers having capability in
Tera FLOPS. The hardware forms the backbone of the GIS which gets the input from digitizer board
(manual digitization) or from a scanner (automatic scanning of map). A digitizer board is flat board used
for vectorization of a given map objects. Scanner converts a map into a raster image for further
processing. The output of scanner can be stored in many formats viz. .tiff, .bmp, .jpg etc. Printers and
plotters are the most common output devices for a GIS hardware setup.

Figure-1: Components of GIS

2.1.2 Software

GIS software provides the functions and tools needed to store, analyze, and display geographic
information. GIS softwares in use are plenty viz. Arc/Info (ESRI), MapInfo, Geomedia (Intergraph),
AutoCAD Map, Smallworld, Bentley etc. and the latest in the market is IGIS software which has adopted
OGC open standard geo data model. This is a low cost single software supports GIS as well as image
processing functions. It is totally customized and easy to operate with click of buttons to execute
functions. However, user is at will to opt GIS software depending upon the requirements at hand.

2.1.3 Data

Geographic data can be either collected in-house or can be purchased from a commercial vendor.
The digital input of the map forms the basic data for GIS operations. Tabular data can be attached with
respect to each geographic feature using unique identification number of the features.

UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 2


Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

2.1.4 People

GIS users can range from resource specialists who design and maintain the system to
administrators, planners who use the database and information system to carry out their everyday work.

2.1.5 Method

And above all a successful GIS operates according to a systematic and well-designed
conceptualized plan, which are the models and operating procedures unique to each research and
planning organization. There are various techniques used to create the database, integrate it, analyze
with different multivariate index models, query retrieval and presentation in good cartographic quality
outputs.

3.0 GEOGRAPHIC DATA

Geographic data represents real world objects (buildings, roads, agriculture, forest, water bodies
etc.) in digital form. Real world objects can be divided into two abstractions such as discrete objects (a
building) and continuous fields (agricultural/forest/terrain etc.). The data about the real world objects is
normally available in the form of i) maps/co-ordinates, ii) attributes and iii) images.

3.1 Map/Co-ordinate data: Map/coordinate data contains the location and shape of geographic
features. Maps use three basic shapes to present real-world features in the form of points, lines, and
areas (called polygons).

3.2 Attribute data: Attribute (tabular) data is the descriptive data that GIS links to map features.
Attribute data is collected and compiled for specific areas like states, census tracts, cities, and so on and
often comes packaged with map data. When implementing a GIS, the most common sources of attribute
data are your own organization's databases combined with data sets you buy or acquire from other
sources to fill in gaps.
3.3 Image data: Image data ranges from satellite images and aerial photographs to scanned
maps (maps that have been converted from printed to digital format).

4.0 DATA CAPTURE:

The entire geographic variation on the earth as mentioned above can be captured into GIS using
points, lines and polygons (areas). Symbols and labels are used to describe these features. Points define
discrete location of geographic features which are too small to be depicted as lines or areas i.e. telephone
poles, wells, electric poles etc. Lines represent the shape of geographic objects too narrow to depict as
areas, such as roads, canals, rivers, and contours which have length but no area. Areas are closed
polygons that represent the shape and location of homogeneous, real world features such as
administrative boundaries (state, district, taluka etc.), land use types, soil classes, vegetation categories
etc. However, the map scale dictates the type of map feature used to represent a real-world geographic
feature. For example, a polygon feature on large scale may appear as a point on small scale. Because
small scale maps depict large ground areas with low spatial resolution and represent little detail. On the
other hand, large scale maps depict small ground areas with high spatial resolution and represent more
details.

The entire geographic data is captured in GIS environment either by digitization process or
scanning the map. Data can also be captured by converting an ASCII formatted file (GPS co-ordinates,
ground survey co-ordinates etc.) and by converting digital data from different sources (Figure-2).

UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 3


Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

Figure-2 : Data Capture in GIS environment

The co-ordinates after digitization will be in default units i.e. inches. These coordinates have to
be projected to real world geographic coordinates using transformation functions available in the
computer system. There are about two hundred map projections available to preserve the basic
properties (shape, size, direction and distance) of the object when a 3D surface (Globe) projected on to a
2D surface (on a two dimensional paper). The two commonly used map projections in the country are: i)
Polyconic and ii) Universal Transverse Mercator (UTM) projection. Latitude and longitude term is
commonly used as a geographic reference system. The height of the object is measured with reference to
a standard datum. The most commonly used datums with reference to topographic maps in the country
are i) Mean sea level (Defense Series Maps) and ii) WGS84 (Open Series Maps).

5.0 DATA STORAGE:

As it is discussed earlier, geographic variation on the earth's surface is primarily represented on a


map as points (wells, telephone poles etc..), lines (roads, canals, drainage etc.) and polygons (areas under
Agriculture, Forest etc.). Current GIS differ according to the way in which they organize reality through
the data model. Each model tends to fit certain types of data and applications better than others. The
data model chosen for a particular project or application is also influenced by the software available, the
training of the key individuals, historical precedent.

5.1 Data models:

The procedure used to convert the geographic variation into discrete objects in GIS environment
is called a data model. Data models are the rules the GIS follows, such as "contour lines do not overlap,"
and are essential for defining what is in the GIS as well as supporting the use of GIS software. All spatial
data models fall into two basic categories as i) Vector data model and ii) Raster data model. Discrete
features, such as customer locations and data summarized by area, are usually represented using the
vector model. Continuous numeric values, such as elevation, and continuous categories, such as
vegetation types, are represented using the raster model. In addition, any feature type can be
represented using either model. In a GIS, geographical features are often expressed as vectors, by
considering those features as geometrical shapes. Different geographical features are expressed by
different types of geometry:

UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 4


Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

 Points
Zero-dimensional points are used for geographical features that can best be expressed by a
single point reference; in other words, simple location. For example, the locations of dug wells,
electrical poles etc. Points convey the least amount of information of these file types. Points can
also be used to represent areas when displayed at a small scale. For example, cities on a map of
the world would be represented by points rather than polygons. No measurements are possible
with point features.
 Lines or poly lines
One-dimensional lines or poly lines are used for linear features such as rivers, roads, canals,
drainage etc. Again, as with point features, linear features displayed at a small scale will be
represented as linear features rather than as a polygon. Line features can measure distance.
 Polygons
Two-dimensional polygons are used for geographical features that cover a particular area of the
earth's surface. Such features may include agricultural areas, forest areas, water bodies etc.
Polygon features can measure perimeter and area.

Each of these geometries is linked to a row in a database that describes their attributes. For
example, a database that describes water body may contain its depth, water quality, pollution level. This
information can be used to make a map to describe a particular attribute of the dataset. For example,
water body could be colored depending on level of pollution. Different geometries can also be compared.
For example, the GIS could be used to identify all wells (point geometry) that are within a radius of 2 km
of a lake (polygon geometry) that has a high level of pollution. Vector features can be made to respect
spatial integrity through the application of topology rules such as 'polygons must not overlap'. Vector
data can also be used to represent continuously varying phenomena. Contour lines and triangulated
irregular networks (TIN) are used to represent elevation or other continuously changing values. TINs
record values at point locations, which are connected by lines to form an irregular mesh of triangles. The
face of the triangles represents the terrain surface.

5.1.1 The vector or object GIS : The vector model uses discrete line segments or points to
identify locations. Discrete objects (boundaries, streams, cities) are formed by connecting line segments.
Vector objects do not necessarily fill space, not all locations in space need to be referenced in the
model. Thus, the Vector data model is based on vectors (as opposed to space-occupancy raster
structures). The fundamental primitive is a point and is represented using an x, y (Cartesian) coordinate
system. Each point is recorded as a single location. Lines are recorded as a series of x, y coordinates.
Areas are defined by sets of lines. Areas are recorded as a series of x, y coordinates defining area that
enclose the area. It is important to note that the points i.e. x, y pairs along the arc are called vertices and
the end points of the arc are called nodes in GIS terminology (Figure-3). Arcs join only at nodes. The
term polygon is synonymous with area in vector databases because of the use of straight-line
connections between points. It is because of this, vector model tends to dominate in transportation,
utility, marketing applications. However, raster and vector data models are used in resource
management applications.

Figure-3 : Structure of vector coverage – Polygons, lines and points


UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 5
Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

Coverage is the basic unit of storage in vector GIS. It is a digital version of a single map sheet
layer and generally contains one type of map feature such as roads, vegetation types, soil types, land use
patterns etc. Coverage contains both the locational data and the descriptive data for features in a given
geographic area. Coverage is stored as a directory that contains related files describing the location and
attributes of the features.

5.1.2 Raster data model: In raster data model, the map is divided into a regular grid of square or
rectangular cells which are individually coded. The conventional sequence is row by row from the top left
corner, each cell contains a single value. Simple rasters are limited by the area it can represent within the
limitations of storage. The fineness of the data is limited by the cell size. The raster GIS is space-filling
since every location in the study area corresponds to a cell in the raster. One set of cells and associated
values is a layer. There may be many layers in a database, e.g. soil type, elevation, land use, land cover
etc. Conceptually, the raster models are the simplest of the available data models. However, they occupy
more storage space. This problem is now handled by resorting to coding such as run-length coding, chain
coding, block coding etc. A more elegant structure in raster data model is the quad tree structure. In
quad tree structure, the area of interest is recursively decomposed into smaller grids and the
decomposition continues till each of the smallest grid represent a homogeneous area. Therefore, quad
tree describes a class of hierarchical data structure. The resolution of decomposition depends upon the
number of times the decomposition process is applied. The storage requirements of a quad tree are lower
than that of the simple raster. Hence, these days, much attention is given to quad tree structures. Recent
advancements in this area suggests a variety of quad tree data structures. Among them, Region quad
tree and Polygon Map (PM) quad tree structures are famous.

5.2 Raster versus Vector data models

A raster model tells what occurs everywhere, at each place in the area. A vector model tells
where everything occurs, gives a location to every object. Vector data is precise and has no approximate
errors for measured quantities like area, length, perimeter etc. Raster data suffers to present precise
details of measured quantities due to the discretization. Generally, raster data has higher storage
requirements. However, overlay and spatial analysis operations are computationally faster than vector
data. Raster data is not easily amenable to association of attribute data with spatial features such as
points, lines and polygons. This is primarily because of the fact that the basic entity in raster data is the
grid cell and the entities such as points, lines or polygons are not recognized as objects in their own
merit. Most of the GIS's in the market choose one of these two data models as the primary method of
representing spatial data and provide conversion utilities from one form to the other.

5.2.1 Advantages and disadvantages

There are advantages and disadvantages to using a raster or vector data model to represent
reality. Raster datasets record a value for all points in the area covered which may require more storage
space than representing data in a vector format that can store data only where needed. Raster data also
allows easy implementation of overlay operations, which are more difficult with vector data. Vector data
can be displayed as vector graphics used on traditional maps, whereas raster data will appear as an
image that, depending on the resolution of the raster file, may have a blocky appearance for object
boundaries. Vector data can be easier to register, scale, and re-project. This can simplify combining
vector layers from different sources. Vector data is more compatible with relational database
environments. They can be part of a relational table as a normal column and processed using a multitude
of operators.

The file size for vector data is usually much smaller for storage and sharing than raster data.
Image or raster data can be 10 to 100 times larger than vector data depending on the resolution. Another
advantage of vector data is that it is easy to update and maintain. For example, a new highway is added.
The raster image will have to be completely reproduced, but the vector data, "roads," can be easily
updated by adding the missing road segment. In addition, vector data allows much more analysis
capability, especially for "networks" such as roads, power, rail, telecommunications, etc. For example,
UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 6
Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

with vector data attributed with the characteristics of roads, ports, and airfields, allows the analyst to
query for the best route or method of transportation. In the vector data, the analyst can query the data for
the largest port with an airfield within 60 miles and a connecting road that is at least two lane highway.
Raster data will not have all the characteristics of the features it displays.

5.3 Defining spatial relationships

Spatial relationships in GIS are defined by topology. Topology is a mathematical procedure for
explicitly defining properties and spatial relationships of geograhic features which include, connectivity
of lines, direction of a line, length of a line, adjacency (contiguity) of areas, relative position, intersection,
elevation difference, definition of areas etc. There are two data models exist to define these relationships.
They are a) topologic model and b) Non-topologic model. A topologic data model stores data efficiently
and provides the framework for advanced geographic analysis. The model builds areas from the list of
individual lines that define area borders. The system stores linear co-ordinates only once because two
areas that are adjacent may share the common line between them. In contrast, the non-topologic data
model stores each closed area as a single entity. The line shared by adjacent areas must be entered and
stored twice, either by double digitizing or copying the line. This duplicate data makes geographic
analysis difficult because of the system's inability to observe topologic relationships between areas that
share a common border. The non topologic model is a common data model supported by many computer
aided drafting (CAD), mapping and graphic systems.

The topology helps to perform various types of overlaying analysis and modelling. The
topological model is further divided into two types viz. i) Planar topology and ii) Network topology. In
planar topology, the boundaries of categories must adjoin each other exactly and there should not be any
gap or overlap between them. Where as, the network topology contains edges that have nodes at their
end points. A node can be connected to either one or more edges. This assemblage of nodes and edges is
called ‘a geometric network’. Therefore, a perfect coverage must have a topology for any spatial analysis
in GIS. The relationships can not be established without building the topology.

5.4 Attribute data in GIS

A separate data model is used to store and maintain attribute data for GIS software. These data
models may exist internally within the GIS software, or may be reflected in external commercial
‘Database Management Software (DBMS)’. A variety of different data models exist for the storage and
management of attribute data. The most common attribute data models used in GIS are i) tabular, ii)
hierarchial, iii) network, iv) relational and v) object oriented (Figure-4).

The tabular model is the manner in which most early GIS software packages stored their attribute
data. The next three models are those most commonly implemented in database management systems
(DBMS). The object oriented is newer but rapidly gaining in popularity for some applications. A brief
review of each model is provided.

5.4.1 Tabular Model

The simple tabular model stores attribute data as sequential data files with fixed formats (or
comma delimited for ASCII data), for the location of attribute values in a predefined record structure. This
type of data model is outdated in the GIS arena. It lacks any method of checking data integrity, as well as
being inefficient with respect to data storage, e.g. limited indexing capability for attributes or records,
etc.

UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 7


Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

Figure-4 : Attribute Data Models

5.4.2 Hierarchical Model

The hierarchical database organizes data in a tree structure. Data is structured downward in a
hierarchy of tables. Any level in the hierarchy can have unlimited children, but any child can have only
one parent. Though, it is a good model to represent the attribute data, it has not gained any noticeable
acceptance for use within GIS. They are oriented for data sets that are very stable, where primary
relationships among the data change infrequently or never at all. Also, the limitation on the number of
parents that an element may have is not always conducive to actual geographic phenomenon.

5.4.3 Network Model

The network database organizes data in a network or plex structure. Any column in a plex
structure can be linked to any other. Like a tree structure, a plex structure can be described in terms of
parents and children. This model allows for children to have more than one parent.

Network DBMS have not found much more acceptance in GIS than the hierarchical DBMS. They
have the same flexibility limitations as hierarchical databases. However, the more powerful structure for
representing data relationships allows a more realistic modelling of geographic phenomenon. However,
network databases tend to become overly complex too easily. In this regard it is easy to lose control and
understanding of the relationships between elements.

5.4.4 Relational Model

The relational database management system (RDBMS) model organizes data in tables. Each
table, is identified by a unique table name, and is organized by rows and columns. Each column within a
table also has a unique name. Columns store the values for a specific attribute, e.g. cover group, tree
height. Rows represent one record in the table. In a GIS each row is usually linked to a separate spatial
feature, e.g. a residential area, agricultural cropped area, forest type etc. Accordingly, each row would be
comprised of several columns, each column containing a specific value for that geographic feature. Data
is often stored in several tables. Tables can be joined or referenced to each other by common columns
(relational fields). Usually the common column is an identification number for a selected geographic
feature, e.g. a residence number for residential area, parcel number for cropped area, division number for
forest type etc. This identification number acts as the primary key for the table. The ability to join tables

UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 8


Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

through use of a common column is the essence of the relational model. Such relational joins are usually
ad hoc in nature and form the basis of for querying in a relational GIS product. Unlike the other previously
discussed database types, relationships are implicit in the character of the data as opposed to explicit
characteristics of the database set up.

The relational database model is the most widely accepted for managing the attributes of
geographic data. There are many different designs of DBMSs, but in GIS the relational design has been
the most useful. In the relational design, data are stored conceptually as a collection of tables. Common
fields in different tables are used to link them together. This simple design has been so widely used
primarily because of its flexibility and very wide deployment in applications both within and without GIS.

In the relational design, data are stored conceptually as a collection of tables. Common fields in
different tables are used to link them together.

In fact, most GIS software provides an internal relational data model, as well as support for
commercial off-the-shelf (COTS) relational DBMS'. COTS DBMS' are referred to as external DBMS'. With an
external DBMS the GIS software can simply connect to the database, and the user can make use of the
inherent capabilities of the DBMS. External DBMS' tend to have much more extensive querying and data
integrity capabilities than the GIS' internal relational model.

The relational DBMS is attractive because of its i) simplicity in organisation and data modeling,
ii) flexibility where data can be manipulated in an ad hoc manner by joining tables, iii) efficiency of
storage with a proper design of data tables redundant data can be minimized and iv) the non procedural
nature i.e. queries on a relational database do not need to take into account the internal organisation of
the data.

5.4.5 Object-Oriented Model

The object-oriented database model manages data through objects. An object is a collection of
data elements and operations that together are considered a single entity. The object model describes
the state and its behavior. The object-oriented database is a relatively new model. This approach has the
attraction that querying is very natural, as features can be bundled together with attributes at the
database administrator's discretion. To date, only a few GIS packages are promoting the use of this
attribute data model. However, initial impressions indicate that this approach may hold many operational
benefits with respect to geographic data processing.

5.5 Connecting features and attributes

The importance of GIS lies in its link between the graphic (spatial) and the tabular (aspatial) data.
Basically there are three characteristics of this connection. They are i) There is one-to-one relationship
between features on the digital map and the records in the feature attribute table, ii) The link between
the feature and its record is maintained through a unique numerical identifier assigned to each feature
(label points) and iii) The unique identifier is physically stored in two places i.e. in the file that contain
the x, y coordinates and with the corresponding record in the feature attribute table. So, once this
connection is established one can query the digital map to display attribute information or create a map
based on the attributes stores in the feature attribute table.

6.0 SPATIAL ANALYSIS WITH GIS

Over the years, a vast range of spatial analysis techniques have been developed, any summary or
review can only cover the subject to a limited depth. This is a rapidly changing field, and GIS packages
are increasingly including analytical tools as standard built-in facilities or as optional toolsets, add-ins or
'analysts'. In many instances such facilities are provided by the original software suppliers (commercial
vendors or collaborative non commercial development teams), whilst in other cases facilities have been
developed and are provided by third parties. Furthermore, many products offer software development

UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 9


Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

kits (SDKs), programming languages and language support, scripting facilities and/or special interfaces
for developing one’s own analytical tools or variants. The website Geospatial Analysis and associated
book/ebook attempt to provide a reasonably comprehensive guide to the subject. The impact of these
myriad paths to perform spatial analysis create a new dimension to business intelligence termed "spatial
intelligence" which, when delivered via intranet, democratizes access to operational sorts not usually
privy to this type of information.

6.1 Spatial analysis operations

The combination of several spatial datasets (points, lines or polygons) creates a new output
vector dataset, visually similar to stacking several maps of the same region. These overlays are similar to
mathematical Venn diagram overlays. A union overlay combines the geographic features and attribute
tables of both inputs into a single new output. An intersect overlay defines the area where both inputs
overlap and retains a set of attribute fields for each. A symmetric difference overlay defines an output
area that includes the total area of both inputs except for the overlapping area. Data extraction is a GIS
process similar to vector overlay, though it can be used in either vector or raster data analysis. Rather
than combining the properties and features of both datasets, data extraction involves using a "clip" or
"mask" to extract the features of one data set that fall within the spatial extent of another dataset. Like
this, GIS provides number of analysis capabilities that operate on the topology or spatial aspects of
geographical data and on aspatial (attribute) data. It provides a number of spatial analysis functions
which signifies its importance on other information systems (P.A. Burrough, 1996). The integrated
information is useful to answer queries pertaining to the parameters defined in a particular theme. For
example, if a thematic map represents land use information, the associate tabular information will answer
all queries related to the extent and spatial distribution of various land uses. Integrated analysis will
help not only to know the land use details but also to know the soil depth, soil texture, slope etc
pertaining to a land use category. Thus GIS has many capabilities for spatial analysis and identifying
spatial relationships between the parameters. The details of some of the operations are as follows.

In vector based GIS, these operations are performed on two layers (maps) at a time to form a new
composite map through the geometric intersection of the features. The layer on which manipulation is
performed is called the input layer and the layer that controls the area of operation is called the analysis
layer. These operations are classified into three categories as a) Feature combination, b) Feature
extraction and c) Feature extraction and combination.

6.1.1 Feature Combination: There are two categories identified under this operation. UNION and
INTERSECT (Figure-5).

UNION process is used only on the maps having polygon features. It performs the geometric
combination of the features of the two themes. Similar to mathematical operation (Ex.: A is a set of
natural numbers with 1,2,3,4,5. B is a set of natural numbers with 4,5,6,7. The union of A and B sets will
be AUB=1,2,3,4,5,6,7) to keeps all the areas from both the coverages. New polygons in the output layer
are created by splitting the arcs of the input layers at the intersections. Therefore the number of polygons
in the output layer is more compared to the number of polygons in any of the input layers. This particular
operation is performed, when the application demands that the combined region and combined attributes
of the two input thematic layers are required for querying or for further analysis. INTERSECT operation is
performed to overlay points, lines or polygons on polygons but keep only those portions of the input
coverage features falling within the overlay coverage features i.e. A Ç B = 4,5 from the above example.
The input layer can have points, lines or polygons but the analysis layer must have polygon topology.
This particular operation is normally performed to find out the number of tube wells, dug wells in a
watershed, roads crossing the human settlements, land use distribution in a defined administrative
boundary etc.

UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 10


Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

Union Operation Intersection

Figure-5 : Union and Intersect analysis in Vector GIS

6.1.2 Feature extraction : In this operation, two input layers are overlaid similar to feature
combination analysis. However, the extent of analysis layer defines the area of interest to be retained or
removed. Three operations fall under this category viz. CLIP, ERASE and SPLIT. In CLIP operation, the
features that fall within the boundary of the analysis layer are retained in the resultant layer with the
attributes of the input layer only. In this operation, the input layer can be points, lines or polygons but the
analysis layer should contain polygons. This function is performed to extract a smaller data set from a
larger data set. ERASE is a reversal process of CLIP in which the features of the input layer within the
boundary of the analysis layer are erased and the features that fall outside the boundary of analysis layer
are retained. This is used to create a compliment of CLIP operation. SPLIT operation is the enhancement
of CLIP operation. This operation is normally carried out to produce an output layer of various CLIP
operations based on certain criteria. This operation is meant for advance users to perform a series of CLIP
operations through a single command.

6.1.3 Feature Combination and Extraction : Two spatial operations fall under this category. They
are UPDATE and IDENTITY. UPDATE operation is used to perform cut and paste analysis. Similar to
earlier operations, the analysis layer defines the area of control of the input layer that needs to be
updated. Thus, the output layer will have features from the input layer in the non-overlapping area and
the features from the analysis layer will be in the overlapping area. It is used to generate a time series
map showing the changes in thematic information through time. For example, showing the change in
land use pattern through time, Urban sprawl through time etc. IDENTITY operation is performed to create
an output layer by combining the features of the overlapping areas of input and analysis layers. It is
mostly used to preserve the boundaries of thematic map precisely. For example, showing the district
boundaries in the state map without there being any mismatch between the district boundaries and the
state boundary. This operation is also performed to overlay points, lines or polygons on polygons and
keep all input coverage features.

In all the integration operations, a large number of small polygons are formed at the edges of the
boundary of the output layer due to the mismatch of the boundaries of input layers. These small polygons
in GIS terminology are termed as "SLIVERS". These polygons have to be removed before the output layer
of integrated operation is taken for further use. The slivers are normally removed on the basis of the
minimum mapping unit (mmu) area of the thematic map scale.

GIS is also useful to construct the proximity boundaries for polygons at a distance specified by
the user. These resultant polygons are known as proximity polygons or buffer zones (Figure-6). The
distance input used for the operation is called as " search radius" or "buffer distance". Buffers can be
generated on points, lines and polygon (both inside and outside of polygon) features. This operation is
basically carried out to find out the closeness or proximity between the features. This function is
UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 11
Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

necessary in spatial data analysis to find out the distance of a road from a settlement, ownership details
around a mining area, water bodies, flood hazard zones etc. The spatial analysis techniques mentioned
above provide a variety of choices to user to carry out spatial modelling exercises. However, it all
depends upon the technical understanding and experience of the user to identify the need of a particular
function.

Figure-6 : Buffer analysis

6.1.4 Network analysis

If all the factories near a wetland were accidentally to release chemicals into the river at the same
time, how long would it take for a damaging amount of pollutant to enter the wetland reserve? A GIS can
simulate the routing of materials along a linear network. Values such as slope, speed limit, or pipe
diameter can be incorporated into network modeling in order to represent the flow of the phenomenon
more accurately. In addition, one can employ network analysis for the generation of optimum paths to
travel in city for accessing a facility with either distance or time criteria. Therefore, Network modeling is
commonly employed in transportation planning, hydrology modeling, and infrastructure modeling.

7.0 DATA DISPLAY AND QUERYING

Both spatial and attribute data can be displayed on the computer screen. It is known that the
geographic features are stored as points, lines and polygons in GIS, they can be displayed as separate
layers or in combination with each other. The displays can be sent to a hard copy output device i.e.
plotter or printer.

In GIS, one can determine what exists at a particular location. For this, one must first specify the
location of an object or region for which the information is needed. Commonly used methods of
specification are : Pointing to the object or region, typing in a particular address or typing in a co-ordinate
location. After the specification of the object or location, one can obtain either all of its characteristics or
some of its characteristics viz. Location address, current owner, existing land use, assessed value etc.
The second function in GIS is that one can obtain information on locations satisfying certain conditions.
For this, one can specify the condition as : making selections from a set of predefined options, writing
logical expressions and filling out forms interactively on a terminal. After specifying the logical condition,
one can obtain the list of all objects meeting the specified conditions and also display highlighting all
features meeting the specified conditions.

8.0 DATA OUTPUTS

Visual representations of spatial data in GIS are done through map composition procedures
based on cartographic principles. The vast majority of modern cartography is done with the help of
computers, usually using a GIS but production quality cartography is also achieved by importing layers
into a design program to refine it. Most GIS software gives the user substantial control over the
appearance of the data. The cartographic work in GIS serves two major functions. First, it produces
graphics on the screen or on paper that convey the results of analysis to the people who make decisions
about resources. Wall maps and other graphics can be generated, allowing the viewer to visualize and

UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 12


Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

thereby understand the results of analyses or simulations of potential events. Web Map Servers facilitate
distribution of generated maps through web browsers using various implementations of web-based
application programming interfaces (AJAX, Java, Flash, etc). Second, other database information can be
generated for further analysis or use. An example would be a list of all addresses within one 2 km of a
toxic spill.

8.1 Graphic display techniques


Traditional maps are abstractions of the real world, a sampling of important elements portrayed
on a sheet of paper with symbols to represent physical objects. People who use maps must interpret
these symbols. Topographic maps show the shape of land surface with contour lines or with shaded
relief. Today, graphic display techniques such as shading based on altitude in a GIS can make
relationships among map elements visible, heightening one's ability to extract and analyze information.
For example, two types of data were combined in a GIS to produce a perspective view of a portion of
Himalayas in the northern part of the country.

9.0 THE FUTURE OF GIS

Today, many sectors viz. natural resources, socio-economic, infrastructure etc are benefiting from
GIS technology. In addition, active GIS market has resulted in lower costs and continual improvements in
the hardware and software components of GIS. These developments will, in turn, result in a much wider
use of the technology throughout science, government, business, and industry, with applications
including real estate, public health, crime mapping, national defense, sustainable development, natural
resources, landscape architecture, archaeology, regional and community planning, transportation and
logistics. GIS is also diverging into location-based services (LBS). LBS allows GPS enabled mobile devices
to display their location in relation to fixed assets (nearest restaurant, gas station, fire hydrant), mobile
assets (friends, children, police car) or to relay their position back to a central server for display or other
processing. These services continue to develop with the increased integration of GPS functionality with
increasingly powerful mobile electronics (cell phones, PDAs, laptops).

9.1 OGC standards

The Open Geospatial Consortium (OGC) is an international industry consortium of 384 companies,
government agencies, universities and individuals participating in a consensus process to develop
publicly available geo-processing specifications. Open interfaces and protocols defined by Open GIS
Specifications support interoperable solutions that "geo-enable" the Web, wireless and location-based
services, and mainstream IT, and empower technology developers to make complex spatial information
and services accessible and useful with all kinds of applications. Open Geospatial Consortium (OGC)
protocols include Web Map Service (WMS) and Web Feature Service (WFS).

9.1.1 Web mapping

In recent years there has been an explosion of mapping applications on the web such as Google
Maps (www.wikimapia.com) and Bhuvan (www.nrsc.bhuvan.com) and NRDB (www.nnrms.gov.in)
resource maps (India). These websites give the public access to huge amounts of geographic data. Some
of them, like Google Maps and Open Layers, expose an API that enable users to create custom
applications. These toolkits commonly offer street maps, aerial/satellite imagery, geo-coding, searches,
and routing functionality.

Other applications for publishing geographic information on the web include MapInfo's
MapXtreme or PlanAcess[1] or Stratus Connect, Cadcorp's GeognoSIS, Intergraph's GeoMedia WebMap
(TM), ESRI's ArcIMS, ArcGIS Server, Autodesk's Mapguide, SeaTrails' AtlasAlive, and the open source
MapServer or GeoServer. In recent years web mapping services have begun to adopt features more
common in GIS. Services such as Google Maps and Live Maps allow users to annotate maps and share
the maps with others.

UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 13


Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

9.1.2 Adding the dimension of time

The condition of the Earth's surface, atmosphere, and subsurface can be examined by feeding
satellite data into a GIS. GIS technology gives researchers the ability to examine the variations in Earth
processes over days, months, and years. As an example, the changes in vegetation vigor through a
growing season can be animated to determine when drought was most extensive in a particular region.
The resulting graphic, known as a normalized vegetation index, represents a rough measure of plant
health. Working with two variables over time would then allow researchers to detect regional differences
in the lag between a decline in rainfall and its effect on vegetation.

In addition, GIS and related technology will help greatly in the management and analysis of these
large volumes of data, allowing for better understanding of terrestrial processes and better management
of human activities to maintain world economic vitality and environmental quality. Using models to
project the data held by a GIS forward in time have enabled planners to test policy decisions. These
systems are known as Spatial Decision Support Systems.

9.1.3 Semantics

Tools and technologies emerging from the W3C's Semantic Web Activity are proving useful for
data integration problems in information systems. Correspondingly, such technologies have been
proposed as a means to facilitate interoperability and data reuse among GIS applications and also to
enable new analysis mechanisms. Recent research results in this area can be seen in the International
Conference on Geospatial Semantics and the Terra Cognita -- Directions to the Geospatial Semantic Web
workshop at the International Semantic Web Conference.

9.1.4 Society

With the popularization of GIS in decision making, scholars have begun to scrutinize the social
implications of GIS. It has been argued that the production, distribution, utilization, and representation of
geographic information are largely related with the social context. Other related topics include discussion
on copyright, privacy, and censorship. A more optimistic social approach to GIS adoption is to use it as a
tool for public participation.

10.0 CONCLUSIONS

Though, the GIS is originated in the mid 1960's and it is continuous history since then. However,
many see GIS as a phenomenon of the late 1980's. This is primarily because of developments in software,
cost-effectiveness of hardware. The expansion has also been fuelled by continuing advances in
computing technology, increasing awareness of digital databases, new application areas etc. The
fundamental question now is that how long can this growth continue? The growth of GIS will continue, if
the institutions provide focus on programs, journals, magazines, books, conferences and raise awareness
of GIS technology and its applications in education and training (the effective use of spatial information
requires higher levels of training than the word processing softwares). The prospects for the future of GIS
lie in automated geography i.e. to create "paperless map library" on the similar lines of "paperless office".
The potential of automated geography may lead to much greater levels of use if people have better access
to the database, easy procedures to use the database and usage of geographical database more
frequently by planners, administrators etc. for various decision making processes.

REFERENCES:

 Bolstad, P. (2005) GIS Fundamentals: A first text on Geographic Information Systems, Second Edition.
White Bear Lake, MN: Eider Press, 543 pp.

 Burrough, P.A. and McDonnell, R.A. (1998) Principles of geographical information systems. Oxford
University Press, Oxford, 327 pp.
UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 14
Training on GIS/Lecture Notes on Geographic Information System – Basics by Dr. S.K. Pathan

 Chang, K. (2007) Introduction to Geographic Information System, 4th Edition. McGraw Hill.
de Smith M J, Goodchild M F, Longley P A (2007) Geospatial analysis: A comprehensive guide to
principles, techniques and software tools", 2nd edition, Troubador, UK available free online at:

 Elangovan,K (2006)"GIS: Fundamentals, Applications and Implementations", New India Publishing


Agency, New Delhi"208 pp.

 Geographic Information System (GIS) Educational website — Educational site with PDF lessons and
videos to accompany free GIS software.

 Geospatial Analysis - a comprehensive guide. 2nd edition © 2006-2008 de Smith, Goodchild, Longley
Harvey, Francis(2008) A Primer of GIS, Fundamental geographic and cartographic concepts. The
Guilford Press, 31 pp.

 Heywood, I., Cornelius, S., and Carver, S. (2006) An Introduction to Geographical Information Systems.
Prentice Hall. 3rd edition.
http://www.opengeospatial.org/ogc/members

 Longley, P.A., Goodchild, M.F., Maguire, D.J. and Rhind, D.W. (2005) Geographic Information Systems
and Science. Chichester: Wiley. 2nd edition.

 Maguire, D.J., Goodchild M.F., Rhind D.W. (1997) "Geographic Information Systems: principles, and
applications" Longman Scientific and Technical, Harlow.

 NCGIA Core Curriculum, Volumes - 1,2 and 3, Michael F Goodchild and Karen K Kemp, d (1990), A
NCGIA, University of California, Santa Barbara, USA publication.

 Pathan S.K. and Navalgund R.R., "The role of Geomatics in Natural Resources development and
planning", 1996, Proceedings of Indian Society of Geomatics.

 Thurston, J., Poiker, T.K. and J. Patrick Moore. (2003) Integrated Geospatial Technologies: A Guide to
GPS, GIS, and Data Logging. Hoboken, New Jersey: Wiley.

 Tutorial notes on Geographic Information System, GIS-The Basics, Indian Society of Geomatics, 1996,
Space Applications Centre (ISRO).

UPDG/EPSA/Training/GIS/Lecture No.1/October 2010 Page 15

You might also like