Professional Documents
Culture Documents
Unit 3 GIS DATABASE SPATIL (Raster, Vecctor) AND NON-SPATIAL
Unit 3 GIS DATABASE SPATIL (Raster, Vecctor) AND NON-SPATIAL
and Non-spatial 2
2.1 Introduction
2.2 Types of GIS Database
The database is one of the most important com-
ponents in GIS and acts as a warehouse or The database plays a major role in providing input
storeroom as well as providing a backbone of any in the GIS environment. The analysis and
GIS analysis. It is a collection of data stored in a results totally depend upon these databases. In a
real-world system, all the components and fea- 2.3 Spatial Database
tures of the Earth are integrated by a cause-and-
effect relationship; for example, wheat crops All geographical or spatial features of the world
require particular climate zones, seasonal patterns, can be represented and stored in the form of a
elevations, soils with appropriate levels of water map. This is basically a set of points, lines and
and micro-nutrients —among many other similar polygons (area). These features correspond to a
criteria—for particular customers. But in the GIS uniquely defined location on the Earth’s surface.
environment, we segregate them as different Spatial data have also been described as ‘any
themes or features or as a layer for the generation data concerning phenomena spatially distributed’
of an individual database. In other words, the real in two or more dimensions. These spatial features
features of the Earth are stored as different layers are digitally stored in vector and raster data
according to different features (Fig. 2.1). Theo- structures.
retically, layers are individual parts or slices or
strata of a real-world system which is separately
represented in the legend in a map with different 2.3.1 Spatial Data Representation
symbology; for example, political boundaries, in Vector Format
road networks, land use, soil type, climate, water
bodies etc. can be considered as different layers in Vector data structures represent geographic data
the GIS environment. These layers may be in the by locating x and y coordinates (longitude and
form of vector or raster data formats. latitude) and are represented in point, line and
Basically, two types of database are used in a polygon features (Fig. 2.3). The vector repre-
GIS environment: one is a spatial database and sents the object as exactly as possible. The
the other is non-spatial (Fig. 2.2). The spatial coordinate space is assumed to be continuous,
database can have a vector data structure or raster allowing all positions, lengths and dimensions to
data structure. be defined precisely.
Point feature This is defined as a dimensionless,
single geometric position. It occupies a discrete
location and is located by a pair of x and y co-
ordinates such as latitude and longitude on the
Earth’s surface, to locate features such as eleva-
tion points, electric poles, houses, wells or towns
using a unique symbology.
Line, string or polyline features These are
obtained by connecting two or more points to
form a second type of spatial feature called a line
or polyline which is one dimension (the feature
has no width) and stretches in one direction only.
A line connects two points and a string or
polyline is a sequence of two or more connected
lines; alternatively we can call it a series of
connected x–y coordinate pairs. An arc, chain
and string comprise a set of x and y coordinate
pairs describing a continuous complex line.
A line consists of two ends known as nodes and
Fig. 2.1 Real-world system and layers in a GIS two lines are connected by these nodes only. The
environment straight line has fewer nodes or x–y coordinates,
2.3 Spatial Database 17
The resolution or scale of the raster data has a in the garden, but if it is 0.5 m (high resolution),
relation with the pixel or cell size and it can represent it very clearly in the images.
ground-feature size. If the pixel size is very The raster data model represents phenomena
small, known as high-resolution imagery, then it or features using a tessellation approach. This is
could be representing detailed features of the the partition of space into cells that together
Earth’s surface. For large-scale mapping, if the depict the geographical phenomena. There are
pixel size is larger than the feature, then it cannot regular and irregular tessellations. The regular
represent the small features; this is known as tessellations have small shapes and sizes of cells
coarse-resolution imagery (Fig. 2.7). For exam- and the values of the cells show their attributes.
ple, if the pixel size is 1000 m (coarse resolution) There are three regular tessellations—i.e.
then it would not be able to represent the fountain hexagonal, triangular and square (Fig. 2.8).
2.3 Spatial Database 19
The square cells are widely used to represent the other hand, overlay operations such as union and
Earth’s features. The regular pattern leads to fast intersection are difficult to perform without
algorithms for storage, retrieval and analysis but returning to a full grid representation. The
does not adapt to real spatial phenomena. boundaries between regions should be stored
There are four main ways to represent irreg- twice for data redundancy (Burrough 1986).
ular tessellations, such as chain codes, run-length In this example, the 69 cells have been com-
codes, block codes and quadtrees codes. These pletely coded by 22 numbers, thereby effecting a
are mainly used for compacting the data in order considerable reduction in the space needed to
to reduce the storage space. In the chain codes, store the data (many-to-one relation). On the
the boundary of the region in raster data can be other hand, too much data compression may lead
given in terms of its origin and a sequence of unit to increased processing requirements during
vectors in the cardinal directions (Fig. 2.9). The cartographic processing and manipulation.
directions can be numbered (East = 0, North = With block code, the idea of run-length codes
1, West = 2, South = 3); for example, if we start can be extended to two dimensions by using
at cell row = 10, column = 1, the boundary of square blocks to tile the area to be mapped. The
the region is coded clockwise. The run-length data structure consists of just three features: the
code allows the points in each mapping unit to be number, origin (the centre or bottom left) and
stored per row in terms, from left to right, of a radius of each square. This is called a medial axis
beginning cell and an end cell (Fig. 2.10). transformation (MAT) (Burrough et al. 2015).
Chain codes provide a very compact way of The region can be stored in 17 unit squares +
storing a regional representation and they allow 94-squares + 116-squares. Given that two coor-
the estimation of areas and perimeters. On the dinates are needed for each square, the region can
20 2 GIS Databases: Spatial and Non-spatial
Example of Chain-code
0, 1, 02, 3, 02, 1, 0, 3, 0, 1, 03, 32, 2, 33, 02, 1, 05, 32, 22, 3, 23, 3, 23, 1, 22, 1, 22, 1,
22, 1, 22, 13.
Table 2.1 Advantages and disadvantages of raster and vector data structures
Raster methods
Advantages Disadvantages
Simple data structure Volume of graphical data
The overlay and combination of mapped data The use of large cells to reduce data volumes means that
with remotely sensed data is easy phenomenologically recognisable structures can be lost and there
can be a serious loss of information
Various kinds of spatial analysis are easy Crude raster maps are considerably less elegant than maps drawn
with fine lines
Simulation is easy because each spatial unit Network linkages are difficult to establish
has the same size and shape
The technology is cheap and is being Projection transformation is time-consuming unless special
energetically developed algorithms or hardware are used
Vector methods
Good representation of phenomenological Complex data structures
data structure
Compact data structure Combination of several vector polygon maps or polygon and
raster maps through overlay creates difficulties
Topology can be completely described with Simulation is difficult because each unit has a different topological
network linkage form
Accurate graphical representation Display and plotting can be expensive, particularly for high
quality, colour and cross-hatching
Retrieval, updating and generalisation of The technology is expensive, particularly for the more
graphics and attributes are possible sophisticated software and hardware
Spatial analysis and filtering within polygons are impossible
sources such as Bhuvan Indian Geo-platform, in text format (ASCII—American Standard Code
USGS EarthExplorer, USGS Global Visualiza- for Information Interchange) for GIS input.
tion Viewer (GloVis), ESA’s Sentinel, Japan Field survey This is a collection of positional
Aerospace Exploration Agency (JAXA), Global information and its attributes, acquired in the
Land Cover Facility (GLCF) etc. (see Appendix). field. It can be cross-checked by ground valida-
Aerial photograph A photograph taken from an tion of spatial information.
aircraft or drone is known as an aerial photo-
graph. It can be digital or analogue. Very detailed
information is available in the photograph which 2.4 Non-spatial Database
is utilised for updating the topographical sheet
and providing detailed mapping. The non-spatial database represents the charac-
GPS data A global positioning system (GPS) is teristics, also called attribute data or descriptive
a constellation of 24 satellites that continuously information, of the spatial database. It describes
transmit coded information, which makes it regions and defines the characteristics of spatial
possible to precisely identify locations on Earth features within geographic regions. The
by measuring their distance from the satellites. It non-spatial data are usually alpha-numeric and
provides geographical location, altitude and time provide information such as the colour, texture,
2.4 Non-spatial Database 23
quantity, quality and value of features. For 2.4.2 Network Data Structure
example, a map of Delhi with its ‘attributes’ data
describes Delhi’s characteristics such as elevation, The network data structure has multiple con-
land use, populations, boundary information etc. nectivity with different hierarchies of data—in
This information is usually kept in tabular other words, a node may have more than one
form consisting of rows and columns, managed parent node. It provides fast linkages between
as a normal database. Each row represents a various levels of data. Network structures have a
feature of a map, and a column represents the structure, like a ring pointer. They are used in
desired characteristics of a particular row. navigating complex topological structures. They
Changes to any feature on that map, for example avoid data redundancy and make good use of
an urban boundary, can be generated by the available data (Fig. 2.14).
attribute data.
A separate data model is used to store and
maintain non-spatial or attribute (additional 2.4.3 Relational Data Structure
information) data for GIS. These data models
may exist internally within the GIS software, or The relational database stores data in simple
may be reflected externally through database records, containing an ordered set of attribute
management systems (DBMS). A variety of values grouped together in a 2-dimensional table
different data models exists. The most common known as a relation. Each row in the table is a
non-spatial database structures are hierarchical, record of a feature and each record has a set of
network and relational. attributes stored in columns. Different tables are
related through the use of a common identifier
called a primary key which should be unique in
2.4.1 Hierarchical Data Structure nature within the whole database. Relational
databases have a great advantage because their
When the data have family relations like a par- structure is very flexible and can meet the demands
ent–child relationship or tree structure, an ad- of all queries that can be formulated using the
ministrative structure represents the hierarchical rules. Adding or removing spatial and non-spatial
structure: i.e. as a state, district, tehsil (adminis- data is very easy because it involves just adding
trative division), block or village. The data are and removing a row and column. The disadvan-
stored in such a way that a hierarchy is main- tage of relational databases is that many of the
tained. Each node can be divided into one or operations involve sequential searches through the
more additional nodes. The hierarchical system file to find the right data to satisfy the specific
assumes that each part of the hierarchy can be relation (Fig. 2.15).
reached using a key that fully describes the data
structure, such as soil family. It is easy to
understand, update and expand, like the open 2.5 Conclusion
system of land-use classification. Data access via
keys is easy for key attributes but very difficult The database is an important part of the GIS and
for associated attributes. Travel within the data- plays a major role in providing inputs in a GIS
base is restricted to only up and down paths. It environment. The features of the real-world sys-
has a large index file and certain attribute values, tem are converted into different themes or layers of
repeated many times, leading to data redundancy, the database. These layers are composed of char-
which further increases storage and access costs acteristics of those features in the form of spatial
(Fig. 2.13). and non-spatial types of database. So, in GIS, two
24 2 GIS Databases: Spatial and Non-spatial
types of database are used: one is spatial and the vector data models have different utilities, repre-
other non-spatial. All the geographical features of sent geographical data, and are complementary to
the Earth’s surface or spatial features are repre- and inter-convertible with each other. There are
sented by point, line and polygon features. These different sources of spatial databases, in the form
features can be stored in vector and raster data of analogue maps, remote-sensing imagery, aerial
structures. The vector data structure is stored in a photographs, GPS data and field surveys.
pair of x–y coordinates, i.e. it is dimensionless. A non-spatial database includes the charac-
The spatial features in raster format are a group of teristics of a spatial database stored alphanu-
pixels or cells arranged in a row and column. Each merically and provides information in tabular
cell has a dimension of length and width. The form. The three most common non-spatial data-
raster data are represented in cells and stored in base structures are (1) hierarchical, where a tree
regular and irregular tessellations. Both raster and structure exists; (2) networked, where multiple
2.5 Conclusion 25
connectivity with all the data is there; and most 7. What are the differences between raster and
importantly, (3) relational, where the relation is vector data structures to represent the spatial
established while joining or appending the vari- database?
ous databases. 8. Discuss the non-spatial data structure in GIS.
9. Discuss the importance of the unique identi-
Questions
fication number and its relation with data
structure.
1. What do we mean by database in the GIS
environment?
2. Discuss the various types of databases and
their sources in GIS.
References
3. What are the differences between spatial and
Burrough PA (1986) Principles of geographical informa-
non-spatial databases in GIS? tion systems for land resources assessment. Oxford
4. Discuss the spatial data structures that repre- University Press, New York
sent the Earth’s features. Burrough PA, McDonnell RA, Lloyd CD (2015) Princi-
5. Discuss the storage of spatial features in ples of geographical information systems. Oxford
University Press, Oxford
vector format. Kumar D, Kaur R (2015) Remote sensing, 1st edn. A.K.
6. Discuss the methods of spatial data storage in Publications, New Delhi
raster format.