Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 58

Fundamental Operations in GIS

Data Structures

Alex Takura Kuhudzai


Data Models
We want to Model the Real World
and represent it digitally on the
computer
Representing Reality
Definitions

 Map / Hard Copy: paper copy of a


drawing of an area.
 Coverage: digital form of a map. It
is usually a single theme.
 Data Structure: the format (type of
construction) of data the GIS
software understands and uses.
Data Models:
Entity - Object - Attribute
An entity is "a
phenomenon of interest
in reality”
An object is "a digital
representation of all or
part of an entity".

An attribute is a
characteristic of an entity
selected for representation
Data Models:
Field - Object Based
FIELD-based model
Partitions space into a regular
grid and assigns attributes to
the pixels; therefore, it
encodes spatial relationships
FIELD OBJECT implicitly

The OBJECT-based model


Identifies attributes and
defines their location;
therefore it encodes spatial
relationships explicitly
Raster / Vector Representations
Raster / Vector
Raster Vector
 A grid  Vector features are
 A 2D array of defined by coordinate
numbers points and chains /
arranged in rows arcs / lines connecting
and columns these points
 cells / pixels  Cartographic
 Very similar to original
data
Raster Structure
R1 Grid
C1 Extent

Georeferenced R3
C5

Pixel
Rows

Pixel
Resolution Columns
Raster Values
 Each cell contains a SINGLE data value
 I.e. (x,y,z)
 This number is the value of the attribute
being mapped / coded (E.g.
Temperature, Elevation, Colour)
 The correct data type must be used in
order to store the whole number
accurately
Common Raster Data Structures
Numeric Signed /
No. of Bits Possible Range
Type Unsigned
1 (20) Integer Unsigned 0 to +1
8 (23) Integer Unsigned 0 to + 255
8 Integer Signed -128 to +127
16 (24) Integer Unsigned 0 to + 65 535
16 Integer Signed -32 768 to +32 767
Red Value: 0 to 255
3 Set of 3 Unsigned
24 (3 x 2 ) Green value: 0 to 255
Integers Indicating RGB Blue Value: 0 to 255
32 (25) Integer Unsigned 0 to +4 294 967 295
Wide range of decimal numbers
32 Floating Signed with 7 place accuracy
Double
6 Wide range of decimal numbers
64 (2 ) Precision Signed with 14 place accuracy
Floating
Raster Histograms
Typically there are hundreds to millions of cells in a
raster.
In order to investigate these values a histogram of
value vs... frequency is plotted to help analyze the
data in the raster.
This allows statistics to be calculated for the raster.
These include:
• Range (Minimum to Maximum)
• Mean
• Median
• Mode
• Standard deviation
Raster Histogram

The Image histogram is a


graphical representation
of the brightness values
that comprise an image,
with brightness (0 - 255)
displayed on the x-axis,
and frequency on the y-
axis.
Use of the Raster Structure
The Raster data structure is used to
represent:
 Continuous data
• elevation
• rainfall
• temperature
 Remotely Sensed Images
 Colour Composite Pictures
Digital Images

Each pixel
comprises of
a
measuremen
t stored as a
digital
number in the
matrix.
IRS-1: Indian Satellite
Indian Harbour
Southern Iran

IRS - LISS IRS - LISS


5.8 m Panchromatic 23 m Colour
Colour Composite Pictures

R=50 G=158 B=236

NOT images, but


a colour
PICTURE, E.g. a
scanned
photograph
Displaying Raster Images

A single channel Three bands may be displayed


(band) is display as a together as a colour image with each
greyscale image band represented by a different primary
colour - red, green, blue
Landsat 8 Greyscale Image of
Gauteng
Landsat 8 RGB Image of Gauteng
Displaying Colour Raster Images

Display RGB
Rasters
Raster Resolution
The spatial resolution of a raster (I.e. the area of
each pixel) is dependant upon numerous factors
such as:
• data capture source (E.g. satellite sensor, field
sampling strategy)
• accuracy of data capture
• output requirements and purpose of project
Increasing the number of cells in a coverage will
increase its spatial accuracy BUT utilizes more
computer storage space and will have a longer
processing time.
Spatial Resolution

Coarse / Low Resolution Fine / High Resolution

Only large features are visible Small objects can be detected


5 metres 10 metres

23 metres 40 metres
Mixed Pixels

Spatial Amount of Scene Covered


Resolution by Mixed Pixels
A (Fine) 12%
B 26%
C 34%
D (Coarse) 47%
Rasters: Advantages
 Simple Data Structure
 Inherent Information About Position - every
cell knows where it is relative to the other cells
 Easy Analyses
 Easy Modelling
 Smooth representation of continuous data
(E.g. DEM)
 Remote Sensing Imagery
Rasters: Disadvantages
 Only one value per cell
 Storage Volume
 Data Redundancy (Compression)
 Area / Distance calculations inaccurate
 Lack of public understanding

5 4 or 7???
3 3

4 4
Vector Structure
The Vector data structure is used to represent
Dis-Continuous data:
• soil
• vegetation
• geology
Vector data features include:
• Point / Line / Polygon Elements
• Attached to attributes
• Topology
Vector Structure
 Vector features are defined by
“coordinate points” - spots located
precisely by X-Y coordinates.
 A vector is a line with direction and
magnitude x1, y1, z1

x2, y2, z2
Types of Vector Elements
Nominal

Point Arc / Line Polygon


Qualitative

Airport Forest
River
Town Ocean

Swamp
H Hospital Boundary
Ordinal
Rank

City Highway Minor


Flooding
Main Road
Town Major
Gravel Road
Village Flooding
Trail
Quantitative

10 People Density
Interval

Contour Lines
100 People 300
200
1000 People 100
Attribute Data
An attribute is a description of a feature - a
characteristic of it. They are used to describe
spatial data and are linked to the spatial data.

Qualitative / Descriptive Quantitative Attributes


Attributes • Have mathematical
• no measurement or meaning
magnitude • Measurements
• Names, descriptions, labels • Statistical analysis
• Codes - numbers or letters
• Categories
• No statistical analysis
Attribute data

PROPERTY AREA
OWNER TAX CODE SOIL QUALITY
NUMBER (ha)

1 100,000 TALATU B HIGH

2 50,100 BRAUDO A MEDIUM


3 90,900 BRAUDO B LOW
4 40,800 ANUNKU A HIGH
Who?

A Attribute Description B Where do certain


Q: What are the attributes of conditions exist? (Field)
Property 2? Q: Who owns High soil
A: (Look at records) properties?
A: Talatu and Anunku
Visualization of Attribute Data
DATA DATA DATA DATA
NATION NATION
Qualitative QUANTITY QUALITY QUANTITY QUALITY
Algeria South 50
attribute - 45 High
Africa
High

description Angola 20 Medium Sudan 40 Medium


Benin 40 High Swaziland 21 Low
(Name) Cameroon 18 Low Tunisia 11 Low

Qualitative Quantitative
attributes attributes

Africa is mapped in colours


reflecting qualitative data -
Data Quality
Vector Structure
1 2 3 4 5 6 7 8 9 10
NODE 1
1 LINE POINT
VERTEX 1
2 1,2 (Chain 1) 10,1
6, 2.5 (C
NODE 1 ha
(Chain 1) in
3 5, 4 2)
2, 4 (Ch
ai n NODE 2
VERTEX 1 2)
5)

4
in

VER. 2
ha
(C

5 7, 5 9, 5
VER. 4 POLYGON

3)
6 1, 6
in
(C ha
ha
in
(C

4) VER. 3
7
4, 8
8
Storing Point Data

1
ID No. X Y Z

1 10231 32000 541

2 X2 Y2 Z2
2
Storing Line Data - Inefficient

1 1 2
2
LINE 1
X1 Y1 Z1 3
X2 Y2 Z2
LINE 2
X2 Y2 Z2
X3 Y3 Z3
Storing Line Data - Relational

1 1 2

Line Table
2

Index 3
field
Line ID Point List
Point Table
1 1 2 ID No. Locational Table
Link
2 2 3 1 10231 32000 541

2 X2 Y2 Z2
Note: Direction
Storing Polygon Data - Inefficient
POLYGON 1
1 X1 Y1 Z1
6 1 X2 Y2 Z2
X3 Y3 Z3
2 X4 Y4 Z4
6
7 X5 Y5 Z5
1 7 X6 Y6 Z6
5 2 2 POLYGON 2
X2 Y2 Z2
8 X3 Y3 Z3
3
X7 Y7 Z7
5
4 3
4
Storing Polygon Data - Relational
Polygon Table
Poly ID Line List
1 1, 2, 3, 4, 5, 6

6 1 2 7, 8, 2

2 Line Table
6 7
1 7 Line ID Point
5 2 2 1 List
1, 2
2 2, 3
8 3 3, 4
3
5 8 7, 3
4 3
4
Observations
1. Sequence ordering of Line ID Point List
points list gives the line 1 1 2
DIRECTION
2 2 3

2. Line 2 is repeated in both 6 1


Polygons, therefore they “know”
that they are NEIGHBOURS 7

5
1 2 2
Poly ID Line List 8
1 1, 2, 3, 4, 5, 6
4 3
2 7, 8, 2
Topology
 By building vector data in this manner,
allowing direction and neighbourhood
relations to be intrinsically incorporated
into the data structure, the data has
“Intelligence”.
 This is known as TOPOLOGY
Topology
 Topology is concerned with the spatial
relationships between elements
 Each element:
• Knows where it is (position)
• Knows what is around it
• Knows how to get around (from A to B)
Topological Properties: Adjacency

Poly A is next to
Poly B
Node 2 Arc 1 knows it
goes from Node 1
to Node 2,
A B therefore the
Left Poly is A
Arc 1 and the
Node 1 Right Poly is B
Topological Properties: Connectivity
Node 1 has Arcs
(Chains / Lines) A, B
A Node 1 B and C entering /
leaving it.
Therefore it knows
that Arc A is
connected to Arcs B
and C.
C It also knows in which
direction these
connections are.
Topological Properties: Containment

Polygon B is
contained within
Polygon A
A B Polygon A contains
Polygon B within it
These are also
referred to as
Islands
Topology Types: Strict / Polygonal
 No two nodes have the same X and Y
coordinates
 All lines start and end in nodes
 No lines may intersect unless there is a node
 Enclosed areas are defined as a polygon
 A point can only exist in one polygon
Polygonal topology is essential for accurate ground
measurements and many vector overlay operations.
However, it takes time and care to maintain.
Topology Types: Planar
 No two nodes have the same X and Y
coordinates
 All lines start and end in nodes
 No lines may intersect unless there is a node
 There are no polygons

Planar topology is used for hydrology (if there


are no lakes) and for simple transport systems.
Topology Types: Network
 No two nodes have the same X and Y coordinates
 All lines start and end in nodes
 There are no polygons
 Lines may cross themselves or each other without a
node, although nodes can be entered if necessary

Network topology is used for network analysis -


routing and allocation.
Raster & Vector Overlay
Result A B
Input layers
stored as
X
arrays of
pixels
Y
AX BX
Landscape AY BY
sampled at
regular
spacing

Vector
Raster
Vectors: Advantages
 Maplike / Easier to Read
 Spatially Accurate: Area / Distance
calculations precise
 Topology
 Compact Storage
 Can be linked to numerous attributes
Vectors: Disadvantages
 Complex digital structure
 Data Modelling complicated and time
consuming
 Requires “high-end” computers
 Requires more experience and knowledge
to manipulate and achieve answers
 More expensive
CAD Data
 CAD data may look like vector data but
is NOT the same!
 CAD has no Topology - it cannot perform
spatial or statistical analysis
 CAD supports layers
 CAD was developed for computerized
drafting and design (architecture,
engineering etc.)
CAD Elements
Line
Point
EX T
T
Circle Rectangle

Arc Regular
Polygon
3D
Elements
Arc
Chord Polygon
CAD Elements: Geometric Shapes
As seen previously, CAD elements are
geometric shapes - they store the geometric
description of the shape as an equation

This means:
• faster to display Circle = Centre point (x, y)
• less storage space + radius (r)

• Elements can be resized easily


• Elements can be moved easily
CAD Layers
Individual CAD elements
or groups form “layers”,
which can be moved
forward or backward
(behind / in-front-of)

Care needs to be taken in ordering elements to avoid


hiding islands.
CAD / Vector Overlap

CAD B Vector B

A A
?

CAD supports overlapping elements. Vector data creates


a new polygon - which can cause problems with attribute
labelling BUT which allows for measurement and
calculations related to the overlap area.
Advantages / Disadvantages of CAD
Advantages Disadvantages
• Simple data structure • No topology - no
• Requires little storage modelling and no analysis
space • Cannot query the data
• Quick to edit and display • Hundreds of Layers with
• Good for simple poor labelling - confusing
drawings • No attribute databases
CAD Data
Uses:
 Technical drawing
 Geometric drawings - E.g. Logos
 Very simple desktop cartography

CAD is NOT GIS - No Modelling or Analysis

You might also like