Week 2 Lecture Material

EL
PT
Geo-Spatial Analysis in Urban Planning
N
Saikat Paul (PhD) ASSOCIATE PROFESSOR
Department of Architecture & Regional Planning
IIT KHARAGPUR
Module 02: GIS functionality and spatial analysis in the urban planning
Lecture 06 : Spatial Data Analysis
EL
➢ Basic Concepts of Spatial Data Analysis
PT
o Types of spatial data analysis techniques
N
➢ Application areas
Spatial Analysis { Uses variety of statistical and other techniques
Spatial data is gathered from observations/questionnaire surveys
Spatial Statistics includes study of entities using their

* topological, * geometric, and/or * geographic properties.
Application areas Varieties of statistical techniques
EL
❑ Urban Planning, ❑ Descriptive - cell counts, means, variances,
PT
maxima, minima, cumulative values, frequencies
❑ Ecology, epidemiology, ❑ Exploratory – Graphical methods and
N
❑ Social science, dimensionality reduction
❑ Explanatory - analysing data sets to summarize
❑ Natural resources management, main characteristics
❑ Disaster risk reduction and management, ❑ Predictive - predict future events by building a
mathematical model that captures important
❑ Climate change adaptation etc. trends
1
Spatial Data Analysis
❑ Factor Analysis of Multivariate Data - measure of correlated distance between the observations
• Euclidean measure (PCA),
• Chi-square distance (Correspondence Analysis),
• Generalised Mahalanobis distance (Discriminant Analysis)
Advantage - information is contained in initial few components or factors - thus creating fewer and important maps
EL
Disadvantage - loss of small amount of information in the spatial context Factor Analysis reveals the spatial structure
of important, and often persistent, features within spatially distributed observations
PT
Application examples in Urban Planning Context - evaluation of country's economic rank
N
2
❑ Factor Analysis of Multivariate Data

❑ Spatial Autocorrelation - Calculates interdependance of observations in geographical space
- Moran's I, Geary's C, Getis Ord G, Standard deviational ellipse
▪ Measuring a spatial weights matrix - reflecting intensity of geographic relationship between observations in a
neighborhood
EL
▪ Classic spatial autocorrelation statistics compare the spatial weights to the covariance relationship at pairs of
locations
PT
▪ Positive values would indicate the clustering of similar values across geographic space, while significant negative
spatial autocorrelation indicates that neighboring values are more dissimilar than expected by chance
N
Global Measure - estimate the overall degree of spatial autocorrelation for a dataset
Moran's I and Geary's C are global measures
Local Measure - estimates disaggregated to the level of the spatial analysis units, allowing
assessment of the dependency relationships across space
Getis-Ord Gi and Anselin Local Moran's I
3

❑ Spatial Autocorrelation
❑ Spatial Heterogenity - used for ecological analysis to identify strata variance
Measured using
q Statistic which is the % variance for an attribute
EL
value of 1 indicates high spatial hetereogenity and vice versa
PT
N
4

❑ Spatial Heterogenity
❑ Spatial Interpolation - Interpolation of values in a spatial strata at unobserved locations
from the values of nearby spatial locations
EL
Measured using -
PT
Kriging - inverse distance weighing
a spatial lag relationship using systematic and random components
N
5

❑ Spatial Interpolation
❑ Spatial Regression - Spatial relationship among dependant and independant variables
EL
- Geographically Weighted Regression generates parameters
disaggregated by the spatial units of analysis
PT
Bayesian hierarchical modeling + Markov chain Monte Carlo (MCMC) methods – used to modeling
complex relationships using Poisson-Gamma-CAR, Poisson-lognormal-SAR, or Overdispersed logit
N
models.
Spatial stochastic processes - Gaussian processes used in spatial regression analysis.
Model based versions of GWR or spatially varying coefficient models - applied to conduct Bayesian
inference.
Spatial stochastic process can become computationally effective and scalable Gaussian process
models, such as Gaussian Predictive Processes and Nearest Neighbor Gaussian Processes (NNGP).
6

❑ Spatial Regression
EL
❑ Spatial Interaction - Generally a top down approach
- Model “SPATIAL INTERACTION”
PT
- Criteria used - physical distance or travel time E.g. flow of people
from one zone of the city to other zones
N
- based on attributes of origin and destination as total commuters in
a zone and attraction potential of destination zones such as offices,
school, industries, market
Measured using -
❑ Gravity Model
❑ ANN - can also handle noisy or qualitative data
7

❑ Spatial Regression • Complex Adaptive Systems Theory
EL
❑ Spatial Interaction • Assess the bottom-up interactions that would result in very complex
❑ Simulation and Modeling patterns of interactions at a city scale
PT
• Capture and code the interactions among proximal
features/objects/agents/entities
N
Geographic Automata System -
❑ Agent Based Modeling(ABM) - agents and goals(behaviour) modify the city structure or
the environment
❑ Cellular Automata (CA) - fixed spatial framework of grid cell with transition rules based on
state of a grid cell and the proximal interactions
8

EL
❑ Spatial Interaction
{
❑ Simulation and Modeling - (Honarkhah - MPS algorithm) Analyses the spatial statistics of a
PT
model - training set and generates realisation of the phenomenon
❑ Multi-point Geostatistics
based on those input multi-point statistics
N
- (Tahmasebi - CCSIM algorithm) uses a cross-correlation function to
improve the spatial pattern reproduction
9

EL
❑ Simulation and Modeling
{
PT
❑ Multi-point Geostatistics - Gradient, Aspect and Visibility
❑ Surface Analysis - Gradient Descent/Ascent Methods
N
• Convex/Concave Hull approach
9

EL
PT
❑ Surface Analysis
{
N
- Understand and Determine flow within and around networks
❑ Network Analysis - Application areas - facility location, transportation - route choice,
hydrological - flood forecasting, etc.
11

EL
PT
N
❑ Network Analysis
❑ Geographic Knowledge Discovery - Data mining of massive datasets involving
o data gathering,
o cleaning/pre-processing(data curation), and
o interpretation of results
12

EL
PT
N
❑ Geographic Knowledge Discovery
❑ Spatial Decision Support System - use of mathematical models for making future
predictions and
- check the impact of physical and policy
interventions on urban development
13
Recapitulation
EL
PT
N
❑ Geographic Knowledge Discovery
❑ Spatial Decision Support System
14
EL
1. Coordinate Systems and Map Projections, D.H. Maling, Pergamon;
PT
2. Map Projections: A Reference Manual, L. M. Bugayevskiy and John Snyder, CRC Press;
N
N
PT
EL
EL
PT
N
IIT KHARAGPUR
Lecture 07 : Vector Data Analysis
EL
➢ Buffering
PT
➢ Overlay
N
➢ Distance Measurement
➢ Pattern Analysis
➢ Feature Management
Vector Data Analysis
❑ Vector data model uses points and their x-, y-coordinates to construct spatial features of points,
lines, and polygons
❑ These spatial features are used as inputs in vector data analysis.
EL
❑ Accuracy of data analysis depends on the accuracy of these features in terms of their location and
PT
shape and whether they are topological or not.
❑ Union and Intersect
N
o As overlay tools - work with both feature geometries and attributes
o As editing tools - work only with feature geometries
1
Vector Data Analysis Point
Buffer
Buffering
Features for buffering may be points, lines, or polygons
▪ Points - circular buffer zones
▪ Lines - elongated buffer zones Line
Buffer
▪ Polygons - buffer zones that extend outward from the polygon boundaries
Buffering creates two areas, based on the concept of proximity:

• buffer zone - one area that is within a specified distance of select features and
EL
• area beyond the buffer zone
ends of buffer zones can be either rounded or flat
PT
• GIS creates different Id’s to separate the buffer zone from the area beyond the buffer zone
N
Area
Buffer
• Besides the designation of the buffer zone, no other attribute data are added or combined
• It’s possible to create multiple buffer rings or dissolving overlapping boundaries
• Spatial data query - select features that are located within a certain distance of other features
but cannot create buffer zones.
2
Buffering
Variations in Buffering
❑ Buffer distance or buffer size can be variable
❑ Buffering around line features does not have to be on both sides of the lines; it can be on either the left side or the
right side of the line feature
❑ Buffer zones around polygons can be extended either outward or inward from the polygon boundaries of buffer
EL
zones may remain intact so that each buffer zone is a separate polygon for further analysis
PT
❑ Boundaries may be dissolved to create an aggregate zone, leaving no overlapped areas between individual buffer
N
zones
❑ Buffering uses distance measurements from select features to create the buffer zones
❑ Linear unit of select features, which is used as the default distance unit.
❑ Positional accuracy of spatial features in a data set also determines the accuracy of buffer zones.
3
Buffering
Applications of Buffering
❑ Buffer zone can be treated as a protection zone and is used for planning or regulatory purposes
❑ Buffer zones may represent the inclusion zones in GIS applications
❑ Buffer zones can also be used as indicators of the positional accuracy of point and line features
EL
❑ Finally, buffering with multiple rings can be useful as a sampling method
❑ One can also apply incremental banding to studies of land-use changes around urban areas
PT
Dissolved Buffer zones
Undissolved Buffer zones

N
4
Overlay
❑ An overlay operation combines the geometries and attributes of two feature layers to create the output.
❑ Each feature on the output contains a combination of attributes from the input layers, and this combination
differs from its neighbors.
❑ Feature layers to be overlaid must be spatially registered and based on the same coordinate system. E.g. the
Universal Transverse Mercator (UTM) coordinate system
1 1A 1B
2
+ A B =
2A 2B Overlay Operation
Feature Type Overlay
EL
❑ Overlay operations can take polygon, line, or point layers as the inputs and create an output of
PT
a lower-dimension feature type.
N
Input layer can be a point, line, or polygon layer
Overlay layer is a polygon layer
❑ Three common overlay operations: point-in-polygon, line-in-polygon, and polygon-on-polygon.
5
Overlay POINT-IN-POLYGON
Feature Type Overlay 2 2B

+ +
POINT-IN-POLYGON + A B =
▪ Same point features in the input layer are included in the output but each point 1 1A
+ +
is assigned with attributes of the polygon within which it falls
LINE-IN-POLYGON
LINE-IN-POLYGON
• Output contains same line features as in input layer but each line feature is
dissected by the polygon boundaries on the overlay layer.
EL
• Output has more line segments than does the input layer. 1B
• Each line segment on the output combines attributes from the input layer and 1
+ A B =
PT
1A
the underlying polygon.
N
POLYGON-ON-POLYGON
• Involves two polygon layers POLYGON-ON-POLYGON
• Output combines the polygon boundaries from the input
and overlay layers to create a new set of polygons
• Each new polygon carries attributes from both layers, and
these attributes differ from those of adjacent polygons.
6
Overlay
Overlay Methods
❑ Based on the Boolean connectors AND, OR, and XOR

o Intersect uses the AND connector.
EL
o Union uses the OR connector.
o Symmetrical Difference or Difference uses the XOR connector.
PT
o Identity or Minus uses the expression- [(input layer) AND (identity layer)] OR (input layer).
N
7
Union
Overlay
Overlay Methods
Union
❑ Preserves all features from the inputs.
❑ The area extent of the output combines the area extents of both input layers.
Intersect
Intersect
❑ Preserves only those features that fall within the area extent common to the inputs.
❑ Intersect is often a preferred method of overlay because any feature on its output has
EL
attribute data from both of its inputs.
❑ Intersect is a spatial relationship that can be used in a spatial join operation.
Symmetrical Difference
PT
Symmetrical Difference
❑ Preserves features that fall within the area extent that is common to only one of the
N
inputs.
❑ In other words, Symmetrical Difference is opposite to Intersect in terms of the output’s
area extent.
Identity Identity
❑ Preserves only features that fall within the area extent of the layer defined as the input
layer.
❑ The other layer is called the identity layer.
8
Distance Measurement
❑ Measurement of straight line (Euclidean) distances between features.

❑ Can be used directly for data analysis.
❑ Distance measures are stored in a field.
EL
❑ Distance measurements are carried out from –
▪ points in a layer to points in another layer, or
PT
▪ each point in a layer to its nearest point or line in another layer.
N
❑ Spatial interaction gravity model - uses distance measures between points as the input.
❑ Pattern analysis uses distance measures as inputs.
9
Pattern Analysis
❑ Study of the spatial arrangements of point or polygon features in 2-dimensional space.

❑ Reveals if a point distribution pattern is random, dispersed, or clustered.
❑ Uses distance measurements as inputs and statistics (spatial statistics) for describing the
distribution pattern.
EL
❑ Random pattern - presence of a point at a location does not encourage or inhibit the occurrence of
PT
neighbouring points.
❑ Spatial randomness separates a random pattern from a dispersed or clustered pattern.
N
❑ Local level - a pattern analysis can detect if a distribution pattern contains local clusters of high or
low values.
10
Pattern Analysis
Analysis of Random and Nonrandom Patterns
❑ Uses the distance between each point and its closest neighbouring point in a layer to determine if the point
pattern is random, regular, or clustered
Point Pattern Analysis Applications

Hot spot analysis is a standard tool for mapping and analysing health and crime data
EL
Nearest neighbour statistic
❑ Ratio (R) of the observed average distance between nearest neighbors (dobs) to the expected average for a
PT
hypothetical random distribution (dexp):
N
▪ R ratio < 1 - point pattern is more clustered and
▪ R ratio > 1 - point pattern is more dispersed than random.
▪ Also produces a Z score - indicates likelihood that pattern could be a result of random chance.
11
N
PT
EL
EL
PT
N
IIT KHARAGPUR
Lecture 08 : Vector Data Analysis
EL
PT
N
Pattern Analysis
Point Pattern Analysis Application
analyzing the spatial distribution and structure of plant species
Ripley’s K-function
❑ Identifies clustering or dispersion over a range of distances.
❑ Standardized version of Ripley’s K-function(L function) - observed L function at distance d, without edge
correction:
EL
PT
where,
• A is the size of the study area, N is the number of points, and π is a mathematical constant.
N
• summation of k(i, j) measures the number of j points within distance d of all i points: k(i, j)
is 1 when the distance between I and j is less than or equal to d and 0 when the distance is
greater than d.
• The expected L(d) for a random point pattern is d.
• A point pattern is more clustered than random at distance d if the computed L(d) is higher than
expected, and a point pattern is more dispersed than random if the computed L(d) is less than
expected.
12
Pattern Analysis
Analysis of Random and Nonrandom Patterns Applications
• analysing temporal changes of spatial distributions
• quantifying the spatial dependency over distance classes
Spatial Autocorrelation Analysis • validating the use of standard statistical analysis such as
regression analysis
❑ Considers both the point locations and the variation of an attribute at the locations.
❑ Measures the relationship among values of a variable according to the spatial arrangement of the values
EL
❑ Relationship may be described as highly correlated if like values are spatially close to each other, and
PT
independent or random if no pattern can be discerned from arrangement of values.
N
❑ Spatial autocorrelation is also called spatial association or spatial dependence.
❑ Moran’s I and Geary’s C are global indices of spatial association that include all locations in data.
o A spatial contiguity matrix Wij , with a zero diagonal, and the off-diagonal nonzero elements
indicating contiguity of locations i and j are used to code proximities
13
Pattern Analysis
Spatial Autocorrelation Analysis
Moran’s I
Moran’s I, which can be computed by:
where xi is the value at point i, xj is the value at point i’s neighbor j, wij is a coefficient, n is the number of points, and s2
is the variance of x values with a mean of x–.
EL
The coefficient wij is the weight for measuring spatial autocorrelation.
Typically, wij is defined as the inverse of the distance (d) between points i and j, or 1/dij.
PT
Other weights such as the inverse of the distance squared can also be used.
Values of Moran’s I can be explained by expected value E(I) for a random pattern:
N
▪ E(I) approaches 0 when the number of points n is large.
▪ Moran’s I is close to E(I) if the pattern is random.
▪ When Moran’s I > E(I) adjacent points tend to have similar values (i.e., are spatially correlated) and
▪ When Moran’s I < E(I) adjacent points tend to have different values (i.e., are not spatially correlated).
▪ Similar to nearest neighbor analysis, we can compute the Z score associated with a Moran’s I.
▪ The Z score indicates the likelihood that the point pattern could be a result of random chance.
14
Pattern Analysis
Local Indicators of Spatial Association (LISA)
LISA – Anselin local Moran’s I
Calculates for each feature (point or polygon) an index value and a Z score
• high +ve Z score suggests that the feature is adjacent to features of similar values, either above the mean or
EL
below the mean
• high -ve Z score indicates that the feature is adjacent to features of dissimilar values
PT
N
15
Pattern Analysis
Getis – Ord Gi Statistic for Measuring High/Low Clustering
❑ Moran’s I - only detects presence of clustering of similar values.
❑ But Moran’s I cannot tell whether clustering is made of high values or low values.
❑ Getis – Ord Gi -statistic, which can separate clusters of high values from clusters of low values (Getis and Ord 1992).
❑ The general G-statistic based on a specified distance, d, is defined as:
EL
PT
where
xi is the value at location i,
N
xj is the value at location j if j is within d of i, and
wij(d) is the spatial weight.
The weight can be based on some weighted distance such as inverse distance or 1 and 0 (for
adjacent and nonadjacent polygons).
16
Pattern Analysis
G-Statistic for Measuring High/Low Clustering
Expected value of G(d) is:
▪ E(G) is typically a very small value when n is large.

▪ A high G(d) value suggests a clustering of high values, and a low G(d) value suggests a clustering of low values.
❑ A Z score can be computed for a G(d) to evaluate its statistical significance.
EL
❑ Similar to Moran’s I, the local version of the G-statistic is also available.
PT
❑ The local G-statistic, denoted by Gi(d), is often described as a tool for “hot spot” analysis.
❑ A cluster of high positive Z scores suggests the presence of a cluster of high values or a hot spot.
N
❑ A cluster of high negative Z scores, on the other hand, suggests the presence of a cluster of low
values or a cold spot.
❑ The local G-statistic also allows the use of a distance threshold d, defined as the distance beyond
which no discernible increase in clustering of high or low values exists.
17
Feature Management
❑ Input layers must be based on the same coordinate system.

❑ Used for data pre-processing and data analysis.
EL
❑ terms describing the various tools may differ between GIS packages.
PT
N
18
Feature Management
Dissolve
❑ Aggregates features in a feature layer that have the same attribute value or values, used to simplify a
classified polygon layer.
❑ Dissolve can aggregate both spatial and attribute data of the input layer.
EL
PT
Dissolve removes boundaries of polygons
that have the same attribute value
N
19
Feature Editing
Clip
❑ Creates a new layer that includes only those features of the input layer, including their attributes, that fall
within the area extent of the clip layer
❑ Input layer may be a point, line, or polygon layer, but the clip layer must be a polygon layer.
EL
❑ Output layer has the same feature type as the input.
PT
Clip creates an output that
N
contains only those features of
the input layer that fall within the
area extent of the clip layer
20
Feature Management
Append
❑ Creates a new layer by piecing together two or more layers, which represent the same feature and have the same
attributes
❑ E.g. Append can put together a layer from multiple input layers and the output can be used as a single layer for
data query or display.
EL
PT
Append pieces together two adjacent layers
into a single layer but does not remove the
N
shared boundary between the layers
21
Feature Management
Select
❑ Creates a new layer that contains features selected from a user-defined query expression
EL
Select creates a new layer
PT
with selected features
from the input layer
N
22
Feature Management
Eliminate
❑ Creates a new layer by removing features that meet a user-defined query expression
❑ Eliminate can implement minimum mapping unit concept by removing polygons that are smaller than the defined
unit in a layer.
EL
PT
Eliminate removes some small
slivers along the top boundary
N
23
Feature Management
Update
❑ Uses a “cut and paste” operation to replace the input layer with the update layer and its features.
❑ Useful for updating an existing layer with new features in limited areas instead of redigitizing the entire map.
EL
Update replaces the input layer with
PT
the update layer and its features
N
24
Feature Management
Erase
❑ Removes from the input layer those features that fall within the area extent of the erase layer
EL
Erase removes features from the
PT
input layer that fall within the
area extent of the erase layer
N
25
Feature Management
Split
❑ Divides the input layer into two or more layers.
EL
Split uses the geometry of the
split layer to divide the input
PT
layer into four separate layers
N
26
Recapitulation
➢ Buffering
➢ Overlay
EL
➢ Distance Measurement
PT
N
N
PT
EL
EL
PT
N
IIT KHARAGPUR
Lecture 09 : Raster Data Analysis & Spatial Statistics
EL
➢ Comparison of Vector and Raster Analysis
PT
➢ Map Algebra
N
➢ Reclassification
➢ Local, Neighbourhood and Zonal Raster Operations
Raster Analysis
Comparison of Vector-and Raster-based Data Analysis
❑ Vector data analysis and raster data analysis represent the two basic types of GIS analyses.
❑ GIS package cannot run them together in the same operation.
EL
❑ some GIS packages allow the use of vector data in some raster data operations (e.g., extraction
PT
operations), the data are converted into raster data before the operation starts.
❑ Each GIS project is different in terms of data sources and objectives.
N
❑ Moreover, vector data can be easily converted into raster data and vice versa.
1
Raster Analysis
Introduction
❑ Can be performed at the level of individual cells, or groups of cells, or cells within an entire
raster
❑ Data operations may use a single raster or multiple raster
❑ Analysis mask limits analysis to its area coverage
EL
❑ Output cell size is set to be equal to, or larger than, the largest cell size among the input rasters
PT
❑ Statistics such as mean and standard deviation are designed for numeric values, whereas others
N
such as majority (the most frequent cell value) are designed for both numeric and categorical
values
❑ Raster data are typically used in geographic information system (GIS), involving heavy
computation such as building environmental models
1
Arithmetic +, −, /, *, absolute, integer, floating-point
Raster Analysis
Logarithmic exponentials, logarithms
Map Algebra
Trigonometric sin, cos, tan, arcsin, arccos, arctan
❑ Map algebra - an informal language with syntax similar to Power square, square root, power
algebra, which can be used to facilitate manipulation and Arithmetic, logarithmic, trigonometric, and
analysis of raster data. power functions for local operations
EL
❑ Uses an expression to link the input and the output.
PT
❑ Expression can be composed of GIS tools, mathematical
operators, and constants
N
❑ Can use basic tools of local, focal, zonal, and distance
measure operations as well as special tools as Slope,
Aspect etc.
1
Raster Analysis
Reclassification
❑ Local operation, reclassification creates a new raster by classification. Reclassification is also
referred to as recoding, or transforming, through lookup tables
❑ Two reclassification methods may be used.

o first method - one-to-one change, meaning that a cell value in the input raster is assigned a
EL
new value in the output raster.
o second method - assigns a new value to a range of cell values in the input raster.
PT
❑ Integer raster can be reclassified by either method, but a floating-point raster can only be
N
reclassified by the second method.
❑ Reclassification creates
o a simplified raster
o a new raster that contains a unique category or value
o reclassification can create a new raster that shows the
ranking of cell values in input raster
1
Raster Analysis
Reclassification Arithmetic +, −, /, *, absolute, integer, floating-point
Logarithmic exponentials, logarithms
Trigonometric sin, cos, tan, arcsin, arccos, arctan
Power square, square root, power
EL
Arithmetic, logarithmic, trigonometric, and
power functions for local operations
PT
N
1
Raster Analysis Types of operations
Local Neighborhood Zonal

• cell-by-cell operations • focal operation - involving a focal cell • works with groups of cells of
• can create a new raster from and a set of its surrounding cells same values or like features -
either a single input raster or • surrounding cells are chosen for their called zones - may be
multiple input rasters distance and/or directional contiguous or noncontiguous
EL
• cell values of new raster are relationship to focal cell • contiguous zone - cells that are
computed by a function relating • common neighborhoods include spatially connected e.g.
PT
input to output or are assigned rectangles, circles, annuluses, and watershed
by a classification table wedges • noncontiguous zone - separate
N
• general rule is to include a cell if regions of cells e.g. landuse
center of cell falls within
neighborhood
• irregular neighborhoods such as
asymmetric and discontinuous
neighborhoods cannot be used
1
Raster Analysis
Types of operations
Local
• Cell-by-cell operations
• Can create a new raster from either a single input raster or
EL
multiple input rasters
• Cell values of new raster are computed by a function relating
PT
input to output or are assigned by a classification table
N
1
Raster Analysis
Types of operations
Neighborhood
• Focal operation - involving a focal cell and a set of its surrounding cells.
• Surrounding cells are chosen for their distance and/or directional relationship to focal cell
EL
• Common neighborhoods include rectangles, circles, annuluses, and wedges
• Rectangle is defined by its width and height in cells, such as a 3-by-3 area centered at the focal cell.
PT
• Circle extends from the focal cell with a specified radius.
• Annulus or doughnut-shaped neighborhood consists of the ring area between a smaller circle and a
N
larger circle centered at the focal cell.
• Wedge consists of a piece of a circle centered at the focal cell.
• Cell is included in case center of cell falls within the neighborhood.
• Irregular neighborhoods - asymmetric and discontinuous neighborhoods cannot be used
1
Raster Analysis
Types of operations
❑ Performed at the level of individual cells, or groups of
Zonal Operations cells, or cells within an entire raster
• Resampling, which can build different pyramid levels (different resolutions) for a large raster Input Output
Aggregate operation(using mean
data set statistic) creates a lower-
• Aggregate is similar to a resampling technique in that it also creates an output raster that
EL
resolution raster
has a larger cell size (i.e., a lower resolution) than the input.
• Calculates each output cell value as the mean, median, sum, minimum, or maximum of the
PT
input cells that fall within the output cell
• Some data generalization operations are based on zones, or groups of cells of the same
N
value
• Generalizing or simplifying the cell values of a raster can be useful for certain applications.
Input Output
E.g. Aggregation or a resampling of satellite image or LiDAR (light detection and ranging) Generalization operations for groups of cells
(having high degree of local variations) of the same value
1
Raster Data Operations
Clip and Mosaic
Clip - an analysis mask or the minimum and maximum x-, y-coordinates of a rectangular area is specified for
analysisenvironmentand larger rasteras the input
Mosaic combinesmultiple input rastersinto a single raster
If input rasters overlap, a GIS package typically provides options for filling in cell values in the overlapping areas
ArcGIS, for example, lets the user choose the first input raster’s data or the blending of data from the input
EL
rastersfor the overlappingareas.
In case of small gaps between input rasters - neighborhood resampling - mean operation can be used to fill in
PT
missing values.
(a)
Input raster (a)
(b)
N
Analysis mask (b)
(c)
Clipped output raster (c)
1
Recapitulation
➢ Comparison of Vector and Raster Analysis
➢ Map Algebra
EL
➢ Reclassification
PT
N
➢ Local, Neighbourhood and Zonal Raster Operations
N
PT
EL
EL
PT
N
IIT KHARAGPUR
Lecture 10 : Raster Operations, Terrain visualization & Classification
➢ Raster operations
EL
• Measure of Physical Distance
• Allocation and Direction
PT
• Data Extraction
• Bufferring
N
➢ Operations for Analysing and Visualising Terrain
• Shaded Relief
• Hillshading
• Slope and Aspect
➢ Raster Data Classification Approaches
Raster Data Operations (0, 0)
Measure of Physical Distance
• Distances may be expressed as physical distances or cost distances

• Physical distance measures the straight-line or Euclidean distance Straight-line distance measured from
a cell center to another cell center
• Cost distance measures the cost for traversing the physical distance – function of physical
distance, speed limit and road condition
EL
• Physical distance is measured by the formula:
Spatial resolution × {(X1 − x2)2 + (Y1 − Y2)2}1/2
PT
• Physical distance measure buffers source cells creates buffers over entire raster or to a
specified maximum distance - also called extended neighborhood operations or
N
global operations.
• Reclassification can convert a continuous distance raster into a raster with one or more
discrete distance zones. A variation of Reclassification is Slice, which can divide a
continuous distance raster into equal-interval or equal-area distance zones. Continuous distance measured
from a stream network
1
Allocation and Direction
❑ Physical distance measure operations can produce allocation and direction rasters
❑ Cell value in an allocation raster corresponds to the closest source cell for the cell.
❑ Cell value in a direction raster corresponds to the direction in degrees that the cell is from
closest source cell.
EL
❑ Direction values are based on the compass directions: 90° to the east, 180° to the south,
PT
270° to the west, and 360° to the north. (0° is reserved for the source cell.)
N
Source cells are denoted as 1 and 2,
1.0 2 1.0 2.0 2 2 2 2 90 2 270 270
(a) shows the physical distance measures in cell units from each cell to the closest source cell;
1.4 1.0 1.4 2.2 2 2 2 2 45 360 315 287
(b) shows the allocation of each cell to the closest source cell; and
1.0 1.4 2.2 2.8 1 1 1 2 180 225 243 315 (c) shows the direction in degrees from each cell to the closest source cell.
1 1.0 (a)2.0 3.0 1 1 (b) 1 1 1 270 (c)270 270
The cell in a dark shade (row 3, column 3) has the same distance to both source cells.
Therefore, the cell can be allocated to either source cell. The direction of 243° is to the source cell 1.
2
Raster Data Extraction
❑ Raster data extraction - creates a new raster by extracting data from an existing raster- similar to
raster data query
EL
❑ A data set, a graphic object, or a query expression can be used to define the area to be extracted.
❑ Extraction tool extracts values at point locations (e.g., using bilinear interpolation) and attaches
PT
values to a new field in the point feature attribute table.
N
❑ Data set is a raster or a polygon feature layer, extraction tool extracts the cell values within the
area defined by the raster or the polygon layer and assigns no data to cells outside the area.
3
Buffering
❑ Vector-based buffering operation and raster-based physical distance measure operation are
similar as both measure distances from select features
❑ Vector buffering operation
• Uses x- and y-coordinates in measuring distances
• Create more accurate buffer zones than a raster-based operation can
EL
• More flexible and offers more options – creation of multiple buffer zones
• Creates separate buffer zones for each select feature or a dissolved buffer zone for all
PT
select features
❑ Raster-based operation
N
• Uses cells in measuring physical distances
• Creates continuous distance measures
• Reclassification or Slice is required to define buffer zones from continuous distance
measures
• Difficult to create and manipulate separate distance measures using a raster-based
operation
4
Raster Data Operations for Analysing and Visualising Terrain
Shaded Relief { Ratio of the amount of direct solar radiation
received on a surface in terms of radiance value
❑ Hill shading - simulates shading due to sun on the terrain due to

changes in terrain elevations
EL
❑ Helps viewers recognize the shape of landform features
PT
❑ Four factors control hill shading:
• sun’s azimuth - direction of incoming light (0° (north) to
N
360° measured clockwise)
• sun’s altitude - angle of the incoming light above horizon
• surface’s slope - (0° to 90° )
• aspect - (0° to 360°)
5
Raster Data Operations for Analysing and Visualising Terrain
Hill Shading
The relative radiance value of a raster cell or a TIN triangle can be computed by the following equation:
Rf = cos (Af − As) sin (Hf) cos (Hs) + cos (Hf) sin (Hs)
EL
where,
PT
Rf is the relative radiance value of
a facet (a raster cell or a triangle),
N
Af is the facet’s aspect,
As is the sun’s azimuth,
Hf is the facet’s slope, and
Hs is the sun’s altitude.
6
Raster Data Operations for Analysing and Visualising Terrain N
W E
Slope and Aspect

Aspect measures grouped into S
four or eight principal directions N

❑ Slope is the first derivative of elevation NW NE
❑ Measures the rate of change of elevation at a surface location W E
❑ Expressed as : SW SE
EL
• Percent slope is 100 times the ratio of rise (vertical distance) over run (horizontal distance) S
• Degree slope is the arc tangent of the ratio of rise over run
PT
• Aspect is the directional component of slope.
N
• Starts with 0° at north, moves clockwise, and ends with 360° also at north
• Aspect measures can be converted to linear measures by taking their sine or cosine values -
ranging from –1 to 1
• Application areas - studies of watershed units, landscape units, and morphometric measures
soil erosion, site suitability analysis
7
Image Classification
❑ A basic and important application of GIS in urban planning
❑ Remotely sensed satellite data is used to generate Land Use and Land Cover (LULC) maps.
❑ Classification assigns different classes ( vegetation, urban, water, etc.) to different pixel values.
❑ The assignment is based on different algorithms that analyses the spectral reflectance value in
EL
each pixel and choose a class which has the same reflectance property in those
PT
wavelengths/bands.
❑ There are different ways in which classification can be done:
N
▪ Supervised
▪ Unsupervised
▪ Hybrid or combined classification
8
Create Develop
Supervised Classification Training Signature Classify
Sets File
❑ The user guides the classification process.
❑ User selects areas in the image for which the land cover class are known, known as training
set. It should be representative of the whole image.
EL
❑ User assigns the classes to the these sample areas to create a signature file.
PT
❑ The training set is then used by the software to identify the classes of the rest of the pixels
N
by matching the spectral signatures of both.
❑ Softwares use many algorithms for this, like Parallelepiped Classifier, Gaussian Maximum
Likelihood Classifier, Minimum-Distance to the Mean, Principal component, etc.
9
Supervised Classification
Maximum Likelihood: Each pixel is assigned to the class to which it has the
highest probability of belonging to based on the statistics for each class in
each band. The data is assumed to be normally distributed.
Minimum Distance: The pixels are classified based on the nearest class. Uses
the mean vectors for each class and calculates the Euclidean distance from
EL
each unknown pixel to the mean vector for each class.
PT
Parallelepiped creates a box in feature space for each class, based on the mean and standard
deviation.
N
•If the pixel falls inside the parallelepiped, it is assigned to the class
• if it falls within more than one class, it is put in the overlap class.
•If the pixel does not fall inside any class, it is assigned to the null class.
The parallelepiped classifier is quick but has, in many cases, poor accuracy and many pixels
classified as ties.
10
Unsupervised Classification
Generate
iso-
The pixels are divided into a number of classes/clusters based on natural groupings present clusters
in the image values.
❑ The two basic steps for unsupervised classification are:

▪Generate clusters
EL
▪Assign classes Classify
iso-
PT
❑ Some of the common image clustering algorithms are K-means and ISODATA. clusters
❑ After picking a clustering algorithm, the number of groups to be generated is identified.
N
❑ The next step is to manually assign land cover classes to each cluster.
❑ For example, if you want to classify vegetation and non-vegetation, you can select those
clusters that represent them best.
11
Accuracy Assessment
❑ Since classification is guided by algorithms, its accuracy must be verified.
❑ The simplest way to assess it is the visual evaluation. Comparing the image with the results of its
interpretation, we can see errors and roughly estimate their size.
Other quantitative methods of evaluation are:
❑ Confusion matrix or error matrix: It is a table that shows correspondence between the classification result and
EL
a reference image such as maps, field work results etc
❑ Diagonal cells contain the number of correctly identified pixels. If we divide the sum of these pixels by the
PT
total number of pixels we will get classification’s overall accuracy (OvAc).
N
12
Accuracy Assessment
❑ Kappa coefficient(K): It is a measure of how the classification results
compare to values assigned by chance.
•The values range from 0 to 1.
•If K=0, there is no agreement between the classified image and
the reference image.
•If K=1, then the classified image and the ground truth image are
totally identical.
EL
Errors in Classification
PT
❑ There are two types of errors: omission error and commission error
❑ Errors of commission occur when a classification procedure assigns pixels to a certain class
N
that in fact don’t belong to it. Number of pixels mistakenly assigned to a class is found in
column cells of the class above and below the main diagonal.
❑ Errors of omission occur when pixels that in fact belong to one class, are included into
other classes. In the confusion matrix, the number of omitted pixels is found in the row
cells to the left and to the right from the main diagonal.
13
Recapitulation
➢ Raster operations
• Measure of Physical Distance
• Allocation and Direction
• Data Extraction
EL
• Bufferring
PT
➢ Operations for Analysing and Visualising Terrain
• Shaded Relief
N
• Hillshading
• Slope and Aspect
➢ Raster Data Classification Approaches
N
PT
EL

Week 2 Lecture Material

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week 2 Lecture Material

Uploaded by

Copyright:

Available Formats

EL

Spatial Statistics includes study of entities using their

Application areas Varieties of statistical techniques

❑ Factor Analysis of Multivariate Data

❑ Factor Analysis of Multivariate Data

❑ Factor Analysis of Multivariate Data

❑ Factor Analysis of Multivariate Data

❑ Factor Analysis of Multivariate Data

❑ Factor Analysis of Multivariate Data

❑ Factor Analysis of Multivariate Data

❑ Factor Analysis of Multivariate Data

❑ Factor Analysis of Multivariate Data

❑ Factor Analysis of Multivariate Data

❑ Factor Analysis of Multivariate Data

Buffering creates two areas, based on the concept of proximity:

• It’s possible to create multiple buffer rings or dissolving overlapping boundaries

Undissolved Buﬀer zones

❑ Three common overlay operations: point-in-polygon, line-in-polygon, and polygon-on-polygon.

Feature Type Overlay 2 2B

the underlying polygon.

❑ Based on the Boolean connectors AND, OR, and XOR

❑ Measurement of straight line (Euclidean) distances between features.

❑ Study of the spatial arrangements of point or polygon features in 2-dimensional space.

Point Pattern Analysis Applications

▪ E(G) is typically a very small value when n is large.

❑ A Z score can be computed for a G(d) to evaluate its statistical significance.

❑ Input layers must be based on the same coordinate system.

Comparison of Vector-and Raster-based Data Analysis

❑ Two reclassiﬁcation methods may be used.

Local Neighborhood Zonal

➢ Comparison of Vector and Raster Analysis

Measure of Physical Distance

• Distances may be expressed as physical distances or cost distances

❑ Hill shading - simulates shading due to sun on the terrain due to

Slope and Aspect

four or eight principal directions N

❑ Measures the rate of change of elevation at a surface location W E

❑ A basic and important application of GIS in urban planning

❑ There are different ways in which classification can be done:

❑ The user guides the classification process.

❑ The two basic steps for unsupervised classification are:

Other quantitative methods of evaluation are:

You might also like