Week 3 Lecture Material

EL
PT
Geo-Spatial Analysis in Urban Planning
N
Saikat Paul (PhD) ASSOCIATE PROFESSOR
Department of Architecture & Regional Planning
IIT KHARAGPUR
Module 03: Modeling Geographical Space and spatial analysis
Lecture 11 : Attribute Data Management and Data Exploration
 Descriptive Statistics (Univariate)
EL
 Descriptive Statistics (Multivariate)
PT
 Inferential Statistics
 Types of Attribute Data
N
 Data Management
• Attribute Data Entry
• Database Design
• Relational database
• Data Operations
Descriptive Statistics (Univariate)
 Cannot be used for analyzing nominal or categorical variables

 Summarize observations – with reference to the distribution of a variable –
 histogram of classes - grouped values – showing frequency of occurrence
 Number of classes - determined as a function of number of observations and range of
EL
values
 Mean - measures of central tendency in a distribution
PT
N
 Median - middle value when all values are ordered from smallest to largest
 Mode - most frequently occurring value
 Mean is sensitive to outliers, median or mode reduces impact of outliers
1
Descriptive Statistics (Univariate)
 Standard deviation – degree of dispersion around the mean
EL
 Variance - expectation of squared deviation of a random variable from its mean - how far
a set of (random) numbers are spread out from their average value
PT
 Coefficient of skewness - measures lack of symmetry in data distribution – measured
N
using Pearson's moment coefficient of skewness
2
Descriptive Statistics (Multivariate)
 Cannot be used for analyzing nominal or categorical variables

 Explores relationship of two or more variables to each other - Scatter plot
 Include more than one independent variable allowing simultaneous assessment of
EL
interrelationships between several variables
PT
 Correlation and Regression - exploration of nature of relationship between two or more β0 and β1 are Slope and
intercept coefficients
variables and strength of the relationship between them
N
 Line of best fit (equation of the line – regression equation) - fit a line through the points
on a scatter plot
• line is as close as possible to all points
• represents the trend in the data
3
 Line of best fit (equation of the line – regression equation)
 Slope
EL
PT
β0 and β1 are Slope and
 Intercept
N
intercept coefficients
4
 Correlation Coefficient (r) - measure of the nature and strength of the relationship
between variables - degree to which points scatter around the regression line
EL
r ranges from -1 to +1
PT
+ve values of r indicate positive correlation
-ve values or r indicate negative correlation
N
 Coefficient of Determination (r2) - indicates the goodness of fit - squared correlation
coefficient - indicates that % of variation in data that can be explained by line of best fit
5
Inferential Statistics
 Inferences about a population from a sample
 Significance testing (testing significance of differences between groups) - degree of difference
between samples
 Inferential statistics helps us to:
 consider likelihood of statement about a given parameter (mean or standard deviation)
is true based on available data - hypothesis testing - null hypothesis and alternative
EL
hypothesis
PT
o H0 null hypothesis - there is no significant difference
N
o H1 alternate hypothesis - there is a significant difference
o significance level, α, defined as probability that null hypothesis is correct
o if statistically significant then regarded as unlikely to have occurred by chance
 Standard Error – confidence interval - small standard error gives greater confidence that
sample mean is close to population mean
6
Inferential Statistics
 Standard Error – confidence interval - small standard error gives greater confidence that
sample mean is close to population mean
µ is the mean average

s is the sample standard deviation
n is number of observations in the sample
EL
PT
 t- statistic - assesses differences between two different sample means
N
 Analysis of Variance (ANOVA) - compare variation within data columns to variation between
data columns
7
Spatial and Aspatial Data Analysis
 Spatial and Aspatial data should be treated differently

 Aspatial data analysis – assumes independence in observations of samples
EL
– value of an observation is not affected by another observation
PT
 Spatial data analysis – observations of spatial variables are often similar to
N
values at neighbouring locations
8
Data Management
 Define each field in the table –

 field name,
 field length - number of digits to be reserved for a field,
 data type – integer, float, text, date, currency etc
EL
 number of decimal digits – for float data type
PT
 GIS project may start with a database of flat files – upon further expansion - migrate to
relational database for easier addition of new variables and handling increased amount
N
of data
9
Types of Attribute Data
• Common data types are number, text (or string), date, and binary large object (BLOB).
• Numbers include integer (for numbers without decimal digits) and float or floating
points (for numbers with decimal digits).
• attribute data by measurement scale:
EL
• Nominal - describe different kinds or different categories of data such as land-use types
or soil types.
PT
• Ordinal - differentiate data by a ranking relationship. For example, soil erosion may be
ordered from severe to moderate to light.
N
• Interval - have known intervals between values. For example, a temperature reading
of 70°F is warmer than 60°F by 10°F.
• Ratio data - same as interval data except that ratio data are based on a meaningful, or
absolute, zero value. Population densities are an example of ratio data because a density
of 0 is an absolute zero.
10
Four types of database design
 Flat File - single, two-dimensional array of data elements

 Hierarchical - data is organized into a tree-like structure, implying a single parent for each
EL
record
 Network - many-to-many relationships in a tree-like structure that allows multiple
PT
parents
N
 Relational - based on predicate logic and set theory - named columns of relation are
called attributes, and domain is the set of values of attributes
11
Database Management System (DBMS)
 Manages data tables - build and handle a database
 Provides tools for data input, search, retrieval, data handling and output
 GIS can interact with data from various sources and has connection capabilities to access
EL
remote databases
 Relational database - collection of tables, relations, connected to each other by keys - feature ID
PT
 Primary Keys - attributes whose values can uniquely identify a record in a table
N
 Foreign key is one or more attributes that refer to a primary key in another table
 BLOB (Binary Large Object) - stores a large block of data such as coordinates (for feature
geometries), images, or multimedia as binary numbers
12
Types of Relationships
 One-to-one relationship - one and only one record in a table is related to one and only one record in another table
 One-to-many relationship - one record in a table may be related to many records in another table
 Many-to-one relationship - many records in a table may be related to one record in another table
EL
 Many-to-many relationship - many records in a table may be related to many records in another table
PT
N
13
Primary key Foreign key
Joining Attribute Data
 Non- spatial table to a feature attribute
EL
table - feature attribute table (has
primary key) is the origin and the other
PT
table is the destination (has foreign key)
N
Origin Destination
14
Merging Attribute Data
 Join - brings together two tables by
using keys or a field common to both
tables and appends all columns from one
EL
table to the other based on Key
 Relate - temporarily connects two tables
PT
by using keys or a field common to both
N
tables
 Spatial Join - uses a spatial relationship to
join two sets of spatial features and their
attribute data Spatial Join - Proportion of Urban Population in India
15
Recapitulation
 Descriptive Statistics (Univariate)

 Descriptive Statistics (Multivariate)
 Inferential Statistics
EL
 Types of Attribute Data
 Data Management
PT
• Attribute Data Entry
N
• Database Design
• Relational database
• Data Operations
N
PT
EL
EL
PT
N
IIT KHARAGPUR
Lecture 12 : Spatial Interpolation
➢ Spatial interpolation
EL
o Global approaches
o Local approaches
➢ Spatial Interpolation Methods
PT
o Trend Surface
o Regression
N
o Thiessen Polygons
o Triangulated Irregular Networks (TIN) - vector representation of 3D
surface
o Kernel Density Estimation
o Inverse Distance Weighting (IDW)
o Thin Plate Splines (TPS)
Spatial interpolation
• Predicting values on a regular grid from a set of samples irregularly distributed in space
➢ Deterministic process - outcomes are precisely determined through known relationships among
states and events, without any room for random variation - a given input will always produce
same output
➢ Stochastic process - having a random probability distribution or pattern that may be analysed
statistically but may not be predicted precisely and would be an ensemble of different outputs
EL
➢ Global approaches make simultaneous use of all sample data
PT
• Deterministic – Trend Surface
• Stochastic – Regression
N
➢ Local approaches use only a subset of data (moving window)
• Deterministic - Thiessen polygons, Density estimation, Inverse distance weighted, Splines
• Stochastic – Kriging
1
Commonly used approaches of Spatial Interpolation
➢ Trend Surface
➢ Thiessen Polygons
EL
➢ Triangulated Irregular Networks (TIN) - vector representation of 3D surface
PT
➢ Kernel Density Estimation
N
➢ Inverse Distance Weighting (IDW)
➢ Thin Plate Splines (TPS)
➢ Ordinary Kriging
2
Trend Surface
➢ Trend Surface – Deterministic process
➢ Multiple Regression - dependent variable is the variable of interest (temperature, Humidity, Precipitation
etc.) and the independent variables are the data coordinates or some function of data coordinates
➢ Inexact interpolation method - approximates points with known values with a polynomial equation
EL
➢ Equation / interpolator can then be used to estimate values at other points
PT
➢ Equation of linear / first-order trend surface
N
where the attribute value z is a function of x and y coordinates.
b coefficients are estimated from the known points.
Source : http://www.kgs.ku.edu/Tis/surf3/s3trend2.html
3
Estimation of the unknown value at point 0 from five surrounding known points
Table below shows x, y coordinates of points, measured in row and column of a raster with
their known values.
E.g. How to use the equation or a linear trend surface, to interpolate unknown value at point 0.
Point x y Value
EL
1 X1 Y1 V1
2 X2 Y2 V2
PT
3 X3 Y3 V3
4 X4 Y4 V4
N
5 X5 Y5 V5
0 X0 Y0 ?
Least-squares method - to solve for coefficients of b0, b1, and b2
First step - set up three normal equations

Equations can be rewritten in matrix form
EL
PT
N
Deviation or residual between observed and estimated
values can be computed for each known point
Goodness of fit of model can be measured and tested

Trend Surface
➢ The distribution of many natural phenomena is usually more complex than an inclined plane surface from a
first-order model.
➢ Higher-order trend surface models can approximate complex surfaces E.g. hills and valleys can be modelled
using a cubic or a third-order model
EL
➢ A cubic trend surface is based on the equation:
PT
N
➢ A third-order trend surface requires estimation of 10 coefficients (i.e., bi), compared
to three coefficients for a first-order surface.
➢ GIS packages - up to 12th-order trend surface models.
4
Types of Trend Surface Analysis
➢ Logistic trend surface analysis uses known points with binary data (i.e., 0 and 1) and produces
a probability surface.
EL
➢ Local polynomial interpolation uses a sample of known points to estimate unknown value of a
cell – can be used for converting a triangulated irregular network (TIN) to a DEM and deriving
PT
topographic measures from a DEM
N
5
3D Surfaces – Raster & Vector
➢ 3D Surfaces can be created as Raster GRID or vector Triangulated Irregular Network surface
Grids
➢ Easy to store and operate and easy integration with raster databases
➢ Smoother and more natural appearance
EL
➢ Not possible to include varying grid sixes to represent areas of differing complexity of relief
PT
Triangulated Irregular Networks (TIN)
N
➢ Represents a surface as a set of contiguous, non-overlapping triangles
➢ Each triangular surface is represented by a plane
➢ Can describe surface at different level of spatial resolution
➢ Efficiency in storing data
➢ TIN is created using a process called Delaunay triangulation
6
Delaunay triangulation
➢ Created from Contours
➢ Vertices of the contour lines are used as mass points for triangulation
➢ Proximal method that satisfies the requirement that a circle drawn through the
three nodes of a triangle (equiangular) will contain no other node
EL
➢ Any point on the surface is as close as possible to a node
PT
➢ triangulation is independent of the order the points are processed
N
Triangulated Networks are stored in 2 ways
➢ Triangle by Triangle - better for storing attributes (slope, aspect)
➢ Points and Their Neighbours - better for generating contours and uses less storage space
and slope, aspect must be stored and calculated separately
7
Thiessen polygons (Voronoi polygons)
➢ Assume that any point within a polygon is closer to the polygon’s known point than any
other known points, thus one polygon for each known point.
➢ Estimate areal averages of precipitation - any point within a polygon closer to polygon’s
EL
weather station than any other station
➢ Used for service area analysis of public facilities such as hospitals
PT
➢ Uses Delaunay triangulation - each known point is connected to its nearest neighbours -
N
triangles are as equilateral as possible
➢ Constructed after Delaunay triangulation by connecting perpendicular lines to the sides of
each triangle at their midpoints.
8
Density estimation
• Density estimation measures cell densities in a raster by using a sample of known points.
• substitute of point pattern analysis - describes a pattern in terms of random, clustered,
and dispersed
Density Estimation Methods
EL
Simple Density Estimation
o Counting method
PT
o Based on a probability function and offers options in terms of how density estimation is made
o Place a raster over a point distribution, tabulate points that fall within each cell, sum up point
N
values and estimate cell’s density by dividing total point value by cell size
Kernel Density Estimation
o Associates each known point with a kernel function for the purpose of estimation
o Expressed as a bivariate probability density function
o Produces a smoother surface than the simple estimation method
o Applied to urban morphology, road accidents
9
Inverse Distance Weighted Interpolation
➢ Deterministic method for multivariate interpolation
➢ IDW implements the principle - estimated value of a point is influenced more by nearby
known points than by those farther away.
➢ IDW Interpolation is calculated using the following equation
EL
where z0 is the estimated value at point 0,
zi is the z value at known point i,
PT
di is the distance between point i and point 0,
s is the number of known points used in estimation,
k is the specified power.
N
➢ power k controls degree of local influence
➢ k = 1 - linear interpolation - constant rate of change in value between points
➢ k = 2 - rate of change in values is higher near a known point and levels off away from it
➢ Predicted values are within range of maximum and minimum values of known points.
➢ Small enclosed isolines are typical of IDW interpolation
10
Thin-Plate Splines
➢ Conceptually similar to splines for line smoothing
➢ Thin-plate splines create a surface that passes through control points and has least possible
change in slope at all points
➢ physical analogy involving bending of a thin sheet of metal
➢ metal has rigidity similarly TPS fit resists bending - a penalty involving smoothness of fitted
EL
surface.
➢ Deflection is in z direction, orthogonal to plane
PT
➢ In other words, thin-plate splines fit the control points with a minimum curvature surface. The
approximation of thin-plate splines is of the form:
N
where x and y are the x-, y-coordinates of the point to be interpolated, di2 = (x – xi)2 + (y – yi)2,
and xi and yi are the x-, y-coordinates of control point i.
2 components of Thin-plate splines :
local trend function (a + bx + cy) same form as a linear or first-order trend surface, di2
basis function log di designed to obtain minimum curvature surfaces
11
Thin-Plate Splines
➢ The coefficients Ai, and a, b, and c are determined by a linear system of equations
EL
PT
N
where n is the number of control points and fi is the known value at control point i.
The estimation of coefficients requires n + 3 simultaneous equations.
➢ Unlike the IDW method, predicted values from thin-plate splines are not limited within range
of maximum and minimum values of known points.
➢ Major problem with thin-plate splines is steep gradients in data-poor areas, often referred to
as overshoots – several methods exist for correction of overshoots
12
Thin-Plate Splines
➢ Thin-plate splines with tension, for example, allow user to control tension to be pulled on
edges of the surface
➢ Other methods include regularized splines and regularized splines with tension
➢ All these methods belong to a diverse group called radial basis functions (RBF)
➢ Recommended for smooth, continuous surfaces such as elevation and water table,
interpolating mean rainfall surface and land demand surface
EL
Radial basis functions
PT
➢ Refer to a large group of interpolation methods.
➢ They are exact interpolators - basis function or equation determines how surface will fit
N
between control points.
➢ ArcGIS, for example, offers five RBF methods: thin-plate spline, spline with tension,
completely regularized spline, multiquadric function, and inverse multiquadric function. Each
RBF method also has a parameter that controls the smoothness of the generated surface.
➢ Although each combination of an RBF method and a parameter value can create a new
surface, difference between surfaces is usually small.
13
Recapitulation
➢ Spatial interpolation
o Global approaches
o Local approaches
o Trend Surface
EL
o Regression
o Thiessen Polygons
PT
o Triangulated Irregular Networks (TIN) - vector representation
of 3D surface
N
o Kernel Density Estimation
o Inverse Distance Weighting (IDW)
o Thin Plate Splines (TPS)
N
PT
EL
EL
PT
N
IIT KHARAGPUR
Lecture 13 : Spatial Interpolation
EL
o Kriging
PT
o Co-Kriging
o Semi-variogram
N
o Elements of Semi-variogram
o Ordinary Kriging
o Universal Kriging
o Comparison of Spatial Interpolation Methods
o Diagnostic Statistics for Error Estimation
Kriging
➢ Different from other local interpolation methods
➢ Geostatistical method for spatial interpolation
➢ Assess quality of prediction with estimated prediction errors
➢ Assumes that the spatial variation of an attribute is neither absolutely random (stochastic) nor
EL
deterministic.
PT
3 components of Spatial Variation
N
➢ A spatially correlated component - representing variation of regionalized variable
➢ A drift or structure - representing a trend
➢ Random error term
1
Semivariogram
➢ Semivariogram useful for investigating
➢ Spatial Dependance - spatial variability of event under study
➢ Directional Differences - semivariance values changes more rapidly in one direction than
another
EL
➢ Anisotropy - existence of directional differences in spatial dependence
➢ Isotropy - spatial dependence changes with the distance but not the direction
PT
N
Measures Spatial
Semivariance spatially dependence or
used in Kriging correlated spatial
component autocorrelation
1
Semivariogram
Semivariance is calculated using the following equation:
EL
where γ(h) is the semivariance between known points, xi and xj, separated by the distance h; and
z is the attribute value
PT
In case of spatial dependance
N
➢ Small semivariances in data points close to each other
➢ Semivariances increases for points that are farther apart
2
Semivariogram
➢ Binning process - used to average semivariance data by distance and direction in

kriging
➢ Semivariogram plots the average semivariance against the average distance
EL
➢ Average semivariance is computed using
PT
N
where γ(h) is the average semivariance between sample points separated by lag h;
n is the number of pairs of sample points sorted by direction in the bin;
z is the attribute value.
3
Models
➢ Semivariogram - measure of spatial autocorrelation in data set
➢ Semivariogram fitted with a mathematical function or model can then be used for estimating
semivariance at any given distance
Selecting a model is a difficult task
EL
o Since there are number of models to choose from
o Lack of a standardized procedure for comparing the models
PT
➢ Visual inspection and cross-validation method - uses statistics for comparing interpolation methods
N
Two models commonly used for fitting semivariograms
4
Two models commonly for fitting semivariograms
• Progressive decrease of spatial dependence

Spherical until some distance, beyond which spatial
EL
dependence levels off
PT
• Exhibits a less gradual pattern than a
N
spherical model
Exponential • Spatial dependence decreases exponentially
with increasing distance and disappears
completely at an infinite distance
5
3 Elements of Semivariogram
• Semivariance at a distance of 0
Nugget • Represents measurement error, or microscale
variation
EL
PT
• Range is the distance at which the semivariance
Range starts to level off
• Spatially correlated portion of the semi-variogram
N
Sill • Beyond range semivariance becomes a somewhat
constant value
6
Ordinary Kriging
➢ Assumes absence of a drift
➢ Uses spatially correlated component
➢ Applies fitted semi-variogram for interpolation.
EL
➢ Ordinary Kriging estimates z value at a point using the equation
➢ Produces a variance measure for each estimated point to indicate reliability of estimation
PT
N
where z0 is the estimated value, zx is the known value at point x, Wx is the weight associated with
point x, and s is the number of sample points used in estimation.
➢ Weights are calculated by solving a set of simultaneous equations
7
Universal kriging
➢ Assumption - spatial variation in z values has a drift or a trend in addition to spatial correlation between sample points
➢ Performed on residuals after trend is removed – also known as residual kriging
➢ Kriging process uses either
First-order (plane surface)
EL
where M is the drift, xi and yi are the x, ycoordinates of sampled point i, and b1 and b2 are the drift coefficients.
PT
Second-order (quadratic surface) polynomial
N
➢ higher order polynomials - not recommended as
➢ it leaves little variation in the residuals for assessing uncertainty
➢ Larger number of the bi coefficients, which must be estimated along with the weights, and a larger set of
simultaneous equations to be solved.
8
Kriging and cokriging

➢ Generalized forms of univariate and multivariate linear regression models to estimation at a point, over an area, or
within a volume
➢ Linear-weighted averaging methods, similar to other interpolation methods;
EL
➢ Their weights depend on distance, direction and orientation of neighboring data to unsampled location
PT
N
Other Kriging Methods
➢ Ordinary kriging and universal kriging are the most commonly used methods
➢ Few other kriging methods
9
Few other kriging methods
• Trend component is a constant and known mean, which is

Simple Kriging often unrealistic
EL
• Uses binary data (i.e., 0 and 1) rather than continuous data
Indicator Kriging • Interpolated values are therefore between 0 and 1, similar to
probabilities
PT
Disjunctive • Function of the attribute value for interpolation
N
• More complicated than other kriging methods
Kriging computationally
• Estimates the average value of a variable over some small area

Block Kriging or block rather than at a point
10
Cokriging
➢ Regression methods only use data available at target location and donot use existing spatial correlations
from secondary-data control points
➢ Multivariate variant of the Ordinary Kriging operation
EL
➢ Uses covariance between one or more secondary variables, which are correlated with the primary
PT
variable of interest, in interpolation
➢ Uses covariance between two or more regionalized variables that are related
N
➢ Appropriate when main attribute of interest is sparse and related secondary information is abundant
➢ Correlation between the variables can improve the prediction of the value of the primary variable
➢ E.g. Using elevation or other topographical variables with precipitation
11
Cokriging
Cokriging methods
EL
Ordinary Cokriging
PT
Universal
Cokriging
N
12
Comparison of Spatial Interpolation Methods
➢ Large number of spatial interpolation methods are available – giving different interpolation results using
same data
➢ Cross-validation - compares the interpolation methods by repeating the following procedure for each
EL
interpolation method to be compared
PT
Calculate Aggregate
Use the
N
predicted diagnostic
remaining
error of statistics to
Remove a points to
estimation by assess the
known point estimate
comparing accuracy of
from data set value at point
estimated the
previously
with known interpolation
removed
value method
13
Diagnostic Statistics
➢ One sample for developing the model for each interpolation method to be compared and other sample
for testing accuracy of models
EL
Root Mean Square
(RMS) error - quantifies • Trend Surface, Regression
differences between • Small RMS is desirable for an
PT
known and estimated
values at sample points optimal solution
N
Standardized RMS Error • Kriging
- requires variance of • Desirable Standardized RMS
estimated values
Error close to 1
14
Recapitulation

o Kriging
o Co-Kriging
o Semi-variogram
EL
o Elements of Semi-variogram
o Ordinary Kriging
PT
o Universal Kriging
o Comparison of Spatial Interpolation Methods
N
o Diagnostic Statistics for Error Estimation
N
PT
EL
EL
PT
N
IIT KHARAGPUR
Lecture 14 : Network Analysis
➢ Networks
EL
➢ Network Connectivity
PT
➢ Summaries of Network Characteristics
N
➢ Identifying Shortest Paths
➢ Travelling Salesman Problem
Networks Analysis
Networks
➢ Constituent parts of a simple network

o arcs, vertices (representing a change in direction of the arc)
o nodes at the end of each arc and connecting different arcs
EL
➢ Connections between places within networks
PT
➢ Characterization of network complexity
➢ Finding shortest paths between start and end points
N
➢ Application areas - determination of optimal routes for emergency vehicles
1
Networks Analysis
Network connectivity
➢ The connectivity of a network - represented using connectivity matrix

➢ Square matrix that contains the arc labels as its column and row headings
➢ Matrix indicates those nodes that are connected by an arc (assigned a value of 1) and
EL
those that are not (given 0)
PT
o First order connectivity - nodes that are directly connected by arcs
o Second order connectivity - nodes that are connected by two arcs with a further
N
node in between
➢ Connectivity matrix is symmetric
2
Networks Analysis
Summaries of Network Characteristics
➢ Range of measures of network characteristics
o γ index - complexity of a network
o α index - degree of connectedness
EL
➢ A better connected network has larger values of g and a - provide a useful summary of
network complexity and connectivity.
PT
➢ A planar graph has no intersecting arcs
N
➢ Planar graph with n nodes has a maximum of 3(n - 2) number of links
➢ Non-planar graphs - three-dimensional air transport networks, the maximum number of
links is n(n - 1)/2
3
Networks Analysis
Summaries of Network Characteristics
➢ Measure of network connectivity - number of circuits, c, that exist within a network

➢ Circuit has a start node that is same as end node and it comprises a closed loop
➢ Minimally connected network has no circuits
EL
➢ Number of circuits is calculated by subtracting number of arcs required to form a minimally
PT
connected network from observed number of arcs in network
N
l-n+1
l is the number of arcs

n is the number of nodes
4
Networks Analysis
• Only source cell has a cell value other cells are assigned no data
Least-Cost Path Analysis source • Source cell is an end point of a path – origin or destination
raster
• For a cell we calculate least accumulated cost path to source cell or
to closest source cell if two or more source cells are present
cost
Least-Cost Path raster
EL
Analysis for deriving
least accumulative
PT
cost path requires cost
distance
measure
N
s
algorithm
5
Cost or Impedance to move through each cell
Networks Analysis 3 characteristics of cost raster :
❑ Cost for each cell is sum of different costs - construction and
Least-Cost Path Analysis operational costs and potential costs of environmental impacts
source ❑ Actual or relative cost
raster relative costs are expressed in ranked values involves a wide variety
of cost factors costs may be ranked from 1 to 5, with 5 being
highest cost value.
cost
relative costs - means of standardizing different cost factors for
EL
raster least-cost path analysis e.g. aesthetics, wildlife habitats, and
Least-Cost Path cultural resources (costs of intangible factors)
PT
❑ Cost factors may be weighted by relative importance of each factor
least accumulative
cost path requires
N
cost
distance ❑ Cost Raster – prepared by Compiling and
measures
evaluating a list of cost factors
❑ Make raster for each cost factor - multiply each
cost factor by its weight - use a local operation
algorithm to sum individual cost rasters.
❑ Local sum is total cost necessary to traverse
each cell.
6
❑ Based on node-link cell representation
Networks Analysis ❑ A node represents center of a cell, and a link— either a lateral
link or a diagonal line—connects node to its adjacent cells.
Least-Cost Path Analysis ❑ A lateral link connects a cell to one of its four immediate
source
neighbors, and a diagonal link connects cell to one of corner
raster neighbors.
❑ Distance is 1.0 cell for a lateral link and 1.414 cells for a
diagonal link.
cost
❑ Cost distance to travel from one cell to another through a
EL
Least-Cost Path
raster lateral link is 1.0 cell times average of two cost values:
PT
1 × [(Ci + Cj)/2]
least accumulative
cost path requires
N
cost
distance where,
measures
Ci is cost value at cell i, and Cj cost value at neighboring cell j
algorithm cost distance to travel from one cell to another through a

diagonal link, on or hand, is 1.414 cells times average of two
cost values
7
Networks Analysis
Least-Cost Path Analysis
source
raster
cost
EL
raster
Least-Cost Path
PT
least accumulative Costs distance in:
cost path requires Laterally linked cells : (1 + 2)/2 = 1.5
N
cost
distance
measures
diagonally linked cells average cost times 1.414 : 1.414 × [(1 + 5)/2] = 4.2
algorithm
8
Networks Analysis
Least-Cost Path Analysis
source
raster
cost Source Cells Cost Raster
EL
raster
Least-Cost Path
PT
least accumulative
cost path requires
N
cost
distance
measures
algorithm
Cost Distance for Each Link Least Accumulative Cost
Distance from Each Cell
9
Networks Analysis ❑ Different outputs of cost distance
measure operation
Least-Cost Path Analysis ▪ least accumulative cost raster
source
raster
▪ direction raster - showing
direction of least-cost path for
each cell
cost
EL
raster
Least-Cost Path
PT
▪ allocation raster - showing
least accumulative assignment of each cell to a
cost path requires
N
cost
distance
source cell on basis of cost
measures distance measures
algorithm
▪ shortest path raster - showing
least cost path from each cell to
a source cell
10
Dijkstra’s algorithm
Networks Analysis
Activating cells adjacent to source cell and by computing costs to cells
Least accumulative cost
distance from each cell Cell with lowest cost distance is chosen from active cell list, and its value
is assigned to output raster
❑ Obtained after each possible path
has been evaluated Cells adjacent to chosen cell are activated and added to active cell list -
❑ Derived using Dijkstra’s algorithm
EL
lowest cost cell is chosen from list and its neighboring cells are activated
- iterative process
PT
Each time a cell is reactivated - cell is accessible to source cell through a
different path - its accumulative cost is recomputed
N
Lowest accumulative cost is assigned to reactivated cell
Edge value in graph are weights Process continues until all cells in output raster are assigned with their least
Node Value in tree are total weights accumulative costs to source cell
11
Networks Analysis
Traveling Salesman Problem
➢ Routing problem - salesman must visit each of select stops only once, and salesman may start
from any stop but must return to original stop
➢ Objective is to identify a route that would minimize total impedance value
➢ solved using a heuristic method
EL
➢ Beginning with an initial random tour - runs a series of locally optimal solutions by swapping
PT
stops that yield reductions in cumulative impedance
N
➢ Iterative process ends when no improvement can be found by swapping stops
➢ Creating a tour with a minimum, or near minimum, cumulative impedance
➢ Tabu Search being best-known algorithm.
➢ Time window constraint can also be added to TSP so that tour must be completed with a
minimum amount of time delay.
12
Networks Analysis
Distance
Network
❑ System of linear features that has appropriate attributes for flow of objects
Delay at Assessing
intersection Commuting
❑ Road, railways, public transit lines, bicycle paths, and streams are examples (due to cost of a cost
signals)
of network link
EL
❑ Topology-based: lines meet at intersections, lines cannot have gaps, and lines
PT
have directions
Queue
N
❑ Road network – link and turn impedance and restrictions (one way street
etc) – these data are aggregated for creating a real-world street network
❑ Links are basic geometric features of a network - defined by two end points
13
Link Impedance
Networks Analysis Distance
Link Impedance in an Urban Context

❑ Cost of traversing a link Turn Impedance
❑ A simple measure of cost is physical length of link. Delay at Assessing
intersection Commuting
❑ length may not be a reliable measure of cost (due to cost of a cost
signals)
❑ link impedance is travel time estimated from length and speed on a link. link
❑
EL
there are variations in measuring link travel time.
❑ Travel time - directional - travel time in one direction different from other
PT
direction Queue
❑ Directional travel time - entered separately in two fields (e.g., from-to and to-
N
from).
❑ Travel time – depends on time of day and by day of week - setup of different
network attributes for different applications
❑ Create a turn table listing having turn impedance at intersections for an
urban network application
14
Recapitulation
➢ Networks
➢ Network Connectivity
EL
➢ Summaries of Network Characteristics
PT
➢ Identifying Shortest Paths
N
➢ Travelling Salesman Problem
N
PT
EL
EL
PT
N
IIT KHARAGPUR
Lecture 15 : Service or Trade Area Analysis in an Urban Area
 Service or Trade Area
EL
• Descriptive Trade Area
• Prescriptive Service Area
PT
 Mathematical Representation of Service / Trade Area
N
 Spatial Interaction
o Gravity Model
o Reilly Law
o Huff’s Law
Service or Trade Area Analysis
Measure of Physical Distance
 Estimate the market area or trade area of a business

 Determining service area
 Location choice of a service oriented establishment
EL
 Trade area may be viewed as descriptive
PT
 Customers of a business are described in spatial and aspatial ways
 Service area is prescriptive - company must prescribe how products are to be delivered to
N
customers
 Descriptive and prescriptive context in detailing with trade and service areas.
1
Descriptive Trade Area
 Several shops to choose from - range of potential choices
 An individual’s decision to shop at a store, say Grog4Less, represents a small increment in sales
to that store.
 Customer Card - information about its customers
o who its customers are.
EL
o where customers live - generate a map of the store’s trade area
o Analyse temporal changes in the customer base.
PT
 Size and extent of market area depends on
o location of the store
N
o location of its competitors
o distribution of potential customers
o products and services it sells
 Urban areas grow, decline, and change in various ways over time - Companies should be
sensitive to these changes for their survival
2
Descriptive Trade Area Delineation
 2 approaches for delineating a trade area
o Assumed or Ad hoc boundaries (2 methods)
• Customer-spotting method
 Generates boundaries based on observed customer locations
• Analog approach
EL
 Comparative data from existing stores to derive a trade area
 Data include store similarity, competition, expected market share, size/density of
PT
potential customers near a store etc.
 Regression models are then used to estimate expected sales, or other
N
performance metrics, for a defined trade area
o Use a model to derive a trade area - explicitly spatial model ling approaches
• spatial interaction model
• kernel functions and other geostatistical approaches
3
Descriptive Trade Area Delineation - Spatial Interaction Approach
 Advantage of this approach - geographic specificity, general applicability, and its modest input
requirements
 Proximity or distance is fundamental determinant of an individual visiting a particular store
 Gravity Model - formalizes interaction between two areas as a function of distance and
expected propensity to interact
EL
estimate of interaction between areas i and j
i = index of areas ( j an index as well)
PT
as a proportion of product of their
populations, pi pj, divided by the distance pi = population of area i
N
separating them raised to the power of λ dij = distance between areas i and j
k = proportionality constant
λ = distance decay factor
Iij = interaction between areas i and j
4
Descriptive Trade Area - Spatial Interaction Approach
 Interaction between two areas decreases as distance separating them increases, and increases
with increases in their populations.
 Gravity Model does not explain why interaction occurs - estimates general propensity for such
interaction
 Reilly (1929) first proposed using gravity model for dividing up market area between two
EL
competing towns
 Further suggested that ratio of interaction of settlement with towns A and B could be used to
PT
examine the market
N
Settlement at i situated between
two towns, A and B
IAi is amount of grocery sales at town A from intermediate settlement

IAi is amount of grocery sales at town B from intermediate settlement
5
Reilly’s law - ratio of interaction
 Ratio of Interaction of settlement with towns A and B
 ratio estimates how sales are split between two nearby towns, based on their respective sizes
and distances of the towns
 Trade Area Boundary Delineation - identify the point at which the trade influence between the
EL
two towns is equal
PT
Settlement at i situated
N
between two towns, A and B
IAi is amount of commodity sales at town A from intermediate settlement

IAi is amount of commodity sales at town B from intermediate settlement
does not require an estimate of the proportionality constant k
assumes a distance decay effect of λ = 2
6
Reilly’s law - ratio of interaction
 Trade Area Boundary Delineation
 Point j is defined by setting the ratio of interaction in the
equation equal to 1
EL
PT
Indifference location - Point j - by
dAB = dAj + dBj is the distance between the two towns setting ratio of interaction equal to 1
N
We can solve for distance dBj
if we know the distance from B to j along the line,
we therefore know its location
breakpoint formula
7
Descriptive Trade Area - Spatial Interaction Approach – Huff’s model
 Trade areas for multiple stores simultaneously
 Account for potential interactions between all pairs of stores
 Spatial interaction model for two or more stores
 Expected probability of interaction of a customer can be determined
 Commonly used measure of store attractiveness is store size
EL
numerator accounts for gravity-based
interaction of area i and store j
PT
denominator accounts for interaction with all
N
of stores
i = index of areas (i = 1, 2, 3, . . . , n)
j = index of stores (j = 1, 2, 3, . . . , m)
pi = population in area i
sj = attractiveness of store j Market boundaries between
αij = probability that a customer in area i will shop at store j competing stores
8
Recapitulation
 Service or Trade Area

EL
 Mathematical Representation of Service / Trade Area
PT
N
 Spatial Interaction
o Gravity Model
o Reilly Law
o Huff’s Law
N
PT
EL

Week 3 Lecture Material

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week 3 Lecture Material

Uploaded by

Copyright:

Available Formats

EL

 Cannot be used for analyzing nominal or categorical variables

 Standard deviation – degree of dispersion around the mean

 Cannot be used for analyzing nominal or categorical variables

 Line of best fit (equation of the line – regression equation)

µ is the mean average

 Spatial and Aspatial data should be treated differently

 Deﬁne each ﬁeld in the table –

 Flat File - single, two-dimensional array of data elements

 Non- spatial table to a feature attribute

 Descriptive Statistics (Univariate)

First step - set up three normal equations

Goodness of ﬁt of model can be measured and tested

Density Estimation Methods

➢ Binning process - used to average semivariance data by distance and direction in

Selecting a model is a difficult task

• Progressive decrease of spatial dependence

Kriging and cokriging

• Trend component is a constant and known mean, which is

• Estimates the average value of a variable over some small area

➢ Spatial Interpolation Methods

➢ Constituent parts of a simple network

➢ The connectivity of a network - represented using connectivity matrix

➢ Measure of network connectivity - number of circuits, c, that exist within a network

l is the number of arcs

Ci is cost value at cell i, and Cj cost value at neighboring cell j

algorithm cost distance to travel from one cell to another through a

cost Source Cells Cost Raster

Networks Analysis Distance

Link Impedance in an Urban Context

 Estimate the market area or trade area of a business

IAi is amount of grocery sales at town A from intermediate settlement

IAi is amount of commodity sales at town A from intermediate settlement

 Service or Trade Area

You might also like