SpaceStat Chapter2

PART II
CHAPTER II
GENERAL CONCEPTS IN SPATIAL

DATA ANALYSIS
Assist. Prof. Dr. Mahmut Çavur
2.1. Introduction
Spatial data analysis involves:
 Accurate description of data relating to a process in

space.
 Exploration of patterns and relationships in data
 Search for explanations of such patterns and
relationships
These relate to:

  Visualizing spatial data
  Exploring spatial data
  Modeling spatial data
2.2. Visualizing Spatial Data
An essential requirement in any data analysis is the

ability to be able to “see” the data being analyzed.
Plots of data and other graphical displays of various
descriptions are fundamental tools for:
Seeking patterns
Generating hypotheses
Assessing the fit of proposed models
Determining the validity of predictions derived
from models
Maps are the tools for visualizing the spatial data.
Hence GIS can provide an environment to create maps
for spatial data and to explore spatial patterns and
relationships quickly and easily.
Cartographic considerations are important in using

maps in spatial data analyses. Because bad choices of
map type or scaling used for data values can lead to
 Misleading conclusions drawn from the display
 Suggest inappropriate models for the process

under study
2.3. Exploring Spatial Data
Exploratory methods for spatial data may be in the

form of:
 Maps
or
 Conventional plots
 E.g. Some exploratory techniques when applied to

point events result in contour map of the estimated
intensity of occurrences of events over the whole study
area; others, applied to the same set of events result in
a graph to throw light on the degree of spatial
dependence between event locations.
Exploring spatial data:
 Provides good descriptions of the data

 Help to develop hypothesis
 Help to establish appropriate models
If many exploratory spatial techniques result in

different forms of maps, then how do they really differ
from visualization techniques?
2.3.1. Distinction between visualizing and exploring
spatial data
Dividing line between visualization of spatial data and

exploratory data analysis is somewhat artificial. The
distinction is made based on the degree of data
manipulation.
 E.g.
Suppose that we have cause-specific death rates which
are age-standardized in a number of administrative zone.
Visualizing spatial data involves:
 A map of death rates

 Simple transformation of the rates
(No data manipulation)
Exploring spatial data involves:
 Map of spatial moving average of the rates in for

smoothing out local variations in order to see clearly
global trends (the moving averages are computed
in which each rate is replaced by the average of itself
and those neighboring districts)
(Data manipulation)
2.3.2. Distinction between exploring and modeling spatial
data
Exploratory methods do not involve any explicit model

for the data. However several exploratory techniques
involve informal comparison of some summary data.
Hence models do enter into exploratory techniques. The
distinction is based on the degree to what extent any
comparison made between the model. Moreover models
depend on certain assumptions.
 E.g.
Stan Openshaw (a quantitative geographer) tried to detect clusters
in point distributions of incidence of childhood leukemia. For this
purpose he used a technique which exhaustively compares the
observed intensity of events in circles of varying radius centered
on a fine grid imposed over the study area. By this way the aim
was to detect if cases were random in the circles. The circles with
significant discrepancies are identified and retained for later
display and investigation. This technique involves a model for
searching a random pattern and performs repeated formal
statistical comparisons with this model.
However, the validity of such comparison does not depend on the

assumption of any specific alternative model. The technique is
detecting clusters not searching for an explanation for the
process by which such clusters occur.
Therefore, this form of analysis makes few a priori assumptions

about the data and is fully in line with explanatory methods
2.4. Modeling Spatial Data
Models are mathematical abstraction of reality and not reality

itself. A statistical model involve using a combination of both:
 Data
 Reasonable assumptions
About the nature of phenomena being modeled. The assumptions

are arise from:
Background theoretical knowledge about the behavior of the

phenomena
The results of previous analysis on the same or similar

phenomenon
Judgement and intuition of the modeler.

A statistical model for a stochastic process consists of
specifying a probability distribution for the random
variable/variables that present the phenomena. Once a
probability distribution is fully specified there is
effectively nothing further that can be said about the
behavior of the process. A fitted model is evaluated
and results may lead to modification of assumptions or
using different model or updating the existing one.
 E.g.
Consider modeling levels of ozone in a large rural area.
The ozone level at each location s in R will vary during
the day and from day to day. A model can be fitted to
explain the distribution of ozone level based on a linear
regression.
Figure 2.1. Ozone levels
Basic Assumptions:
1. Random variables { Y(s), s ÎR } are independent
2. The probability distribution of random variable Y(s)

only differ in their mean value
3. The mean value is a simple linear function of

location.
4. Y(s) has normal distribution about this mean with

the same constant variance, σ2.
The model:
Where;
s1 and s2 are spatial coordinates of s
The assumptions provide a framework under which

final model specifications reduce to a problem of
estimation of unknown parameters. βi can be
estimated based on Maximum Likelihood Estimation
method.
The next step is to test the reliability of the model or
goodness of the fit. This can be achieved by using
hypothesis-testing methods. Testing hypothesis, which
involves comparison of the fit of a hypothesized model
with that of an alternative, is in fact one facet of
statistical modeling. At this step:
 Does a model in which certain parameters have pre-

specified values fit the data significantly well?
Figure 2.2. Analysis of spatial data
2.5. Practical Problems of Spatial Data Analysis
There are basically four types of problem that an

analyst can face:
1. Problem of geographical scale
2. Lack of spatial indexing
3. Problem of edge of boundary effects
4. Problem of modifiable areal unit

Problem 1: Geographical scale at which analyses are
performed.
Spatial data analysis is concerned with detecting and

modeling spatial pattern. However, pattern at one
geographical scale may be simply random variations in
another pattern at a different scale.
Problem 1: Geographical scale at which analyses are
performed.
 E.g. Local variations in disease rates may die out
against the national scale.
The scale to which spatial analysis relates depends on:
Phenomena under study
Objective of the analysis
Scale at which data collected

Problem 2: Lack of spatial indexing or ordering in
space.
An indexing implies that we have a natural notion of

what is next or previous. On a regular grid there is
reasonably a natural ordering of locations. However,
spatial data are not indexed most of the time. While
some data (those from satellites) come in the form of
regular grid or lattice, much spatial data are provided
for a patchwork quilt of areal units or irregularly
distributed set of sites.
 E.g. We can only speak of neighborhood of a zone

for areal units that share a common boundary.
Problem 3: Problem of edge or boundary effect.
In the middle of a study area, a site or zone may likely

to be surrounded by others; i.e. zone may have
neighbors. However, at the edge of the map or study
region, the neighbors extend in one direction only. In
spatial domain there is potentially much greater set of
observations around the edge of the map. Therefore
edge effects play critical role. This problem can be
overcome by leaving a guard area.
Problem 4: Problem of modifiable areal unit.
When data are measurements on a set of zones, often

they are aggregated measurements such as
households or individuals living in a zone. For the
sake of confidentially, the data are realized for arbitrary
areal units. The important point is to note that any
result from the analysis of these area aggregations is
usually conditional on the set of zones. Depending on
different aggregated areas the result is subject to
change.
Problem 4: Problem of modifiable areal unit.
Mean = 8.88; Aerial unit = 9 Mean = 8.33; Aerial unit = 3
Mean = 8.47; Aerial unit = 3 Mean = 9.33; Aerial unit = 3

2.6. Computers and Spatial Data Analysis
Q: Given that some spatial analysis capabilities are

available in widely used systems, is there a need for
spatial analysis functions beyond those currently
provided by GIS?
A: At present yes!
 E.g. A GIS will currently be able to overlay a set of

points (childhood cancer) onto a set of polygons
(buffer zones constructed along high voltage power
lines). The GIS will then be able to count how many
points lie within particular polygons by performing a
“point-in-polygon” operation.
However, it is hard to find a system, which evaluates
significantly the nature of the association between the
set of points and the set of polygons.
If we want to know whether there is statistically

significant association between the incidence of
childhood cancer and proximity to high voltage power
lines we can not do this readily by using GIS.
There are several ways for the use of computers in

spatial data analysis. Most of the time spatial analysis
techniques are coupled with GIS.
3.6.1.Methods of coupling GIS and spatial data analysis
There are 4 different methods to use spatial analysis

techniques with GIS:
 Full integration
 Loose coupling
 Close coupling
 Special combinations
Full integration: Every method for exploratory spatial
analysis and modeling are available within a GIS.
Loose coupling: Data are exported from GIS for use

within a spatial statistical framework, (i.e. having GIS and
separate spatial analysis software talk to each other)
Close coupling: Spatial analysis routines are called from

within GIS, (which requires use of macro language
capabilities of GIS).
Special combinations: A self-contained spatial analysis

system for a specific purpose is developed (Case I).
OR
Spatial analysis and GIS functions are added to a
standard statistical package (Case II).
2.7. Stationarity and Isotropy (terminology)
A spatial phenomenon is represented within a spatial
domain (R) and the location of each stochastic
phenomenon is expressed by s. The set of s within R
referred as a spatial stochastic process, {Y(s), s Є R}.
sR  Any data location in R
 s1 
s  s1, s 2    
T Location vector of point s
 s2 
Z(s) : s  R  Spatial Stochastic Process

2.7. Stationarity and Isotropy
Modeling real life problems requires data and assumptions on nature of the
phenomena
Figure 2.7. A spatial stochastic process

Spatial stochastic processes often exhibit a degree of spatial
correlation and this correlation by somehow has to be
incorporated into the analysis.
In general the behavior of spatial phenomena is the result of a

mixture of two types of effects:
 First order
 Second order
First order effects: They relate to variation in the mean value of the
process in space (global or large-scale trend).
Second order effects: They are resultant from the spatial

correlation structure or the spatial dependence in the process. In
other words, this effect occurs due to the tendency for deviations in
values of the process from its mean to follow each other in
neighboring sites (local or small-scale effects).
Behavior of Spatial Phenomena
First Order effects
Variation of a mean value in space - global or large scale

trend
Second order effects
Correlation in the deviations of the process values from

the mean
E.g.
Suppose that iron particles onto a sheet of paper marked
with a fine grid are scattered. The numbers of particles
landing in different grid-squares represent a spatial
stochastic process. As long as the mechanism by which
we scatter iron particles is purely random, they should
lack in both 1st and 2nd order effects (Case I).
Figure 2.7. Random scatter of iron particles (Case I)

Suppose that a small number of weak magnets are placed
under the paper at different points and we scatter the iron
particles again. The result will be a process with spatial
pattern arising from first-order effects (clustering in
numbers in grid-squares will occur globally at and around
the sites of magnets (Case II).
Figure 2.8. Scatter of iron particles with magnets

underneath the paper (Case I)
Now remove the magnets and weakly magnetized the iron
particles instead and scatter them again. The result is a
process with spatial pattern arising from a second-order
effect (some degree of local clustering will occur due to
the tendency of particles to attract or repel each other)
(Case III).
If the magnets are now replaced under the paper and the
magnetized particles scattered again we end up with a
spatial pattern arising from both first-order and second-
order effects
Stationarity: A spatial process is stationary or
homogeneous if its statistical properties are independent
of absolute location in R. This implies that:
 E[Y(s)] and VAR[Y(s)] are constant over R and do not

depend on locations.
 COV[Y(si),Y(sj)] between values at any two sites si and

sj, depends only on the relative locations of these
sites, the distance and direction between them. But
not their absolute location in R.
 If mean, variance and covariance structure changes

over R, the process exhibits non-stationarity or
heterogeneity.
Stationarity
Spatial data usually represent a single realization of a
random process
 Some degree of stationarity must be assumed to

make inferences about the data
 Stationarity is a form of location invariance

(invariance in the mean and variance of the process).
 Stationarity is the quality of a process in which the

statistical parameters (mean and standard deviation)
of the process do not change with space or time.
Stationarity
Strict or Strong Stationarity
Requires equivalence of distribution functions under

translation and rotation - all higher-order moments are
constant including the variance and mean
Weak Stationarity
Requires a constant mean and covariance that is

independent of location. The covariance is only
dependent on distance and direction between points
Non-Stationary Mean
Decreasing from west to east

Stationarity E(Y(s))=μ (Constant) for all s Є R
Cov [Y(s1), Y(s2)] = Cov [Y(s3), Y(s4)]

Cov [Y(s5), Y(s6)] = Cov [Y(s9), Y(s10)]
Cov [Y(s1), Y(s2)] ≠ Cov [Y(s7), Y(s8)]
Isotropy: The spatial process is called isotropic if the
covariance depends only on the distance between si and
sj, not the direction in which they are separated.
E.g.
Weakly magnetized iron particles

scattered onto paper with no
magnets represents an isotropic
process
Figure 2.9. Stationary and isotropic spatial processes

Isotropic
Refers to a spatial process that evolves the same in all

directions
Anisotropic
A spatial process in which the correlation and

covariance differs with direction
Most methods assume spatial correlation as isotropic

Modeling Spatial Processes
Most methods assume spatial correlation is isotropic
 Heterogeneity in the mean
 Deviations from the mean are stationary

SpaceStat Chapter2

Uploaded by

Copyright:

Available Formats

You might also like

SpaceStat Chapter2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SpaceStat Chapter2

Uploaded by

Copyright:

Available Formats

PART II

GENERAL CONCEPTS IN SPATIAL

 Accurate description of data relating to a process in

These relate to:

An essential requirement in any data analysis is the

Cartographic considerations are important in using

 Misleading conclusions drawn from the display

 Suggest inappropriate models for the process

Exploratory methods for spatial data may be in the

 E.g. Some exploratory techniques when applied to

 Provides good descriptions of the data

If many exploratory spatial techniques result in

Dividing line between visualization of spatial data and

 A map of death rates

(No data manipulation)

Exploring spatial data involves:

 Map of spatial moving average of the rates in for

Exploratory methods do not involve any explicit model

However, the validity of such comparison does not depend on the

Therefore, this form of analysis makes few a priori assumptions

Models are mathematical abstraction of reality and not reality

About the nature of phenomena being modeled. The assumptions

Background theoretical knowledge about the behavior of the

The results of previous analysis on the same or similar

Judgement and intuition of the modeler.

1. Random variables { Y(s), s ÎR } are independent

2. The probability distribution of random variable Y(s)

3. The mean value is a simple linear function of

4. Y(s) has normal distribution about this mean with

The assumptions provide a framework under which

 Does a model in which certain parameters have pre-

There are basically four types of problem that an

1. Problem of geographical scale

2. Lack of spatial indexing

3. Problem of edge of boundary effects

4. Problem of modifiable areal unit

Spatial data analysis is concerned with detecting and

The scale to which spatial analysis relates depends on:

Phenomena under study

Objective of the analysis

Scale at which data collected

An indexing implies that we have a natural notion of

 E.g. We can only speak of neighborhood of a zone

In the middle of a study area, a site or zone may likely

When data are measurements on a set of zones, often

Mean = 8.88; Aerial unit = 9 Mean = 8.33; Aerial unit = 3

Mean = 8.47; Aerial unit = 3 Mean = 9.33; Aerial unit = 3

Q: Given that some spatial analysis capabilities are

 E.g. A GIS will currently be able to overlay a set of

If we want to know whether there is statistically

There are several ways for the use of computers in

There are 4 different methods to use spatial analysis

Loose coupling: Data are exported from GIS for use

Close coupling: Spatial analysis routines are called from

Special combinations: A self-contained spatial analysis

sR  Any data location in R

Z(s) : s  R  Spatial Stochastic Process

Figure 2.7. A spatial stochastic process