Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Introduction to Spatial Data

Analysis

Instructor: M. Reza Najafi

https://landsat.gsfc.nasa.gov/nasa-studies-details-of-a-greening-arctic/
1

Location matters, so…

The spatial The statistical The relationship


patterns in data distributions are of between the two is
are of interest interest really what counts

1
http://www.statcan.gc.ca/pub/16-002-x/2007001/map/5008062-eng.htm

http://thamesriver.on.ca/watershed-health/watershed-report- http://thamesriver.on.ca/education-community/watershed-friends-of-
cards/watershed-map/ projects/stoneycreek/stoney-creek-watershed/

local examples

2
Distance

Calculated Use
Trickier for Numerous
from two Pythagoras
the sphere variants
coordinates formula

Manhattan (city-block) distance

Calculates distance by counting blocks

D12 = (X2 – X1) + (Y2 – Y1)

3
Other distance metrics

Network distance Perceived


Travel time (flight
(transportation, distance (mental
times)
water) map)

• This is a sort of binary


distance: Two spatial
objects are either
adjacent or not

Adjacency • Other ways…

✓ Fixed distance (1 km)

✓ Number of nearest
entities

4
Neighborhood

• Less clear-cut concept

Means
1. The region of space
around some objects

2. A set of objects
considered to be
neighbors of that object

Harris R. Quantitative geography: the basics. Sage

Hypothesis testing

5
Learning Objectives

Define confidence interval and significance level

Describe hypothesis testing

Define the null hypothesis and the alternative hypothesis

Describe type I and type II errors

“Some people hate the very name of statistics but I


find them full of beauty and interest. Whenever
they are not brutalized, but delicately handled by
the higher methods, and are warily interpreted,
their power of dealing with complicated
phenomena is extraordinary”

Sir Francis Galton

6
Sample vs. population

Sample Population

Size n N
_
Mean X µ

Standard
s σ
deviation

Variance s2 σ2

Draw Sample

POPULATION Calculate
Sample
Statistics

𝑋ത = 78
Infer sd = 7
Population
Parameters

7
10
9 8 8

10
11

9 9
10 10
12

11
10

µ = 10 X =9

How confident are you of this estimate?

• How likely is this possibility?


→ Would be useful to have a measurement of the
reliability of the estimate → confidence interval

• Confidence level vs. significance level

• Standard error of the mean

8
Scientific method in geography

17

The Central Limit Theorem


The sampling distribution of the mean for any population is
• Centered on the population mean
• Normally distributed about the mean

18

9
As n increases, sample
means become
more numerous,
approaching a
continuous normal
distribution.

https://stats.libretexts.org/Textbook_Maps/Map%3A_Introductory_Statistics_(Shafer_and_Zhang)/06%3A_Sampl
ing_Distributions/6.2%3A_The_Sampling_Distribution_of_the_Sample_Mean

Example: Central
limit theorem

Sampling distribution of mean with


increasing n

20

10
The Central Limit Theorem

The sampling distribution of the mean for any population


is
• Centered on the population mean
• Normally distributed about the mean
• Variability is defined by the population standard
deviation divided by the square root of the sample
size (called standard error of the mean or
sampling error):

x= s
n
21

The larger the sample size, the smaller


the amount of sampling error (i.e., the
closer to the true population mean)

22

11
Sample Size, Standard Deviation and
Sampling Error
Small Standard Deviation of Population Large
Large
Sampling Error

Small
Small Large
23
Sample Size

Review of confidence intervals (CI)

Samples are variable in predictable ways (over the long run).

For example: estimates of population mean based on a sample are normally


distributed, with standard deviation varying inversely with the square root of the
sample size.

This can be used to construct a confidence interval for a population mean, by taking a
sample, calculating its mean and SD, and combining the information to state that with
(say) 95% confidence, the mean lies in the interval from A to B.

One other IMPORTANT thing: for the central limit theorem to hold, we generally
assume that a sample size n = 30 is required.

Hypothesis testing is related to this idea, but starts to add in issues of establishing
theories by statistical testing.

12
Distribution of Sample Means
(Nx = 10)

25

Sampling Distribution
of Sample Means
1. Distribution is
centered on the
population mean
2. Unlikely to find
• very low sample
means
• very high sample
means
3. Most likely to
find intermediate
sample means
• These results
derive directly from
the concepts of
probability and
randomness being
applied to a normal
distribution

13
Z-scores are simply transformations of X-scores
to a zero- plus-minus system, where zero = the
mean of the X-scores.

27

Confidence Intervals and Estimation

• Based on z-score
• Upper and lower bounds of confidence interval
• Confidence level (CI)
✓ p that confidence interval includes true population mean
✓ Examples: 0.90, 0.95, 0.99, 90%, 95%, 99%
• Significance level (α)
✓ p that confidence interval fails to include true population
mean
✓ Examples: 0.10, 0.05, 0.01, 10%, 5%, 1%

28

14
When determining probability, The Normal Distribution table
we are concerned with both only provides values for one
tails of the normal distribution. tail.

15
What is a hypothesis?
• Any statement about reality.

• More technically, it is a statement about the study


population that we are interested in determining the
'truth-value' of. More accurately, we want to assign a
likelihood to the hypothesis.

• To work as a hypothesis, it must be possible in principle


for the statement to be false... this sounds obvious, but…

• Hypotheses arise from a theory (or theories). We


formulate them as a step toward testing the theory that
produced them.
31

The null hypothesis, H0


• This states what would be true if the theory we have
developed were wrong.

• Usually, we want to reject the null hypothesis, thus


establishing that the theory is not wrong.

• The terminology, and the approach derives from the


experimental method in the sciences where we are
looking for changes in a response variable as we
vary the treatment or control variables. The null
hypothesis describes what we would see if there
was no effect.
32

16
The alternative hypothesis, HA

• This is usually what we want to 'prove', and is a


statement that follows from our theories and ideas
about the world.

• What we do is to disprove the opposite (i.e. the null).

• The reasons for this approach are a little obscure...


basically it's a logical problem: you can never
actually prove anything, because we could wake up
tomorrow and everything might have changed. You
will find that statisticians are generally very careful
about the claims they make based on statistical
evidence, and a lot of the care is due to this issue.
33

Steps in hypothesis testing


1. State null and alternative hypothesis

2. Select appropriate statistical test

3. Select level of significance (α = 0.05)

4. Delineate regions of rejection and nonrejection of


null hypothesis

5. Calculate a test statistic

6. Make decision regarding null and alternate


34
hypotheses

17
Possible errors
• Two errors are possible:

❑ Type 1 error: Rejecting the null hypothesis when it is true

❑ Type 2 error: Not rejecting the null hypothesis when it is


false

35

H0 True H0 false
Statistical power: (1 - β)
Accept H0 1-α β
= the probability of rejecting a
false null hypothesis. It is the
chance of finding an effect
that truly exists in the Reject H0 α 1-β
population

http://www.wadsworth.com/psychology_d/templates/student_resources/workshops/stat_workshp/statpower/statpower_23.html
36

18
Statistical power H0 True H0 false

Accept H0 1-α β

Reject H0 α 1-β

http://www.wadsworth.com/psychology_d/templates/student_resources/workshops/stat_workshp/statpower/statpower_23.html
37

Distribution
of Sample
Means
Area of
Type II
Error

2-Tail
Critical
Regions

38

19

You might also like