Final Report

SPATIAL ANALYSIS OF
CRIME
Bellingham, WA 2013

Kevin Ward
WWU 2014
Spatial analysis of crime requires defining what constitutes a crime. This changes from place to
place and requires the analyst to have correct and up to date knowledge about their area of interest

Kevin Ward
ENVS 422
Draft of Final Report

Spatial Distribution of Crime in 2013
Bellingham, Washington

Abstract:

I dont think it could be successfully argued that crime is not an important issue to consider.
Using spatial analyses can be an incredibly useful method for managing and preventing crime. If you
know where crime is occurring you can make more informed decisions about where to allocate
resources to combat it. I have analyzed crime in Bellingham that occurred in 2013 using methods such
as Hot-Spot Analysis and Directional Distribution to locate significant clusters of crime. Local law
enforcement could hopefully benefit from knowing where particular crimes are most likely to occur. My
results were not unexpected. Crime tends to occur most commonly in downtown areas where human
activity is most concentrated. However, after normalizing the data by total population the clusters of
crime shifted outside of downtown Bellingham.
Introduction:

I analyzed crime that was recorded by the Bellingham Police Department for the year of 2013.
Using the data they made freely available to the public I was able to create point data for the reported
incidents. My original goal was to map all crime that had occurred. After examining the data I made a
decision to create categories of what I believed to be the most relevant crimes. This ended up leaving out
crimes such as littering, trespass, traffic stops, and situations where officers responded but no crime had
been committed. In the end I produced 6 different maps, 5 of which were for an individual category of
crime and the last one being an aggregate of all 5 categories that I called total crime.
Literature Review:

(Hollis, 2013). There are different sub-groups of spatial crime analysis. The main groups are Tactical
Crime Analysis, Strategic Crime Analysis, Administrative/Academic Crime Analysis, Operations
Analysis, and Intelligence Analysis (Ahmadi, 2003). I will effectively be ignoring Administrative and
Intelligence as they deal mainly with policy outside of law enforcement and with forensic research.
Tactical Crime Analysis is the day to day analysis that looks for tries to identify patterns, hot spots, and
crime sprees. It is most useful to law enforcement by providing timely and accurate data about the
changing criminal landscape. (Canter, 2000). Strategic Crime Analysis is more about analyzing crime
based on location and time. Its meant to forecast potential crime events (Ahmadi, 2003). Operational
Analysis looks at where law enforcement resources are being allocated relative to crime and is
sometimes grouped together with Strategic Crime Analysis (Boba, 2001).
Now I will be focusing on the methods used to analyze crime spatially. The methods involve
geocoding, hot spot analysis, Local indicators of spatial association, and journey to crime analyses
among other things. This will serve as a basic overview of the most prevalent methods being used in
spatial analysis of crime. I will not be getting into the sociological aspects of criminal activity that are
often associated with particular methods of analyses.
One of the more popular methods for analyzing crime is hot spot analyses. Crime hot spot at their
most minimal are boundaries with criminal events within those boundaries (Anselin et al., 2000). The
boundaries can either be fixed or ad hoc each with their own respective advantages and disadvantages
(Anselin et al., 2000). Fixed boundaries are generally the most commonly used for the purposes of
criminal analyses. Hot spot analysis is able to identify significant clusters of incidents but it does not in
any way explain why the clustering occurs (Levine, 2006). It is also possible for clustering to occur by
chance which is why it is necessary to compare your results to the results of random chance (Levine,
2006).
There are several different ways hot spots can be calculated and visualized. One such method
uses ad hoc boundaries and fits standard deviational ellipses around each identified cluster (Chainey et
is, turning an address into a point on a map (Ratcliffe, 2003). This is an important and necessary step
to creating points out of the incidents that can then be spatially analyzed. Geocoding is not a perfect

al., 2008). You can also create defined boundaries, join criminal point data to the polygons, count the
points within each polygon, and then create a thematic map based on the number of crimes within each
polygon. If you want your boundaries to be uniform you can lay a grid over the area of interest and have
each cell serve as a boundary (Chainey et al., 2008). One of the most popular methods of hot spot
analysis is the Gi or Gi* of Getis and Ord (Anselin et al., 2000). Getis Ord Gi* calculates clusters of
high and low density that are of statistical significance as compared to random chance. It is an example
of local spatial autocorrelation as opposed to global autocorrelation that yields only one statistic to
summarize the study area and assumes homogeneity (Getis & Ord, 1995).
Interpolation is another area of spatial analysis useful in analyzing crime. Interpolation is,
extrapolating a density estimate from individual data points and an estimate of incident density for
each grid cell is made using a mathematical function (a kernel) that relates the density to distance
(Levine, 2006). A method that has become popular because it combines hot spot analysis with
interpolation is kernel density estimation. With kernel density estimation point data is aggregated within
a user defined radius and, a continuous surface that represents the density or volume of crime events
across the desired area is calculated (Chainey et al., 2008).
Another useful method of analyzing crimes spatially involves variables associated with length of
the journey to crime. This analysis seeks to determine where the perpetrator lives based on where the
crimes were committed. Basically, This is a criminal justice method for estimating the likely residence
location of a serial offender given the distribution of incidents and a model for travel distance (Bernasco
& Elffers, 2010). Another name for this is geographic profiling. For example, say 10 related crimes
occur in an area you could, using a journey-to-crime model, estimate several likely locations which
would be the residence of the perpetrator (Levine, 2006).
Another important tool used in criminal analyses is geocoding addresses. Simply put geocoding
data so that any incidents occurring outside of Bellingham were removed. After some more careful
editing I was able to successfully geolocate 94% of my original 24,000 reported crime incidents.

process for several reasons. The sheer volume of incidents that are often part and parcel of criminal
incident data sets can almost guarantee some error. Geocoding relies on relating known addresses with
specific coordinates to locations written down by law enforcement as part of an incident report. The
human error associated with this, such as misspellings, using incorrect prefixes/suffixes, or writing down
an address not known to a geocoding database are all common reasons for error (Ratcliffe, 2003). A
standard acceptable hit rate for geocoding addresses, when it comes to criminal analysis, is 85%
accuracy. This was shown to be acceptable for the purposes of analyses 95% of the time (Ratcliffe,
2003).

Methods:

In order to begin my crime analyses it was necessary for me to obtain the crime data. I went to
the City of Bellingham's website where the Bellingham Police Department has some publicly available
data in the form of a daily activity log. The log includes the type of offence committed, the time
recorded, the location, and case number. I copy and pasted this information from the website for every
day of 2013 into an excel spreadsheet. In total I had a little over 24,000 reported crime incidents.
The other data I required for my analyses could be obtained from the U.S. Census Bureau. I
downloaded population, age, sex, and income data as well as the Tigerline boundaries all at the block
group level for Whatcom County. I joined the spatial boundaries to the Census data. I clipped the block
groups so that I only had those that were in some way part of Bellinghams city limits since this was my
area of interest.
In order to bring my incident data into a GIS workspace so that I could analyze them spatially I

needed to geocode them so each incident became a spatial point. I downloaded a free geolocator style

from UCLAs GIS web portal (http://gis.ats.ucla.edu/). In order for my incident data to play nicely with

this geolocator I added several new fields which were, city, state, and country. I further edited that excel
Another useful tool I used is called Directional Distribution (Standard Deviational Ellipse).
Basically, this tool was meant to show if there was a directional element to any of my crime incidents. I

My next step was to categorize the different crimes into groups such as violent, theft,
alcohol/drug related, vandalism, and sex crime. I then created separate feature classes for each one of
these categories and spatially joined each one to the block group polygons they were associated with. I
did this so that it would be easier to identify clusters of specific activity rather than just crime in general.
The actual analysis that I performed involved several different methods. The first, and the one I
relied on the most, was hot spot analysis. Specifically, I used a method known as Getis Ord Gi* to
located areas where crime incidents were clustered. My intention was to find where each type of crime
was most likely to be committed. In order to do this I normalized the categories by total population so
that my hot spots werent all located in the most populous area. I did this after some preliminary results
that showed the most populous area had the highest rate of crime incidents. I created a model that
iterated through all of my categories so and performed hot spot analysis on each one.

Figure 1 - Hot Spot Analysis Model

wanted to know if there were any trends showing they were occurring along some defined route. I set
one of the parameters of the Directional Distribution tool show that 68% of all crime fell within the
ellipse. I also used a tool to determine the mean center for each category of crime points.
The final part of my study utilized a basic regression tool called Ordinary Least Squares. My
dependent variable was the number of crime incidents in a block group. My explanatory variables were,
total population, median age, and household income. I did this to try and determine if the distribution of
crime was anything other than random and to see how much weight should be given to each explanatory
variable. I only performed regression analysis on the combined categories of crime rather than each
individual one.
Results:

Figure 2 - Total Crime

To begin I would like to examine the distribution of all the incidents of crime. My category of
total crime only includes the combination of all the categories I created which are Violence, Theft,

Drugs/Alcohol, Sex, and Student related crimes. I left out a large number of reported incidents such as
trespassing, graffiti, interference with a custodian, and many other crimes I was unable to fit into one of
my categories. I will begin by showing map visualizing total crime where the data has not been
normalized by total population. The Central Business District had more reported crimes than anywhere
else in Bellingham for the year of 2013. This part of my analysis did not deviate from my expectations.
It makes sense that the area with the most concentrated human activity, and highest concentration of
bars, would also have the highest incident rate of crime. You should also take note of the colored point
symbols located within my purple ellipses. Each point represents the mean center of a category of crime
with one of them representing the mean center of total crime. The ellipse shown in purple was created
with the directional distribution tool. Of the total crimes recorded last year 68% of them are within that
ellipse.
Theft:

Figure 3 - Crimes Related to Theft

The total recorded crimes were not normalized by population. Each individual category was
normalized by total population before the hot spot analysis was run on each Census Block Group. The
category of Theft included auto theft, burglary, robbery, shoplifting, theft, and theft of a bicycle. It
became clear after looking at my results that the majority of theft related crimes were concentrated in the
west and central neighborhoods of Bellingham. This includes the neighborhoods of Birchwood,
Meridian, the Central Business District, Columbia, Cornwall, the Lettered Streets, and to a lesser extent
Cordata and Roosevelt. A strange extrusion of the hot spot in the Barkley neighborhood extends down
into the Alabama Hill neighborhood. The main significant cluster of theft was located in the Birchwood
neighborhood. To clarify, Im not saying most theft occurs in Birchwood. This is not the case because I
normalized the number of thefts by the total population before I performed the Getis Ord Gi* hot spot
analysis. This means that relative to the amount of people in Birchwood there was a significant cluster of
theft related crimes occurring there in 2013.
Look at the purple ellipse on the map. Sixty-eight percent (68%) of all theft related crimes
occurred within that ellipse. By sheer numbers alone the majority of them took place in the Central
Business District, Meridian, and Cordata. Many of them occurred at the Bellis Fair Mall as instances of
shoplifting. The mean center of Theft related crime was in the Lettered Streets neighborhood. You
should be able to see the dark blue areas at the eastern edge of Bellingham. Those areas had significantly
less theft related crime. This would include the neighborhoods of Silver Beach and Whatcom Falls.
Also, I noticed that the mean center of crime related to Theft was the farthest north out of all the
categories of crime. This suggests that there is a concentration of theft related crime on the northern end
of Bellingham. This makes sense as I also recognized that a sizable amount of theft related crime
occurred at the Bellis Fair Mall which is located relatively far north in Bellingham.

Drugs and Alcohol:

Figure 4 - Crimes Related to Drugs and Alcohol

The category of Drugs & Alcohol included the following offenses: driving under the influence,
narcotics violations, drug equipment violations, drug house, drunk person, liquor law violation,
overdose, prescription fraud, and tobacco violation. There were two main hot spots associated with this
category of crime and three less severe hot spots. The main hot spots were situated in the Birchwood
neighborhood and the South Hill neighborhood. The hot spot associated with the South Hill
Neighborhood also includes a small section of the Central Business District. The less severe hot spots
were within the Happy Valley, Birchwood, and Meridian neighborhoods. As I noticed with the crimes
related to Theft the Barkley Hill neighborhood hot spot for crimes related to Drugs and Alcohol also
extends a finger into the Alabama Hill neighborhood.
The directional distribution tool showed that 68% of drug and alcohol related crimes took place
within the Central Business District, Sehome, the Lettered Streets, York, WWU, and Cornwall. The

mean center of drug and alcohol related crimes was in the Central Business District. So much of the
crime took place around the aforementioned hot spots that most of Bellingham, by area, was a relatively cool
spot for drug and alcohol related crimes.
Violence:

Figure 5 - Crimes Related to Violence

The category of Violence included the following offenses: hit and run, felony assault,
misdemeanor assault, fights, homicide, kidnap, rape, rape of a child, shots fired, suicide, and vehicular
assault. The hot spot analysis of violent crime produced an interesting pattern that I did not see for my
other categories. The neighborhoods of Birchwood, Meridian, Cornwall, Columbia, the Lettered Streets,
Sunnyland, Central Business District, Roosevelt, and Barkley could all be considered hot spots of the
violent crime I analyzed. The only exception to this is the large cool spot located in the Birchwood
neighborhood that I am at loss to explain.
One thing I noticed about the ellipse produced by the Directional Distribution in this case was its
orientation. Compared to the ellipses produced the other categories of crime this one is oriented slightly

more east to west than the others. Im not sure why this is the case but I thought it was important to
mention because it suggests spatial influences on violent crime that are different from the other
categories.
Sex:

Figure 6 - Crime Related to Sex

The category of crime related to sex crimes included the following offenses: indecent exposure,
prostitution, sex crime (no rape), and voyeurism. The results I got from this analysis were interesting
because it appears as though the hot spots of sex related crime are very concentrated relative to the other
categories. The hot spots were located around the southern portion of the Birchwood neighborhood, half
of the Lettered Streets, a small part of the Central Business District, and some of the Barkley
neighborhood. The DD ellipses was long and narrow relative to the other categories. To me this suggests

that there are two anchoring concentrations of sex related crimes; one in the northern part of town and
the other on the opposite end.
Student:

Figure 7 - Crime Related to Students

I decided to create my own category of crimes related to students because of my hypothesis that
WWUs students would have an influence on the spatial distribution of certain crimes. The category of
crime related to students was a unique one because it included crimes from some at least one other
category. The crimes I decided to associate with students were: narcotics violations, drug equipment
violations, drunk person, liquor law violation, loud party, and noise ordinances. The pattern of hot spots
encompasses the Birchwood, Happy Valley, South Hill, Barkley, Roosevelt, and Alabama Hill
neighborhoods. The DD ellipses is very similar to the Drugs and Alcohols ellipses. This is because they
share some of the same crimes. The mean center of Student related crimes is the farthest south in
comparison to the other categories of crimes. I believe this is because of the high concentration of

students around Western and they pulled the mean south because they are a significant source of the
crimes I attributed to them. However, I do not have anything beyond the map I presented to back that up
and cannot say my evidence is conclusive in any way.
Regression:
I performed an Ordinary Least Squares regression model on my category of total crime.
Remember that I did not normalize this category by total population. My dependent variable was # of
crimes committed while my explanatory variables were total population, median age, and average
income over a 12 year period. I will insert some images from the report generated by my analysis.

Figure 8 - Summary of Important OLS Statistics

The statistics I paid the most attention to are the coefficients, the adjusted r-squared, Koenker, Jarque-
Bera, and VIF. My coefficients showed that there was no strong positive or negative correlation between
my dependent variable and any of my explanatory variables. Out of all my explanatory variables the
only one that was statistically significant was the income variable. The adjusted R-Squared value of

0.258897 suggests that my model is only explaining about 26% of the predicted number of crimes. I feel
it is safe to say that is not an acceptable model. My Jarque-Bera statistic was statistically significant, in

fact it was zero which I assume means there is some absolute certainty on the part of the model in
regards to that statistics significance. This, as I understand it, suggests that I have put together a poor
OLS model. My Koenker Statistic is not statistically significant. This suggests that my model does not
have an issue with non-stationarity which means I would most likely not improve my results with a
Geographically Weighted Model. As I understand it non-stationarity would mean that a variable might
be a good predictor in one area but weak in another. The VIF statistic is an indicator of whether or not
the explanatory variables might be redundant. If the VIF statistic was above 7.5 then redundancy might
have been an issue. My VIF statistics were all well below 7.5. Overall the OLS provided some
interesting results but did not produce an accurate model.

Discussion:

As with the production of any map I had to make a series of choices based on my own
knowledge as to how I was going to prepare, analyze, and present the data. One of those choices was to
remove any crime that occurred outside of Bellinghams city limits. The Bellingham Police force had
recorded a not so insubstantial amount of crime that was not within the city limits but would still have
had impact on the residents of Bellingham. Also, while geocoding I made some editing decisions that
effected the data. I think one of the most important decisions I made was to pick specific crimes to focus
on and then group them by category. I chose my categories based off of some examples I have seen of
other crime maps. However, the crimes that I included in each category was entirely up to me. I have no
idea if the crimes I chose for each category were valid. For instance, I chose to put the crimes involving
sexual assault under the violent crime category rather than the category of crime related to sex. This was
done purely on my own opinion. At one point in time I had been developing the data for a category of
crime related to vandalism. I dropped it during the analysis because the only crime related to vandalism
was graffiti and I didnt want a category that only had one crime attributed to it.

Furthermore, when I created the category of student related crime I merely thought about what
crimes I typically associated with students. This was not a scientific approach by any means. Also, my
original reason for thinking about student related crime was because I wanted to compare the spatial
distribution of crime during a month with classes in session and one with no classes in session. I would
still like to do that in the future.
If I had more time I would invested more effort into my regression analysis. I have to admit that I
only learned enough to interpret my results but not necessarily understand beyond a superficial level. I
have almost no experience with statistics so it was interesting to try and understand the outputs from my
Ordinary Least Squares regression. I want to continue to develop a work flow for analyzing crime in an
urban environment. I plan on expanding my knowledge of spatial statistics in order to do that. I am also
eager to see if I can apply Tyler Blacks technique of dasymetric mapping to my data set.
Conclusion:

The neighborhoods that appeared as hot spots the most often were Birchwood, the Central
Business District, South Hill, Meridian, Barkley, and the Lettered Streets. I found this to be meaningful
because that suggests that those neighborhoods may have a bigger problem with crime than the total
number of crimes occurring there would suggest. The Birchwood neighborhood was the only one to
appear as a hot spot for every category of crime. For every category of crime 68% of it fell within a
similar area with only minor differences shown in the directional ellipse. The mean center for every
category including total crime all clustered around where the Central Business District, the Lettered
Streets, and the Sunnyland neighborhood meet.
I would also like to reiterate that the results of my regression analysis were interesting but
inconclusive. The model was biased and nowhere near adequate enough for predictive purposes.
I was satisfied with my results because I ended up with information that I could not have
predicted based on a cursory knowledge of crime neighborhoods in Bellingham face.

Works Cited:
Ahmadi, Mostafa. "Crime mapping and spatial analysis." Unpublished Master of Science in
Geoinformatics Thesis (2003): 27.

Anselin, Luc, et al. "Spatial analyses of crime." Criminal justice 4 (2000): 213-262.

Bernasco, Wim, and Henk Elffers. "Statistical analysis of spatial crime data."Handbook of quantitative
criminology. Springer New York, 2010. 699-724.

Canter, Philip. "Using a geographic information system for tactical crime analysis." Analyzing crime
patterns: Frontiers of practice (2000): 3-10.

Chainey, Spencer, Lisa Tompson, and Sebastian Uhlig. "The utility of hotspot mapping for predicting
spatial patterns of crime." Security Journal 21.1 (2008): 4-28.

Hollis, Meghan Elizabeth. "Defining crime: an analysis of organizational influences on police processing
of information." (2013).

Levine, Ned. "Crime mapping and the Crimestat program." Geographical Analysis 38.1 (2006): 41-56.

Ord, J. Keith, and Arthur Getis. "Local spatial autocorrelation statistics: distributional issues and an
application." Geographical analysis 27.4 (1995): 286-306.

Ratcliffe, Jerry H. "Geocoding crime and a first estimate of a minimum acceptable hit rate."

Final Report

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Final Report

Uploaded by

Copyright:

Available Formats

SPATIAL ANALYSIS OF

You might also like