Professional Documents
Culture Documents
Intro To Geospatial Data and Maps in R
Intro To Geospatial Data and Maps in R
Overview
To try this tutorial out yoruself, you will need a copy of R installed on your computer, with the latest
version available at http://cran.r-project.org/. We also recommend using RStudio, available
at http://www.rstudio.com
Here are some other helpful resources we came across while putting together this tutorial:
https://cran.r-project.org/doc/contrib/intro-spatial-rl.pdf
https://www.zevross.com/blog/2016/01/13/tips-for-reading-spatial-files-into-r-with-rgdal/
http://zevross.com/blog/2015/10/14/manipulating-and-mapping-us-census-data-in-r-using-
the-acs-tigris-and-leaflet-packages-3/
https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/ggmap/ggmapCheatsheet.pdf
The goals of this tutorial are to provide introductory skills on how to read-in, manipulate/edit, and plot
geospatial data in R. Geospatial data are used in diverse scientific fields such as history, biology,
busienss, tech, public health, sociology, psychology, and many more.
Important considerations for using geospatial data include defining:
We will present examples of geospatial data at multiple levels of analysis including county-level flu
data within Pennsylvania, as well as individual-level data for a single person on a single day.
Outline
Section 0: Load Packages & Set Working Directory
Section 1: Reading in Geospatial Data
Section 2: Basic Geospatial Data Formatting
Section 3: Fetching a Map
Section 4: Plotting on a Static Map
Section 5: Plotting Dynamic Maps
Section 0: Load Packages & Set Working Directory
Check whether all packages installed (install them if not), and then load all packages:
Note: We don’t use all of these packages in the tutorial, but these were the most common packages
we came across while writing the tutorial, and therefore may be helpful to dig into in your own work :)
library(maps)
library(mapdata)
library(maptools)
library(rgdal)
library(ggmap)
library(leaflet)
library(tigris)
library(sp)
library(ggplot2)
library(plyr)
library(animation)
library(gridExtra)
library(psych)
library(rstudioapi)
library(data.table)
Set working directory to source file location first, or choose different directory:
Date
Name
Start
End
Duration
Latitude
Longitude
These geospatial data are formatted as a standard .csv. Other spatial data formats include: GPX,
ESRI, Shapefile, GeoJSON, and many more. Using the rgdal package, we can take a look at all the
different drivers (i.e., spatial data formats) that can be utilized:
ogrDrivers()$name
## [1] "AeronavFAA" "ARCGEN" "AVCBin" "AVCE00"
## [5] "BNA" "CSV" "DGN" "DXF"
## [9] "EDIGEO" "ESRI Shapefile" "Geoconcept" "GeoJSON"
## [13] "Geomedia" "GeoRSS" "GML" "GPKG"
## [17] "GPSBabel" "GPSTrackMaker" "GPX" "HTF"
## [21] "Idrisi" "JML" "KML" "MapInfo File"
## [25] "Memory" "MSSQLSpatial" "ODBC" "ODS"
## [29] "OGR_GMT" "OGR_PDS" "OGR_SDTS" "OGR_VRT"
## [33] "OpenAir" "OpenFileGDB" "OSM" "PCIDSK"
## [37] "PDF" "PGDUMP" "PGeo" "REC"
## [41] "S57" "SEGUKOOA" "SEGY" "Selafin"
## [45] "SQLite" "SUA" "SVG" "SXF"
## [49] "TIGER" "UK .NTF" "VFK" "Walk"
## [53] "WAsP" "XLSX" "XPlane"
Further, there are many places on the internet as well as other R packages where geospatial data
can be obtained. For example, the cdcfluview package allows you to retrieve US flu season data
from the CDC. The US Census Bureau also has a wealth of data availabel to be downloaded
(https://factfinder.census.gov/faces/nav/jsf/pages/download_center.xhtml).
For now, we will read-in the moves data:
head(moves)
## Date Name
## 1 1/21/2017 Place in Penn Quarter, Washington
## 2 1/21/2017 Place in Federal Triangle, Washington
## 3 1/21/2017 Place in The Mall, Washington
## 4 1/21/2017 Place in The Mall, Washington
## 5 1/21/2017 Place in The Mall, Washington
## 6 1/21/2017 Place in The Mall, Washington
## Start End Duration Latitude
## 1 2017-01-21T08:49:27-05:00 2017-01-21T08:55:14-05:00 347 38.89821
## 2 2017-01-21T09:09:41-05:00 2017-01-21T09:11:41-05:00 120 38.89372
## 3 2017-01-21T09:22:10-05:00 2017-01-21T09:29:11-05:00 421 38.89126
## 4 2017-01-21T09:34:49-05:00 2017-01-21T09:50:34-05:00 945 38.88987
## 5 2017-01-21T09:50:35-05:00 2017-01-21T10:03:02-05:00 747 38.89070
## 6 2017-01-21T10:03:03-05:00 2017-01-21T11:28:17-05:00 5114 38.88939
## Longitude
## 1 -77.02807
## 2 -77.02371
## 3 -77.01739
## 4 -77.01638
## 5 -77.01578
## 6 -77.01737
In the event that there is not a duration variable, first we would need to format the start and stop
dates into date objects, and then we could calcualte a duration variable ourselves:
Another possibility, is to format the dataset so that the rows are on a more uniform time metric such
as seconds, minutes, or hours. Below we demonstrate how to transform the moves data given above
into a dataset that can be indexed by minutes or hours:
# Re-insert and expand the minutes variable into a sequential indicator for e
ach "obs"
moves3 <- ddply(moves2,"obs",mutate, minutes = seq(1:length(obs)))
# Insert a new row 1 - this is used for plotting purposes to help with size s
caling
describe(moves3$minutes_scaled) # range = 0.1:8.5
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 487 2.21 2.14 1.3 1.86 1.33 0.1 8.5 8.4 1.24 0.52 0.1
# For later plotting purposes, add in a ghost row for scaling
moves3 <- rbind(c(0,0,0,85,8.5),moves3)
PAcounties contains the following information: * long: longitude * lat: latitude * group: a grouping
variable for different regions, that is extra useful for plotting adjacent points (within a group). * order:
each element within a group is assigned an order number, which will tell ggplot the order to connect
the dots within groups. * region: describes what each group is (e.g., state name) * subregion:
describes what each sub-group is (e.g., counties in PA)
Next, we need to use the information we pulled from map_data to tie county information
in fludat geographic signifiers (latitude and longitude)
ggmap Package
The get_map() in the ggmap package is a wrapper that can query Google Maps, OpenStreetMap,
Stamen Maps or Naver Map servers.
For the location argument, you can either input an address, longitude and latitude, or
left/bottom/right/top boudning box.
For the zoom argument, you input an integer where 3 is at the continent level and 21 is at the
building level.
For the maptype argument, you can specify a character string corresponding to, terrain,
terrain-background, satellite, roadmap, hybrid, toner, watercolor, terrain-labels, and so forth
(see ?get_map).
?ggmap
map1 <- get_map(location = "State College",
maptype = "terrain",
source = "google",
crop = FALSE,
zoom = 10)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=State+
College&zoom=10&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=fa
lse
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?ad
dress=State%20College&sensor=false
plot_map1 <- ggmap(map1) +
xlab("Longitude") + ylab("Latitude") +
theme(legend.position="none")
plot_map1
summary(moves3[,c("longitude","latitude")])
## longitude latitude
## Min. :-77.04 Min. : 0.00
## 1st Qu.:-77.03 1st Qu.:38.89
## Median :-77.02 Median :38.89
## Mean :-76.87 Mean :38.81
## 3rd Qu.:-77.02 3rd Qu.:38.89
## Max. : 0.00 Max. :38.90
map3 <- get_map(location = c(-77.025,38.895),
maptype = "terrain",
source = "google",
crop = FALSE,
zoom = 15)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=38.895
,-77.025&zoom=15&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=f
alse
plot_map3 <- ggmap(map3) +
geom_point(data = moves, aes(x = Longitude, y = Latitude)) +
geom_path(data = moves,aes(x = Longitude, y = Latitude), size = 1, lineend
= "round") +
xlab("Longitude") + ylab("Latitude") +
ggtitle("Plot of Women's March Time Series Data") +
theme(legend.position="none",
plot.title = element_text(hjust = 0.5))
plot_map3
# Create gif
saveGIF({
seq_list <- unique(moves3$seq_num) # this sets how many plots will be tie
d together (equal to number of rows)
#looping through plots
for(i in 1:length(seq_list)){
print(paste(c("Now at", seq_list[i])))
dat <- subset(moves3, seq_num <= seq_list[i])
print(
ggmap(map) +
scale_size(range=c(0,8.5)) +
scale_colour_manual(values=palette) +
geom_point(data = dat, aes(x = longitude, y = latitude, size=minu
tes_scaled,colour=factor(obs))) +
geom_path(data = dat, aes(x = longitude, y = latitude), size = .5
, lineend = "round") +
xlab("Longitude") + ylab("Latitude") +
theme(legend.position="none")
)
}}, movie.name="moves2.gif",
interval = .001,
ani.width = 600,
ani.height = 600)