Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Anything Arc can do, R can do better OR How to lose

Arc in 10 days (part 1/n)


UA Summer R Workshop: Week 3
Nicholas M. Caruso
Christina L. Staudhammer
14 June 2016

Making Maps: Salamander Species in US


To demonstrate making maps, we will use IUCN range maps to map out the number of salamander species in
the United States. These range map shapefiles, as well as many others, can be found on IUCNs website here.
As with other plotting we will use the ggplot2 package, which requires the data to be in a data frame, so we
must convert it before plotting. We will also be using the maptools, raster, rgeos, and rgdal, sp, and maps
packages, which will likely need to be installed first.
# install.packages(c('maptools','raster','rgeos','rgdal','sp','maps'))
library(maptools)
library(raster)
library(rgeos)
library(rgdal)
library(sp)
library(tidyr)
library(dplyr)
library(maps)

Read in the Data


We will use the readShapePoly() function and call the path to our range map shapefile. Next, we will use
map() to get a reference map of the United States counties (not including Alaska or Hawaii). Because we
are dealing with spatial data, we need to give our maps a projection, we will define a coordinate reference
system using CRS() set the projection for the caudate shapefile. We could plot these range maps, but in
their current format, they arent very informative.
caudates <- readShapePoly('E:/IDrive-Sync/Documents/R workshop Summer 2016/CAUDATA.shp')
usa <- map("county", fill=TRUE)

albers.proj <- CRS("+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=37.5 +lon_0=-96


+x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs")
projection(caudates) <- albers.proj
plot(caudates)

Convert Map to SpatialPolygon


Before we work with these two maps, we need to convert our county USA map to a SpatialPolygon. We want
to preserve the identity of the county and states polygons (for plotting), so well use the preserve the ID
variable and give the usa map the same projection as our range map.
IDs <- usa$names
usa <- map2SpatialPolygons(usa, IDs=IDs, proj4string=albers.proj)

Number of Salamanders in Each County


For this map, we want to know how many polygons falls within the boundaries of each county. This can be
accomplished using sapply(), over(), geometry(). We will retrieve the geometry from our composite range
map shapefile, and overlay our county map with our range map. This overlay function will return a list for
every county with a vector of all of the species from our range map that are within that county (each entry
of the the vector corresponds to a species). We then use sapply() to find the length of each of those vectors,
convert to a dataframe and use the rownames to create a new column of county ids.
counts <- sapply(over(usa, geometry(caudates), returnList=TRUE), length)
counts.df <- as.data.frame(counts)
counts.df$id <- rownames(counts.df)
rownames(counts.df) <- NULL

Combine Datasets
Now we can combine our dataframe of counts of number of species per county, we can join these data back into
our county map data so that we can plot the county polygons. First we will convert our SpatialPolygon-class
US county map into a dataframe for joining and plotting. We will then use left_join to join our count
dataframe into our county maps dataframe, matching by the unique county id.
states.df <- fortify(usa, region = "ID")
sallys.counties <- left_join(states.df, counts.df, by='id')
head(sallys.counties)
##
##
##
##
##
##
##
##
##
##
##
##
##
##

1
2
3
4
5
6
1
2
3
4
5
6

long
-86.50517
-86.53382
-86.54527
-86.55673
-86.57966
-86.59111
counts
22
22
22
22
22
22

lat order hole piece


id
group
32.34920
1 FALSE
1 alabama,autauga alabama,autauga.1
32.35493
2 FALSE
1 alabama,autauga alabama,autauga.1
32.36639
3 FALSE
1 alabama,autauga alabama,autauga.1
32.37785
4 FALSE
1 alabama,autauga alabama,autauga.1
32.38357
5 FALSE
1 alabama,autauga alabama,autauga.1
32.37785
6 FALSE
1 alabama,autauga alabama,autauga.1

Plot the map


We will use the ggmap, scales, and RColorBrewer packages along with ggplot2 to map our data. First, well
plot the new counties dataframe with counts of salamanders using geom_polygon(); our aesthetics are defined
as the longitude (x), latitude (y), the county ids (group), the number of salamander species within each
county (fill). The remaining changes it the map are superficial and include, adjusting the map projection,
changing the fill gradient, changing the theme, legend position, appearence of the x-axis, and the labels for
both axes.
# install.packages(c('ggmap','scales','RColorBrewer'))
library(ggplot2)
library(ggmap)
library(scales)
library(RColorBrewer)
ggplot() +
geom_polygon(data=sallys.counties,
aes(x=long, y=lat, group=id, fill=counts),
color="black", size=0.25) +
coord_equal() +
scale_fill_gradient2(name="Number of\nSpecies", low="white", high='navyblue') +
theme_bw() +
theme(legend.position=c(0.92, 0.27),
legend.background=element_rect('transparent')) +
scale_x_continuous(breaks=c(-120,-100,-80),
labels=c(120,100,80),
expression(paste('Longitude (', degree,
'W)'))) +
ylab(expression(paste('Latitude (', degree, 'N)')))

50

Latitude (N)

45

40

Number of
Species

35

30
30

20
10
0

25
120

100

80

Longitude (W)

Inset Map: Alabama Salamander Richness


To demonstrate an inset map, we will visualize Alabama species richness compared to the rest of the US. First
well filter our county map by only the state of Alabama using grepl which searches for a pattern (alabama)
in our id column.

Filter Data
bama.rich <- sallys.counties[grepl('alabama,', sallys.counties$id),]

Define Map
Similarly to the US map, we will create a map of the number of salamander species in Alabama and store it.
Because The scale is much smaller, we will adjust the fill gradient. Just to make it look fancy, well add an
off-centered gray polygon to make it look like the plot has a shadow. To show a different x and y axes labels,
we adjusted the x and y scale pasting the N/W directly after the number.
bama.map <- ggplot() +
geom_polygon(data=bama.rich,
aes(x=long+0.05, y=lat-0.05, group=id), fill='gray80') +
geom_polygon(data=bama.rich,
aes(x=long, y=lat, group=id, fill=counts),
color="black", size=0.25) +
coord_equal() +
scale_fill_gradient2(name="Number of\nSpecies", low="white", high='darkblue',
midpoint=22, mid='steelblue1') +
theme_bw() +
scale_x_continuous(breaks=c(-88,-87,-86,-85),
labels=c(paste(c(88:85), 'W', sep=''))) +
scale_y_continuous(limits=c(29.8,35.25),
breaks=c(seq(30, 35, 1)),
labels=c(paste(seq(30, 35, 1), 'N', sep=''))) +
labs(x='', y='')

Define Inset
We will use the states.df to plot our inset map. To define our zoomed-in map of Alabama, by plotting the
state of Alabama with darker red lines.
states <- map("state", fill=TRUE, plot=FALSE)
states.df2 <- fortify(states, region='id')
inset.map <- ggplot() +
geom_polygon(data=states.df2, aes(long, lat, group=group),
color='black', fill='white', size=0.25) +
geom_polygon(data=states.df2[states.df2$region=='alabama',], aes(long, lat, group=group),
color='darkred', fill='white', size=1) +
coord_equal() +
theme_bw() +
labs(x=NULL, y=NULL) +

theme(axis.text.x=element_blank(), axis.text.y=element_blank(),
axis.ticks=element_blank(), axis.title.x=element_blank())

Plot Map and Inset


We will need the gridExtra package to plot these two maps together. Here we can define the areas and
placement for each plot. We will place the zoomed-in map in the middle of the plot, while we will place our
inset map in the lower right hand corner.
# install.packages('gridExtra')
library(gridExtra)
library(grid)
library(GISTools)
grid.newpage()
map.viewport <- viewport(width=1, height=1, x=0.5, y=0.5)
inset.viewport <- viewport(width=0.45, height=0.55, x=0.57, y=0.3)
print(bama.map, vp=map.viewport)
print(inset.map, vp=inset.viewport)

35N

34N

Number of
Species

33N

26
24
22
20
32N

18

31N

30N

88W

87W

86W

85W

Add Points
Lets say that we randomly sampled the state of Alabama, and we want to visualize the proportion of species
that we found at each site. We will use the dismo package to create the random sampling locations throughout
the state. To create a random sampling of points, we need to use a raster object as the mask. We will also
create a proportion species found column. Thus, when we add these data points to our previous plot we
can scale the symbol size to be proportional to the proportion of species we found at that site.

Random Sampling Sites


library(dismo)
r <- raster(ncol=200, nrow=200)
extent(r) <- c(-88,-85.5, 31, 35)
set.seed(8675309)
sampling.sites <- as.data.frame(randomPoints(r, 20))
sampling.sites$prop.spp.found <- runif(nrow(sampling.sites), 0.5, 1)
head(sampling.sites)
##
##
##
##
##
##
##

1
2
3
4
5
6

x
-87.59375
-86.30625
-87.18125
-85.68125
-85.80625
-87.88125

y prop.spp.found
31.81
0.6313152
33.85
0.7202442
33.99
0.7492431
31.15
0.9298679
34.01
0.7590958
34.35
0.5575507

10

Plot Points with Proportional Symbol Size


Now we can create a new map with our points, and plot it similarly as above.
bama.map2 <- ggplot() +
geom_polygon(data=bama.rich,
aes(x=long+0.05, y=lat-0.05, group=id), fill='gray80') +
geom_polygon(data=bama.rich,
aes(x=long, y=lat, group=id, fill=counts),
color="black", size=0.25) +
geom_point(data=sampling.sites, aes(x=x, y=y, size=prop.spp.found)) +
coord_equal() +
scale_fill_gradient2(name="Number of\nSpecies", low="white", high='darkblue',
midpoint=22, mid='steelblue1') +
theme_bw() +
scale_x_continuous(breaks=c(-88,-87,-86,-85),
labels=c(paste(c(88:85), 'W', sep=''))) +
scale_y_continuous(limits=c(29.8,35.25),
breaks=c(seq(30, 35, 1)),
labels=c(paste(seq(30, 35, 1), 'N', sep=''))) +
labs(x='', y='') +
scale_size(name='Proportion of\nSpecies Found',
range=c(0.5,7))
library(grid)
grid.newpage()
map.viewport <- viewport(width=1, height=1, x=0.5, y=0.5)
inset.viewport <- viewport(width=0.45, height=0.65, x=0.56, y=0.22)
print(bama.map2, vp=map.viewport)
print(inset.map, vp=inset.viewport)

11

35N

34N

Number of
Species
26
24
33N

22
20
18

Proportion of
Species Found
0.6

32N

0.7
0.8
0.9

31N

30N

88W

87W

86W

12

85W

You might also like