Professional Documents
Culture Documents
Week 6
Week 6
STAT240
You will use a lab computer to complete the midterm. IDs will be
checked, so please your SFU ID and a matching government ID
for comparison.
The exam is closed book. You cannot use the internet during the
exam, except to upload the exam to Crowdmark.
You can use RStudio's "help" tab. You cannot use previous labs
or slides or notes.
Set Zi = βXi + α
for product A)
i=1
n 2
1 1 f (Xi ) − Yi
= ∑ log exp(− ( ) )
σ√2π 2 σ
i=1
n
2
∝ − ∑ (f (Xi ) − Yi ) + K
i=1
— Wikipedia
Example: NASDAQ
For example, in the NASDAQ API, a command is run by going to
a URL (universal resource locator). The content of the "website"
at the URL is the requested data
library(httr)
url='https://data.nasdaq.com/api/v3/datasets/WIKI/AAPL.json?
start_date=1985-05-01&end_date=1997-07-
01&order=asc&column_index=4&collapse=quarterly&transformation=rdif
data = GET(url)
print(data)
Response [https://data.nasdaq.com/api/v3/datasets/WIKI/AAPL.json?
start_date=1985-05-01&end_date=1997-07-
01&order=asc&column_index=4&collapse=quarterly&transformation=rdiff]
Date: 2024-02-14 08:56
Status: 200
Content-Type: application/json; charset=utf-8
Size: 2.55 kB
Retrieving URLs in R
We convert to JSON, and explore to find out how to index the
data in the list
print(substr(as.character(data),1,50))
[1] "{\"dataset\":{\"id\":9775409,\"dataset_code\":\"AAPL\",\"da"
library(rjson)
parsed = fromJSON(as.character(data))
print(parsed$dataset$data[[1]])
[[1]]
[1] "1985-06-30"
[[2]]
[1] 18
Retrieving URLs in R
Once we understand how to index into the list, we can convert to
a dataframe:
n = length(parsed$dataset$data)
dates = rep("", n); values = rep(NA, n)
for (i in 1:n) {
dates[i] = parsed$dataset$data[[i]][[1]]
values[i] = parsed$dataset$data[[i]][[2]]
}
df = data.frame(date = dates, value = values)
kable(df[1:3,]) # Show 1st 3 rows as a table in rendered output
date value
1985-06-30 18.00
1985-09-30 15.75
1985-12-31 22.00
R packages for APIs
While APIs are programming language agnostic, many APIs
have libraries in a variety of languages (these may construct
URLs and parse results). NASDAQ API R package: Quandl
library(Quandl)
# This call gets the quarterly percentage change in AAPL stock
between 1985 and 1997, closing prices only
result = Quandl("WIKI/AAPL", transformation ="rdiff", start_date
= "1985-05-01", end_date = "1997-07-01", column_index = 4, order
= "asc", collapse = "quarterly")
kable(result[1:3,])
Date Close
1985-06-30 18.00
1985-09-30 15.75
1985-12-31 22.00
Example: Google maps
Example: Apple pay
Example: Discord
Example: eBird
eBird is a database of bird
sightings maintained by the
Cornell Lab of Ornithology
at Cornell University, Ithaca
library(rebird)
#API key must be obtained. Set API_KEY = "your key".
subregions = ebirdsubregionlist(regionType = "subnational1",
parentRegionCode = "CA", key = API_KEY)
New names:
• value -> value...5
• value -> value...6
• value -> value...7
• value -> value...8
• value -> value...9
eBird API
print(subregions)
# A tibble: 13 × 2
code name
<chr> <chr>
1 CA-AB Alberta
2 CA-BC British Columbia
3 CA-MB Manitoba
4 CA-NB New Brunswick
5 CA-NL Newfoundland and Labrador
6 CA-NT Northwest Territories
7 CA-NS Nova Scotia
8 CA-NU Nunavut
9 CA-ON Ontario
10 CA-PE Prince Edward Island
11 CA-QC Quebec
12 CA-SK Saskatchewan
13 CA-YT Yukon Territory
"Tibble" to vector
Converting "A tibble: 13 × 2" to a vector:
as.vector(subregions[,1])
# A tibble: 13 × 1
code
<chr>
1 CA-AB
2 CA-BC
3 CA-MB
4 CA-NB
5 CA-NL
6 CA-NT
7 CA-NS
8 CA-NU
9 CA-ON
10 CA-PE
11 CA-QC
12 CA-SK
13 CA-YT
eBird API: Species list
Again, from eBird API:
n = length(sr) ; m = length(sp)
recent = matrix(NA, n, m)
for (j in 1:m) {
for (i in 1:n) {
if (is.na(recent[i, j])) {
url = sprintf('https://api.ebird.org/v2/data/obs/%s/recent/%s',
sr[i], sp[j])
result = content(GET(url, add_headers(
'X-eBirdApiToken' = API_KEY
)))
...
for (k in 1:length(result)) {
...
recent[i, j] = recent[i, j] + howMany
...
[1] 883 14
print(colnames(ebird30))
names = read.csv("names.csv")
print(dim(names))
[1] 883 3
print(colnames(names))
source('canada.R')
heat = list("CA.MB" = 0, "CA.BC" = 1, "CA.AB" = 0, "CA.SK" = 0,
"CA.ON" = 0.5, "CA.QC" = 0, "CA.NL" = 0, "CA.NS" = 0, "CA.NB" =
0, "CA.PE" = 0, "CA.NT" = 0, "CA.NU" = 0, "CA.YT" = 0)
title = 'Test'
Drawing a map of Canada
canada(title, heat)
Heatmaps for birds
This can be combined with the eBirds data to provide data
visualizations:
heat = list()
rownames(ebird30) = ebird30$Code
for (i in 2:14) {
heat[[colnames(ebird30)[i]]] = ebird30["houspa",
colnames(ebird30)[i]]
}