Professional Documents
Culture Documents
ch4 PDF
ch4 PDF
ch4 PDF
Advanced joining
Joining Data in R with dplyr
> library(tibble)
> rownames_to_column(noNames, var = "name")
name surname band
1 Mick Jagger <NA>
2 John name of
Lennon Beatles name of column
table
3 Paul McCartney Beatles to add
Joining Data in R with dplyr
Let’s practice!
JOINING DATA IN R WITH DPLYR
Conflicting names
> playsWith > plays
name plays name plays
1 Mick Stones 1 John Guitar
2 John Beatles 2 Paul Bass
3 Paul Beatles 3 Keith Guitar
Let’s practice!
JOINING DATA IN R WITH DPLYR
purrr R package
● Applies functions in efficient ways
● reduce()
● Works well with dplyr
Joining Data in R with dplyr
Installing purrr
> install.packages("purrr")
> library(purrr)
three r’s
Joining Data in R with dplyr
reduce()
> surnames > names > plays
name surname name band name plays
1 Mick Jagger 1 Mick Stones 1 John Guitar
2 John Lennon 2 John Beatles 2 Paul Bass
3 Ringo Starr 3 Paul Beatles 3 Keith Guitar
Let’s practice!
JOINING DATA IN R WITH DPLYR
Other implementations
Joining Data in R with dplyr
merge()
> merge(names, plays, by = "name", ...)
src_sqlite() SQLite
src_postgres() PostgreSQL
install.packages("DBI")
Joining Data in R with dplyr
# Connect to a database
> air <- src_postgres(dbname = "airontime", host =
"sol-eng-sparklyr.cyii7eabibhu.us-east-1.redshift.amazonaws.com",
port = "5439", user = "redshift_user", password = "ABCd4321")
# Manipulate tables
> flights <- left_join(flights, planes, by = "tailnum")
# Collect results
> flights <- collect(flights)
Joining Data in R with dplyr
Let’s practice!