Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 39

Introduction to R for Air Quality Research

Dr. Ross Edwards - Senior Scientist


Wisconsin State Laboratory of Hygiene, University of Wisconsin Madison
Acknowledgements and disclaimer
Disclaimer
This presentation was funded by a grant from the United States Department of
State to the University of Wisconsin. The opinions, findings and conclusions
stated herein are those of the presenter and do not necessarily reflect those of
the United States Department of State.

Funding
• U.S. Department of State awards SMLAQM19CA2361
Partnerships
• The U.S Embassy in Bangladesh and U.S Consulate General, Kolkata.

Academic collaborations
• Dhaka University and Bose Institute, Kolkata.
Introduction to R

Subjects to be covered

1. Cran R Installation and setup navigating.


2. R- Studio 1.
3. R- Studio 2.
4. Data types and structures 1.
5. Data types and structures 2.
6. Data types and structures 3.
7. Importing and exporting data.
8. Datetime objects
9. Plotting.
10. Plotting.
12. Writing functions.
13. Writing functions.
Introduction to R

Experience needed
Experience with data in excel or another spreadsheet program. Basic statistics.

Required
Access to an internet connected computer with R installed. See https://cran.r-project.org for
download and installation instructions.

Materials
Presentation slides and R code can be found at.
https://drive.google.com/drive/folders/1UmMo968xAO-cQiUCx6HMNEZtTtExUQP3?usp=sharing
R Language

Why R?
• Most popular language for data science.
• Open source.
• Large community.
• More focused on statistics and data exploration than
Python.
• Python is better for long-term production and
deployment of data pipes etc.
R Background

• R is a language and environment for statistical computing and graphics.

• Is like the S language and environment developed at Bell Laboratories


(formerly AT&T, now Lucent Technologies) by John Chambers and colleagues
in 1976.

• In many respects R is a different implementation of S. There are differences,


but some S code will run under R.

• The implementation was created in 1991 by statisticians Ross Ihaka and


Robert Gentlemen at the University of Auckland New Zealand and first
released as open source (GNU) in 1995.

• Named after the first initial of the R authors and as a play on the name of S
language.
R - Some basic questions

How do I install R?
R IDE and R Studio ?
help? Exporting data from R ?
Importing data into R?

Plotting data ? Calculate statistics with R?

What is an R script? Packages / libraries?


Objects?
Data frames?
Data types? Write functions?
R Installation

• R can be installed on Windows, Mac and Linux computers

• Download for free at https://cran.r-project.org

• NOTE: only download R installers from a CRAN server using


HTTPS

• Windows R FAQ is at
https://cran.r-project.org/bin/windows/base/rw-FAQ.R-
4.2.1.html
R Windows Users

Important Note to Windows Users:


R gets confused if you use a path in your code like:
c:\mydocuments\myfile.txt

R sees "\" as an escape character.

Instead use: c:\\my documents\\myfile.txt


or c:/mydocuments/myfile.txt

Both should work.


R Commands

Entering Commands
R is a case sensitive command line driven program. The user enters commands at the
prompt > each command is executed one at a time.

E.g. > mean(x)

This command runs the function “mean” on the data contained in object “x” .

Remember R commands etc. are case sensitive “a” is


different to “A”.
R Operators

Operators

Arithmetic Logical
Operator Description Operator Description
+ addition > greater than
- subtraction
>= greater than or equal to
* multiplication
== exactly equal to
/ division
^ or ** exponentiation != not equal to
R Assignment Operator

Assignment “<- “

Used to create new objects.

x <- c(1,2,3,4,5,6) # creates a numeric vector called “x”

e.g. a <- mean(x) # applies the mean function to x and creates a new object called “a” to hold
answer

Escape “#” for commenting it’s not run

e.g. ##### R only supports single line comments


# for multi line comments need to use R Studio
R Session

R session concepts
R session
Workspace

Working directory R GUI or R command line


Objects Menu bar Terminal

Packages Console Files


R scripts Document window Graphic device
Graphic device
R Workspace / Working Environment

R session concepts
Workspace Default workspace loads on startup

Working directory The starting directory , to search for folders, data etc.

Objects user-defined objects (vectors, matrices, data frames, lists, functions).

Packages Libraries of functions, code data, etc.

R scripts Text files containing commands, functions etc..

R session can be saved to the default workspace or as a new workspace


or deleted.

Includes - Command history. Use Up and down arrow keys scroll


through your command history
R GUI

R session concepts
R GUI
Menu bar Preferences, graphics devices, package installation + management etc.

Console Entering commands.

Document window Text editor for building and running scripts ( series of commands).

Graphic device For displaying graphics.


R Command Line Editer (CLE)

R session concepts

R command line R can be run without the GUI from a CLE terminal

Terminal Terminal for entering commands.

Files Files or scripts top run from the CLE

Graphic device For displaying graphics.


R GUI

Using the GUI Document editor Quartz windows for graphics

At R app start up – should see.

❖ Menu bar along top.

❖ R console window beneath menu.


Command Line Interface (CLE).

❖ Quartz window: Graphics etc. Type or copy paste text here


displayed in a quartz window.

❖ Document Editor – after opening or


creating a file.
R GUI
Quartz windows for graphics
Using the (GUI)
❖ The Command Line Interface
(CLE). Under Console

❖ Type in commands, text etc.

❖ Try the following:


getwd()
help()
help(getwd)
data(iris)
head(iris)
tail(iris)
ls()
R GUI
Packages and Data
Packages and data

Packages are libraries containing


functions and data.

R comes with a base package


and some other standard packages.

To see a lists of packages go to


https://cran.r-project.org/web/pack
ages/
R Package structure

R package structure Package functions


Data for demos
Description of functions
R GUI
Packages and Data
Install a package Search for package

• Got to the Packages & Data menu


and select Package Installer.

Search for openAir.


R GUI

Install a package
• Got to the Packages & Data menu
and select Package Installer.

1. Search for openAir package.


2. Select the package with cursor.
3. Tick “Install Dependencies”
4. Click on “Install Selected”.
R GUI
Install a package
• Menu > R
• Preferences
• Startup

• The default mirror repository for R packages


can be changed in Preferences .

• Usually this is set to “https://cran.case.edu”


to automictically select the closest
repository.
R GUI

Install a package
• The first time you install a package a large
number of other dependent packages may
also be installed.

• R may also ask you which repository mirror


site to use.

• The repository should have a “https” in front


of it’s URL.

• I usually let R decide which mirror to use or


select the mirror closest to me.
R GUI
Packages and Data
> R Package Manager
Attach a package
• Once a package is installed its functions and
data can be accessed by attaching it.

• Packages can be attached using the R


Package Manager or the console .

Package Manager

• 1. Using the Package Manger drop down


menu under Packages and Data.

• 2. Select the package to attach using the


cursor.
R GUI
Packages and Data
> R Package Manager
Attach a package
Console using library() command
library(openair)
Type the command “ library(openair) “
and return.

• Note that the name of the package used in


the library command is usually all lower caps!
• If the package refuses to load R maybe
missing a package which it is dependent on. Commands
• The missing package will need to be installed.
library() - lists all available packages .
search() - lists all loaded packages .
R GUI
Packages and Data
> R Package Manager
detach a package
• Packages can be unloaded from the working
environment by unselecting the package in detach(package:openair ,unload=TRUE)
the Package Manager or from the R console. search()
• From the console use -

detach(package:openair ,unload=TRUE)
search() # check that package is detached.
R GUI

Package Use Install package to R library

• Once a package is installed into R it can


be attached and detached (to save CPU
time). attach package to detach package from
working environment working environment
• Loading is used to access specific
functions from a specific package.
More on loading latter.
Use functions from package
R GUI

Package Use Workspace


• Sometimes different packages
use the same name for a Search path for
function
function. 10th package attached

9th package attached


For example function “filter(x) “.
8th package attached
If multiple packages attached R will
look for function in last function
attached, then the next package etc. Masking : new attached package function masks
previously attached package function.

“Sometimes this causes problems”.


R GUI

Package Use Workspace


• How to prevent problems with
functions with the same name. path dplyr :: filter()
10th package attached function

• Functions can be addressed 9th package attached


directly from attached packages
using “loading” symbol “::” 8th package attached

• For example

dplyr :: filter() - says use the filter


function from the dplyr package.
R GUI

Packages and Data – Load Data R Package

Functions Dependent
• R packages often contain data for packages
example use or for the package Data
functions. R package structure

• Package data can be brought into the


working environment through the
menu:
• Packages and data
> Data Manager
Click on a dataset then
select “load Data”
R GUI

Packages and Data – Load Data

• R packages often contain data for


example use or for the package
functions.

• Package data can be brought into the


working environment through the
menu:
• Packages and data
> Data Manager
Click on a dataset then
select “load Data”
R GUI

Packages and Data – Load Data

• Package datasets can also be accessed


using the data() command. iris <- data(iris)
head(iris)
• For example “ data(iris) ” Loads the iris
dataset from R’s base package.

• Note that this is not the same as


importing a data file into R.
End WK1.
WK 1 Exercises

Wk 1.1
a. Install the following packages using the package manager.

tidyverse
lubridate
dplyr
ggplot2
Openair

b. Attach and detach the packages using the package manager and
commands
WK 1 Exercises

Wk 1.2
a. Find your working directory using the misc menu and the command.
getwd()

b. copy the folder at

https://drive.google.com/drive/folders/1K86en-JdVS5nHdMuGPlaChO1
mKMvs0Rx?usp=sharing

To your computer and make the folder “tsi_concat_dat” your working


directory using the Misc menu. Check using getwd().
WK 1 Exercises

Wk 1.3
a. In the console type command:
source("tsi_concat_dat.r")

This command will attach packages and load the tsi_concat_dat function.
If that fails go to R File menu, select source file and find tsi_concat_dat.r.
WK
WK11Exercises
Exercises
Wk 1.4 R Commands

a. In the console list the objects using ls()

• Use the class function to identify different objects e.g. class(TSI2)


• Get a summary of the data in TSI2 summary(TSI2)
• Get the attributes of TSI2 date column
attributes(TSI2$date)
• Get the names of data columns in TSI2 names(TSI2)

• Get the class of TSI2 pm25 column class(TSI2$pm25)

• Get summary stats for TSI2 pm25 column summary(TSI2$pm25)


WK
WK11Exercises
Exercises
Wk 1.5

Install free version of R Studio from

https://www.rstudio.com/products/rstudio/download/

You might also like