Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

Lecture

4
Xiaotong Suo

Homework 1
Ques7on 3

Todays agenda
Data input/output
Graphics

Data input/output
R can write matrix and data frames to le using the
func7on write.table. And read data from le using
read.table.
If you have a tab-delimited le, use the func7on
read.delim instead. If the le is comma-separated le,
then use read.csv.
Year,Student,Major
2009, John Doe,Sta7s7cs
2009, Bart Simpson, Mathema7cs I
The above is an example of a comma-separated le.
Tab.delimited is the same except that we have tabs as a
separator.

Data input/output con7nued


The data set airquality is available is R and
gives weather measurement in New York city
over some period of 7me. Load that data set
in a data frame and save it to a le.

Data input/output con7nued


dt=airquality
write.table(dt,Airquality.dt,col.names=T,
row.names=F,sep=" ", na=Missing)

You could also use write.csv. See the help


documenta7on for details.

Data Input/output con7nued


Things to keep in mind when reading or wri7ng
to le:
Header: whether the le has a rst row giving the
names of the variables.
Separator: What separator of elds is used:
space, comma, tabular.
Missing data character string: What character
strings serve as missing data.
Do you want to allow R to convert characters
variables to factors? use op7ons stringsAsFactors
and as.is.

Data input/output con7nued


The general syntax of read.table:
mydata=read.table(lename.dat,header=F, sep=
, dec=., col.names=c(V1,V2),na.strings=NA)

Data input/output con7nued


Let try it with the le just saved.
dtNew=read.table(Airquality.dt,header=T, sep=
, dec=.,na.strings=Missing)

Data input/output con7nued


As men7oned earlier, if you have a tab-delimited
le, use the func7on read.delim instead. If the
le is comma-separated le, then use read.csv.
Another func7on to read text data is read.fwf
that works with xed-width text data. See the
user manual for more detail.
Yet, another func7on to read data from le is
scan. It is more ecient when reading data of a
single mode. See the user manual.

Data input/output
Exercise: The le Earmarksbymember08.xls is
an Excel le available in coursework. Load this
le in R.

Graphics
R has a powerful graphical capability.
To plot a graph you need a graphical device. If you launch
your plot right away, R will create automa7cally one
graphical device for you.
On OS Mac use the func7on
quartz()
to create a graphical device.

On Windows systems, use


windows()

A graphical device can also be a le. Your graphs are then


sent to that le. Use the func7ons
pdf()
postscript()

Graphics con7nued
Example: the airquality data set.
dt=airquality
names(dt)
boxplot(dt$Temp)
plot(dt$Temp,type=l)
plot(dt$Temp,dt$Wind,type=p)
plot(dt$Temp,dt
$Wind,type=p,xlab=Temperature, ylab=Wind,
main=Wind vs Temp. in NY city May-Sept. 73)

Graphics
Con7nuing with the airquality dataset,
suppose we want to do a boxplot of the data
from each month.
dt$Month=as.factor(dt$Month)
boxplot(Temp ~ Month,data=dt,
names=c(May,June,July,August,Sept.))

Graphics
What if we want to have mul7ple graphics on
the same graphical device? There are many
ways to do this.
One simple possibility is layout.

Graphics
Example: the airquality data set.
m=matrix(c(1,2),ncol=2)
layout(m)
layout.show(2)
boxplot(dt$Temp,main=Boxplot)
plot(dt$Temp,type=l,main=Time series plot)

Graphics
Example: the airquality data set.
m=matrix(c(1,3,2,3),2,2)
layout(m)
layout.show(3)
boxplot(dt$Temp,main=Boxplot Temp. in NY
city)
plot(dt$Temp,type=l,main=Temp. in NY city)
plot(dt$Temp,dt$Wind,type=p,xlab=Temp,
ylab=Wind,main=xyplot)

Graphics con7nued
What if we want to put mul7ple graphs on the
same plot.
issue
par(new=T)

rst.

Graphics con7nued
Few plokng func7ons in R:
plot(x): plot the values of vector x.
plot(x,y): bivariate plot of y as func7on of x.
boxplot(x): box-and-whiskers plot.
hist(x): produce a histogram of x.
... many others. See R manual by typing
help.start().

Graphics con7nued
Example:
n=10000;
X=rnorm(n);
hist(X,breaks=200,prob=T,col=blue,
xlim=c(-4,4),ylim=c(0,0.4))
par(new=T)

curve(dnorm,xlim=c(-4,4),ylim=c(0,0.4),lwd=2,col=
red ,xlab=,ylab=)

graphics
Example:
X=rnorm(100);
Y=rnorm(100)
m=matrix(c(1,2),ncol=2)
layout(m)
plot(x,y)
plot(x,y,xlab=100 Normal rvs,ylab=100 Normal
rvs, col=blue,pch=4,main=Example of plot in R)

Graphics con7nued
Exercise: The Californian freeway
performance measurement system. The data
is ow-occ-table.txt in coursework.
Download the le to your computer and load
it in R using read.table. Prac7ce with the
following code.

Graphics con7nued
dt=read.table(ow-occ-
table.txt,header=T,sep=,)
names(dt)
Ind=complete.cases(dt)
sum(Ind);
length(dt[,1])
arach(dt)

Graphics con7nued
m=matrix(c(1,5,2,5,3,5,4,5),ncol=4)
layout(m)

boxplot(Flow1,Flow2,Flow3,names=c(Flow1,Flo
w2,Flow3) main=Boxplots ows)
boxplot(Occ1,Occ2,Occ3,names=c(Flow1,Flow2,
Flow3), main=Boxplots Occup.)
plot(Occ2,Flow2,type=p,col=blue, main=Flow
vs Occup. for Lane 2)
plot(Occ3,Flow3,type=p,col=red, main=Flow vs
Occup. for Lane 3)

Graphics con7nued
plot(Occ1,type=l,xlim=c(0,1700),
ylim=c(0,0.5),col=green)
par(new=T)
plot(Occ2,type=l,xlim=c(0,1700),
ylim=c(0,0.5),col=blue)
par(new=T)
plot(Occ3,type=l,xlim=c(0,1700),
ylim=c(0,0.5),col=red,main=Occup. for Lane 1,2
and 3)

Graphics
legend(x=top,legend=c(Lane 1, Lane 2, Lane
3),col=c(green,blue,red) ,lty=c(1,1,1))

ggplot2
hrp://cran.r-project.org/web/packages/
ggplot2/index.html
Returns much nicer plots.
Install the package rst in R and type
library(ggplot2)

Control structures
So far we have learned some of the basic
aspects of R: working with its basic objects,
input/output, graphics. Here, we learn the
more general task of wri7ng computer
programs using R.

Control structures con7nued


An important component of a programming
language is control structures to implement
repe77ve tasks.
R programming language has control
structures similar to C

For loops
Loops are used to carry out a sequence of
related opera7ons without having to write the
code for each step explicitly. For instance,
suppose we want to calculate:

10

i
i=1

For loops con7nued


x=0
for (i in 1:10) {
x=x+i
}

For loops
In the above program, x is an accumulator variable,
meaning that its value is repeatedly updated while the
program runs.
Always remember to ini7alize accumulator variables
(to zero in the example).
To clarify, we can add a print statement inside the loop
body.

x=0
for (i in 1:10) {

x=x+i

print(c(i,x))
}

For loops
The general structure of for loops:
for (var in seq) expr
Or
for (var in seq){
expr
}

For loops con7nued


Exercise: Given a matrix A, write a for loop
that calculates the sum of each row of A.

For loops con7nued


This is an example of a trivial for loop.
There is never the need to do such loops in R
because it provides a simple class of func7ons
to do just that: the apply func7ons.
Owen 7mes the apply func7ons even lead to
faster code (but not always).

Next lecture
More control structures
R in Sta7s7cs(linear regression,etc)

You might also like