Professional Documents
Culture Documents
An Introduction To R - Contents and Sample Chapter
An Introduction To R - Contents and Sample Chapter
Dr Mark Gardener
Pelagic Publishing
Contents
1 A brief introduction to R 1
2 Basic math 14
3 Introduction to R objects 22
10 Tabulation 205
Appendix 371
Index 375
1. A brief introduction to R
1.1 Getting R
You can get R from the R Project website at https://www.r-project.org where you will find the
download page (Figure 1.1).
Click on the appropriate link for your operating system and you’ll be on your way. Once R
is installed on your computer you will be able to run the program. If you use a Linux system,
R runs via the terminal. On Windows or Macintosh you will find R like any other program.
1.2 The R interface
The main interface between R and you is the R console (Figure 1.2). When you run R this is
what you’ll see first. The console displays a “welcome” message and some additional “blurb”,
then waits for you.
The console is where you will type the commands required to “drive” R. You start typing
at the > symbol. In the GUI (for Windows or Macintosh) there are some menus, which can help
make using R a little easier (e.g. Figure 1.3).
The Packages menu for example (Figure 1.4) allows you to manage additional R command
packages, which allow you to extend the (already extensive) capabilities of R.
2 | An Introduction to R: Data Analysis and Visualization
If you produce any graphical output, this will appear in a separate window.
help("sum")
4 | An Introduction to R: Data Analysis and Visualization
As a short-cut you can use ?name, where name is the name of the function (you do not need
quotes).
?mean
If you do not recall the name of the function, but you think you know a bit of it, you can try
apropos("partname"), where "partname" is the part you can recall, for example:
apropos("qq")
[1] "qqline" "qqnorm" "qqplot"
The result of apropos() is to show you all the functions that contain the "partword" you typed.
You can “force” R to open the help system in your default web browser using help.start().
This will open the starting page for the help system (Figure 1.5), regardless of the operating sys-
tem you are using. Note that you do not need to type anything in the () brackets.
help.start()
Help entries from the terminal appear in the console by default. You can scroll up and down
using the arrow keys, and quit by typing q from the keyboard.
You can set the default help-display type for your R session by using the options() function
(note that R ignores lines starting with #, which are treated as comment lines):
# reset to default
options(help _ type = NULL)
R is loaded with analytical packages, not all of them are active (see section 1.3 Extending R for
more details). You can search for help entries in any package, even if it is not “active”, by using
help.search().
help.search(pattern = "mean")
If you know the name of the package you can access help on an individual command or view
the index:
Use ?? as a short-cut for help.search(), in which case quotes are not required.
??mean
R will search through all the R packages on your computer and show you matching entries.
You can search the online R-help mailing list using the command RSiteSearch(). This
opens your web browser.
RSiteSearch("mean")
Of course you can always use the Internet; adding “R” to your query will usually provide you
with plenty of hits.
1.2.2 Command history
R records the commands you type and stores them in memory. You can access the previous
commands using the ↑ and ↓ keys. You can edit previous commands (use the mouse, arrow
keys, delete and so on) and press ENTER to run them.
You can also get a list of past commands using the history() function. This shows you
the previously typed commands. If you are using the regular Windows GUI this opens in a
new “R History” window, other GUI and systems vary in their output.
Section/title Contents
Name {package} The name of the help entry, this is usually the same as the command
name but may be more generic. The package name appears in {}, giving
the location of the entry.
Title A general title.
Description A brief description of the function(s).
Usage This shows the general syntax (what you need to type into the console).
Arguments A list of the various arguments that may be used.
Details Additional information
Value A brief note about what the result of the command is.
References References to books or papers.
See Also Links to related functions.
Examples Examples that you can copy/paste into the console.
The history() function will show you the last 25 commands by default. However, you can
alter the parameters (Table 1.2) to change the number of items and their order. You can also
show entries that contain a pattern.
A brief introduction to R | 7
Parameter Explanation
max.show The maximum number of items to display (default = 25).
reverse Set reverse = TRUE to display in reverse order. The default shows the last
command at the bottom of the list.
pattern A text pattern to match, this must be in “quotes”.
... Additional arguments can be passed to the pattern-matching.
You can save the commands you type into a file using savehistory(), use a .txt file extension
for the filename. You can also load a history file with loadhistory().
savehistory(file = ".Rhistory")
loadhistory(file = ".Rhistory")
You need to give the filename in “quotes” and include the pathway to the file. The default
filename is a “hidden” file, in your working directory. You can see what this is currently set at
using getwd(). See section 4.7.1 Working directory for more details on how to view and change
the settings.
On the Windows GUI you can load or save the history from the File menu.
1.3 Extending R
R is powerful right “out of the box” but there are many additional “packages” you can add on
to R to increase its capabilities. One of the great strengths of R is that you can customize it to
suit your requirements. Many people have already done this and published their efforts as R
packages on the Comprehensive R Archive Network (CRAN) at https://www.r-project.org.
The CRAN website gives a lot of information about the available packages, as well as access
to the packages themselves. A particularly useful section is the Task Views, which is a link from
the main CRAN page. This is a topic-led information section and gives comprehensive infor-
mation about various topics and the R packages that are “useful” for those topics.
You can add and manage R packages quite easily using R, as you will see now.
1.3.1 Command packages
R is modular, although this is not generally apparent when you are using R. When you run R a
default “set” of command packages are loaded into the memory of your computer. You can see
what packages (or libraries) are currently loaded with the search() command.
8 | An Introduction to R: Data Analysis and Visualization
search()
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods" "Autoloads" "org:r-lib"
[10] "package:base"
The list you get will vary, depending what GUI you are running and if you’ve loaded any com-
mand packages. The example here is a “typical” minimum you get when running R from the
terminal. You can get a “list” of packages that are on your computer system and “available”
using installed.packages().
installed.packages()
The list can be very long indeed, and gives a range of information such as the version of
the package and the location on your system. If you simply want a list of the names you
can use:
rownames(installed.packages())
You can also use a menu if you are using the Windows or Macintosh GUI:
These will show you what R command packages (libraries) are available on your system.
1.3.2 Download packages
Most R command packages are on CRAN at https://www.r-project.org. If you go to the Packages
section of the site, you can see the names and get other information about the packages that are
available. You can get the packages onto your system in two main ways:
Once you know the name of a package you want you can use it in the install.packages() com-
mand like so:
install.packages("name")
In the command you give the “name” (in quotes) of the package you want to install. You will
be asked to select a mirror site for the download; choose one local to you. The download will
begin and the package will be added to your system.
If you are using a GUI you can use a menu for installation:
• Windows – Packages.
• Macintosh – Packages & Data.
Once you have the file on your system you will still need to install it using R. On Windows or
Macintosh GUI you can try the following:
This will open a file browser and allow you to select the file you downloaded. The file will be
opened and the R command package added to your system.
The file
.choo se() command only works if you are using a GUI. If you are using a termi-
nal you need to use the full filename (in quotes) with its path.
Example:
Path names for different OS
# Windows
install.packages("C:/Users/Mark/Downloads/abc.zip")
# Macintosh
install.packages("Macintosh HD/Users/markgardener/Downloads/
abc.tgz")
10 | An Introduction to R: Data Analysis and Visualization
1.3.3 Prepare a package
Once an R command package has been “installed” it is on your system. However, it is not
“active” until you tell R that you want to use it. To activate a command package use library() or
require() commands, with the name of the package (quotes not required).
library(cluster)
require(cluster) # same thing
require(cluster)
search()
[1] ".GlobalEnv" "package:cluster" "package:stats"
[4] "package:graphics" "package:grDevices" "package:utils"
[7] "package:datasets" "package:methods" "Autoloads"
[10] "org:r-lib" "package:base"
Now the package is ready and the commands within it are available to use. If you are using a
GUI you can use a menu command to “load” a package:
1.3.4 Remove a package
There are many command packages available and it might not be sensible to have them all
“running” at the same time. Usually you would load only the package(s) that you need for a
specific task. Once you are done you can “tidy up” and remove the package.
The search() command allows you to see what packages are loaded. When you type
a command, R “looks” along the search path as defined in search() for the command that
you’ve typed. The first entry on the search path is usually .GlobalEnv, which means your
computer memory (think of it at your workspace). With so many command packages available
it is inevitable that there will be duplication of command names. R will “select” the first entry it
encounters on the search path. Each package you load gets to the front of the queue.
To “unload” a package use the detach() function.
detach(package:name)
You replace the name part with the name of the package to detach. Removing a package does
not un-install it, it simply removes it from the current search path. The package is still on your
system and can be loaded and unloaded as you need.
search()
[1] ".GlobalEnv" "package:cluster" "package:stats"
[4] "package:graphics" "package:grDevices" "package:utils"
[7] "package:datasets" "package:methods" "Autoloads"
[10] "org:r-lib" "package:base"
detach(package:cluster)
search()
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods" "Autoloads" "org:r-lib"
[10] "package:base"
A brief introduction to R | 11
If you are using a Macintosh GUI you can use a menu to detach a package: Packages & Data
> Package Manager. On Windows you must use the detach() command but see section 1.4
Alternative ways to run R.
1.3.5 Updating packages
From time to time the R command packages are updated. New features are added and bugs
ironed out. You can check for newer versions of packages using old.packages(), which gives
a (potentially long) list of packages that have suitable newer versions.
More usually you’ll want to check and update; use the update.packages() command
to do this.
There are quite a lot of potential parameters but ask is the most useful. The default ask =
TRUE will let you choose package-by-package which to update. If you set ask = "graphics"
you’ll get an interactive selector. Setting ask = FALSE will attempt to update all your packages
without asking any further.
If you use a GUI you can also use a menu to help update your packages.
You can always run R directly, that is from a terminal or DOS window (Table 1.3), without using
a GUI. However, this is not usually convenient!
Most of the time you’ll want to use a GUI. The concept of an Integrated Development
Environment is becoming increasingly popular. An IDE is a “wrapper” for R (or other pro-
gram) that helps you manage your work-flow more easily. A good IDE will help you get
the most out of R by making “housekeeping” tasks more straightforward (such as manag-
ing files and add-on packages). You can then focus on working with R.
OS Terminal application
Linux Open the regular terminal and type R
Macintosh Open the Terminal.app or similar (e.g. iTerm) and type R
Windows Look for the R.exe file in "C:\Program Files\R\R-xxx\bin", where
R-xxx is a version of R. Then run the R.exe file.
12 | An Introduction to R: Data Analysis and Visualization
There are several IDE around and a quick Internet search (e.g. “IDE for R”) will reveal
the options. The most widely used is called RStudio (https://www.RStudio.com).
The panes display different kinds of information, some of the panes contain multiple items
as tabs. The panes are:
Console
The main console is where “R lives” and where commands are entered and basic output gener-
ated. The left side of RStudio contains the console pane (but you can alter the configuration). In
the default settings the pane above the Console holds the Script Editor window.
Data/History/Files
The top-right pane is generally configured to show some “housekeeping” items:
• Environment: Shows you the “contents” of various environments. The default is the console
itself, that is, the data currently in memory.
• History: Previous commands you’ve typed.
• Files: Access your computer file system.
Plots/Help/Packages/Help
The bottom-right pane contains some more “housekeeping” items, including:
Customizing RStudio
You can control the layout of the panes and various other options from the Tools Menu. Tools
> Global Options…. You can alter the layout, colours and contents of the various panes to suit
your needs.
EXERCISES:
Self-Assessment for Introduction to R
1. R is Open Source… TRUE or FALSE?
2. R is only available for Windows operating systems… TRUE or FALSE?
3. To open the help system in a web browser type ____ from the console.
4. To add a command package use the ____ function.
5. You can scroll back through previously typed commands by:
a. Using history() command
b. Using ↑ key
c. Using loadhistory() command
d. Using control & - keys
Answers to exercises are in the Appendix.