Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 30

Computers give you scientific

superpowers!!

As a data scientist who knows how to


program, you will improve your ability to:
Project 01 - • Memorize (store) entire data sets
Weighted • Recall data values on demand
• Perform complex calculations with large amounts of
Dice data
• Do repetitive tasks without becoming careless or
bored
• Computers can do all of these things quickly and
error free, which lets your mind do the
things it does well: make decisions and assign
meaning.
Use Use the R and RStudio interfaces

Run Run R commands

Create Create R objects

Objectives Write Write your own R functions and scripts

of this
project
Load and use Load and use R packages

Generate Generate random samples

Create Create quick plots

Get Get help when you need it


The R User Interface

Objects

Functions

The Very Sample with Replacement

Basics Writing Your Own Functions

The Function Constructor

Arguments

Scripts
The R User
Interface
R Studio helps you communicate
with the computer via R
Programming language
The RStudio interface is simple.

The R User
You type R code into the bottom line of the
RStudio console pane and then click Enter to
run it. The code you type is called
Interface a command, because it will command your
computer to do something for you.

The line you type it into is called


the command line.
Brief

• [X] - R is just letting you know that this line begins with the
first value in your result.
>1+1
[1] 2
>

• If you type a command that R doesn’t recognize, R will


return an error message. If you ever see an error message
• Using > does not include the data in the code
Brief

If you type an incomplete command and press Enter, R will


display a ”+”  prompt, which means R is waiting for you to
type the rest of your command. Either finish the command
or hit Escape to start over
5–
+
+1
[1] 4
If you type an incomplete command and
press Enter, R will display a ”+”  prompt,
which means R is waiting for you to type
the rest of your command. Either finish
Brief the command or hit Escape to start over
5–
+
+1
[1] 4
• The ‘:’ operator returns its results as
a vector, a one-dimensional set of
numbers
>1:6
Objects >1 2 3 4 5 6
• R lets you save data by storing it inside
an R object. What’s an object? Just a
name that you can use to call up stored
data.
Objects a <- 1 a
>1
a+2
>3
Objects are stored here;
• R is case-sensitive, So ‘name’ and ‘Name’ will be 2
different objects
Name <- 1
name <- 0
Name + 1
Capitalizat >2

ion
• R will overwrite any previous information stored
in an object without asking you for permission.
my_number <- 1
my_number
>1
my_number <- 999
my_number
> 999
R Multiplication When you use two or more vectors in an
operation, R will line up the vectors and
perform a sequence of individual operations
• R does not always follow the rules of matrix
multiplication. Instead, R uses element-wise
execution.
• die - 1
• >> 0 1 2 3 4 5

• die / 2
• >> 0.5 1.0 1.5 2.0 2.5 3.0

• die * die
• >> 1 4 9 16 25 36
If you give R two vectors of unequal lengths, R
will repeat the shorter vector until it is as long
as the longer vector, and then do the math

This isn’t a permanent change–the shorter


vector will be its original size after R does the
math. If the length of the short vector does not
divide evenly into the length of the long vector,
R will return a warning message. This behavior
is known as vector recycling, and it helps R do
element-wise operations:
1:2
>> 1 2

1:4
>> 1 2 3 4

die
>> 1 2 3 4 5 6

die + 1:2
>> 2 4 4 6 6 8

die + 1:4
>> 2 4 6 8 6 8
Warning message:
In die + 1:4 :
longer object length is not a multiple of
shorter object length
• Element-wise operations are a very
useful feature in R because they
manipulate groups of values in an
orderly way. When you start working
with data sets, element-wise
Plus operations will ensure that values from
one observation or case are only paired
with values from the same observation
or case. Element-wise operations also
make it easier to write your own
programs and functions in R.
• You can do inner multiplication with the %*%
operator and outer multiplication with the %o%
“o-letter” operator
die %*% die

Traditional
>> 91

Matrix die %o% die


>> [,1] [,2] [,3] [,4] [,5] [,6]
Multiplicatio >> [1,] 1 2 3 4 5 6

n >> [2,] 2 4 6 8 10 12
>> [3,] 3 6 9 12 15 18
>> [4,] 4 8 12 16 20 24
>> [5,] 5 10 15 20 25 30
>> [6,] 6 12 18 24 30 36
• R comes with many functions that you can use
to do sophisticated tasks like random sampling.
For example, you can round a number with the
round function, or calculate its factorial with
the factorial function. Using a function is pretty
simple. Just write the name of the function and
then the data you want the function to operate
on in parentheses:
Function
round(3.1415)
>> 3

factorial(3)
>> 6
mean(1:6)
>> 3.5

Calculating style mean(die)


of R >> 3.5

round(mean(die))
>> 4
• The data that you pass into
the function is called the
function’s argument
• The argument can be raw
data, an R object, or even
the results of another R
function

R will work from the innermost function to the outermost


Simulating a • To roll your die and get a
number back, set x to die and
•using names is optional.
You will notice that R
roll of the die sample one element from it.
You’ll get a new (maybe users do not often use the
with R’s different) number each time
you roll it:
name of the first
argument in a function
sample •sample(x = die, size = 1)
function. •>> 2 •sample(die, size = 1)
•>> 2
•sample(x = die, size = 1)
•>> 1

•sample(x = die, size = 1)


•>> 6
• If you are not sure of what to add in a function
use the Syntax – “args”
Stuck with args(round)
a Function >> function (x, digits = 0)
>> NULL
• The argument replace = TRUE causes
sample to sample with replacement. Our jar
example provides a good way to understand
the difference between sampling with
replacement and without. When sample uses
Sample with replacement, it draws a value from the jar and

Replacemen
records the value. Then it puts the value back
into the jar. In other words, sample replaces

t
each value after each draw. As a result, sample
may select the same value on the second draw.
Each value has a chance of being selected each
time. It is as if every draw were the first draw.

sample(die, size = 2, replace = TRUE)


>> 2
• you can feed your result straight into the sum function:
dice <- sample(die, size = 2, replace = TRUE)
dice
>> 2 4

sum(dice)
>> 6
dice • dice
• >> 2 4
•Each time you call dice, R
will show you the result of
multiple that one time you called
sample and saved the output
times • dice to dice. R won’t rerun
sample(die, 2, replace =
• >> 2 4 TRUE) to create a new roll of
the dice. This is a relief in a
way. Once you save a set of
• dice results to an R object, those
• >> 2 4 results do not change.
Programming would be quite
hard if the values of your
objects changed each time
you called them.
Writing • We’re going to write a
function named roll that you
•roll()
•>> 8
Your Own can use to roll your virtual
Functions dice
•roll()
• each time you call roll(), R
will return the sum of rolling •>> 3
two dice
•roll()
•>> 7
Every
function in
R has
The three
a name
Function basic
Construct parts:

or
a body of a set of
code arguments
• function function. To do this, call
function() and follow it with a pair of
To Store braces, {}

Function Code
my_function <- function() {}
roll <- function() {
die <- 1:6
dice <- sample(die, size = 2, replace = TRUE)

Example }
sum(dice)

You can think of the parentheses as the “trigger”


that causes R to run the function
Arguments

You might also like