Professional Documents
Culture Documents
R - Lecture #2
R - Lecture #2
R
Lecture #2
Matrices
A matrix is a 2-dimensional array of data elements of the same type.
In statistics, matrices are useful as tables that hold data.
You can have a matrix of numbers:
Creating Matrices:
You want to combine the eye color vector in coded form, the eye color vector in
factor form, and the empathy score vector into one collection named
eyes_and_empathy.
You can use the list() function :
eyes_and_empathy <- list(eyes_code=eye_color, eyes=feye_color, empathy=empathy_score)
eyes_and_empathy
$eyes_code
[1] 2 2 4 1 5 5 5 6 1 3 6 3 1 4
$eyes
[1] blue blue gray amber green green green hazel amber brown hazel brown amber gray
Levels: amber blue brown gray green hazel
$empathy
[1] 15 21 45 32 61 74 53 92 83 22 67 55 42 44
Lists
R uses the dollar sign ($) to indicate each component of the
list. So, if you want to refer to a list component, you type the
name of the list, the dollar sign, and the component-name:
eyes_and_empathy$empathy
[1] 15 21 45 32 61 74 53 92 83 22 67 55 42 44
eyes_and_empathy$empathy [4]
[1] 32
Lists and Statistics
Lists are important because numerous statistical functions
return lists of objects.
One statistical function is t.test()
We use this test to see if the mean of the empathy scores differs
from an arbitrary number — 30, for example.
data: eyes_and_empathy$empathy
t = 3.2549, df = 13, p-value = 0.006269
alternative hypothesis: true mean is not equal to 30
95 percent confidence interval:
36.86936 63.98778
sample estimates:
mean of x
50.42857
Lists and Statistics
Note that this output, t.result, is a list.
To show this, you use $ to focus on some of the components:
> t.result$data.name
[1] "eyes_and_empathy$empathy”
> t.result$p.value
[1] 0.006269396
> t.result$statistic
t
3.254853
Data Frames
A data frame contains different data types instead of only one, like matrices and
vectors.
When you think of data for a group of individuals (14 in our previous example)
you typically think in terms of columns that represent the data variables (like
eyes_code, eyes, and empathy) and rows that represent the individuals that’s a
data frame.
> e <- data.frame(eye_color,feye_color,empathy_score)
>e
eye_color feye_color empathy_score
1 2 blue 15
2 2 blue 21
3 4 gray 45
4 1 amber 32
5 5 green 61
6 5 green 74
7 5 green 53
8 6 hazel 92
9 1 amber 83
10 3 brown 22
11 6 hazel 67
12 3 brown 55
13 1 amber 42
14 4 gray 44
Data Frames
Empathy score for the seventh person?
> e[7,3]
[1] 53
First, extract the empathy scores for each eye color and create
vectors:
> e.blue <- e$empathy_score[e$feye_color=="blue"]
> e.green <- e$empathy_score[e$feye_color=="green"]
> e.hazel <- e$empathy_score[e$feye_color=="hazel"]
You should run str() first when you get a new data set.
Selecting elements of a data frame:
Displaying column 3
> anorexia[,3]
[1] 80.2 80.1 86.4 86.3 76.1 78.1 75.1 86.7 73.5 84.6 77.4 79.5 89.6 81.4
81.8 77.3 84.2 75.4 79.5 73.0
[21] 88.3 84.7 81.4 81.2 88.2 78.8 82.2 85.6 81.4 81.9 76.4 103.6 98.4 93.4
73.4 82.1 96.7 95.3 82.4 72.5
[41] 90.9 71.3 85.4 81.6 89.1 83.9 82.7 75.7 82.6 100.4 85.2 83.6 84.6 96.2
86.7 95.2 94.3 91.5 91.9 100.3
[61] 76.7 76.8 101.6 94.9 75.2 77.8 95.5 90.7 92.5 93.8 91.7 98.0