Professional Documents
Culture Documents
Lab 5 - Shell
Lab 5 - Shell
2023-02-17
Learning Objectives
By the end of this lab, you should have a grasp on the following concepts:
Instructions
To complete this worksheet, add code as needed into the R code chunks given below. Do not delete the
question text. All text should be in complete English sentences. Be sure to change the author of this file to
reflect your name and student number.
To properly see the questions, knit this .Rmd file to .pdf and view the output. You will have a link in your
email that takes you to the Crowdmark submission page. Once you have completed the worksheet, knit it
to .pdf and upload your output to Crowdmark.
1
Exercises
Load in the WeightLoss dataset. This dataset contains the results on a study of 40 individuals who partici-
pated in a brief six-week weight loss study. Their weight (in kg) was measured at the beginning and the end
of the study, and we wish to determine if there is a statistically significant weight loss or not.
Make a side-by-side boxplot, comparing the weights at the baseline to the weights at the end of the study.
boxplot(WeightLoss$Baseline, WeightLoss$End)
130
120
110
100
90
1 2
Make a vector containing the differences between the baseline and the end weights.
differences<-WeightLoss$End - WeightLoss$Baseline
mean(differences)
## [1] -2.7425
Use the t.test function to run a matched pairs test to determine whether there is a statistically significant
weight loss over the course of this study.
2
t.test(WeightLoss$End, WeightLoss$Baseline, paired = TRUE, alternative = "less")
##
## Paired t-test
##
## data: WeightLoss$End and WeightLoss$Baseline
## t = -3.8375, df = 39, p-value = 0.0002219
## alternative hypothesis: true mean difference is less than 0
## 95 percent confidence interval:
## -Inf -1.538383
## sample estimates:
## mean difference
## -2.7425
Generate a 95% confidence interval for the true mean weight loss.
##
## Paired t-test
##
## data: WeightLoss$End and WeightLoss$Baseline
## t = -3.8375, df = 39, p-value = 0.0004437
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -4.188041 -1.296959
## sample estimates:
## mean difference
## -2.7425
Load in the React150.csv dataset. This dataset contains various measurements on a sample of 150 Grade
12 students across the United States, including their Gender, Handedness, Height, Foot Length, Armspan,
and Reaction Time.
Create a side-by-side boxplot to compare the heights of the Males and the Females in this sample, using the
tilde (~) notation. Use the data argument to make your code more concise.
3
boxplot(React150$Height ~ React150$Gender)
190
180
React150$Height
170
160
150
Female Male
React150$Gender
Use the aggregate function to calculate the mean height for each gender.
## Gender Height
## 1 Female 165.4133
## 2 Male 176.9867
Use the aggregate function to compare the standard deviation of heights between the Males and Females
in this sample.
## Gender Height
## 1 Female 7.697127
## 2 Male 6.882201
Use t.test to determine whether there is a statistically significant difference in the mean heights between
the Males and Females in this sample.
4
t.test(Height ~ Gender, alternative = "less",
var.equal = TRUE, data = React150)
##
## Two Sample t-test
##
## data: Height by Gender
## t = -9.7071, df = 148, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group Female and group Male is less than 0
## 95 percent confidence interval:
## -Inf -9.599895
## sample estimates:
## mean in group Female mean in group Male
## 165.4133 176.9867
Use t.test to generate a confidence interval for the true mean difference in average heights (Female - Male).
##
## Two Sample t-test
##
## data: Height by Gender
## t = -9.7071, df = 148, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group Female and group Male is not equal to
## 95 percent confidence interval:
## -13.929376 -9.217291
## sample estimates:
## mean in group Female mean in group Male
## 165.4133 176.9867
Below we will conduct a two-sample t-test on the React150 dataset to test, at the 1% level of significance,
whether left-handed people have a different mean reaction time than right-handed people.
Exercise: Make a side-by-side boxplot, comparing the reaction times of left-handed and right-
handed people.
5
0.8
0.6
React
0.4
0.2
Left−Handed Right−Handed
Hand
## Hand React
## 1 Left-Handed 0.05222913
## 2 Right-Handed 0.15896866
H0 : µ1 = µ2 vs Ha : µ1 ̸= µ2
t.test(React ~ Hand,
alternative="two.sided" , var.equal = FALSE,
data = React150, conf.level=0.99)
##
## Welch Two Sample t-test
##
## data: React by Hand
## t = -5.1436, df = 86.018, p-value = 1.674e-06
## alternative hypothesis: true difference in means between group Left-Handed and group Right-Handed is
6
## 99 percent confidence interval:
## -0.14145962 -0.04564038
## sample estimates:
## mean in group Left-Handed mean in group Right-Handed
## 0.33475 0.42830