Lab 5 - Shell

Lab 5 - Two-Sample t-Testing
Mansi Kumari (007908159)
2023-02-17
Learning Objectives
By the end of this lab, you should have a grasp on the following concepts:
• How to create a side-by-side boxplot in R

• How to make a stripchart in R
• How to make a scatterplot with a y = x line through it in R
• How to perform a matched-pairs t-test in R
• How to use tilde notation in R
• How to use aggregate in R
• How to perform a two-sample t-test in R
• How to calculate a confidence interval for a two-sample t-test in R
Instructions
To complete this worksheet, add code as needed into the R code chunks given below. Do not delete the
question text. All text should be in complete English sentences. Be sure to change the author of this file to
reflect your name and student number.
To properly see the questions, knit this .Rmd file to .pdf and view the output. You will have a link in your
email that takes you to the Crowdmark submission page. Once you have completed the worksheet, knit it
to .pdf and upload your output to Crowdmark.
1
Exercises
Load in the WeightLoss dataset. This dataset contains the results on a study of 40 individuals who partici-
pated in a brief six-week weight loss study. Their weight (in kg) was measured at the beginning and the end
of the study, and we wish to determine if there is a statistically significant weight loss or not.
WeightLoss <- read.csv("~/Downloads/WeightLoss.csv")
Make a side-by-side boxplot, comparing the weights at the baseline to the weights at the end of the study.
boxplot(WeightLoss$Baseline, WeightLoss$End)
130
120
110
100
90
1 2
Make a vector containing the differences between the baseline and the end weights.
differences<-WeightLoss$End - WeightLoss$Baseline
Calculate the mean of the differences.
mean(differences)
## [1] -2.7425
Use the t.test function to run a matched pairs test to determine whether there is a statistically significant
weight loss over the course of this study.
2
t.test(WeightLoss$End, WeightLoss$Baseline, paired = TRUE, alternative = "less")
##
## Paired t-test
##
## data: WeightLoss$End and WeightLoss$Baseline
## t = -3.8375, df = 39, p-value = 0.0002219
## alternative hypothesis: true mean difference is less than 0
## 95 percent confidence interval:
## -Inf -1.538383
## sample estimates:
## mean difference
## -2.7425
Generate a 95% confidence interval for the true mean weight loss.
t.test(WeightLoss$End, WeightLoss$Baseline, paired = TRUE, alternative = "two.sided")
##
## Paired t-test
##
## data: WeightLoss$End and WeightLoss$Baseline
## t = -3.8375, df = 39, p-value = 0.0004437
## alternative hypothesis: true mean difference is not equal to 0
## -4.188041 -1.296959
## mean difference
## -2.7425
Load in the React150.csv dataset. This dataset contains various measurements on a sample of 150 Grade
12 students across the United States, including their Gender, Handedness, Height, Foot Length, Armspan,
and Reaction Time.
React150 <- read.csv("~/Downloads/React150.csv")
Create a side-by-side boxplot to compare the heights of the Males and the Females in this sample, using the
tilde (~) notation. Use the data argument to make your code more concise.
3
boxplot(React150$Height ~ React150$Gender)
190
180
React150$Height
170
160
150
Female Male
React150$Gender
Use the aggregate function to calculate the mean height for each gender.
aggregate(Height ~ Gender, FUN = mean, data = React150)
## Gender Height
## 1 Female 165.4133
## 2 Male 176.9867
Use the aggregate function to compare the standard deviation of heights between the Males and Females
in this sample.
aggregate(Height ~ Gender, FUN = sd, data = React150)
## Gender Height
## 1 Female 7.697127
## 2 Male 6.882201
Use t.test to determine whether there is a statistically significant difference in the mean heights between
the Males and Females in this sample.
4
t.test(Height ~ Gender, alternative = "less",
var.equal = TRUE, data = React150)
##
## Two Sample t-test
##
## data: Height by Gender
## t = -9.7071, df = 148, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group Female and group Male is less than 0
## -Inf -9.599895
## mean in group Female mean in group Male
## 165.4133 176.9867
Use t.test to generate a confidence interval for the true mean difference in average heights (Female - Male).
t.test(Height ~ Gender, alternative = "two.sided",

var.equal = TRUE, data = React150)
##
## Two Sample t-test
##
## data: Height by Gender
## t = -9.7071, df = 148, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group Female and group Male is not equal to
## -13.929376 -9.217291
## mean in group Female mean in group Male
## 165.4133 176.9867
Below we will conduct a two-sample t-test on the React150 dataset to test, at the 1% level of significance,
whether left-handed people have a different mean reaction time than right-handed people.
Exercise: Make a side-by-side boxplot, comparing the reaction times of left-handed and right-
handed people.
boxplot(React ~ Hand,data = React150)
5
0.8
0.6
React
0.4
0.2
Left−Handed Right−Handed
Hand
Exercise: Use aggregate to determine whether to use a pooled or unpooled test.
aggregate(React ~ Hand,FUN = sd,data = React150)
## Hand React
## 1 Left-Handed 0.05222913
## 2 Right-Handed 0.15896866
Exercise: Write the hypotheses of this test in TeX formatting below.
H0 : µ1 = µ2 vs Ha : µ1 ̸= µ2
Exercise: Use t.test to conduct the appropriate hypothesis test.
t.test(React ~ Hand,
alternative="two.sided" , var.equal = FALSE,
data = React150, conf.level=0.99)
##
## Welch Two Sample t-test
##
## data: React by Hand
## t = -5.1436, df = 86.018, p-value = 1.674e-06
## alternative hypothesis: true difference in means between group Left-Handed and group Right-Handed is
6
## -0.14145962 -0.04564038
## mean in group Left-Handed mean in group Right-Handed
## 0.33475 0.42830
Exercise: Provide a fully worded conclusion to this test.

As the p - value is below the level of significance we reject the null hypothesis that is we have sufficient
evidence at this level of significance that the mean reaction times differ for left and right handed people.

Lab 5 - Shell

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lab 5 - Shell

Uploaded by

Copyright:

Available Formats

Lab 5 - Two-Sample t-Testing

Mansi Kumari (007908159)

• How to create a side-by-side boxplot in R

WeightLoss <- read.csv("~/Downloads/WeightLoss.csv")

Calculate the mean of the differences.

t.test(WeightLoss$End, WeightLoss$Baseline, paired = TRUE, alternative = "two.sided")

React150 <- read.csv("~/Downloads/React150.csv")

aggregate(Height ~ Gender, FUN = mean, data = React150)

aggregate(Height ~ Gender, FUN = sd, data = React150)

t.test(Height ~ Gender, alternative = "two.sided",

boxplot(React ~ Hand,data = React150)

Exercise: Use aggregate to determine whether to use a pooled or unpooled test.

aggregate(React ~ Hand,FUN = sd,data = React150)

Exercise: Write the hypotheses of this test in TeX formatting below.

Exercise: Use t.test to conduct the appropriate hypothesis test.

Exercise: Provide a fully worded conclusion to this test.

You might also like