Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Lab 5 - Two-Sample t-Testing

Mansi Kumari (007908159)

2023-02-17

Learning Objectives

By the end of this lab, you should have a grasp on the following concepts:

• How to create a side-by-side boxplot in R


• How to make a stripchart in R
• How to make a scatterplot with a y = x line through it in R
• How to perform a matched-pairs t-test in R
• How to use tilde notation in R
• How to use aggregate in R
• How to perform a two-sample t-test in R
• How to calculate a confidence interval for a two-sample t-test in R

Instructions

To complete this worksheet, add code as needed into the R code chunks given below. Do not delete the
question text. All text should be in complete English sentences. Be sure to change the author of this file to
reflect your name and student number.
To properly see the questions, knit this .Rmd file to .pdf and view the output. You will have a link in your
email that takes you to the Crowdmark submission page. Once you have completed the worksheet, knit it
to .pdf and upload your output to Crowdmark.

1
Exercises
Load in the WeightLoss dataset. This dataset contains the results on a study of 40 individuals who partici-
pated in a brief six-week weight loss study. Their weight (in kg) was measured at the beginning and the end
of the study, and we wish to determine if there is a statistically significant weight loss or not.

WeightLoss <- read.csv("~/Downloads/WeightLoss.csv")

Make a side-by-side boxplot, comparing the weights at the baseline to the weights at the end of the study.

boxplot(WeightLoss$Baseline, WeightLoss$End)
130
120
110
100
90

1 2

Make a vector containing the differences between the baseline and the end weights.

differences<-WeightLoss$End - WeightLoss$Baseline

Calculate the mean of the differences.

mean(differences)

## [1] -2.7425

Use the t.test function to run a matched pairs test to determine whether there is a statistically significant
weight loss over the course of this study.

2
t.test(WeightLoss$End, WeightLoss$Baseline, paired = TRUE, alternative = "less")

##
## Paired t-test
##
## data: WeightLoss$End and WeightLoss$Baseline
## t = -3.8375, df = 39, p-value = 0.0002219
## alternative hypothesis: true mean difference is less than 0
## 95 percent confidence interval:
## -Inf -1.538383
## sample estimates:
## mean difference
## -2.7425

Generate a 95% confidence interval for the true mean weight loss.

t.test(WeightLoss$End, WeightLoss$Baseline, paired = TRUE, alternative = "two.sided")

##
## Paired t-test
##
## data: WeightLoss$End and WeightLoss$Baseline
## t = -3.8375, df = 39, p-value = 0.0004437
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -4.188041 -1.296959
## sample estimates:
## mean difference
## -2.7425

Load in the React150.csv dataset. This dataset contains various measurements on a sample of 150 Grade
12 students across the United States, including their Gender, Handedness, Height, Foot Length, Armspan,
and Reaction Time.

React150 <- read.csv("~/Downloads/React150.csv")

Create a side-by-side boxplot to compare the heights of the Males and the Females in this sample, using the
tilde (~) notation. Use the data argument to make your code more concise.

3
boxplot(React150$Height ~ React150$Gender)

190
180
React150$Height

170
160
150

Female Male

React150$Gender

Use the aggregate function to calculate the mean height for each gender.

aggregate(Height ~ Gender, FUN = mean, data = React150)

## Gender Height
## 1 Female 165.4133
## 2 Male 176.9867

Use the aggregate function to compare the standard deviation of heights between the Males and Females
in this sample.

aggregate(Height ~ Gender, FUN = sd, data = React150)

## Gender Height
## 1 Female 7.697127
## 2 Male 6.882201

Use t.test to determine whether there is a statistically significant difference in the mean heights between
the Males and Females in this sample.

4
t.test(Height ~ Gender, alternative = "less",
var.equal = TRUE, data = React150)

##
## Two Sample t-test
##
## data: Height by Gender
## t = -9.7071, df = 148, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group Female and group Male is less than 0
## 95 percent confidence interval:
## -Inf -9.599895
## sample estimates:
## mean in group Female mean in group Male
## 165.4133 176.9867

Use t.test to generate a confidence interval for the true mean difference in average heights (Female - Male).

t.test(Height ~ Gender, alternative = "two.sided",


var.equal = TRUE, data = React150)

##
## Two Sample t-test
##
## data: Height by Gender
## t = -9.7071, df = 148, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group Female and group Male is not equal to
## 95 percent confidence interval:
## -13.929376 -9.217291
## sample estimates:
## mean in group Female mean in group Male
## 165.4133 176.9867

Below we will conduct a two-sample t-test on the React150 dataset to test, at the 1% level of significance,
whether left-handed people have a different mean reaction time than right-handed people.
Exercise: Make a side-by-side boxplot, comparing the reaction times of left-handed and right-
handed people.

boxplot(React ~ Hand,data = React150)

5
0.8
0.6
React

0.4
0.2

Left−Handed Right−Handed

Hand

Exercise: Use aggregate to determine whether to use a pooled or unpooled test.

aggregate(React ~ Hand,FUN = sd,data = React150)

## Hand React
## 1 Left-Handed 0.05222913
## 2 Right-Handed 0.15896866

Exercise: Write the hypotheses of this test in TeX formatting below.

H0 : µ1 = µ2 vs Ha : µ1 ̸= µ2

Exercise: Use t.test to conduct the appropriate hypothesis test.

t.test(React ~ Hand,
alternative="two.sided" , var.equal = FALSE,
data = React150, conf.level=0.99)

##
## Welch Two Sample t-test
##
## data: React by Hand
## t = -5.1436, df = 86.018, p-value = 1.674e-06
## alternative hypothesis: true difference in means between group Left-Handed and group Right-Handed is

6
## 99 percent confidence interval:
## -0.14145962 -0.04564038
## sample estimates:
## mean in group Left-Handed mean in group Right-Handed
## 0.33475 0.42830

Exercise: Provide a fully worded conclusion to this test.


As the p - value is below the level of significance we reject the null hypothesis that is we have sufficient
evidence at this level of significance that the mean reaction times differ for left and right handed people.

You might also like