Bacs HW4

HW4
112078516
2024-03-15
Question 1)
The following problem’s data is real, but the scenario is strictly imaginary. The large American phone
company Verizon had a monopoly on phone services in many areas of the US. The New York Public Utilities
Commission (PUC) regularly monitors repair times with customers in New York to verify the quality of
Verizon’s services. The file verizon.csv has a recent sample of repair times collected by the PUC.
verizon <- read.csv("verizon.csv")

sample_size <- length(verizon$Time)
sample_mean <- mean(verizon$Time)
sample_sd <- sd(verizon$Time)
se <- (sample_sd /sqrt(sample_size))
a. Imagine that Verizon claims that they take 7.6 minutes to repair phone services for its
customers on average. The PUC seeks to verify this claim at 99% confidence (i.e., significance
� = 1%) using traditional statistical methods.
i. Visualize the distribution of Verizon’s repair times, marking the mean with a vertical line
Plot the density function of Verizon’s repair times
plot(density(verizon$Time), col="blue", lwd=2,

main = "Verizon's repair times")
# Add vertical lines showing mean
abline(v=mean(verizon$Time), col="orange")
1
Verizon's repair times
0.12
0.08
Density
0.04
0.00
0 50 100 150 200
N = 1687 Bandwidth = 1.003
ii. Given what the PUC wishes to test, how would you write the hypothesis? (not graded)
The average minutes for Verizon to repair phone services for its customers is 7.6.
Hnull (𝜇 = 7.6 hrs)
Halt The average minutes for Verizon to repair phone services for its customers is not 7.6.
(𝜇 ≠ 7.6 hrs)
# 99% CI
c(sample_mean-2.58*se,sample_mean+2.58*se)
iii. Estimate the population mean, and the 99% confidence interval (CI) of this estimate.
## [1] 7.593073 9.450946
# t value
t <- (sample_mean - 7.6) / se
t
iv. Find the t-statistic and p-value of the test
2
## [1] 2.560762
# degree of freedom
df <- sample_size - 1
p <- 1 - pt(t, df)
p
## [1] 0.005265342
The value of t-statistic is 2.5607623.

The p-value is 0.0052653.
v. Briefly describe how these values relate to the Null distribution of t (not graded)
When df increase, the shape of t distribution becomes more similar to a normal distribution.
Theoretically, the t-distribution only becomes perfectly normal when the sample size reaches the population
size.
vi. What is your conclusion about the company’s claim from this t-statistic, and why?
Because this is a two-tailed test, the threshold for single side is 0.05%. The p-value 0.0052653 is bigger
than 0.5%, so we can not reject H0 at 99% confidence.
We can also do t-test for checking.
t.test(verizon$Time,mu=7.6,alternative="two.sided",conf.level = 0.99)
##
## One Sample t-test
##
## data: verizon$Time
## t = 2.5608, df = 1686, p-value = 0.01053
## alternative hypothesis: true mean is not equal to 7.6
## 99 percent confidence interval:
## 7.593524 9.450495
## sample estimates:
## mean of x
## 8.522009
The overall p-value = 0.01053 > 0.01, so we can not reject H0 at 99% confidence.
b. Let’s re-examine Verizon’s claim that they take no more than 7.6 minutes on average, but
this time using bootstrapped testing:
i. Bootstrapped Percentile: Estimate the bootstrapped 99% CI of the population mean
set.seed(23)
compute_sample_mean <- function(sample0) {
resample <- sample(sample0, length(sample0), replace=TRUE)
mean(resample)
}
bootmean <- replicate(2000,compute_sample_mean(verizon$Time))
mean(bootmean)
3
## [1] 8.521639
quantile(bootmean, probs=c(0.005, 0.995))
## 0.5% 99.5%
## 7.663299 9.437621
The bootstrapped mean is 8.521639, and the 99% CI is 7.663299, 9.437621.
ii. Bootstrapped Difference of Means: What is the 99% CI of the bootstrapped difference
between the sample mean and the hypothesized mean?
set.seed(23)
bootsmeandiff <- replicate(2000, compute_sample_mean(verizon$Time-7.6))
mean(bootsmeandiff)
## [1] 0.9216395
quantile(bootsmeandiff, probs=c(0.005, 0.995))
## 0.5% 99.5%
## 0.06329926 1.83762072
The bootstrapped mean is 0.9216395, and the 99% CI is 0.06329926, 1.83762072.
iii. Bootstrapped t-statistic: What is the 99% CI of the bootstrapped t-statistic of the sample
mean versus the hypothesized mean?
set.seed(23)
# Function to compute bootstrapped t-statistic
compute_t_statistic <- function(sample_arg, hypothesized_mean){
reexamine_sample <- sample(sample_arg, length(sample_arg), replace = TRUE)
t_statistic <- (mean(reexamine_sample) - hypothesized_mean) /
(sd(reexamine_sample) / sqrt(length(reexamine_sample)))
t_statistic
}
# Generate 2000 bootstrapped t-statistics based on the original dataset

bootstrap_t_stats <- replicate(2000, compute_t_statistic(verizon$Time, 7.6))
# Compute the mean of bootstrapped t-statistics

# (should be close to 0 under null hypothesis)
mean(bootstrap_t_stats)
## [1] 2.531093
4
# Calculate the 99% confidence interval for bootstrapped t-statistics
quantile(bootstrap_t_stats, probs = c(0.005, 0.995))
## 0.5% 99.5%
## 0.2151776 4.6382164
The 99% CI of the bootstrapped t-statistic is 2.531093.

The 99% confidence interval (CI) of the bootstrapped t-statistic of the sample mean versus the hypothesized
mean is calculated to be between -0.06574763 and 4.61355749.
iv. Plot the distribution of the three bootstraps above on separate plots; draw vertical lines
showing the lower/upper bounds of their respective 99% confidence intervals.
Plot three distributions in density plot with lower/upper bounds of their respective 99% confidence intervals.
plot(density(bootmean),main = "bootstrapped mean")

abline(v=quantile(bootmean, probs=c(0.005, 0.995)), lty="dashed")
bootstrapped mean
1.0
0.8
Density
0.6
0.4
0.2
0.0
7.0 7.5 8.0 8.5 9.0 9.5 10.0
N = 2000 Bandwidth = 0.06954
plot(density(bootsmeandiff), main = "bootstraped difference of means")

abline(v=quantile(bootsmeandiff, probs = c(0.005, 0.995)), lty="dashed")
5
bootstraped difference of means
1.0
0.8
Density
0.6
0.4
0.2
0.0
−0.5 0.0 0.5 1.0 1.5 2.0 2.5
N = 2000 Bandwidth = 0.06954
plot(density(bootstrap_t_stats), main = "bootstraped t-statistic of means")

abline(v=quantile(bootstrap_t_stats, probs = c(0.005, 0.995)), lty="dashed")
6
bootstraped t−statistic of means
0.4
0.3
Density
0.2
0.1
0.0
0 2 4 6
N = 2000 Bandwidth = 0.1671
v. Does the bootstrapped approach agree with the traditional t-test in part [a]?
set.seed(23)
bootmean <- replicate(2000,compute_sample_mean(verizon$Time))
quantile(bootmean, probs=c(0.005, 0.995))
## 0.5% 99.5%
## 7.663299 9.437621
bootmeandiff <- replicate(2000,compute_sample_mean(verizon$Time-7.6))

quantile(bootmeandiff, probs=c(0.005, 0.995))
## 0.5% 99.5%
## 0.04212762 1.92928278
Both the values of 7.6 and 0 fall outside the 99% confidence interval (CI) in the bootstrapped approach.
Put simply, even though the traditional t-test didn’t dismiss the null hypothesis (𝐻0 ) with 99% confidence
in part (a), the bootstrapped approach in part (b) also upheld the H0 because both 7.6 and 0 fell within
the 99% confidence intervals. However, it’s important to note that when using a high confidence level like
99%, the results can vary between different statistical methods. This variability is a characteristic of the
bootstrapped approach, especially when dealing with critical points like the 99% confidence level. Therefore,
it’s possible to encounter instances where the results from bootstrapping differ from those obtained using
traditional statistical tests like the t-test, as demonstrated by changing the random seed.
7
c. Finally, imagine that Verizon notes that the distribution of repair times is highly skewed by
outliers, and feel that testing the mean is not fair because the mean is sensitive to outliers.
They argue that the median is a more fair test, and claim that the median repair time is no
more than 3.5 minutes at 99% confidence (i.e., significance � = 1%).
i. Bootstrapped Percentile: Estimate the bootstrapped 99% CI of the population median
set.seed(12)
compute_sample_median <- function(sample0) {
resample <- sample(sample0, length(sample0), replace=TRUE)
median(resample)
}
bootmedian <- replicate(2000,compute_sample_median(verizon$Time))
mean(bootmedian)
## [1] 3.615185
quantile(bootmedian, probs=c(0.005, 0.995))
## 0.5% 99.5%
## 3.15 3.95
ii. Bootstrapped Difference of Medians:
What is the 99% CI of the bootstrapped difference between the sample median and the
hypothesized median?
set.seed(12)
bootmediandiff <- replicate(2000,compute_sample_median(verizon$Time-3.5))
mean(bootmediandiff)
## [1] 0.115185
quantile(bootmediandiff, probs=c(0.005, 0.995))
## 0.5% 99.5%
## -0.35 0.45
iii. Plot distribution the two bootstraps above on two separate plots (note that we are not
using a test statistic for the median).
plot(density(bootmedian),main = "bootstrapped median")

abline(v=quantile(bootmedian, probs=c(0.005, 0.995)), lty="dashed")
8
bootstrapped median
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Density
3.0 3.2 3.4 3.6 3.8 4.0
N = 2000 Bandwidth = 0.02791
plot(density(bootmediandiff),main = "bootstrapped median")

abline(v=quantile(bootmediandiff, probs=c(0.005, 0.995)), lty="dashed")
9
bootstrapped median
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Density
−0.4 −0.2 0.0 0.2 0.4 0.6
N = 2000 Bandwidth = 0.02791
iv. What is your conclusion about Verizon’s claim about the median, and why?
Verizon’s claim about the median is 3.5, we fail to reject the H0 hypothesis at a 99% confidence level.
Question 2) Load the compstatslib package and run interactive_t_test(). You

will see a simulation of null and alternative distributions of the t-statistic, along
with significance, power, and can activate an error matrix.
If you do not see interactive controls (slider bars), press the gears icon (�) on the top-left of
the visualization.
library(compstatslib)
# interactive_t_test()
Recall the form of t-tests, where 𝑡 = (𝑥−𝜇√ 0)

𝑠/ 𝑛
Let’s see how hypothesis tests are are affected by various factors:
diff: the difference we wish to test (𝑥 − 𝜇0 )
sd: the standard deviation of our sample data (𝑠)
n: the number of cases in our sample data
alpha: the significance level of our test (e.g., alpha is 5% for a 95% confidence level)
Your colleague, a data analyst in your organization, is working on a hypothesis test where he has sampled
product usage information from customers who are using a new smartwatch. He wishes to test whether the
10
mean (𝑥𝑖 ) usage time is higher than the usage time of the company’s previous smartwatch released two years
ago (𝜇0 ):
Hnull : The mean usage time of the new smartwatch is the same or less than for the previous smartwatch.
Halt : The mean usage time is greater than that of our previous smartwatch.
After collecting data from just n=50 customers, he informs you that he has found diff=0.3 and sd=2.9.
Your colleague believes that we cannot reject the null hypothesis at alpha of 5%.
Use the slider bars of the simulation to the values your colleague found and confirm from
the visualization that we cannot reject the null hypothesis. Consider the scenarios (a – d)
independently using the simulation tool. For each scenario, start with the initial parameters
above, then adjust them to answer the following questions:
i. Would this scenario create systematic or random error (or both or neither)?
ii. Which part of the t-statistic or significance (diff, sd, n, alpha) would be affected?
iii. Will it increase or decrease our power to reject the null hypothesis?
iv. Which kind of error (Type I or Type II) becomes more likely because of this scenario?
a. You discover that your colleague wanted to target the general population of Taiwanese
users of the product. However, he only collected data from a pool of young consumers, and
missed many older customers who you suspect might use the product much less every day.
i.
A: This scenario would create systematic errors because the sample does not reflect the entire population
of Taiwanese product users.
ii.
A: If we include data from older customers in our sample, we anticipate that the diff(difference) in outcomes
will likely decrease. This expectation arises from the assumption that older customers might use the product
less frequently each day. Additionally, we expect the sd(standard deviation) to increase when including
data from older customers. The parameter n(sample size) would be affected as the biased sample lacks
representation from older users.
iii.
A: Decreasing the sample’s diversity would decrease the power to reject the null hypothesis.
iv.
A: In this scenario, Power is 1 - 𝛽, power decrease implies 𝛽,the kind of error that becomes more likely is
a Type II error, because the biased sample may not capture differences in usage time between age groups
accurately. Type I error remains unchanged in this scenario.
# n = 50, diff = 0.3, sd = 2.9, alpha = 0.05
knitr::include_graphics("a.png")
11
b. You find that 20 of the respondents are reporting data from the wrong wearable device,
so they should be removed from the data. These 20 people are just like the others in every
other respect.
i.
A: random error
ii.
A: The part of the t-statistic affected would be n(sample size) due to the there are only 30 people remaining
after removing 20 individuals.
iii.
A: The removal of respondents decreases the sample size (n), leading to decreased power to reject the null
hypothesis.
iv.
A: In this scenario, Power is 1 - 𝛽, power decrease implies 𝛽 resulting from the decreased sample size,
this kind of error that becomes more likely is a Type II error. Type I error remains unchanged in this scenario.
# n = 30, diff = 0.3, sd = 2.9, alpha = 0.05
knitr::include_graphics("b.png")
12
c. A very annoying professor visiting your company has criticized your colleague’s “95% con-
fidence” criteria, and has suggested relaxing it to just 90%.
i.
A: neither
ii.
A: 𝛼 increase, because 𝛼 change from 5% to 10%
iii.
A: It will increase our power to reject the null hypothesis.
iv.
A: Increasing the probability of a Type I error 𝛼 leads to a higher Type I errors. When the power of a test
increases, the probability of a Type II error decreases.
# n = 30, diff = 0.3, sd = 2.9, alpha = 0.1

knitr::include_graphics("c.png")
13
d. Your colleague has measured usage times on five weekdays and taken a daily average. But
you feel this will underreport usage for younger people who are very active on weekends,
whereas it over-reports usage of older users.
i.
A: Both random error and systematic error.
ii.
A: We lack suﬀicient information to predict the direction of change (increase or decrease) for diff and sd.
Both n and 𝛼 remain constant.
iii.
A: It depends on the changes of the df (difference), because the measurement method changes.
iv.
A: Type I error remains unchanged, while Type II error may either increase or decrease depending on the
result of the change in diff.
14

Bacs HW4

Uploaded by

Copyright:

Available Formats

You might also like

Bacs HW4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bacs HW4

Uploaded by

Copyright:

Available Formats

HW4

verizon <- read.csv("verizon.csv")

plot(density(verizon$Time), col="blue", lwd=2,

0 50 100 150 200

N = 1687 Bandwidth = 1.003

## [1] 7.593073 9.450946

iv. Find the t-statistic and p-value of the test

The value of t-statistic is 2.5607623.

i. Bootstrapped Percentile: Estimate the bootstrapped 99% CI of the population mean

quantile(bootmean, probs=c(0.005, 0.995))

The bootstrapped mean is 8.521639, and the 99% CI is 7.663299, 9.437621.

quantile(bootsmeandiff, probs=c(0.005, 0.995))

The bootstrapped mean is 0.9216395, and the 99% CI is 0.06329926, 1.83762072.

# Generate 2000 bootstrapped t-statistics based on the original dataset

# Compute the mean of bootstrapped t-statistics

The 99% CI of the bootstrapped t-statistic is 2.531093.

plot(density(bootmean),main = "bootstrapped mean")

7.0 7.5 8.0 8.5 9.0 9.5 10.0

N = 2000 Bandwidth = 0.06954

plot(density(bootsmeandiff), main = "bootstraped difference of means")

−0.5 0.0 0.5 1.0 1.5 2.0 2.5

N = 2000 Bandwidth = 0.06954

plot(density(bootstrap_t_stats), main = "bootstraped t-statistic of means")

N = 2000 Bandwidth = 0.1671

bootmeandiff <- replicate(2000,compute_sample_mean(verizon$Time-7.6))

i. Bootstrapped Percentile: Estimate the bootstrapped 99% CI of the population median

quantile(bootmedian, probs=c(0.005, 0.995))

ii. Bootstrapped Difference of Medians:

quantile(bootmediandiff, probs=c(0.005, 0.995))

plot(density(bootmedian),main = "bootstrapped median")

3.0 3.2 3.4 3.6 3.8 4.0

N = 2000 Bandwidth = 0.02791

plot(density(bootmediandiff),main = "bootstrapped median")

−0.4 −0.2 0.0 0.2 0.4 0.6

N = 2000 Bandwidth = 0.02791

Question 2) Load the compstatslib package and run interactive_t_test(). You

Recall the form of t-tests, where 𝑡 = (𝑥−𝜇√ 0)

# n = 30, diff = 0.3, sd = 2.9, alpha = 0.1

You might also like