Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

Practice of Statistics for Business and

Economics 4th Edition Moore Test


Bank
Visit to Download in Full: https://testbankdeal.com/download/practice-of-statistics-for-
business-and-economics-4th-edition-moore-test-bank/
1. Which of the following is NOT a necessary condition that needs to be verified before
proceeding to inference for a regression model?
A) The sample is an SRS from the population.
B) There is a linear relationship between the response and explanatory variable.
C) The deviations of the observations about the least-squares line are constant across
all values of x.
D) The response variable is normally distributed.

2. “Students receiving a 4.0 in their first semester of college don't work as hard in future
semesters, explaining why the GPAs of that group of students fall over their college
career.” This statement is an example of:
A) Simpson's paradox.
B) the regression fallacy.
C) regression to mediocrity.
D) the gambler's fallacy.

3. If you reject the null hypothesis in favor of , what can you say
about the test of versus ?
A) We will fail to reject .
B) We will reject in favor of .
C) Nothing. The test for reveals no information about , so there is not enough
information.
D) None of the answers is correct.

Page 1
Use the following to answer questions 4-11:

An old saying in golf is “you drive for show and you putt for dough.” The point is that good
putting is more important than long driving for shooting low scores and hence winning money.
To see if this is the case, data on the top 69 money winners on the PGA tour in 1993 are
examined. The average number of putts per hole for each player is used to predict their total
winnings using the simple linear regression model

1993 winningsi = 0 + 1(average number of putts per hole)i + i

where the deviations i are assumed to be independent and Normally distributed with mean 0
and standard deviation . This model was fit to the data using the method of least squares.
The following results were obtained from statistical software:

R2 = 0.081
s = 281,777

Variable Parameter Estimate SE of Parameter Estimate


Constant 7,897,179 3,023,782
Avg. Putts –4,139,198 1,698,371

4. The explanatory variable in this study is:


A) 1993 winnings.
B) average number of putts per hole.
C) the slope, 1.
D) –4,139,198.

5. The quantity s = 281,777 is an estimate of the standard deviation  of the deviations in


the simple linear regression model. The degrees of freedom for s are:
A) 69.
B) 68.
C) 67.
D) 281,777.

6. The intercept of the least-squares regression line is:


A) 7,897,179.
B) –4,139,198.
C) 3,023,782.
D) 1,698,371.

Page 2
7. Suppose the researchers test the hypotheses H0: 1 = 0, Ha: 1 < 0.

The value of the t statistic for this test is:


A) 2.61.
B) 2.44.
C) 0.081.
D) –2.44.

8. A 95% confidence interval for the slope 1 in the simple linear regression model is
(approximately):
A) 7,897,179 ± 3,023,782.
B) 7,897,179 ± 6,047,564.
C) –4,139,198 ± 1,698,371.
D) –4,139,198 ± 3,396,742.

9. The correlation between 1993 winnings and average number of putts per hole is:
A) 0.081.
B) –0.081.
C) 0.285.
D) –0.285.

Page 3
10. Here is a scatterplot of the 1993 winnings versus the average number of putts per round
and a plot of the residuals versus the average number of putts per round.

Which of the following statements do these plots support?


A) There is no striking evidence in these plots that the assumptions for regression are
violated.
B) The abundance of outliers and influential observations in the plots means that the
assumptions for regression are clearly violated.
C) These plots contain evidence that the standard deviation of the response about the
true regression line increases as the average number of putts per round increases.
D) These plots contain many more points than were used to fit the least-squares

Page 4
regression line in the previous problems. Obviously there is a major error present.

11. Which of the following conclusions seems MOST justified?


A) There is no evidence of a relationship between the average number of putts per
round and the 1993 winnings of PGA tour pros.
B) There is distinct evidence (P-value less than 0.05) that there is a positive
correlation between 1993 winnings and average number of putts per round.
C) There is some evidence that PGA tour pros who averaged fewer putts per round
had higher winnings in 1993.
D) The presence of strongly influential observations in these data makes it impossible
to draw any conclusions about the relationship between 1993 winnings and average
number of putts per round.

Use the following to answer questions 12-17:

Are the inflation rates of the United States and the United Kingdom associated? If so, can we
attempt to predict the U.S. inflation rate using the U.K. inflation rate? Suppose we fit the
following simple linear regression model

U.S. inflation ratei = 0 + 1(U.K. inflation rate)i + i

where the deviations i were assumed to be independent and Normally distributed, with mean 0
and standard deviation . This model was fit to the data using the method of least squares. A
random sample of 20 annual rates was selected from the rates of the past 110 years. The
following results were obtained from statistical software.

R2 = 0.533
s = 3.88795

Variable Parameter Estimate S.E. of Parameter Estimate


Constant 0.2030 1.087
U.K. Inflation Rate 0.6652 0.1468

12. The intercept of the least-squares regression line is (approximately):


A) 0.203.
B) 1.087.
C) 0.665.
D) 3.888.

Page 5
13. A 90% confidence interval for the slope 1 in the simple linear regression model is
(approximately):
A) (0.4106, 0.9198).
B) (0.3568, 0.9736).
C) (0.3590, 0.9714).
D) (0.4120, 0.9184).

14. Suppose the researchers test the hypotheses H0: 1 = 0, Ha: 1  0.

The value of the t statistic for this test is:


A) 0.19.
B) 1.38.
C) 3.89.
D) 4.53.

15. The correlation between these two variables is:


A) 0.284.
B) 0.507.
C) 0.533.
D) 0.730.

16. Is there strong evidence (and if so, why) that a straight line adequately describes the
relationship between the U.K. inflation rate and the U.S. inflation rate?
A) Yes, because the slope of the least-squares line is positive.
B) Yes, because the P-value for testing if the slope is 0 is quite small.
C) No, because the value of the square of the correlation is relatively small.
D) It is impossible to say, because we are not given the actual value of the correlation.

Page 6
17. Here is a scatterplot of the two variables (both rates are in percentages):

Which of the following statements does the plot support?


A) There is no striking evidence in the plot that the assumptions for regression are
violated.
B) There appears to be an outlier and/or influential observations in the plot, suggesting
that our above results must be interpreted with caution.
C) The plot contains some evidence that the standard deviation of the response about
the true regression line is not the same everywhere.
D) There appears to be an outlier and/or influential observations in the plot, suggesting
that our above results must be interpreted with caution, and the plot contains some
evidence that the standard deviation of the response about the true regression line is
not the same everywhere.

Page 7
Use the following to answer questions 18-22:

A random sample of 79 companies from the Forbes 500 list (which actually consists of nearly
800 companies) was selected, and the relationship between sales (in hundreds of thousands of
dollars) and profits (in hundreds of thousands of dollars) was investigated by regression. The
following simple linear regression model was used

Profitsi = 0 + 1(Sales)i + i

where the deviations i were assumed to be independent and Normally distributed, with mean 0
and standard deviation . This model was fit to the data using the method of least squares.
The following results were obtained from statistical software:

R2 = 0.662
s = 466.2

Variable Parameter S.E. of Parameter


Estimate Estimate
Constant –176.644 61.16
Sales 0.092498 0.0075

18. The intercept of the least-squares regression line is (approximately):


A) 0.09.
B) 0.0075.
C) –176.64.
D) 61.16.

19. A 90% confidence interval for the slope 1 in the simple linear regression model is
(approximately):
A) 0.09 ± 0.0075.
B) 0.09 ± 0.012.
C) –0.09 ± 0.0075.
D) –0.09 ± 0.012.

20. Suppose the researchers test the hypotheses H0: 1 = 0, Ha: 1 > 0.

The P-value of the test is:


A) greater than 0.10.
B) between 0.10 and 0.05.
C) between 0.05 and 0.01.
D) less than 0.01.

Page 8
21. Is there strong evidence (and if so, why) of a straight line relationship between sales and
profits?
A) Yes, because the slope of the least-squares line is positive.
B) Yes, because the P-value for testing if the slope is 0 is quite small.
C) No, because the value of the square of the correlation is relatively small.
D) It is impossible to say because we are not given the actual value of the correlation.

22. A scatterplot of sales versus profits is given below.

Which of the following statements is supported by the plot?


A) There is no striking evidence in the plot that the assumptions for regression are
violated and there is a clear straight line trend.
B) There are very influential observations in the plot suggesting that our results must
be interpreted with extreme caution.
C) The plot contains dramatic evidence that the standard deviation of the response
about the true regression line is not even approximately the same everywhere.
D) The plot contains many fewer points than were used to fit the least-squares
regression line in the previous problems. Obviously there is a major error present.

Page 9
Use the following to answer questions 23-25:

The Union Bank of Switzerland (UBS) produces regular reports on the prices and earnings in
major cities throughout the world. Included in this report are the prices of basic commodities,
reported in minutes of labor, including 1 kg of rice, a 1 kg loaf of bread, and a Big Mac, for 54
major cities around the world. An analyst is interested in understanding how prices have changed
since the global financial crisis in 2007–2008. To do this, they wish to use the price of a Big Mac
in 2003 to predict the price of a Big Mac in 2009.

23. The response variable is the:


A) price of a Big Mac in 2003.
B) price of a Big Mac in 2009.
C) name of the city.
D) year.

Page 10
24. A scatterplot of the price of a Big Mac in 2009 versus the price of a Big Mac in 2003 is
given below.

Which of the following statements does the plot support?


A) There is no striking evidence in the plot that the conditions necessary for regression
are violated.
B) There appears to be an outlier and/or influential observations in the plot, suggesting
that our results must be interpreted with caution.
C) The plot contains evidence that the standard deviation of the response about the
true regression line is not the same everywhere.
D) There appears to be an outlier and/or influential observations in the plot, suggesting
that our results must be interpreted with caution, and the plot contains evidence that
the standard deviation of the response about the true regression line is not the same
everywhere.

Page 11
25. A scatterplot of the log price of a Big Mac in 2009 versus the log price of a Big Mac in
2003 is given below.

Which of the following statements does the plot support?


A) There is no striking evidence in the plot that the conditions necessary for regression
are violated.
B) There appears to be an outlier and/or influential observations in the plot, suggesting
that our results must be interpreted with caution.
C) The plot contains evidence that the standard deviation of the response about the
true regression line is not the same everywhere.
D) There appears to be an outlier and/or influential observations in the plot, suggesting
that our results must be interpreted with caution, and the plot contains evidence that
the standard deviation of the response about the true regression line is not the same
everywhere.

Page 12
Use the following to answer questions 26-30:

The Union Bank of Switzerland (UBS) produces regular reports on the prices and earnings in
major cities throughout the world. Included in this report are the prices of basic commodities,
reported in minutes of labor, including 1 kg of rice, a 1 kg loaf of bread, and a Big Mac, for 54
major cities around the world. An analyst is interested in understanding how prices have changed
since the global financial crisis in 2007–2008. To do this, they wish to use the price of a Big Mac
in 2003 to predict the price of a Big Mac in 2009.

The regression output for the regression of log 2009 Big Mac prices on log 2003 Big Mac prices
is given below.

Term Estimate Std. Error t value Prob > |t|


Intercept 0.64031 0.22922 2.793 0.00728
log(bigmac2003) 0.80293 0.06709 11.967 < 0.0001

26. The correlation between the log price of a Big Mac in 2003 and the log price of a Big
Mac in 2009 is:
A) 0.7336.
B) –0.7336.
C) 0.8565.
D) –0.8565.

27. What is the average percent change in the price of a Big Mac in 2003 for a 1% increase
in the price of a Big Mac in 2009?
A) 64.03%
B) 80.3%
C) 22.9%
D) 6.7%

28. A 95% confidence interval for the slope of the regression line is (approximately):
A) .
B)
C)
D)

Page 13
29. Below is a residual plot for the fitted model.

Which of the following statements does the plot support?


A) There is no striking evidence in the plot that the conditions necessary for regression
are violated.
B) There appears to be an outlier and/or influential observations in the plot, suggesting
that our results must be interpreted with caution.
C) The plot contains evidence that the standard deviation of the response about the
true regression line is not the same everywhere.
D) There appears to be an outlier and/or influential observations in the plot, suggesting
that our results must be interpreted with caution, and the plot contains evidence that
the standard deviation of the response about the true regression line is not the same
everywhere.

Page 14
30. Below is a Normal quantile plot of the regression residuals.

Which of the following statements does the plot support?


A) The distribution of the residuals appears to be Normal, so we can proceed to
inference with confidence.
B) The distribution of the residuals does not appear to be Normal, so we cannot trust
inference for the slope.
C) The distribution of the residuals does not appear to be Normal, but we can still trust
inference for the slope because regression inference is robust against moderate lack
of Normality.
D) A Normal quantile plot does not help us check the conditions for regression
inference.

Page 15
Use the following to answer questions 31-37:

Is snowfall in the Sierra Nevada mountains associated with stream runoff in southern California?
If so the amount of snowfall can be used to predict the volume of stream runoff, one factor that is
known to affect the water supply in California. In this problem you are tasked with using a
regression model to explore the relationship between snowfall (in inches) in the Sierra Nevadas
and stream runoff volume (in acre-feet) near Bishop, California.

A scatterplot of snowfall versus stream runoff and regression output from statistical software are
given below, and should be used to answer the following questions. The data set consists of 42
years of precipitation measurements at a site near Owens Valley in the Sierra Nevadas and
stream runoff volume near Bishop, California.

Term Estimate Std. Error


Intercept 26184.4 3517.3
snowfall 3824.8 247.5

Page 16
31. Which of the following BEST describes the association between stream runoff and
snowfall?
A) no association
B) positive nonlinear association
C) negative linear association
D) positive linear association

32. Approximately what percentage of the variation in stream runoff does the regression
model explain?
A) 68%
B) 85.7%
C) 92.6%
D) 100%

33. For each additional inch of snowfall, steam runoff:


A) increases by 3824 acre-feet, on average.
B) decreases by 3824 acre-feet, on average.
C) increases by 26,184 acre-feet, on average.
D) decreases by 26,184 acre-feet, on average.

34. What is the test statistic for a test of the slope?


A) –15.45
B) 15.45
C) 7.45
D) 0.926

Page 17
35. Do the residual plots below make you question the appropriateness of the hypothesis
test?

A) No, none of the conditions necessary for inference appear to be violated.


B) Yes, the residuals are too large.
C) Yes, the residuals are not normally distributed.
D) Yes, the standard deviation of the residuals does not appear to be constant across
all levels of snowfall.

36. A 95% confidence interval for the slope of the regression line is (approximately):
A) .
B) .
C) .
D)

37. Based on the confidence interval for the slope , can you conclude that the linear
association between stream runoff and snowfall is significant?
A) No, confidence intervals cannot be used to make statements about significance.
B) No, because 0 is not contained within the confidence interval.
C) Yes, because we are 95% confident in the interval.
D) Yes, because 0 is not contained within the confidence interval.

38. True or False. Prediction intervals are always narrower than confidence intervals.
A) True
B) False

Page 18
Use the following to answer questions 39-40:

An old saying in golf is “you drive for show and you putt for dough.” The point is that good
putting is more important than long driving for shooting low scores and hence winning money.
To see if this is the case, data on the top 69 money winners on the PGA tour in 1993 are
examined. The average number of putts per hole for each player is used to predict their total
winnings using the simple linear regression model

1993 winningsi = 0 + 1(average number of putts per hole)i + i

where the deviations i are assumed to be independent and Normally distributed with mean 0
and standard deviation . This model was fit to the data using the method of least squares.
The following results were obtained from statistical software:

R2 = 0.081
s = 281,777

Variable Parameter Estimate Std. Err. of Parameter Est.


Constant 7,897,179 3,023,782
Avg. Putts –4,139,198 1,698,371

39. Suppose we use statistical software to predict the 1993 mean winnings for all PGA tour
pros who averaged 1.75 putts per hole and obtain the following output:

Fit S.E. Fit 95.0% C.I. 95.0% P.I.


653,582 61,621 (530,559, 776,605) (77,731, 1,229,433)

A 95% confidence interval for this prediction according to this output is:
A) (77,731, 1,229,433).
B) (530,559, 776,605).
C) (591,961, 715,203).
D) (530,340, 776,824).

Page 19
40. Suppose we use statistical software to predict the 1993 winnings for PGA tour pros who
averaged 1.75 putts per hole and obtain the following output:

Fit S.E. Fit 95.0% C.I. 95.0% P.I.


653,582 61,621 (530,559, 776,605) (77,731, 1,229,433))

A 95% interval for this prediction according to this output is:


A) (77,731, 1,229,433).
B) (530,559, 776,605).
C) (591,961, 715,203).
D) (530,340, 776,824).

Use the following to answer question 41:

A random sample of 79 companies from the Forbes 500 list (which actually consists of nearly
800 companies) was selected, and the relationship between sales (in hundreds of thousands of
dollars) and profits (in hundreds of thousands of dollars) was investigated by regression. The
following simple linear regression model was used

Profitsi = 0 + 1(Sales)i + i

where the deviations i were assumed to be independent and Normally distributed, with mean 0
and standard deviation . This model was fit to the data using the method of least squares.
The following results were obtained from statistical software:

R2 = 0.662
s = 466.2

Variable Parameter S.E. of Parameter


Estimate Estimate
Constant –176.644 61.16
Sales 0.092498 0.0075

Page 20
41. Suppose we wish to predict the mean profits (in hundreds of thousands of dollars) for all
companies that had sales (in hundreds of thousands of dollars) of 500. We use
statistical software to do the prediction and obtain the following output.

Sales Fit S.E. Fit 95% C.I. 95% P.I.


500 –130.4 59.3 (–248.5, –12.3) (–1066.4, 805.6)

A 95% confidence interval for this prediction is:


A) (–1066.4, 805.6).
B) (–248.5, –12.3).
C) –130.4 ± 59.3.
D) 500 ± 59.3.

42. Suppose we wish to predict the profits (in hundreds of thousands of dollars) for a
company that had sales (in hundreds of thousands of dollars) of 500. We use statistical
software to do the prediction and obtain the following output:

Sales Fit S.E. Fit 95% C.I. 95% P.I.


500 –130.4 59.3 (–248.5, –12.3) (–1066.4, 805.6)

A 95% interval for this prediction is:


A) (–1066.4, 805.6).
B) (–248.5, –12.3).
C) –130.4± 59.3.
D) 500 ± 59.3.

Page 21
Use the following to answer questions 43-44:

Is snowfall in the Sierra Nevada mountains associated with stream runoff in southern California?
If so the amount of snowfall can be used to predict the volume of stream runoff, one factor that is
known to affect the water supply in California. In this problem you are tasked with using a
regression model to explore the relationship between snowfall (in inches) in the Sierra Nevadas
and stream runoff volume (in acre-feet) near Bishop, California.

A scatterplot of snowfall versus stream runoff and regression output from statistical software are
given below, and should be used to answer the following questions. The data set consists of 42
years of precipitation measurements at a site near Owens Valley in the Sierra Nevadas and
stream runoff volume near Bishop, California.

Term Estimate Std. Error


Intercept 26184.4 3517.3
snowfall 3824.8 247.5

Page 22
43. In the winter of 2013–2014, the site in the Sierra Nevadas only received 4.5 inches of
snowfall. Suppose that you use statistical software to predict the stream runoff for this
year. The output is displayed below.
Fit SE Fit 95% C.I. 95% P.I.
43395.83 2532.08 (38278.3, 48513.35) (24516.51, 62275.14)
A 95% confidence interval for this prediction is:
A) (38278.3, 48513.35).
B) (24516.51, 62275.14).
C) (40863.75, 45927.91).
D) (38331.67, 48459.99).

44. In a future year only 4.5 inches of snowfall are forecast. Suppose that you use
statistical software to predict the stream runoff for this year. The output is displayed
below.
Fit SE Fit 95% C.I. 95% P.I.
43395.83 2532.08 (38278.3, 48513.35) (24516.51, 62275.14)
A 95% interval for this prediction is:
A) (38278.3, 48513.35).
B) (24516.51, 62275.14).
C) (40863.75, 45927.91).
D) (38331.67, 48459.99).

Use the following to answer questions 45-46:

An old saying in golf is “you drive for show and you putt for dough.” The point is that good
putting is more important than long driving for shooting low scores and hence winning money.
To see if this is the case, data on the top 69 money winners on the PGA tour in 1993 are
examined. The average number of putts per hole for each player is used to predict their total
winnings using the simple linear regression model

1993 winningsi = 0 + 1(average number of putts per hole)i + i

where the deviations i are assumed to be independent and normally distributed, with mean 0
and standard deviation . This model was fit to the data using the method of least squares.
The following ANOVA table was obtained from statistical software:

Source df Sum of Squares


Model 1 4.71605 × 1011
Error 67 53.19690 × 1011

Page 23
45. Total SS, the total sum of squares, has value:
A) 57.91295 × 1011.
B) 53.19690 × 1011.
C) 5.51003 × 1011.
D) 0.79398 × 1011.

46. The value of the ANOVA F statistic for testing the hypotheses H0: 1 = 0, Ha: 1  0
is:
A) 0.081.
B) 0.794.
C) 4.716.
D) 5.940.

Use the following to answer questions 47-50:

Are the inflation rates of the United States and the United Kingdom associated? If so, can we
attempt to predict the U.S. inflation rate using the U.K. inflation rate? Suppose we fit the
following simple linear regression model

U.S. inflation rate i = 0 + 1(U.K. inflation rate)i + i

where the deviations i were assumed to be independent and Normally distributed, with mean 0
and standard deviation . This model was fit to the data using the method of least squares. A
random sample of 20 annual rates was selected from the rates of the past 110 years. The
following results were obtained from statistical software.

R2 = 0.533
s = 3.88795

Variable Parameter Estimate S.E. of Parameter Est.


Constant 0.2030 1.087
U.K. Inflation Rate 0.6652 0.1468

47. The degrees of freedom for residual MS, the mean sum of squares for error, is:
A) 18.
B) 19.
C) 1.
D) not able to be determined from the information given.

Page 24
48. The value of residual MS, the mean sum of squares for error, is:
A) 1.972.
B) 3.888.
C) 15.116.
D) not able to be determined from the information given.

49. The value of total SS, the total sum of squares, is:
A) 310.54.
B) 272.09.
C) 582.63.
D) not able to be determined from the information given.

50. Suppose you wish to test the hypotheses H0:  = 0, Ha:   0, where  is the
population correlation between the U.K. inflation rate and the U.S. inflation rate. The
value of the t statistic for testing this hypothesis is:
A) 0.19.
B) 2.12.
C) 4.53.
D) 6.89.

Use the following to answer questions 51-53:

A random sample of 79 companies from the Forbes 500 list (which actually consists of nearly
800 companies) was selected, and the relationship between sales (in hundreds of thousands of
dollars) and profits (in hundreds of thousands of dollars) was investigated by regression. The
following simple linear regression model was used:

Profitsi = 0 + 1(Sales)i + i

where the deviations i were assumed to be independent and normally distributed, with mean 0
and standard deviation . This model was fit to the data using the method of least squares.
The following ANOVA table was obtained from statistical software.

Source df Sum of Squares


Model 1 32,809,212
Error 16,734,234

Page 25
51. The degrees of freedom for the residual SS, the error sum of squares, is:
A) 1.
B) 2.
C) 77.
D) 78.

52. Total SS, the total sum of squares, has value:


A) 16,074,978.
B) 16,734,234.
C) 32,809,212.
D) 49,543,448.

53. The value of the ANOVA F statistic for testing the hypotheses H0: 1 = 0, Ha: 1  0
is:
A) 1.96.
B) 77.
C) 150.97.
D) 217,328.

54. Which of the following is TRUE about the width of confidence intervals for ?
A) The widths of the intervals do not depend on the value of x.
B) We have more information for estimating means that correspond to extreme values
of x, so the intervals get narrower as the value of x moves away from the .
C) We have less information for estimating means that correspond to extreme values
of x, so the intervals get wider as the value of x moves away from the .
D) None of the answers is correct.

55. The analysis of variance equation states that:


A) total SS = regression SS + residual SS.
B) regression SS = total SS + residual SS.
C) residual SS = total SS + regression SS.
D) total SS = regression SS – residual SS.

Page 26
56. Which of the following statements about the regression standard error hold TRUE?
I. The regression standard error reflects the variation of the y-values about the
regression line.
II. The regression standard error is an estimate of the model standard deviation .
III. The larger the regression standard error is, the better the model fits the data and the
more precise inference about the regression model will be.
A) I
B) I and II
C) II and III
D) I, II, and III

57. Madeline fit a simple linear regression model and put the ANOVA table shown below in
her report. A week later, she realized that she needed the value of in the results
section of her report. Calculate the value of using the information contained in the
ANOVA table.

Source df Sum of Squares Mean Square F Value Prob > F


Regression 1 2322 2322 8.71 0.032
Residual 5 1334 267
Total 6 3656
A) This cannot be determined from the information given.
B) 0.3649
C) 0.5745
D) 0.6351

Use the following to answer questions 58-61:

Max fit a simple linear regression model for his class project and saved the results in his report.
A week later, he realized that he needed to provide the ANOVA table in his report as well;
however, his computer crashed and he lost his original analysis. From the below information,
recreate the ANOVA table.

Term Estimate Std. Error t value Prob > |t|


Intercept –335.50 99.35 –3.377 0.001
Size 400.76 42.85 9.354 < 0.0001

Page 27
58. What are the degrees of freedom associated with the ANOVA F test?
A) 1 and 120
B) 1 and 118
C) 2 and 120
D) 2 and 118

59. Find the sum of squares error.


A) 522.5
B) 273,006.2
C) 32,214,738
D) This cannot be determined from the given information.

60. Find the sum of squares total.


A) 471,419.6
B) 32,214,738
C) 55,627,508
D) 56,098,928
E) This cannot be determined from the given information.

61. Find the value of the ANOVA F statistic.


A) 0.58
B) 0.74
C) 87.4
D) 205.5
E) This cannot be determined from the given information.

Page 28
Use the following to answer questions 62-78:

You are interested in starting a specialty coffee shop, but you need to establish how the volume
of production will affect your average total costs before you can complete your business plan.
You have a friend with connections in the industry, and she obtains the average total cost per cup
for a random sample of several establishments in your region. The final data set your friend
compiles contains information on 61 coffee shops including the average total cost of production
(in cents per cup) and the rate of output (in cups per hour).

62. What is the explanatory variable?


A) revenue
B) coffee shops
C) output rate
D) average total cost

63. What is the response variable?


A) revenue
B) coffee shops
C) output rate
D) average total cost

Page 29
64. Which of the following BEST describes the association between average total cost and
output rate?
A) strong, positive, linear association
B) strong, negative, linear association
C) weak, negative, linear association
D) weak, positive, linear association
E) no linear association

65. Summary statistics for the data set are given below.

Calculate the slope of the least-squares regression line.


A) –2.662
B) –1.487
C) –1.25
D) This cannot be determined from the given information.

66. Summary statistics for the data set are given below.

Calculate the y-intercept of the least-squares regression line.


A) 248.28
B) 200.18
C) 181.15
D) 167.61
E) This cannot be determined from the given information.

Page 30
67. Before we can proceed with regression inference, we need to verify the necessary
regression assumptions. Below are residual plots from the fitted least-squares regression
model.

What does the residual plot reveal about the conditions for inference?
A) Nothing, it just looks like random scatter.
B) The spread of the residuals is not constant across output rate.
C) The spread of the residuals is constant across output rate.
D) There are many troubling outliers.

Page 31
68. Before we can proceed with regression inference, we need to verify the necessary
regression assumptions. Below are residual plots from the fitted least-squares regression
model.

What does the Normal quantile plot reveal about the conditions for inference?
A) Nothing, a Normal quantile plot is not useful in the assessment of the necessary
conditions for inference.
B) The Normal quantile plot reveals that the distribution of the residuals is not normal.
C) Normal quantile plot reveals that the distribution of the residuals is (approximately)
normal.
D) Normal quantile plot reveals that the distribution of the response variable is
(approximately) normal.

Page 32
69. Before we can proceed with regression inference, we need to verify the necessary
regression assumptions. Below are residual plots from the fitted least-squares regression
model. Is the random sampling condition met?

Is the random sampling condition met?


A) Yes, we have a random sample of 61 coffee shops.
B) No, but this is not a necessary condition for regression inference.
C) No, we have a convenience sample of 61 coffee shops.
D) No, but this is not a necessary condition for regression inference, and we have a
convenience sample of 61 coffee shops.

Page 33
70. Proceed as if the following assumptions are verified. A partial ANOVA table and
summary statistics for this analysis are given below:

ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 < 0.0001
Residual 8492.2
Total 60

The degrees of freedom associated with the residual mean squares is:
A) 1.
B) 59.
C) 60.
D) 61.

Page 34
71. Proceed as if the following assumptions are verified. A partial ANOVA table and
summary statistics for this analysis are given below:

ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 < 0.0001
Residual 8492.2
Total 60

Calculate the regression sums of squares.


A)
B)
C)
D)

Page 35
72. Proceed as if the following assumptions are verified. A partial ANOVA table and
summary statistics for this analysis are given below:

ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 < 0.0001
Residual 8492.2
Total 60

Calculate the mean square error.


A) 92.15
B) 8492.2
C) 143.94
D) None of the answers is correct.

Page 36
73. Proceed as if the following assumptions are verified. A partial ANOVA table and
summary statistics for this analysis are given below:

ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 < 0.0001
Residual 8492.2
Total 60

Calculate the ANOVA F statistic for testing .


A) 118.13
B) 2.002
C) 177.13
D) None of the answers is correct.

74. What can you conclude about the association between output rate and total cost based
on the AVOVA F test?
A) There is no evidence of an association between output rate and total cost.
B) There is strong evidence of a negative association between output rate and total
cost.
C) There is strong evidence of a positive association between output rate and total
cost.
D) There is strong evidence of an association between output rate and total cost.

Page 37
75. The regression output for this simple linear regression model is given below:

Regression Statistics:

ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 17003.4 17003.4 118.13 < 0.0001
Residual 59 8492.2 143.9
Total 60 25495.6

Term Estimate Std. Error t value Prob > |t|


Intercept 181.1443 7.9652 22.74 < 0.0001
OutputRate –1.4869 0.1368 –10.87 < 0.0001

By how much do average total costs change if the output rate decreases by one cup per
hour? Compute a 90% confidence interval for this change.
A)
B)
C)
D) None of the answers is correct.

Page 38
76. The regression output for this simple linear regression model is given below:

Regression Statistics:

ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 17003.4 17003.4 118.13 < 0.0001
Residual 59 8492.2 143.9
Total 60 25495.6

Term Estimate Std. Error t value Prob > |t|


Intercept 181.1443 7.9652 22.74 < 0.0001
OutputRate –1.4869 0.1368 –10.87 < 0.0001

Suppose that you plan to open a shop that will operate at a rate of 80 cups per hour.
What do you predict your total costs will be? (Round you answer to 2 decimal places)
A) $62.19
B) $169.24
C) $300.10
D) Not enough information is given to determine the answer.

Page 39
77. The regression output for this simple linear regression model is given below:

Regression Statistics:

ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 17003.4 17003.4 118.13 < 0.0001
Residual 59 8492.2 143.9
Total 60 25495.6

Term Estimate Std. Error t value Prob > |t|


Intercept 181.1443 7.9652 22.74 < 0.0001
OutputRate –1.4869 0.1368 –10.87 < 0.0001

Suppose that you wish to calculate a 95% prediction interval for your prediction of what
total costs will be in a shop that operates at a rate of 80 cups per hour. Calculate the
appropriate standard error you would use in the calculation.
A)

B)

C)

D) None of the answers is correct.

Page 40
78. The regression output for this simple linear regression model is given below:

Regression Statistics:

ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 17003.4 17003.4 118.13 < 0.0001
Residual 59 8492.2 143.9
Total 60 25495.6

Term Estimate Std. Error t value Prob > |t|


Intercept 181.1443 7.9652 22.74 < 0.0001
OutputRate –1.4869 0.1368 –10.87 < 0.0001

Approximately what proportion of the variation in total cost does the regression model
explain?
A) 0.3331
B) 0.4994
C) 0.6613
D) 0.6669

Page 41
Answer Key
1. D
2. B
3. B
4. B
5. C
6. A
7. D
8. D
9. D
10. A
11. C
12. A
13. A
14. D
15. D
16. B
17. D
18. C
19. B
20. D
21. B
22. B
23. B
24. D
25. B
26. C
27. B
28. D
29. B
30. C
31. D
32. B
33. A
34. B
35. A
36. D
37. D
38. B
39. B
40. A
41. B
42. A
43. A
44. B

Page 42
45. A
46. D
47. A
48. C
49. C
50. C
51. C
52. D
53. C
54. C
55. A
56. B
57. D
58. B
59. C
60. D
61. C
62. C
63. D
64. B
65. B
66. C
67. C
68. C
69. A
70. B
71. B
72. C
73. A
74. D
75. B
76. A
77. C
78. D

Page 43

You might also like