Professional Documents
Culture Documents
Practice of Statistics For Business and Economics 4th Edition Moore Test Bank
Practice of Statistics For Business and Economics 4th Edition Moore Test Bank
2. “Students receiving a 4.0 in their first semester of college don't work as hard in future
semesters, explaining why the GPAs of that group of students fall over their college
career.” This statement is an example of:
A) Simpson's paradox.
B) the regression fallacy.
C) regression to mediocrity.
D) the gambler's fallacy.
3. If you reject the null hypothesis in favor of , what can you say
about the test of versus ?
A) We will fail to reject .
B) We will reject in favor of .
C) Nothing. The test for reveals no information about , so there is not enough
information.
D) None of the answers is correct.
Page 1
Use the following to answer questions 4-11:
An old saying in golf is “you drive for show and you putt for dough.” The point is that good
putting is more important than long driving for shooting low scores and hence winning money.
To see if this is the case, data on the top 69 money winners on the PGA tour in 1993 are
examined. The average number of putts per hole for each player is used to predict their total
winnings using the simple linear regression model
where the deviations i are assumed to be independent and Normally distributed with mean 0
and standard deviation . This model was fit to the data using the method of least squares.
The following results were obtained from statistical software:
R2 = 0.081
s = 281,777
Page 2
7. Suppose the researchers test the hypotheses H0: 1 = 0, Ha: 1 < 0.
8. A 95% confidence interval for the slope 1 in the simple linear regression model is
(approximately):
A) 7,897,179 ± 3,023,782.
B) 7,897,179 ± 6,047,564.
C) –4,139,198 ± 1,698,371.
D) –4,139,198 ± 3,396,742.
9. The correlation between 1993 winnings and average number of putts per hole is:
A) 0.081.
B) –0.081.
C) 0.285.
D) –0.285.
Page 3
10. Here is a scatterplot of the 1993 winnings versus the average number of putts per round
and a plot of the residuals versus the average number of putts per round.
Page 4
regression line in the previous problems. Obviously there is a major error present.
Are the inflation rates of the United States and the United Kingdom associated? If so, can we
attempt to predict the U.S. inflation rate using the U.K. inflation rate? Suppose we fit the
following simple linear regression model
where the deviations i were assumed to be independent and Normally distributed, with mean 0
and standard deviation . This model was fit to the data using the method of least squares. A
random sample of 20 annual rates was selected from the rates of the past 110 years. The
following results were obtained from statistical software.
R2 = 0.533
s = 3.88795
Page 5
13. A 90% confidence interval for the slope 1 in the simple linear regression model is
(approximately):
A) (0.4106, 0.9198).
B) (0.3568, 0.9736).
C) (0.3590, 0.9714).
D) (0.4120, 0.9184).
16. Is there strong evidence (and if so, why) that a straight line adequately describes the
relationship between the U.K. inflation rate and the U.S. inflation rate?
A) Yes, because the slope of the least-squares line is positive.
B) Yes, because the P-value for testing if the slope is 0 is quite small.
C) No, because the value of the square of the correlation is relatively small.
D) It is impossible to say, because we are not given the actual value of the correlation.
Page 6
17. Here is a scatterplot of the two variables (both rates are in percentages):
Page 7
Use the following to answer questions 18-22:
A random sample of 79 companies from the Forbes 500 list (which actually consists of nearly
800 companies) was selected, and the relationship between sales (in hundreds of thousands of
dollars) and profits (in hundreds of thousands of dollars) was investigated by regression. The
following simple linear regression model was used
Profitsi = 0 + 1(Sales)i + i
where the deviations i were assumed to be independent and Normally distributed, with mean 0
and standard deviation . This model was fit to the data using the method of least squares.
The following results were obtained from statistical software:
R2 = 0.662
s = 466.2
19. A 90% confidence interval for the slope 1 in the simple linear regression model is
(approximately):
A) 0.09 ± 0.0075.
B) 0.09 ± 0.012.
C) –0.09 ± 0.0075.
D) –0.09 ± 0.012.
20. Suppose the researchers test the hypotheses H0: 1 = 0, Ha: 1 > 0.
Page 8
21. Is there strong evidence (and if so, why) of a straight line relationship between sales and
profits?
A) Yes, because the slope of the least-squares line is positive.
B) Yes, because the P-value for testing if the slope is 0 is quite small.
C) No, because the value of the square of the correlation is relatively small.
D) It is impossible to say because we are not given the actual value of the correlation.
Page 9
Use the following to answer questions 23-25:
The Union Bank of Switzerland (UBS) produces regular reports on the prices and earnings in
major cities throughout the world. Included in this report are the prices of basic commodities,
reported in minutes of labor, including 1 kg of rice, a 1 kg loaf of bread, and a Big Mac, for 54
major cities around the world. An analyst is interested in understanding how prices have changed
since the global financial crisis in 2007–2008. To do this, they wish to use the price of a Big Mac
in 2003 to predict the price of a Big Mac in 2009.
Page 10
24. A scatterplot of the price of a Big Mac in 2009 versus the price of a Big Mac in 2003 is
given below.
Page 11
25. A scatterplot of the log price of a Big Mac in 2009 versus the log price of a Big Mac in
2003 is given below.
Page 12
Use the following to answer questions 26-30:
The Union Bank of Switzerland (UBS) produces regular reports on the prices and earnings in
major cities throughout the world. Included in this report are the prices of basic commodities,
reported in minutes of labor, including 1 kg of rice, a 1 kg loaf of bread, and a Big Mac, for 54
major cities around the world. An analyst is interested in understanding how prices have changed
since the global financial crisis in 2007–2008. To do this, they wish to use the price of a Big Mac
in 2003 to predict the price of a Big Mac in 2009.
The regression output for the regression of log 2009 Big Mac prices on log 2003 Big Mac prices
is given below.
26. The correlation between the log price of a Big Mac in 2003 and the log price of a Big
Mac in 2009 is:
A) 0.7336.
B) –0.7336.
C) 0.8565.
D) –0.8565.
27. What is the average percent change in the price of a Big Mac in 2003 for a 1% increase
in the price of a Big Mac in 2009?
A) 64.03%
B) 80.3%
C) 22.9%
D) 6.7%
28. A 95% confidence interval for the slope of the regression line is (approximately):
A) .
B)
C)
D)
Page 13
29. Below is a residual plot for the fitted model.
Page 14
30. Below is a Normal quantile plot of the regression residuals.
Page 15
Use the following to answer questions 31-37:
Is snowfall in the Sierra Nevada mountains associated with stream runoff in southern California?
If so the amount of snowfall can be used to predict the volume of stream runoff, one factor that is
known to affect the water supply in California. In this problem you are tasked with using a
regression model to explore the relationship between snowfall (in inches) in the Sierra Nevadas
and stream runoff volume (in acre-feet) near Bishop, California.
A scatterplot of snowfall versus stream runoff and regression output from statistical software are
given below, and should be used to answer the following questions. The data set consists of 42
years of precipitation measurements at a site near Owens Valley in the Sierra Nevadas and
stream runoff volume near Bishop, California.
Page 16
31. Which of the following BEST describes the association between stream runoff and
snowfall?
A) no association
B) positive nonlinear association
C) negative linear association
D) positive linear association
32. Approximately what percentage of the variation in stream runoff does the regression
model explain?
A) 68%
B) 85.7%
C) 92.6%
D) 100%
Page 17
35. Do the residual plots below make you question the appropriateness of the hypothesis
test?
36. A 95% confidence interval for the slope of the regression line is (approximately):
A) .
B) .
C) .
D)
37. Based on the confidence interval for the slope , can you conclude that the linear
association between stream runoff and snowfall is significant?
A) No, confidence intervals cannot be used to make statements about significance.
B) No, because 0 is not contained within the confidence interval.
C) Yes, because we are 95% confident in the interval.
D) Yes, because 0 is not contained within the confidence interval.
38. True or False. Prediction intervals are always narrower than confidence intervals.
A) True
B) False
Page 18
Use the following to answer questions 39-40:
An old saying in golf is “you drive for show and you putt for dough.” The point is that good
putting is more important than long driving for shooting low scores and hence winning money.
To see if this is the case, data on the top 69 money winners on the PGA tour in 1993 are
examined. The average number of putts per hole for each player is used to predict their total
winnings using the simple linear regression model
where the deviations i are assumed to be independent and Normally distributed with mean 0
and standard deviation . This model was fit to the data using the method of least squares.
The following results were obtained from statistical software:
R2 = 0.081
s = 281,777
39. Suppose we use statistical software to predict the 1993 mean winnings for all PGA tour
pros who averaged 1.75 putts per hole and obtain the following output:
A 95% confidence interval for this prediction according to this output is:
A) (77,731, 1,229,433).
B) (530,559, 776,605).
C) (591,961, 715,203).
D) (530,340, 776,824).
Page 19
40. Suppose we use statistical software to predict the 1993 winnings for PGA tour pros who
averaged 1.75 putts per hole and obtain the following output:
A random sample of 79 companies from the Forbes 500 list (which actually consists of nearly
800 companies) was selected, and the relationship between sales (in hundreds of thousands of
dollars) and profits (in hundreds of thousands of dollars) was investigated by regression. The
following simple linear regression model was used
Profitsi = 0 + 1(Sales)i + i
where the deviations i were assumed to be independent and Normally distributed, with mean 0
and standard deviation . This model was fit to the data using the method of least squares.
The following results were obtained from statistical software:
R2 = 0.662
s = 466.2
Page 20
41. Suppose we wish to predict the mean profits (in hundreds of thousands of dollars) for all
companies that had sales (in hundreds of thousands of dollars) of 500. We use
statistical software to do the prediction and obtain the following output.
42. Suppose we wish to predict the profits (in hundreds of thousands of dollars) for a
company that had sales (in hundreds of thousands of dollars) of 500. We use statistical
software to do the prediction and obtain the following output:
Page 21
Use the following to answer questions 43-44:
Is snowfall in the Sierra Nevada mountains associated with stream runoff in southern California?
If so the amount of snowfall can be used to predict the volume of stream runoff, one factor that is
known to affect the water supply in California. In this problem you are tasked with using a
regression model to explore the relationship between snowfall (in inches) in the Sierra Nevadas
and stream runoff volume (in acre-feet) near Bishop, California.
A scatterplot of snowfall versus stream runoff and regression output from statistical software are
given below, and should be used to answer the following questions. The data set consists of 42
years of precipitation measurements at a site near Owens Valley in the Sierra Nevadas and
stream runoff volume near Bishop, California.
Page 22
43. In the winter of 2013–2014, the site in the Sierra Nevadas only received 4.5 inches of
snowfall. Suppose that you use statistical software to predict the stream runoff for this
year. The output is displayed below.
Fit SE Fit 95% C.I. 95% P.I.
43395.83 2532.08 (38278.3, 48513.35) (24516.51, 62275.14)
A 95% confidence interval for this prediction is:
A) (38278.3, 48513.35).
B) (24516.51, 62275.14).
C) (40863.75, 45927.91).
D) (38331.67, 48459.99).
44. In a future year only 4.5 inches of snowfall are forecast. Suppose that you use
statistical software to predict the stream runoff for this year. The output is displayed
below.
Fit SE Fit 95% C.I. 95% P.I.
43395.83 2532.08 (38278.3, 48513.35) (24516.51, 62275.14)
A 95% interval for this prediction is:
A) (38278.3, 48513.35).
B) (24516.51, 62275.14).
C) (40863.75, 45927.91).
D) (38331.67, 48459.99).
An old saying in golf is “you drive for show and you putt for dough.” The point is that good
putting is more important than long driving for shooting low scores and hence winning money.
To see if this is the case, data on the top 69 money winners on the PGA tour in 1993 are
examined. The average number of putts per hole for each player is used to predict their total
winnings using the simple linear regression model
where the deviations i are assumed to be independent and normally distributed, with mean 0
and standard deviation . This model was fit to the data using the method of least squares.
The following ANOVA table was obtained from statistical software:
Page 23
45. Total SS, the total sum of squares, has value:
A) 57.91295 × 1011.
B) 53.19690 × 1011.
C) 5.51003 × 1011.
D) 0.79398 × 1011.
46. The value of the ANOVA F statistic for testing the hypotheses H0: 1 = 0, Ha: 1 0
is:
A) 0.081.
B) 0.794.
C) 4.716.
D) 5.940.
Are the inflation rates of the United States and the United Kingdom associated? If so, can we
attempt to predict the U.S. inflation rate using the U.K. inflation rate? Suppose we fit the
following simple linear regression model
where the deviations i were assumed to be independent and Normally distributed, with mean 0
and standard deviation . This model was fit to the data using the method of least squares. A
random sample of 20 annual rates was selected from the rates of the past 110 years. The
following results were obtained from statistical software.
R2 = 0.533
s = 3.88795
47. The degrees of freedom for residual MS, the mean sum of squares for error, is:
A) 18.
B) 19.
C) 1.
D) not able to be determined from the information given.
Page 24
48. The value of residual MS, the mean sum of squares for error, is:
A) 1.972.
B) 3.888.
C) 15.116.
D) not able to be determined from the information given.
49. The value of total SS, the total sum of squares, is:
A) 310.54.
B) 272.09.
C) 582.63.
D) not able to be determined from the information given.
50. Suppose you wish to test the hypotheses H0: = 0, Ha: 0, where is the
population correlation between the U.K. inflation rate and the U.S. inflation rate. The
value of the t statistic for testing this hypothesis is:
A) 0.19.
B) 2.12.
C) 4.53.
D) 6.89.
A random sample of 79 companies from the Forbes 500 list (which actually consists of nearly
800 companies) was selected, and the relationship between sales (in hundreds of thousands of
dollars) and profits (in hundreds of thousands of dollars) was investigated by regression. The
following simple linear regression model was used:
Profitsi = 0 + 1(Sales)i + i
where the deviations i were assumed to be independent and normally distributed, with mean 0
and standard deviation . This model was fit to the data using the method of least squares.
The following ANOVA table was obtained from statistical software.
Page 25
51. The degrees of freedom for the residual SS, the error sum of squares, is:
A) 1.
B) 2.
C) 77.
D) 78.
53. The value of the ANOVA F statistic for testing the hypotheses H0: 1 = 0, Ha: 1 0
is:
A) 1.96.
B) 77.
C) 150.97.
D) 217,328.
54. Which of the following is TRUE about the width of confidence intervals for ?
A) The widths of the intervals do not depend on the value of x.
B) We have more information for estimating means that correspond to extreme values
of x, so the intervals get narrower as the value of x moves away from the .
C) We have less information for estimating means that correspond to extreme values
of x, so the intervals get wider as the value of x moves away from the .
D) None of the answers is correct.
Page 26
56. Which of the following statements about the regression standard error hold TRUE?
I. The regression standard error reflects the variation of the y-values about the
regression line.
II. The regression standard error is an estimate of the model standard deviation .
III. The larger the regression standard error is, the better the model fits the data and the
more precise inference about the regression model will be.
A) I
B) I and II
C) II and III
D) I, II, and III
57. Madeline fit a simple linear regression model and put the ANOVA table shown below in
her report. A week later, she realized that she needed the value of in the results
section of her report. Calculate the value of using the information contained in the
ANOVA table.
Max fit a simple linear regression model for his class project and saved the results in his report.
A week later, he realized that he needed to provide the ANOVA table in his report as well;
however, his computer crashed and he lost his original analysis. From the below information,
recreate the ANOVA table.
Page 27
58. What are the degrees of freedom associated with the ANOVA F test?
A) 1 and 120
B) 1 and 118
C) 2 and 120
D) 2 and 118
Page 28
Use the following to answer questions 62-78:
You are interested in starting a specialty coffee shop, but you need to establish how the volume
of production will affect your average total costs before you can complete your business plan.
You have a friend with connections in the industry, and she obtains the average total cost per cup
for a random sample of several establishments in your region. The final data set your friend
compiles contains information on 61 coffee shops including the average total cost of production
(in cents per cup) and the rate of output (in cups per hour).
Page 29
64. Which of the following BEST describes the association between average total cost and
output rate?
A) strong, positive, linear association
B) strong, negative, linear association
C) weak, negative, linear association
D) weak, positive, linear association
E) no linear association
65. Summary statistics for the data set are given below.
66. Summary statistics for the data set are given below.
Page 30
67. Before we can proceed with regression inference, we need to verify the necessary
regression assumptions. Below are residual plots from the fitted least-squares regression
model.
What does the residual plot reveal about the conditions for inference?
A) Nothing, it just looks like random scatter.
B) The spread of the residuals is not constant across output rate.
C) The spread of the residuals is constant across output rate.
D) There are many troubling outliers.
Page 31
68. Before we can proceed with regression inference, we need to verify the necessary
regression assumptions. Below are residual plots from the fitted least-squares regression
model.
What does the Normal quantile plot reveal about the conditions for inference?
A) Nothing, a Normal quantile plot is not useful in the assessment of the necessary
conditions for inference.
B) The Normal quantile plot reveals that the distribution of the residuals is not normal.
C) Normal quantile plot reveals that the distribution of the residuals is (approximately)
normal.
D) Normal quantile plot reveals that the distribution of the response variable is
(approximately) normal.
Page 32
69. Before we can proceed with regression inference, we need to verify the necessary
regression assumptions. Below are residual plots from the fitted least-squares regression
model. Is the random sampling condition met?
Page 33
70. Proceed as if the following assumptions are verified. A partial ANOVA table and
summary statistics for this analysis are given below:
ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 < 0.0001
Residual 8492.2
Total 60
The degrees of freedom associated with the residual mean squares is:
A) 1.
B) 59.
C) 60.
D) 61.
Page 34
71. Proceed as if the following assumptions are verified. A partial ANOVA table and
summary statistics for this analysis are given below:
ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 < 0.0001
Residual 8492.2
Total 60
Page 35
72. Proceed as if the following assumptions are verified. A partial ANOVA table and
summary statistics for this analysis are given below:
ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 < 0.0001
Residual 8492.2
Total 60
Page 36
73. Proceed as if the following assumptions are verified. A partial ANOVA table and
summary statistics for this analysis are given below:
ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 < 0.0001
Residual 8492.2
Total 60
74. What can you conclude about the association between output rate and total cost based
on the AVOVA F test?
A) There is no evidence of an association between output rate and total cost.
B) There is strong evidence of a negative association between output rate and total
cost.
C) There is strong evidence of a positive association between output rate and total
cost.
D) There is strong evidence of an association between output rate and total cost.
Page 37
75. The regression output for this simple linear regression model is given below:
Regression Statistics:
ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 17003.4 17003.4 118.13 < 0.0001
Residual 59 8492.2 143.9
Total 60 25495.6
By how much do average total costs change if the output rate decreases by one cup per
hour? Compute a 90% confidence interval for this change.
A)
B)
C)
D) None of the answers is correct.
Page 38
76. The regression output for this simple linear regression model is given below:
Regression Statistics:
ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 17003.4 17003.4 118.13 < 0.0001
Residual 59 8492.2 143.9
Total 60 25495.6
Suppose that you plan to open a shop that will operate at a rate of 80 cups per hour.
What do you predict your total costs will be? (Round you answer to 2 decimal places)
A) $62.19
B) $169.24
C) $300.10
D) Not enough information is given to determine the answer.
Page 39
77. The regression output for this simple linear regression model is given below:
Regression Statistics:
ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 17003.4 17003.4 118.13 < 0.0001
Residual 59 8492.2 143.9
Total 60 25495.6
Suppose that you wish to calculate a 95% prediction interval for your prediction of what
total costs will be in a shop that operates at a rate of 80 cups per hour. Calculate the
appropriate standard error you would use in the calculation.
A)
B)
C)
Page 40
78. The regression output for this simple linear regression model is given below:
Regression Statistics:
ANOVA
Source DF Sum of Mean Square F Value Prob > F
Squares
Regression 1 17003.4 17003.4 118.13 < 0.0001
Residual 59 8492.2 143.9
Total 60 25495.6
Approximately what proportion of the variation in total cost does the regression model
explain?
A) 0.3331
B) 0.4994
C) 0.6613
D) 0.6669
Page 41
Answer Key
1. D
2. B
3. B
4. B
5. C
6. A
7. D
8. D
9. D
10. A
11. C
12. A
13. A
14. D
15. D
16. B
17. D
18. C
19. B
20. D
21. B
22. B
23. B
24. D
25. B
26. C
27. B
28. D
29. B
30. C
31. D
32. B
33. A
34. B
35. A
36. D
37. D
38. B
39. B
40. A
41. B
42. A
43. A
44. B
Page 42
45. A
46. D
47. A
48. C
49. C
50. C
51. C
52. D
53. C
54. C
55. A
56. B
57. D
58. B
59. C
60. D
61. C
62. C
63. D
64. B
65. B
66. C
67. C
68. C
69. A
70. B
71. B
72. C
73. A
74. D
75. B
76. A
77. C
78. D
Page 43