Professional Documents
Culture Documents
Assignment 5
Assignment 5
(A) I only (B) II only (C) I and II only (D) I and III only (E) I, II and III
2. We measure the distance (in km) of a sample of commercial airline flights, as well as the price
(in Canadian $) for a ticket on each of the flights. The correlation between the two variables is
calculated to be r = 0.53. What would be the value of the correlation if we had instead measured
distance in miles (1 mile = 1.61 km) and ticket price in U.S. dollars ($1 U.S. = $1.31 Canadian)?
(A) 0.53 (B) 0.84 (C) 0.62 (D) 0.32 (E) 0.43
3. We have gathered data from a sample of individuals for some explanatory variable X and some
response variable Y. We plot the data on a scatterplot and we see that a linear relationship is a
reasonable assumption. We fit the least squares regression line to the data. This is the line that:
(A) minimizes the sum of the residuals.
(B) maximizes the value of the correlation.
(C) minimizes the sum of the deviations from the points to the line in the horizontal direction.
(D) minimizes the sum of the squared residuals.
(E) minimizes the sum of the squared deviations from the points to the line in the horizontal
direction.
6. One student lives 5 kilometres from the university and takes 8 minutes to get there. What is the
value of the residual for this student?
(A) −2.8 (B) 2.8 (C) −10.9 (D) 10.9 (E) 10.8
7. If we had instead measured distance in miles (1 mile = 1.61 km), which of the following values
would change?
(I) slope
(II) intercept
(III) correlation
(A) I only (B) II only (C) I and II only (D) I and III only (E)
II and III only
8. To study the relation between the output, y (in volts) of a windmill and the wind velocity, x (in
km per hour), a researcher collects 20 pairs of observations. Her scatterplot suggests a linear
association between x and y. She calculates
x = 6.8, y =1.8, sx = 2.3, sy = 0.04, and r = 0.96
The most appropriate statement describing this data set is:
A) there is strong negative linear association between x and y.
B) there is weak negative linear association between x and y.
C) there is strong positive linear association between x and y.
D) there is weak positive linear association between x and y.
E) none of the above.
2
9. In a game of chance, your chance of winning a game is 0.2. If you play the game five times and
outcomes are independent, then the probability that you win at most once is (show your work):
A) 0.3277 B) 0.2 C) 0.4096 D) 0.7373 E) 0.5904
10. In a study of 82 young drivers (under the age of 32), 39 were men who were ticketed, 11 were
men who were not ticketed, 8 were women who were ticketed, and 24 were women who were not
ticketed. If one of these subjects is randomly selected, use the general additional rule to find the
probability of getting a man, or someone who was ticketed (show your work).
A) 50% B) 27% C) 16% D) 100% E) 71%
11. Which of the following pairs of variables would be the most likely to have a correlation close
to r = 0.5?
(A) Select a sample of commercial airline flights leaving from the airport one day: X = flight
distance in kilometres; Y = flight distance in miles
(B) Select a sample of grocery stores: X = price of orange juice; Y = amount of orange juice sold
(C) Select a sample of STAT 1000 students: X = number of incorrect answers on the midterm test;
Y = score on the test
(D) Select a sample of adults in Winnipeg: X = IQ; Y = weight
(E) Select a sample of male students at the University of Manitoba: X = height; Y = shoe size
12. A small graduate class of three students writes a math test. The student who finished writing
the fastest got the highest score in the class. The student who finished second got the second highest
score, and the student who took the longest to write the test got the lowest score. If X is the time
it takes for a student to write the test and Y is the student’s test score, then what can be said about
the correlation r between X and Y for this class?
(A) There is a perfect negative linear relationship between X and Y , and so r = −1.
(B) The correlation between X and Y is negative, but not necessarily equal to −1.
(C) There is no linear relationship between X and Y , and so r = 0.
(D) There is a perfect positive linear relationship between X and Y , and so r = 1.
(E) The correlation between X and Y is positive, but not necessarily equal to 1.
3
13. Two quantitative variables X and Y are measured on a sample of five individuals. Consider
the following (incomplete) table of values for this data set.
14. Determine whether the correlation for each of the following pairs of variables is most likely
positive or negative:
(I) X = Speed of wind in a snowstorm Y = Visibility
(II) X = Global supply of oil Y = Price of gasoline
(III) X = Number of people in line at a bank when you arrive
Y = Time until you are served by a teller
15. We record the heights X (in cm) and weights Y (in kg) of a sample of individuals. We calculate
the correlation between X and Y to be r = 0.56. Now suppose that we reversed the roles of X and
Y, i.e., define weight as X and height as Y. The correlation between X and Y would now be:
(A) 0.56 (B) 0.44 (C) 0.65 (D) −0.44 (E) −0.56
4
16. A national consumer magazine obtained data for several variables measured on a random
sample of cars. The magazine reported the following correlations:
• The correlation between car weight and car reliability is −0.30.
• The correlation between car weight and annual maintenance cost is 0.20.
Which of the following statements is/are true?
(I) Lighter weight cars tend to be more reliable.
(II) Heavier cars tend to cost more to maintain.
(III) Car weight is related more strongly to maintenance cost than to reliability.
(A) I only
(B) II only
(C) I and II only
(D) II and III only
(E) I, II and III
17. Can the number of calories in breakfast cereal be predicted by the sugar content? Re searchers
gathered data for 10 breakfast cereals, including sugar content and calories per serving (both in
grams). The data are as follows:
Cereal 1 2 3 4 5 6 7 8 9 10
Sugar 4.3 7.1 3.8 5.7 8.5 4.2 9.7 3.5 4.9 6.3
Calories 99 109 97 106 107 104 112 102 103 102
The correlation between sugar content and calories for this sample is calculated to be 0.84. The
equation of the least squares regression line is:
(A) 𝑦̂= 93.54 + 1.82x
(B) 𝑦̂= 101.84 + 0.39x
(C) 𝑦̂= 95.23 + 1.53x
(D) 𝑦̂= 114.29 + 1.82x
(E) 𝑦̂ =104.10 + 0.39x
5
18. The next two questions (16 and 17) refer to the following: We would like to determine whether
a man’s shoe size can be used to predict his height. The shoe sizes and heights (in inches) of a
random sample of eight men are shown below:
Shoe Size 11 10 9.5 12 11 11.5 10.5 10
Height (inches) 69 70 67 74 72 70 71 68
The correlation between shoe size and height is calculated to be r = 0.78, and the equation of the
least squares regression line is calculated to be 𝑦̂ = 50 + 2x.
19. What is the correct interpretation of the slope of the least squares regression line?
(A) When a man’s shoe size increases by one, his height increases by two inches.
(B) When a man’s height increases by two inches, we predict his shoe size to increase by one.
(C) When a man’s shoe size increases by two, we predict his height to increase by one inch.
(D) When a man’s height increases by one inch, we predict his shoe size to increase by two.
(E) When a man’s shoe size increases by one, we predict his height to increase by two inches.
21. An economist would like to determine whether the amount of a country’s exports (in billions
of dollars) can be predicted by the country’s population (in millions). He collects data for a random
sample of 86 countries, and the least squares regression line is calculated to be ˆy = 2+1.5x. One
country in the sample has 24 billion dollars of exports and a population of 26 million. What is the
value of the residual for this country?
(A) −17 (B) 17 (C) −12 (D) 12 (E) 15
6
22. A class of fourth year statistics students is studying for their final exam, which will be marked
out of 50. Their midterm results (also out of 50) have already been posted. Consider predicting
their final exam scores (y) based on their midterm (x). The data from last year's class are:
Midterm (x) 16 23 27 29 34 35 37 41 43 48
Final Exam (y) 11 25 28 31 25 30 34 40 37 42
From these, it can be shown that 𝑥̅ = 33.3, 𝑦̅ = 30.3, sx = 9.7188, sy = 8.9697 and r = 0.9191. The
least squares regression line is:
A) 𝑦̂ = 2.0516 + 0.8483x
B) 𝑦̂= –5.7806 + 1.0835x
C) 𝑦̂ = –4.0756 + 1.0323x
D) 𝑦̂= 7.9565 + 0.8483x
E) 𝑦̂ = 0.4700 + 1.0835x
23. When we use the least-squares regression criterion to fit a straight line to a set of data, we are
choosing the line that minimizes:
A)the sum of the squares of the horizontal distances between the points and the line.
B)the sum of the squares of the perpendicular distances between the points and the line.
C) the sum of the perpendicular distances between the points and the line.
D)the sum of the horizontal distances between the points and the line.
E) the sum of the squares of the vertical distances between the points and the line.
7
d) a), b) and c) are all true.
e) None of a), b) and c) are true.
25. A researcher wishes to study how the height of children during early adolescence (ages 12-14)
is affected by milk consumption. She plots the height of children (in inches) versus their milk
consumption (in cups/day), and decides to fit a least squares regression line to the data with x as
the explanatory variable and y as the response variable. She computes the following quantities:
• Correlation between height and milk consumption is 0.9
• mean milk consumption is 6.5 cups/day
• mean height is 60 inches • standard deviation of milk consumption is 3.6 cups/day
• Standard deviation of height is 1.2 inches.
The equation of the least-squares regression line is:
8
27) The equation of the least-squares regression line of stopping distance (x), in feet) on speed (x),
in km/hr) is:
𝑦̂= – 36.22+ 0.94 x
Problems
29. A financial analyst provides you with the following ex-ante data regarding returns of BCE and
the Market index in the following year:
State Probability S&P 500 BCE
X Y
Boom 0.4 14% 30%
Normal 0.5 8% 18%
Recession 0.1 4% 10%
In the space provided, please compute the least squares regression line for the data above.
9
a) The expected return to BCE stock is 22% with standard deviation of returns is 6.93% and the
expected return to the S&P 500 is 10% with standard deviation 3.46%. Calculate the correlation
coefficient of BCE and S&P 500 returns.
b) Compute the least squares regression line of BCE returns vs. S&P 500 returns.
30. The leaning Tower of Pisa is leaning more over time. Eventually, it will fall. The following is
selected data on the tower’s lean over a thirteen year period. Lean, measured in 10th’s of a
millimeter is the distance between where a point at the top of the tower was when the observation
was taken and where it would have been if the tower were straight:
Year (X) 1 3 5 7 9 11 13
Lean (Y) 642 656 673 696 713 725 757
10