Professional Documents
Culture Documents
Kang Eugene Bivariate Data Olympics Project
Kang Eugene Bivariate Data Olympics Project
Kang Eugene Bivariate Data Olympics Project
FIGURE 1 - SCATTERPLOT
1. Describe the relationship between the variables in terms of direction, form, and
strength.
a. There is a negative, weak linear relationship between Winter Olympic year
and winning time in seconds.
2. Is a linear model appropriate for the relationship between these two variables?
Discuss both the scatterplot and the residual plot to answer this question.
a. A linear relationship is appropriate for the relationship for the Winter
Olympic year and winning time in seconds. The scatter plot displays a
weak linear trend among variables. In addition, the residual plot does not
appear to follow a pattern, indicative of linear relationship.
3. Write the equation for the least squares regression line. Define variables in
context!
a. The LSRL is ŷ = 1165 − 0. 506𝑥, Where ŷ is the predicted winning time in
seconds and 𝑥 is the Winter Olympic Year.
6. Use the least-squares regression line to predict the winning time/distance for the
Winter 2026 Olympics.
a. ŷ = 1165 − 0. 506𝑥 → ŷ = 1165 − 0. 506(2026) → ŷ = 139. 844
b. The LSRL predicts the Giant Slalom Men’s winning time to be 139.844
seconds.
7. Find the residual for the 1998 winning time. Please show your work and interpret.
a. ŷ = 1165 − 0. 506𝑥 → ŷ = 1165 − 0. 506(1998) → ŷ = 154. 012 →
𝑟 = 126. 37 − 154. 012 = − 27. 642
b. The predicted winning time is 27.642 seconds lower than the observed
winning time.
8. Find and interpret the r value in the context of the problem.
a. 𝑟 = − 0. 32
b. There is a weak, negative correlation between Winter Olympic Year and
the respective winning time.
2
9. Find and interpret the 𝑟 value in the context of the problem.
2
a. 𝑟 = 0. 103
b. The LSRL accounts for 10.3% of the variability in winning time.
10. Find the means and standard deviations of the explanatory and response
variables.
a.
𝑥̅ = 1993 → Mean of Winter Olympic years
𝑆𝑦
11. Confirm that the slope of the regression line is given by the formula 𝑏 = 𝑟 × 𝑆𝑥
12. Confirm that the LSRL goes through the (𝑥̅, 𝑦̅) point. You must show your work
to receive credit.
a. ŷ = 1165 − 0. 506𝑥 → ŷ = 1165 − 0. 506(1993) → ŷ = 156. 54 ≅ 156. 7
b. Due to rounding errors in the LSRL model, the calculations are not exact.