Kang Eugene Bivariate Data Olympics Project

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Eugene Kang - Bivariate Data Olympics Project

EVENT - Alpine Skiing, Men’s Giant Slalom


AP Statistics 1st Block, Spring 2024

SECTION 1: RAW & CONVERTED DATA

ALPINE SKIING - MEN’S GIANT SLALOM

Year Winning Time (Original) Year Winning Time (Original)

1964 01:46.710 1994 02:52.460

1968 03:29.280 1998 02:38.510

1972 03:09.620 2002 02:23.280

1976 03:26.970 2006 02:35.000

1980 02:40.740 2010 02:37.830

1984 02:41.180 2014 02:45.290

1988 02:06.370 2018 02:18.040

1992 02:06.980 2022 02:09.350

ALPINE SKIING - MEN’S GIANT SLALOM (Converted - Seconds)

Year Winning Time (Seconds) Year Winning Time (Seconds)

1964 106.710 1994 172.460

1968 209.280 1998 158.510

1972 189.620 2002 143.280

1976 206.970 2006 155.000

1980 160.740 2010 157.830

1984 161.180 2014 165.290

1988 126.37 2018 138.040

1992 126.980 2022 129.350


SECTION II: GRAPHS

FIGURE 1 - SCATTERPLOT

FIGURE 2 - RESIDUAL PLOT


SECTION III: ANALYSIS

1. Describe the relationship between the variables in terms of direction, form, and
strength.
a. There is a negative, weak linear relationship between Winter Olympic year
and winning time in seconds.

2. Is a linear model appropriate for the relationship between these two variables?
Discuss both the scatterplot and the residual plot to answer this question.
a. A linear relationship is appropriate for the relationship for the Winter
Olympic year and winning time in seconds. The scatter plot displays a
weak linear trend among variables. In addition, the residual plot does not
appear to follow a pattern, indicative of linear relationship.

3. Write the equation for the least squares regression line. Define variables in
context!
a. The LSRL is ŷ = 1165 − 0. 506𝑥, Where ŷ is the predicted winning time in
seconds and 𝑥 is the Winter Olympic Year.

4. Write a sentence interpreting the slope of the regression line in context.


a. For every subsequent Winter Olympic game (4 years), the LSRL model
predicts a decrease of 0.506 seconds for winning scores.

5. Write a sentence interpreting the y-intercept of the regression line in context.


a. The y-intercept is (0, 1165). Which predicts, during the year 0CE, the
winning score for Men’s Giant Slalom was 1165 seconds. This is not
statistically significant as this event was not held during 0CE, and a
winning time of 1165 seconds is unrealistic.

6. Use the least-squares regression line to predict the winning time/distance for the
Winter 2026 Olympics.
a. ŷ = 1165 − 0. 506𝑥 → ŷ = 1165 − 0. 506(2026) → ŷ = 139. 844
b. The LSRL predicts the Giant Slalom Men’s winning time to be 139.844
seconds.

7. Find the residual for the 1998 winning time. Please show your work and interpret.
a. ŷ = 1165 − 0. 506𝑥 → ŷ = 1165 − 0. 506(1998) → ŷ = 154. 012 →
𝑟 = 126. 37 − 154. 012 = − 27. 642
b. The predicted winning time is 27.642 seconds lower than the observed
winning time.
8. Find and interpret the r value in the context of the problem.
a. 𝑟 = − 0. 32
b. There is a weak, negative correlation between Winter Olympic Year and
the respective winning time.
2
9. Find and interpret the 𝑟 value in the context of the problem.
2
a. 𝑟 = 0. 103
b. The LSRL accounts for 10.3% of the variability in winning time.

10. Find the means and standard deviations of the explanatory and response
variables.
a.
𝑥̅ = 1993 → Mean of Winter Olympic years

𝑆𝑥 = 18. 2 → Standard deviation of 𝑥̅

𝑦̅ = 156. 7 → Mean of observed winning times

𝑆𝑦 = 28. 7 → Standard deviation of 𝑦̅

𝑆𝑦
11. Confirm that the slope of the regression line is given by the formula 𝑏 = 𝑟 × 𝑆𝑥

You must show your work to receive credit.


𝑆𝑦 28.7
a. 𝑏 = 𝑟 × 𝑆𝑥
→𝑏 = − 0. 32 × 18.2
→ − 0. 505 ≅ − 0. 506

12. Confirm that the LSRL goes through the (𝑥̅, 𝑦̅) point. You must show your work
to receive credit.
a. ŷ = 1165 − 0. 506𝑥 → ŷ = 1165 − 0. 506(1993) → ŷ = 156. 54 ≅ 156. 7
b. Due to rounding errors in the LSRL model, the calculations are not exact.

You might also like