Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Chapter 3 Section 2a

Vocabulary Review

Response Variable - measures an _________ of a study

Explanatory Variable - may help ________ or ______ changes in a response


variable

Scatterplot - shows the relationship between two _________ variables measured


on the same individuals.
Vocabulary
Regression Line - a line that models how a response variable (y) changes as a an
explanatory variable (x) changes.

● ŷ - “Y-hat” is used to identify a regression line, its formula, and any predicted
values from the formula

Regression Line’s Equation: ŷ = a + bx

● y-intercept - ‘a’ value, and the predicted value for y when x=0
● Slope - ‘b’ value, and the amount by which the predicted value of y changes
when x increases by 1 unit

Extrapolation - the use of a regression line for prediction outside the interval of
x-values used to obtain the line.

● The further that we extrapolate the less reliable the predictions


Regression Line and Extrapolation

● Regression requires a that we have a distinct explanatory and response variable (unlike
correlation)
● When writing a regression equation we use context rather than x and y and a “hat” over
the y variable, or response variable.
● Extrapolations are found by plugging in a value for x and finding the predicted response
variable (be sure to have the “hat”with your written solution)
Vocabulary
Least Squares Regression Line - the line that makes the sum of the squared
residuals as small as possible (the line that best fits the data)
● To find out Least Squares Regression Line equation:
○ Input your values into your lists (be consistent, ex. L1 is always explanatory and L2
is always response)
○ Stat - Calc - LinReg(a+bx) where ‘b’ will be your slope and ‘a’ will be your y-intercept
■ Note: it does not make a difference which LinReg you use as long as you
input them into your equation correctly
○ Input your equation that you now have into your “y=” while leaving your stat plot of
the scatterplot on to see the Regression Line with the scatterplot
● Regression Line Equations will either
be given or estimated
What’s the Difference ○ When you need to estimate this
between Regression you are not given the regression
line equation or the all of the data
Line and Least ○ You need to pick two points that
Squares Regression the line would most likely go
through and calculate the
Line? equation from there using slope
formula (m=(y2-y1)/(x2-x1)) and
the y-intercept by plugging in one
of the points to calculate ‘b’
● Least Squares Regression Line is more
commonly asked for (aka, you are
required to find it)
○ It is the most precise regression
line and can be found using your
calculator
Vocabulary
Residual - the difference between the actual value of y and predicted value of y
by the regression line

● Residual = actual y - predicted y

=y-ŷ
Example:
Sprint Time (sec) Long Jump Distances (in)

Using the same data we inputted yesterday, let’s find if 5.41 171
Sprint Time (sec) could be an explanatory variable for 5.05 184
Long Jump Distances!
a) What is the regression line that is created from the 7.01 90
data provided?
7.17 65
b) When comparing the regression line with the
scatterplot, does it seem like a good fit? 6.73 78
c) Find the residual for first and last value when
looking at the scatterplot. 5.68 130

a) ŷ = -45.74x + 414.79 or ŷ = 414.79 - 45.74x 5.78 173


b) Yes, the data looks like it fits the line well with
points scattered on both sides of the line. 6.31 143

c) Sprint time of 5.05 sec: ŷ=-45.74(5.05)+414.79 6.44 92


ŷ = 183.803 y=184, therefore the residual at the
sprint time of 5.05 sec for LDJ is 0.197 6.50 139

6.80 120
Sprint time of 7.25 sec: ŷ=-45.74(7.25)+414.79
ŷ = 83.175 y=110, therefore the residual at the 7.25 110
sprint time of 5.05 sec for LDJ is 26.825
What about an Interpretation?
Looking at the example that we just completed, slope and the y-intercept need to be
interpreted as well.

ŷ = -45.74x + 414.79 where x is Sprint Time and y is Long Jump Distance.


The predicted distance for Long Jump Distance decreases by 45.74 inches for every one
second increase of Sprint Time.

If a person was able to run a Sprint Time of 0 seconds then they would have a predicted Long
Jump Distance of 414.79 inches.

The predicted “y-variable” increases(+)/decreases(-) by “slope value” for every one increase
of “x-variable”.

When “x-variable” is zero then the predicted “y-variable’ would be “y-intercept”.


Vocabulary
Residual Plot - a scatterplot that displays the residuals on the vertical axis and
the explanatory variable on the horizontal axis
Interpreting a Residual Plot

● We interpret residual plots to identify if a linear


model is the best fit for a data set
○ If there is a pattern (it looks similar to some
other kind of graph: exponential,
logarithmic, cubic, etc.) then the linear
model is NOT a good fit
○ If there is no pattern then the linear model is
a good fit or we can say that it is
appropriate for the data
● Lets try this with the data that we have and see if
it would be a good match for our linear model!
● https://youtu.be/n-mpifTiPV4 - if we have time
Creating a Residual ○ Input your values into your lists (be
consistent, ex. L1 is always
Plot on Your explanatory and L2 is always

Calculator ○
response)
Graph
○ 2nd - Stat Plot
■ Type: Scatterplot
■ XList: L1 (explanatory variable
list)
■ YList: 2nd - Stat - RESID
○ Select: Graph - Zoom - ZoomStat

Now let’s try it with our data!


Regression Lines: They can create

Why are Regression predictions for collected data

Lines and Residuals ● Used in manufacturing, sales, and


many other areas of business as well
Important? as other areas like medicine and
agriculture

Residuals: They illustrate different trends


and the models that best fit the data

● This is often used to check the


accuracy for different models of data

You might also like