Professional Documents
Culture Documents
Ch4 Notes Transformations 2024
Ch4 Notes Transformations 2024
Ch4 Notes Transformations 2024
We have seen that sometimes a scatterplot/residual plot will show a non-linear relationship between
two variables (typically shown by a “curved” pattern).
In these cases, we must make an adjustment to the data before we can perform a linear regression.
This adjustment is known as a “transformation”. We will study three different types of transformation,
which can either be applied to the explanatory variable (x) or the response variable (y)
y against log 10 (x )
Logarithmic Transformations
log 10 ( y ) against x
y against x 2
Squared Transformations
y against x
2
1
y against
x
Reciprocal Transformations
1
against x
y
1
Using CAS to transform data
You can perform a transformation to a variable using the following steps:
1. Enter original data values into list 1 (explanatory) and list 2 (response).
2. Tap the cell at the bottom of list3, and use the keyboard to type your transformation:
list1^2 or list2^2 for the squared transformations
log(list1) or log(list2) for the logarithmic transformations
1/list1 or 1/list2 for the reciprocal transformations
Once you have done this, you can perform a regression analysis using the transformed variable.
4. You can determine the equation of the regression line for this transformation using
“Calc Regression Linear Reg”
If you have transformed the y -variable, then set XList: list1 and YList: list3
If you have transformed the x -variable, then set XList: list3 and YList: list2
5. Important note: when writing the regression line equation, you must include the
transformation. For example, if you have applied a log ( y) transformation, then using the
values shown above your equation would be:
log ( y)=– 1.31+0.99 x
2
6. You may also be required to construct a scatterplot and/or residual plot to check that your
transformation has worked. (You should copy residuals to list4 when using Linear Reg).
The x 2 transformation has the effect of “stretching out” the x -axis values.
The y 2 transformation has the effect of “stretching out” the y -axis values.
27 2
32 5
100 7
197 10
280 12
b) It is decided that a y 2 transformation will be applied to linearise the data. Apply this
transformation using your calculator and fill in the third column of the table above.
c) Use your CAS to construct a new scatterplot of x versus y 2 and sketch this below. Does this
relationship appear to be linear?
d) Determine the least squares regression equation for your transformed data. Round the
values of the co-efficients to two decimal places.
e) Use the equation to predict the value of y when x=8 , correct to 4 significant figures.
3
Level 1: Exercise 4A Q2, 3, 5-10
Level 2: Exercise 4A Q5, 6, 8-10, Exam Booklet 4A
4
Exercise 4B: The log transformation
The log (x) transformation has the effect of “compressing” the x -axis values.
The log ( y) transformation has the effect of “compressing” the y -axis values.
b) Perform a log (x) transformation on the data. Complete the table above by filling in the
transformed values, along with the residual values for log (x) versus y .
c) Sketch the residual plot for log (x) versus y . Does this relationship appear to be linear?
Explain.
Example B2:
5
Level 2: Exercise 4B Q2, 4ac, 6, 8-10, Exam Booklet 4B
6
Exercise 4C: The reciprocal transformation
1
The transformation has the effect of “compressing” the x -axis values.
x
1
The transformation has the effect of “compressing” the y -axis values.
y
Example C1
8
Exercise 4D: Selecting the right transformation
So how do we know when to use which transformation?
The diagram below will help you choose. You can select a type of transformation depending on what
type of non-linear scatterplot you have.
*Note: sometimes you will be required to apply multiple transformations to see which one works best.
Other times, the question will tell you which one to use.
Example D2:
9
FINDING THE BEST TRANSFORMATION
Example extended response question:
In a circular petunia garden at the Melbourne Garden Show the number of seeds planted in a row can
be predicted from the row number.
Row Number 1 2 3 5 6 7 8 10
No. of seeds 3 5 8 20 31 50 79 200
1. Use CAS to plot this data on a scatterplot and sketch it in the space provided below. (make sure
you label the axes)
3. What types of transformations would be appropriate to use to linearise this data? (Answer in
terms of the variables row number and number of seeds).
r2 =
r2 =
r2 =
11
5. Which transformation gives the best result? Explain your answer.
6. Calculate the least squares regression equation for the best transformation. Express your
answer in terms of the variables.
7. Use the equation to predict the number of seeds to be planted in the 9th row, to the nearest whole
number.
8. Predict the row number in which we would expect 350 seeds to be planted.
(round to the nearest whole number).
12