Ch4 Notes Transformations 2024

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

GENERAL MATHEMATICS 2024 – BIVARIATE DATA

Chapter 4: Data Transformation

We have seen that sometimes a scatterplot/residual plot will show a non-linear relationship between
two variables (typically shown by a “curved” pattern).

In these cases, we must make an adjustment to the data before we can perform a linear regression.

This adjustment is known as a “transformation”. We will study three different types of transformation,
which can either be applied to the explanatory variable (x) or the response variable (y)

y against log 10 (x )
Logarithmic Transformations
log 10 ( y ) against x

y against x 2
Squared Transformations
y against x
2

1
y against
x
Reciprocal Transformations
1
against x
y

We want to change scatterplots that look like this…

to scatterplots that look like this….

And we want to change residual plots that look like this…

to residual plots that look like this…

1
Using CAS to transform data
You can perform a transformation to a variable using the following steps:
1. Enter original data values into list 1 (explanatory) and list 2 (response).
2. Tap the cell at the bottom of list3, and use the keyboard to type your transformation:
 list1^2 or list2^2 for the squared transformations
 log(list1) or log(list2) for the logarithmic transformations
 1/list1 or 1/list2 for the reciprocal transformations

3. Push “EXE” and the transformed values will appear in list3

Once you have done this, you can perform a regression analysis using the transformed variable.

4. You can determine the equation of the regression line for this transformation using
“Calc  Regression  Linear Reg”
If you have transformed the y -variable, then set XList: list1 and YList: list3
If you have transformed the x -variable, then set XList: list3 and YList: list2

For example (note that residuals are copied to list4):

5. Important note: when writing the regression line equation, you must include the
transformation. For example, if you have applied a log ( y) transformation, then using the
values shown above your equation would be:
log ( y)=– 1.31+0.99 x
2
6. You may also be required to construct a scatterplot and/or residual plot to check that your
transformation has worked. (You should copy residuals to list4 when using Linear Reg).

Exercise 4A: The squared transformation

The x 2 transformation has the effect of “stretching out” the x -axis values.
The y 2 transformation has the effect of “stretching out” the y -axis values.

Example A1: Consider the data shown below.


2
x y y
a) Use your CAS to construct a scatterplot of x versus y and draw a rough
3 0 sketch of this below. Is the relationship between x and y linear?

27 2
32 5
100 7
197 10
280 12

b) It is decided that a y 2 transformation will be applied to linearise the data. Apply this
transformation using your calculator and fill in the third column of the table above.

c) Use your CAS to construct a new scatterplot of x versus y 2 and sketch this below. Does this
relationship appear to be linear?

d) Determine the least squares regression equation for your transformed data. Round the
values of the co-efficients to two decimal places.

e) Use the equation to predict the value of y when x=8 , correct to 4 significant figures.

3
Level 1: Exercise 4A Q2, 3, 5-10
Level 2: Exercise 4A Q5, 6, 8-10, Exam Booklet 4A

4
Exercise 4B: The log transformation
The log (x) transformation has the effect of “compressing” the x -axis values.
The log ( y) transformation has the effect of “compressing” the y -axis values.

Example B1: For the data shown in the table below:

x y log (x) Residual for


log(x) vs y a) Sketch the residual plot for x versus y .
Does this relationship appear to be linear?
1 0 Explain.
10 10
100 20
400 25
600 28
1000 30

b) Perform a log (x) transformation on the data. Complete the table above by filling in the
transformed values, along with the residual values for log (x) versus y .

c) Sketch the residual plot for log (x) versus y . Does this relationship appear to be linear?
Explain.

Example B2:

Level 1: Exercise 4B Q2, 3, 5-10

5
Level 2: Exercise 4B Q2, 4ac, 6, 8-10, Exam Booklet 4B

6
Exercise 4C: The reciprocal transformation
1
The transformation has the effect of “compressing” the x -axis values.
x

1
The transformation has the effect of “compressing” the y -axis values.
y
Example C1

Level 1: Exercise 4C Q2, 4, 5-9


7
Level 2: Exercise 4C Q3c, 4, 6, 7-9, Exam Booklet 4C

8
Exercise 4D: Selecting the right transformation
So how do we know when to use which transformation?
The diagram below will help you choose. You can select a type of transformation depending on what
type of non-linear scatterplot you have.

*Note: sometimes you will be required to apply multiple transformations to see which one works best.
Other times, the question will tell you which one to use.

How do you know if the transformation has worked?


If a transformation has been successful in linearising data, then we should see:
 A higher value for the co-efficient of determination (r 2)
 A randomly scattered residual plot
 A scatterplot that does not show a “curve”

Example D1: State three possible transformations that could be used to


linearise the scatterplot shown at right.

Example D2:

9
FINDING THE BEST TRANSFORMATION
Example extended response question:
In a circular petunia garden at the Melbourne Garden Show the number of seeds planted in a row can
be predicted from the row number.

Row Number 1 2 3 5 6 7 8 10
No. of seeds 3 5 8 20 31 50 79 200

1. Use CAS to plot this data on a scatterplot and sketch it in the space provided below. (make sure
you label the axes)

2. Is your data linear?

3. What types of transformations would be appropriate to use to linearise this data? (Answer in
terms of the variables row number and number of seeds).

4. Apply these transformations and for each:


10
a. sketch the scatterplot with labelled axes
b. sketch the residual plot
c. calculate r2

r2 =

r2 =

r2 =

11
5. Which transformation gives the best result? Explain your answer.

6. Calculate the least squares regression equation for the best transformation. Express your
answer in terms of the variables.

7. Use the equation to predict the number of seeds to be planted in the 9th row, to the nearest whole
number.

8. Predict the row number in which we would expect 350 seeds to be planted.
(round to the nearest whole number).

Level 1: Exercise 4D Q1, 2, 3


Level 2: Exercise 4D Q1, 3, Exam Booklet 4D

12

You might also like