Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

HALIFEAN RENTAP_2020836444

TPS552 PLANNING TECHNIQUES AND MODELLING/ AP221 A

SPSS Simple Linear Regression Tutorial

1. Create Scatterplot with Fit Line


The starting point for analysis is the scatterplot. This willshow us if their IQ and performance score and relationship. We'll create
our chart from Graphs_Legacy Dialogs_Scatter/Dot.

1
Result:
Scatterplot performance with IQ
All respondents| N=10

There seems to be a moderate correlation between IQ and performance: on average, respondents with higher IQ scores seem to
be perform better. This relation looks roughly linear. Next, add a regression line to our scatterplot. Right-clicking it and selecting
Edit content SPSS Menu Arrow in Separate Window opens up a Chart Editor window. Here we simply click the “Add Fit Line at
Total” icon as shown below.

2
By default, SPSS now adds a linear regression line to our scatterplot. The result is shown below.

1. R2 = 0.403 indicates that IQ accounts for some 40.3% of the variance in performance scores. That is, IQ predicts performance
fairly well in this sample.
2. We can best predict the performance of work from IQ through our distribution y is performance, shown on the y-axis and x is
IQ, shown on the x-axis. So that'll be performance = 34.26 + 0.64 * IQ.
3. So for a job applicant with an IQ score of 115, we'll predict 34.26 + 0.64 * 115 = 107.86 as his/her most likely future performance
score.
4. This gives us a basic idea of the relationship between IQ and performance and presents it visually walaueven a lot of information
of statistical importance and confidence intervals are still missing.

3
SPSS Linear Regression Dialogs
Rerunning minimal regression analysis from Analyze_Regression_Linear give much more detailed output.

4
a) SPSS Regression Output I – Coefficients
Unfortunately, SPSS give much more regression output than we need. We can safely ignore most of it. However, a table of major
importance is the coefficients table shown below.

1. This table shows the B coefficients we already saw in our scatterplot. As indicated, these imply the linear regression
equation that best estimates job performance from IQ in our sample.
2. Second, remember that we usually reject the null hypothesis if p <0.05. The B coefficient for IQ has “Sig” or p = 0.049. It's
statistically significantly different from zero.
3. However, its 95% confidence interval roughly, a likely range for its population value is [0.004,1.281]. So B is probably not
zero but it may well be very close to zero. The confidence interval is huge our estimate for B is not precise at all and this is
due to the minimal sample size on which the analysis is based.

5
b) SPSS Regression Output II - Model Summary
Apart from the coefficients table, we also need the Model Summary table for reporting our results.

1. R is the correlation between the regression predicted values and the actual values. For simple regression, R is equal to the
correlation between the predictor and dependent variable.
2. R Square is the squared correlation indicates the proportion of variance in the dependent variable that's accounted for by
the predictor(s) in our sample data.
3. Adjusted R square estimates R-square when applying our sample based regression equation to the entire population.
4. Adjusted r square gives a more realistic estimate of predictive accuracy than simply r-square. In our example, the large
difference between them generally referred to as shrinkage- is due to our very minimal sample size of only N = 10.

In any case, this is bad news for Company X: IQ doesn't really predict job performance so nicely after all.

6
c) Evaluating the Regression Assumptions
The main assumptions for regression are
1. Independent observations;
2. Normality: errors must follow a normal distribution in population;
3. Linearity: the relation between each predictor and the dependent variable is linear;
4. Homoscedasticity: errors must have constant variance over all levels of predicted value.
• If each case, row of cells in data view in SPSS represents a separate person, we usually assume that these are
“independent observations”. Next, assumptions 2-4 are best evaluated by inspecting the regression plots in our output.
• If normality holds, then our regression residuals should be, roughly normally distributed. The histogram below doesn't
show a clear departure from normality

The regression procedure can add these residuals as a new variable to your data. By doing so, you could run a Kolmogorov-Smirnov
test for normality on them. For the tiny sample at hand, however, this test will hardly have any statistical power.

7
The 3. linearity and 4. homoscedasticity assumptions are best evaluated from a residual plot. This is a scatterplot with predicted values
in the x-axis and residuals on the y-axis as shown below. Both variables have been standardized but this doesn't affect the shape of
the pattern of dots

1. Ignoring these assumptions altogether;


2. Lying that the regression plots don't indicate any violations of the model assumptions;
3. A non linear transformation -such as logarithmic- to the dependent variable;
4. Fitting a curvilinear model -which we'll give a shot in a minute.

8
3. Non Linear Regression Experiment
Our sample size is too small to really fit anything beyond a linear model. But we did so anyway -just curiosity. The easiest option
in SPSS is under Analyze_Regression_Curve Estimation.

Results:

You might also like