Professional Documents
Culture Documents
Simple Linear Regression 2023
Simple Linear Regression 2023
Hans Burgerhof
Epidemiology
j.g.m.burgerhof@umcg.nl
Semester 1.2
• Three lectures
– Simple linear regression
– Multiple linear regression
– Building regression models and using SPSS
• Workshops/practicals
• Individual assignment
• Exam (multiple choice questions)
Dataset n = 277 patients with diabetes
Today: simple linear regression
What is it?
Why do we need it?
Pearson correlation is a
measure of the strength of
a linear relationship
is the intercept
is the slope
Y = a·X + b
Residual = distance
from an observation to
the regression line (in
vertical direction)
Explanatory variable(s)
or Predictor(s)
SPSS output of (simple) linear regression
Coefficients
Significant!
Intercept -59.58
Variance =
𝑀𝑆𝑟𝑒𝑔𝑟
𝐹=
𝑀𝑆𝑟𝑒𝑠
𝑌 −𝑌 =𝑌 − 𝑌^ + 𝑌^ −𝑌
Y Unexplained Explained
^
𝑌 (residual) (regression)
𝑌
Always equal in
F = t² simple linear
regression
Assumptions for linear regression
To check the
homoscedasticity
To check normality
of the residuals
SPSS output for checking the assumptions
OK
Simple linear regression with a non-
continuous predictor
males females
Linear regression
T-test
Males (0): 87.43
Weight = 87.43 - 8.71·sex Females (1):
87.43 – 8.71 =
78.72
Conclusion
Performing a linear regression with a continuous Y and
a binary X is a valid analysis and is equivalent to
performing a t-test for independent groups (with equal
variances).
Oneway ANOVA:
SPSS results Oneway ANOVA
is rejected.
Conclusion?
Linear regression?
Positive effect of
smoking on weight?
Stopped never
smoking current
Negative effect of
smoking?
Or another coding ...
Stopped
never current
smoking
No effect of smoking?
Conclusion: we cannot use a categorical (nominal) variable with ... unless we use
more than two categories as a predictor in linear regression ... dummy variables
Dummy coding
A dummy variable is a “helping variable”.
There are several ways to use dummy variables.
Most commonly used: the reference group coding.
Choose one of the groups as reference and make dummy variables for the
other groups to compare those groups to the reference group.
Stopped smoking 1 0
Current smoker 0 1