Professional Documents
Culture Documents
Lecture 9
Lecture 9
1
Simple Linear Regression
• Linear regression involves one variable denoted as
“dependent” or “outcome” and referenced to as “y”.
Y X
• And is used for predicting the value of y when x is given.
2
Simple Linear Regression
• Assumptions:
1) The observations for the outcome Y are assumed to be
independent from each other, continuous and normally
distributed.
Y X
Basic Biostat Lect 9 Dr. Jaffa 5
• Interpretation of β:
For each unit increase in X there is β units increase (or
decrease) in Y on average.
3
Simple Linear Regression
4
Simple Linear Regression
• We need a formal test to test if the slope β = 0. We refer to this
test as “t test” and the hypothesis to be tested is
Ho: β = 0 vs H1: β ≠ 0
5
Simple Linear Regression
Example: OBGYN (continued)
• To quantify the relationship between estriol levels and
babies birthweights use linear regression since the outcome
babies birthweight is normally distributed.
6
Simple Linear Regression
Example: OBGYN
• Simple linear regression is used in the estriol and babies
birthweights.
7
Simple Linear Regression
8
Simple Linear Regression
9
Multiple Linear Regression
i Birthweight in oz (X1) Age in days (X2) Systolic Blood Pressure (mm Hg) (Y)
1 135 3 89
2 120 4 90
3 100 3 83
4 105 2 77
5 130 4 92
6 125 5 98
7 125 2 82
8 105 3 85
9 120 5 96
10 90 4 95
11 120 2 80
12 95 3 79
13 120 3 86
14 150 4 97
15 160 3 92
16 125 3 88
20
Basic Biostat Lect 9 Dr. Jaffa
10
Multiple Linear Regression
• SPSS output for the fitted model with SBP (Y) as the
dependent variable, babies birthweights (X1) and age (X2) as
the explanatory (or predictive) variables.
• Estimated α = 53.45,
• Estimated β1 = 0.126 (P-value = 0.003),
• Estimated β2 = 5.88 (P-value = .000)
• Interpretations:
For each 1 ounce increase in birthweight corresponds 0.125
mm Hg increase in the baby’s SBP, adjusting for the effect
of age in the model.
11
Multiple Linear Regression
12
Multiple Linear Regression
13
Multiple Linear Regression
• t statistic = 3.657; P-value = 0.003; thus t test is significant.
So we reject the null hypothesis that the slope for
birthweight is zero and conclude that birthweight is
contributing significantly in explaining the dependent
variable SBP, adjusting for the effect of age in the model.
14
Multiple Linear Regression
• t statistic = 8.656; P-value = 0.000; thus t test is significant and
we conclude that age is contributing significantly in explaining
SBP, adjusting for the effect of birthweight in the model.
15
Multiple Linear Regression
• Thus babies who are 2 days old and weigh 135 ounces
are expected to have an average SBP of 82.099 mm Hg
16
Simple Logistic Regression
Smoking Status
Lung Cancer yes no
yes 10 6
no 20 74
17
Simple Logistic Regression
• The SPSS output of logistic regression with
Y = lung cancer = (1 for yes and 0 for no) dependent variable
X = smoking status = (1 for yes and 0 for no) explanatory
variable
Or equivalently:
• Ho: β = 0
H1: β ≠0
18
Simple Logistic Regression
• The SPSS output of logistic regression with
Y = lung cancer = (1 for yes and 0 for no) dependent variable
X = smoking status = (1 for yes and 0 for no) explanatory
variable
19
Simple Logistic Regression
• If the 95% CI for OR contains 1 then the association between X
and Y is insignificant; otherwise the association is significant.
20
Multiple Logistic Regression
Accident in past Vision Driver Education
year ? Problem? course?
1 1 1
1 0 0
1 1 0
1 0 0
1 1 1
0 0 1
0 1 1
0 0 0
0 0 1
…. …. ….
1 0 1
1 1 0
1 1 0
Basic Biostat Lect 9 Dr. Jaffa 41
Lower Upper
a
drivereducation -1.494 .705 4.496 1 .034 .224 .056 .893
Step 1
age .007 .018 .129 1 .719 1.007 .971 1.043
21
Multiple Logistic Regression
• The odds ratio of car accident is equal to exp(β1)
• Ho: OR=exp(βvision_Problem) = 1 in a model that contains age and
education course
H1: OR=exp(βvision_problem) ≠1 in a model that contains age and
education course
Or equivalently:
• Ho: βvision_Problem = 0 in a model that contains age and
education course
H1: βvision_Problem ≠0 in a model that contains age and education
course
Or equivalently:
• Ho: βdrivers_education = 0 in a model that contains age and vision
problem
H1: βdrivers_education ≠0 in a model that contains age and vision
problem
22
Multiple Logistic Regression
• The odds ratio of car accident is equal to exp(β)
• Ho: OR=exp(βage) = 1 in a model that contains education class
and vision problem
H1: OR=exp(βage) ≠ 1 in a model that contains education class
and vision problem
Or equivalently:
• Ho: βage = 0 in a model that contains education class and
vision problem
H1: βage ≠ 0 in a model that contains education class and
vision problem
Lower Upper
a
drivereducation -1.494 .705 4.496 1 .034 .224 .056 .893
Step 1
age .007 .018 .129 1 .719 1.007 .971 1.043
23
Multiple Logistic Regression
• SPSS outcome for multiple logistic regression corresponding to
the accident example.
Variables in the Equation
Lower Upper
vision 1.710 .706 5.872 1 .015 5.527 1.387 22.036
Lower Upper
a
drivereducation -1.494 .705 4.496 1 .034 .224 .056 .893
Step 1
age .007 .018 .129 1 .719 1.007 .971 1.043
24
Multiple Logistic Regression
• The odds of being in a car accident for drivers who attended
the driver’s education program is 0.224 times that of those
who did not adjusting for age and vision problem.
• Since the OR=0.224 < 1 then adjusting for age and vision
problem, taking the driving education course is a protective
factor against car accidents.
Lower Upper
a
drivereducation -1.494 .705 4.496 1 .034 .224 .056 .893
Step 1
age .007 .018 .129 1 .719 1.007 .971 1.043
25
Multiple Logistic Regression
• SPSS outcome for multiple logistic regression corresponding to
the accident example.
Variables in the Equation
Lower Upper
vision 1.710 .706 5.872 1 .015 5.527 1.387 22.036
Lower Upper
a
drivereducation -1.494 .705 4.496 1 .034 .224 .056 .893
Step 1
age .007 .018 .129 1 .719 1.007 .971 1.043
26
Multiple Logistic Regression
54
Basic Biostat Lect 9 Dr. Jaffa
27
EPHD310 Basic Biostatistics Course Learning Outcomes Per FHS Catalogue
28