Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

PBH 5107 Biostatistics for Public Health

Assignment 5
Due: November 21, 2023

Question 1
The HAMD-24 is a 24 item depression scale that can take values from 0 to 78, with
higher scores representing more severe depression. Suppose you have access to data for a
collection of patients who suffer from depression and run a linear regression analysis using the
variables SEX (0=male, 1=female), DUR (duration of depression in months), FAILED (number of
different antidepressant medications tried and failed), and AGE (in years) as independent
variables to predict the independent variable HAMD-24. You obtain the following output.

variable Coefficient Standard error p-value


Intercept 17.32 5.07 0.001
SEX -1.31 0.36 0.042
DUR 0.35 0.12 0.022
FAILED 2.46 0.99 0.033
AGE -0.24 0.19 0.49

a) Which variables show a statistically significant association with depression severity?

The SEX, DUR, and FAILED variables show a statistically significant association with depression
severity. (p<0.05)

b) According to this model, what can we say about the difference in depression severity
between males and females?

Women have 1.31 lower depression score on the HAMD-24 due to the coding of women being
1, when controlling for other factors.

c) Construct a 95% confidence interval for the effect of AGE.

(b ± 1.96s)
Where b =Coefficient
s= Standard error
-0.24 +/- 1.96*0.19
( -0.24 +/- 0.3724 )
(-0.61, 0.13) is the 95% CI for the effect of AGE.

d) Suppose that a 30-year old male depression patient has tried and failed (ie. they didn’t
work) two different kinds of medication and has been suffering from depression from 12
months. What would his expected score be on the HAMD-24?

Sex= Male: 0 x -1.31= 0


DUR= 12 x 0.35 = 4.2
FAILED= 2 x 2.46= 4.92
AGE= 30 x -0.24= -7.2
Intercept= 17.32

Yi= a +bi + bii…


Y= 17.32 + 0 +4.2 + 4.92+ -7.2
Y= 19.24
Where Y: is the expected score on the HAMD-24
a: intercept
b: variable

Therefore, the expected score on the HAMD-24 of this individual would be 19.24.

e) Suppose that the R2 value for this model is 0.16. Would this be a good model for predicting
someone’s depression severity? Explain.

Given:
2
R is the proportion of the variance in the response (Y) that is explained by the predictor (X).
R2 is always between 0 and 1
R2=0 means the model explains nothing. X is not related to Y.
R2=1 means that X explains Y perfectly. Y is not actually random at all!

If the R2 value for this model is 0.16, then collectively, SEX, DUR, FAILED, and AGE
explains only 16% of the HAMD-24 score. Therefore, we could say that 4 variables
explain 16% of the HAMD-24 score which is better than only 1-3 variables, but it would
be better to do more research and cover more variables in order for the variables to
explain and account for the HAMD-24 score more than 16%. Therefore, it isn’t
necessarily a bad model but it doesn’t explain a lot. More research should be done to
explain the value of HAMD-24 score more (someone’s depression severity).

f) How would you explain the effect of failing a trial of medication? “Failing a trial of
medication” means trying a new antidepressant medication but stopping because it doesn’t
work.
This means that every time the variable “FAILED” (failing a trial of medication) goes up
by 1, it means that the HAMD-24 score (someone’s depression severity) will increase by
2.46. All other factors remaining the same. ( correlation does not imply causation)

g) How would you explain the effect of the DUR variable?

DUR (duration of depression in months) 0.35


This means that every time the variable “DUR” (duration of depression in months) goes
up by 1, it means that the HAMD-24 score (someone’s depression severity) will increase
by 0.35. All other factors remaining the same. ( correlation does not imply causation)

Question 2
Suppose you fit a different model to the same data as follows.

variable Coefficient Standard error p-value


Intercept 18.09 5.02 0.001
SEX -1.27 0.33 0.039
FAILED 2.40 0.98 0.031
AGE 2.24 0.89 0.041

a) It seems that the story about AGE has changed from the model in question 1. Explain what
might have caused this change in terms of confounding between AGE and DUR.

Here we can consider that age (AGE) is confounding with duration (DUR) because people who
are older in AGE have lived longer therefore, they might have a higher possibility of
experiencing more months of being depressed, considering that all other factors are controlled
for. This can be interpreted as the confounding effect. Also, here, it says that as you age, the
depression score (HAMD-24) increases whereas in the other table including the “DUR” variable,
as you age the depression score decreases. As well as, here, the AGE variable becomes
significant whereas in the other table it was not. This can be therefore interpreted as the effect
of modifiers; factors that can modify the effect of the exposure on the response variable.

You might also like