Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

EPID 620/PUBH 801: Epidemiologic Methods I

TAKE HOME FINAL EXAM


DUE: Thursday, December 19th, 2019, 11:59 pm

Value: 25 points

Directions and important notes:

 This exam includes a section of SAS interpretation (13 points) and a short-answer section

(12 points).

 You may refer to your notes, textbooks, class slides, class exercises, etc. to answer the

questions.

 You may NOT talk to each other about the exam; this is an individual activity.

However, you are welcome to e-mail questions to the instructor.

 Make sure to read the exam carefully, and to be both comprehensive and succinct in your

answers.

 In the SAS section, use complete and grammatically correct sentences when providing

answers, explanations and interpretations. Do not dump raw SAS output into your answer—

where prompted, summarize output in tables. Interpret your results below the pertinent table.

1
EPID 620/PUBH 801: Epidemiologic Methods I

I. SAS OUTPUT INTERPRETATION

SCENARIO: A study of smoking and colon cancer recruited 48 men who have been diagnosed

with colon cancer in the past 3 years (prevalent and incident cases) and a random sample of 52

colon cancer-free men who received a colonoscopy within the past month at the same U.S.

hospital where the cases are being treated. Participants were asked about their smoking history

in the past 5 years. For smoking, participants were categorized as “exposed” if they reported

smoking at least an average of one pack per day over the 5-year period, and “unexposed”

otherwise.

Researchers suspected several other variables as possible confounders or effect modifiers of the

smoking - colon cancer relationship, and so they collected information about the following

variables over the past 5 years: alcohol consumption (dichotomized as ≥2 vs. <2 drinks per day),

aspirin consumption (dichotomized as ≥0.75 vs. <0.75 mg per day), coffee consumption (≥2 vs.

<2 cups per day) and exercise (≥3 vs. <3 times per week). The SAS syntax and its output are on

Blackboard, along with the dataset if you wish to run the analyses yourself. The codebook for

variables is below:

Alphabetic List of Variables and Attributes


# Variabl Type Length Original Format Recodin Label

e g
2 alcohol Num 8 0 = Unexposed / 2 Average ≥2 alcoholic drinks per

No 1 day

1 = Exposed / Yes

2
EPID 620/PUBH 801: Epidemiologic Methods I

Alphabetic List of Variables and Attributes


# Variabl Type Length Original Format Recodin Label

e g
5 aspirin Num 8 0 = No 2 Average ≥0.75mg aspirin per day

1 = Yes 1
3 cancer Num 8 0 = Control 2 Has colon cancer

1 = Case 1
7 coffee Num 8 0 = No 2 Average of ≥2 cups of coffee per

1 = Yes 1 day
6 exercise Num 8 0 = No 2 Physically active (≥3 times per

1 = Yes 1 week)
1 id Num 8 Study ID
4 smoking Num 8 0 = No 2 Totaled ≥5 pack-years (average of

1 = Yes 1 ≥1 pack per day)

Note: All you need to answer the following questions is the SAS output provided in PDF form

on Blackboard. The SAS syntax is provided to give you a sense of some of the analyses

performed before arriving at the final outputs; not all the syntax has associated output provided.

(1) Create a table to display the distribution of the exposure and other variables, stratified by

colon cancer status (e.g., describe the proportion of participants who smoked, drank

alcohol, etc., among cases and controls) and summarize the findings (1-2 sentences) [2

point].

3
EPID 620/PUBH 801: Epidemiologic Methods I

Exposures and variable Proportion of exposed Proportion of exposed

participants among CASES participants among

CONTROLS
Exercise 43.7% 61.54%
Coffee 60.42% 46.15%
Smoking 72.92% 40.38%
Alcohol 68.75% 26.92%
Aspirin 33.33% 61.54%

Summary: The rate of getting colon cancer among those who drank coffee, smokers and

those who consumed alcohol was much higher for

(2) Use a table to present the study findings. The table should start with the unadjusted

model, then move to assessment of confounding and effect modification, and should

finally present and label the correct and final results. Describe and interpret only the final

and relevant results in detail, briefly mentioning why you chose these particular results to

present (short paragraph, ~4-5 sentences). [4 points]

MODEL FINAL RESULTS


Association of alcohol consuming and colon Measure of association, the odds ratio of

cancer alcohol consuming, and colon cancer is 5.97.

CI= 2.51-14.18
Association of aspirin intaking and colon Measure of association, the odds ratio of

cancer aspirin intaking and colon cancer is 0.313.

CI= 0.14-0.71
Association of exercise and colon cancer Measure of association, the odds ratio of

exercise doing, and colon cancer is 0.486. CI=

4
EPID 620/PUBH 801: Epidemiologic Methods I

0.22-1.08
Association of coffee consuming and colon Measure of association, the odds ratio of

cancer coffee consuming, and colon cancer is 1.78.

CI= 0.80-3.94
Association of smoking and colon cancer Crude measure of association, the odds ratio

without adjusting variables of smoking and colon cancer is 3.97. CI=

1.71-9.24
Association of smoking and colon cancer Measure of association, the odds ratio of

after adjusting (alcohol, aspirin, exercise, smoking and colon cancer (when adjusted ifor

coffee) alcohol, aspirin, exercise, coffee) is 5.69. CI=

1.94-16.59
Association of smoking and colon cancer Measure of association, the odds ratio of

after adjusting (alcohol, aspirin, exercise) smoking and colon cancer (when adjusted for

alcohol, aspirin and

(3) Discuss limitations and potential sources of bias in this study. How might these issues

have affected the results and causal inferences that can be drawn from them? (short

paragraph, ~4-5 sentences) [3 points]

The limitation of this study is the selection of the control group as they were not randomly

selected and did not represent the whole source population.

Likewise, the study is affected by selection bias and hence the causal inferences can’t be made

right because the control groups are not the same with the source group. There is no proper

representation.

5
EPID 620/PUBH 801: Epidemiologic Methods I

(4) Suggest an alternate study design (be specific, including details like exposure/outcome

assessment and measure of association) for this research question that would address

some of the limitations that you described in Q3. How could your proposal improve

internal/external validity and the causal inferences that can be made? Does your proposal

have any drawbacks? (short paragraph, ~4-5 sentences) [4 points]

To address the selection bias, there should be equal chance of being selected among the groups

and hence case-cohort would be a more likely fit in this. The correct measure of association for

the case-cohort design is the risk ratio (RR). This design will improve the results because we can

select controls, independent of exposure through the same sampling fraction for both the exposed

and the unexposed

II. SHORT ANSWERS (1-2 SENTENCES) [1 point each question]

Questions 1 and 2 refer to the following scenario: You are interested in the total causal effect of

Exposure on Disease. You have measured covariates A and B. U is an unmeasured variable.

The true causal relations are displayed in the following DAG:

Exposure Disease

B
U

6
EPID 620/PUBH 801: Epidemiologic Methods I

1) variable a

a) should you condition on a? No

b) why/why not? because it is a mediator and it is in a pathway of exposure and disease . As

the exposure both directly and indirectly affect the results, if we condition A we will fail

to get the resulting effects of exposure on outcome

2) variable b

a) should you condition on b? No

b) why/why not? Because this would lead to overcontrolling and additionally the

unmeasured variable (which can’t be adjusted) precedes it.

3) A naïve researcher conducts a retrospective cohort study of smoking on mortality, starting

with a cohort of individuals born in 1900, and following them until the year 2019. The

researcher finds that the risk ratio for mortality (comparing smokers to nonsmokers) = 1,

indicating no association. But we know smoking is harmful, and the researcher’s methods

are free of bias. Why was this result obtained?

This is because he used the wrong choice of study of retrospective cohort study instead of

case control study. They seem to not have used random selection and that is why their results

are wrong. The source group is not the same with the control group.

4) The risk ratio for the association of E with D is 2.0 when Z=0, and is also 2.0 when Z=1.

What does this imply about effect measure modification by Z on the risk difference?

7
EPID 620/PUBH 801: Epidemiologic Methods I

This implies that there is no effect measure modification because the measure of association

in the study has remained the same.

5) Which one of the following four exposure-disease relationships is best suited for an RCT,

and why? 1) E=new exercise regimen, D=LDL cholesterol levels; 2) E=high-fiber diet,

D=rare form of pancreatic cancer; 3) E=sedentary profession, D=obesity, 4) E=adolescent

exposure to ibuprofen, D=Alzheimer’s disease.

No 1 because RCTs are used to access the interventions, drugs or new regimen because there

is no loss to follow up even after a long period of time as it uses methods like intention to

treat which all the initial participants are analyzed in their specific groups regardless of their

loss to follow up.

6) E causes D only in the presence of X. With respect to the E-D association, what is X: a

confounder, a mediator, an effect modifier, or none of these?

X is a third variable called a mediator and it along the causal pathway preceded by an antecedent

variable.

7) We are often concerned with selection bias when we consider control selection in a case-

control study. What does this mean, stated in terms of “sampling fractions”?

Case control studies are studies which are prone to selection bias because the cases and

controls are not selected randomly from the source population. This means that during

selecting if we do not conduct a random selection then there is a possibility of selection bias.
8
EPID 620/PUBH 801: Epidemiologic Methods I

8) In a case-control study of aspirin use (dichotomous) and myocardial infarction, both cases

and controls have an equally difficult time accurately recalling their recent aspirin

consumption. What kind of misclassification is this, and in which direction are the results

likely to be biased?

This is called a recall bias and it is identified as differential misclassification. Because it is a

differential misclassification then the direction of the results will be towards the null and not

away from it.

9) The crude RR for the association between E and D is 4.5. When you adjust for C, the RR

becomes 1.7. Is C more likely to be a confounder or a mediator, and why?

I will not be able to know if this is either a confounder or a mediator because I do not

have enough information given to declare the answer.

10) A new screening test for lung cancer is evaluated. Researchers find that people whose lung

cancer is identified via the new screening test live an average of 7 years longer after their

diagnosis than people whose lung cancer is identified without the screening test. Please give

two reasons why this result may not indicate that the screening test is beneficial.

Screening tests do not lengthen the lifetime, nor can it prevent the disease . Because

diagnosis occurs first before the occurrence of the symptoms, so it appears as if the patients

9
EPID 620/PUBH 801: Epidemiologic Methods I

have lived longer. Also, most of the screen test takers are more conscious of their health and

would rarely involve in bad health habits which would compromise their health.

11) Briefly discuss the similarities and differences among confounder, effect measure modifier

and mediation. Please state the measure of association to report for each one.

Confounder is a third variable that influences both the exposure and the outcome while the

mediator is a variable that causes mediation between the independent and dependent variables

and is found along the causal pathway. On the other hand, effect measure modification is the

third variable which makes the effect that the exposure has on an outcome differ depending on it

and its level. In other words, it modifies the relationship between exposure and the disease. All

of these three, define the relationship of variables and the effects the exposure has on the

outcome.

MEASURES OF ASSOCIATION

Confounder: Adjusted measure of association.

Effect modification: Strata specific measure of association

Mediator: Crude measure of association

12) You are a public health commissioner of a town with scarce resources, so you must carefully

allocate your spending on important health problems. Your epidemiologist tells you that

being bitten by a local species of spider results in a risk ratio of 1.7 for a particular

degenerative nerve disease. What other information should you obtain before embarking on

an expensive spider eradication program?

There are many information that are crucial to fetch, including the level of the risk of the

population and other important details of the source population.


10

You might also like