Download as pdf or txt
Download as pdf or txt
You are on page 1of 53

Assessing data quality:

Bias
Bi

D St
Dr. h i R
Stephanie Rollll
Institute for Social Medicine, Epidemiology and Health Economics
Charité University Medical Center, Berlin, Germany
Learning objectives
• Assessing the quality of study data

• Learn about different types of bias.

• Distinguish between random and systematic error.


Example: Study on tuberculosis
tuberculosis.

Study question
What is the prevalence of
tuberculosis (TB) in Cambodia?
The perfect study
• Includes all Cambodians (ca. 14 mil.).
• Assesses TB status objectively,
j y equally,
q y at same
time.
 Result (invented)
Persons with TB: 140 000 (1%)

Persons without TB: 13 860 000 (99%)

 we know the exact prevalence of TB: 1%


 Unfortunately this study is a dream (unrealistic)
The realistic study
• Survey (cross-sectional) planned in 50 villages.
• Agree
g to p
participate:
p 32 villages.
g
• Within each village: 97% of inhabitants tested.
 Result (invented)
Persons with TB: 280 000 (2%)

Persons without TB: 13 720 000 (98%)

What happened?
 prevalence of TB: 2%?
What happened?
• Only 32 of 50 villages participated.

Village participated

Village did not participate

• Villages with bigger TB-problems may be more


likely to participate (more interested).

 „selection bias“
Bias (systematic error)

Bias is a systematic tendency to get an incorrect


result.
result

Bias is an error in design or conduct of study that


results in a conclusion which is different from the
truth.
Selection bias I
total population

1) Is the sample
representative?

studyy sample
p
Selection bias II
total population

2) Are the groups


comparable?

Group A
studyy sample
p
Group B
3 possible explanations for a
result
Association between exposure and outcome

Bias Chance True effect


(systematic error) (random error)
Consequences of bias

Bias will result in an over- or underestimation of the


true effect .
Two main types of bias
any aspect of the way subjects are
assembled in the study that creates
• selection bias a systematic difference between
the compared populations that is
not due to the association under
study

any aspect of the way information


is collected in the study that
• information bias creates a systematic difference
b t
between the
th compared d populations
l ti
that is not due to the association
under study
Two main types of bias
with many categories
non-respondent
d t bi
bias self referral bias
giving consent bias .... many more...
sampling bias
selection bias missing data bias
attrition bias
lost to follow up bias

measurement bias
regression to mean

information bias recall bias


misclassification bias

reporting bias observer bias


i t i
interviewer bi
bias .... many more...
Examples of selection bias
“self-selection” of individuals to participate
• p
people
p interested in taking gppart in a studyy
• people giving consent vs. not giving consent

selection of sample by researchers


• different selection process for cases and controls
• different selection process for exposed and
unexposed participants
Selection bias in a cohort study
Example: self referal bias

Birth cohort on asthma to estimate the incidence


and prevalence of asthma in children in the next 10
years.

Parents who have asthma themselves are more


likely to participate.
Mini-Quiz
Mini Quiz
How will this type of self referal bias
influence the prevalence of asthma
in children?

a) the prevalence will be overestimated 


b) the prevalence will be underestimated

c) the prevalence could be over or underestimated


Selection bias in case-control
study
Association of mobile phone use and brain tumors

• selection
l ti off cases: patients
ti t ini hospital
h it l

• selection of controls: customers of a mobile phone store

A good idea?
a) yes
b) no

Selection bias in case-control
study
Association of mobile phone use and brain tumors

• selection of cases: patients in hospital

• selection of controls: customers of a mobile phone store

no phone cases
phone (brain tumor)

controls
phone
(no brain tumor)
Attrition bias

• attrition = loss of participants


– of entire study
y
– different loss of participants in groups
Attrition bias: example
Study
St d on hand
h d washing
hi promotion
ti iin children
hild age 4
4-10
10
children
N=200

hand washing promotion control group


(N=100, mean age 6.7 y) (no hand washing promotion)
(N=100 mean age 6.4
(N=100, 6 4 y)

diarrhea diarrhea
(N=60, mean age 8.6 y) (N=85, mean age 6.5 y)
Two main types of bias
with many categories
non-respondent
d t bi
bias self referral bias
giving consent bias .... many more...
sampling bias
selection bias missing data bias
attrition bias
lost to follow up bias

measurement bias
regression to mean

information bias recall bias


misclassification bias

reporting bias observer bias


i t i
interviewer bi
bias .... many more...
Information bias
also called
measurement bias
or classification bias
or misclassification bias

• systematic differences in how outcomes or


exposures are assessed and interpreted

• outcomes or exposures are ‘misclassified‘


Differential vs. non-differential
information bias
Non-differential
• if misclassification of exposure is unrelated to disease
• if misclassification of disease is unrelated to exposure
 effect: bias towards the null (OR and RR closer to 1.0)

Differential
• iff misclassification
f off exposure is related to disease
• if misclassification of disease is related to exposure
 effect:
ff t bias
bi can go in i either
ith direction
di ti from
f the
th null;
ll it can
inflate or attenuate your effect estimates (OR and RR)
Diagnostic suspicion bias
K
Knowledge
l d about
b t subject‘s
bj t‘ exposure leads
l d tto more
thorough
g search for the outcome than for an
unexposed individual.
 Exposed subjects are more likely to have the
disease diagnosed than the nonexposed
nonexposed.

Example: for a heavy smoker, one might check more


thoroughly for signs of cancer
Mini-Quiz
Mini Quiz

How will diagnostic suspicion bias


influence the association between
exposure and outcome?

a) the association will be overestimated



b) the association will be underestimated
Detection bias

• Occurs when an exposure, rather than causing


disease causes symptoms that leads to search for
disease,
the disease.

 Even though unrelated,


unrelated the disease is more
often diagnosed in exposed subjects.
Mini-Quiz
Mini Quiz

How will detection bias influence


the prevalence?

a) the risk will be overestimated 


b) the risk will be underestimated
Recall bias

• Cases may more closely look into their past


searching for possible explanations of their illness
illness.
• Controls, not having
g the disease, may
y less closely
y
examine their past history.
• Recall bias is a problem especially in case-control
studies / retrospective studies!
Mini-Quiz
Mini Quiz
If recallll bi
bias iis presentt iin a case-control
t l
study, how will it usually affect the result?

a) a greater association is found 


b) a smaller association is found
Reporting bias
• different information on exposure or outcome obtained
Examples
• Cases (with severe or long lasting disease) tends to have
complete records and more complete information about
exposures than controls
• Study participants give desirable answers
– support researcher’s
researcher s hypothesis
– conceal undesirable or not accepted behaviours
(smoking during pregnancy
pregnancy, violence within the family)
or particular diseases (sexually transmitted diseases,
HIV))
Overmatching bias
Overmatching
in a matched case-control study: cases and controls are
matched by a non-confounding
non confounding variable that is associated to
the exposure but not to the disease

• Overmatching can underestimate an association

 Prevention: matching only for confounding variables


Stages of research prone to bias (Sackett, 1979)
• Literature Review

• Study Design

• Study Execution

• Data collecion
All stages!
• Data analysis

• Interpretation of Results

• Publication
Biases in
in...
- Foreign language
• Literature Review
exclusion bias
• Study Design - Literature search bias
- One
One-sided
sided reference bias
• Study Execution - Rhetoric bias

• Data collecion

• Data analysis

• Interpretation of Results

• Publication
Selection bias
Biases in
in...
Sampling frame bias
Berkson (admission rate) bias
• Literature Review Centripetal bias
Diagnostic access bias
Diagnostic purity bias
• Study Design Hospital access bias
Migrator bias

• Study Execution Prevalence-incidence (Neyman /


selective survival; attrition) bias
Telephone sampling bias
• Data collecion Nonrandom sampling bias
Autopsy series bias
Detection bias
• Data analysis Diagnostic work-up bias
Door-to-door solicitation bias
Previous opinion bias
• Interpretation of Results Referral
f filter
f bias
Sampling bias
• Publication Self-selection bias
U
Unmasking
ki bi bias
Biases in
in...
• Literature Review

• Study Design
- wrong control bias
• Study Execution - contamination bias
(controls also receive
• Data collecion treatment/are exposed)

• Data analysis - compliance bias

• Interpretation of Results

• Publication
Biases in
in...
- Instrument bias
C
Case d fi iti bi
definition bias
Diagnostic vogue bias
Forced choice bias
• Literature Review Framing bias
Insensitive
I iti measure bias bi
Juxtaposed scale bias
• Study Design Laboratory data bias
Questionnaire bias
S l fformatt bias
Scale bi
• Study Execution Sensitive question bias
Stage bias
Unacceptability bias
U d l i /
Underlying/contributing
t ib ti cause off d
death
th bias
bi
• Data collecion Voluntary reporting bias
- Data source bias
Competing death bias
• Data analysis Famil history
Family histor bias
Hospital discharge bias
Spatial bias
- Observer bias
• Interpretation of Results Diagnostic suspicion bias
Exposure suspicion bias
Expectation bias
• Publication Interviewer bias
Therapeutic personality bias
Biases in
in...
- Subject bias
Apprehension bias
Attention bias (Hawthorne effect)
• Literature Review Culture bias
End-aversion bias
(end-of-scale/central tendency bias)
Faking bad bias
• Study Design Faking good bias
Family information bias
Interview setting bias
• Study Execution Obsequiousness bias
Positive satisfaction bias
Proxy respondent bias
- Recall bias
• Data collecion Reporting bias
Response fatigue bias
Unacceptable disease bias
• Data analysis Unacceptable exposure bias
Underlying cause (rumination bias)
Yes-saying bias
• Interpretation of Results - Data handling bias
Data capture error
Data entry bias
Data merging error
• Publication Digit preference bias
Record linkage bias
- Confounding bias
Biases in
in... Latency bias
Multiple exposure bias
• Literature Review Nonrandom sampling bias
Standard population bias
• Study Design Spectrum bias
- Analysis
y strategy
gy bias
• Study Execution Distribution assumption bias
Enquiry unit bias
• Data collecion E ti t bias
Estimator bi
Missing data handling bias
Outlier handling bias
• Data analysis
Overmatching bias
Scale degradation bias
• Interpretation of Results - Post
P t hoc
h analysis
l i bibias
Data dredging bias
• Publication Post hoc significance
g bias
Repeated peeks bias
Biases in
in...
• Literature Review

• Study Design

• Study Execution

• Data collecion - Assumption bias


- Cognitive dissonance
• Data analysis bias
- Correlation bias
• Interpretation of Results - Generalisation bias
- Magnitude bias
• Publication - Significance bias
- Undere ha stion bias
Underexhaustion
Biases in
in...
• Literature Review

• Study Design

• Study Execution

• Data collecion

• Data analysis

• Interpretation of Results - all's well literature bias


- positive result bias
• Publication - hot topic
p bias
Overview of types of bias
Bias. Delgado-Rodríguez M, Llorca J. J Epidemiol Community Health. 2004 Aug;58(8):635-41.
additional pdf-file
Bias. Delgado-Rodríguez M, Llorca J. J Epidemiol Community Health. 2004 Aug;58(8):635-41.
Exercise: Sources of bias
Please find an example from your field of work
(a real one that you experienced or a hypothetical example).

1. Diagnostic suspicion bias


((knowledge
g about subject‘s
j exposure
p leads to more thorough
g
search for the outcome compared to an unexposed individual)
2. Recall bias
(
(cases andd control
t l recallll exposures diff
differently)
tl )
3. Reporting bias of participants giving desirable answers
4
4. Selection bias
(regarding the total study population, or the groups to be
compared))
3 possible explanations for a
result
Association between exposure and outcome

Bias Chance True effect


(systematic error) (random error)
Random errors
= deviation of results from the truth, occurring only as
a result of chance.

Possible reasons
• Variability of chosen sample from underlying
population
• Outcome or risk factor incorrectly assessed
(independent of group)
How to deal with random errors?
• Use
U bibig sample
l size
i

• Calculate p-value and confidence intervals


Random error and systematic error
random error
ystemattic errorr

good precision poor precision


good
d accuracy (unbiased)
( bi d) good
d accuracy (unbiased)
( bi d)
sy

good precision poor precision


poor accuracy (biased) poor accuracy (biased)
Mini-Quiz
Mini Quiz
If the study size increases, how does this
affect random and systematic errors?

random errors will a) get smaller 


b) get bigger

c) stay the same

systematic errors will d) get smaller

e) get bigger

f) stay the same



Random vs. systematic errors
in epidemiological studies
Random vs. systematic
y errors

Random error Systematic error


will cancel each other out in the will not cancel each other out
long run (large sample size) whatever the sample size

lead to imprecise results lead to invalid (inaccurate)


results
Summary
• Observational studies are especially prone to bias!
• Be aware of all kinds and sources of bias in planing
p g
and conducting of a study.
• Think of all possible sources and types of bias and
how to avoid them (e.g. blinding wherever possible).
• Be aware of all kinds and sources of bias in
believing results of a published study.
A
Association
i ti between
b t exposure and
d outcome
t

Bias Chance True effect


(systematic error) (random error)
Any questions or comments?

You might also like