Critical Appraisal Systematic Reveiw 2-9-16 1

07/09/59
Critical appraisal:
Systematic Review &
Meta-analysis
Atiporn Ingsathit MD.PhD.

Section for Clinical Epidemiology and biostatistics
Faculty of Medicine Ramathibodi Hospital
Mahidol University
What is a review?
 A review provides a summary of evidence to
answer important practice and policy questions
without readers having to spend the time and
effort to summarize the evidence themselves.
1
07/09/59
Type of review
 Narrative review (conventional review)
 Review article
 Chapter from textbook
 Systematic review
2
07/09/59
3
07/09/59
Why we need systematic reviews?
4
07/09/59
Problems of conventional review

 Broad clinical questions
 Unsystematic approaches to collecting of
evidences
 Unsystematic approach to summarizing of
evidences
 Trend to be biased by author’s opinions
 Load of evidence
 Conflicting of evidence
What is a systematic review?

 A review of a particular subject undertaken in
such a systematic way that risk of bias is
reduced.
 Systemic reviews have explicit, scientific, and
comprehensive descriptions of their objectives
and methods.
Hunink, Glasziou et al, 2001. 10
5
07/09/59
AIMS
 Systematic: to reduce bias
 Explicit (precisely and clearly express)

: to ensure reproducibility
---
--- ---
---
---
---
--- ---
--- ---
---
--- --- ---
--- ---
---
--- ---
--- ---
--- ---
--- ---
---
--- ---
--- ---
--- ---
--- --- ---
---
---
---
--- ---
--- ---
---
---
---
---
---
---
6
07/09/59
---
--- ---
---
---
---
--- ---
--- ---
---
--- --- ---
--- ---
---
--- ---
--- ---
--- ---
--- ---
---
--- ---
--- ---
--- ---
--- --- ---
---
---
---
--- ---
--- ---
---
---
---
---
---
---
Systematic review
AAAAAAAAAAA
AAAAAAAAAAA
AAAAAAAAAAA
AAAAAAAAAAA
AAAAAAAAAAA
AAAAAAAAAAA
AAAAAAAAAAA
AAAAAAAAAAA
AAAAAAAAAAA
AAAAAAAAAAA
7
07/09/59
What is a meta-analysis?
 The analysis of multiple studies, including
statistical techniques for merging and
contrasting results across studies.
 Synonyms: research synthesis, systematic
overview, pooling, and scientific audit.
 Focus on contrasting and combining results from
different studies in the hopes of identifying
patterns among study results.
 Quantitative methods applied only after rigorous
qualitative selection process.
Hunink, Glasziou et al, 2001. 15
Meta-analysis
 Estimates treatment effects

 Leading to reduces probability of false
negative results (increase power of test)
 Potentially to a more timely introduction
of effective treatments.
8
07/09/59
Process of conducting a systematic

review and meta-analysis
 Define the question: PICO
 Conduct literature search
 Sources: Databases, experts, funding agencies, pharmaceutical companies,
hand-searching, references
 Identify titles and abstracts
 Apply inclusion and exclusion criteria

 Titles and abstract  full articles  final eligible articles  agreement
 Create data abstraction

 Data abstraction, methodologic quality, agreement on validity
 Conduct analysis
 Determine method of generating pooled estimates
 Pooled estimates ( if appropriate)
 Explore heterogeneity  conduct subgroup
 Explore publication bias
Example
9
07/09/59
19
Users’ guides for how

to use review articles
Gordon Guyatt,
Roman Jaeschke, Kameshwar Prasad,
and Deborah J Cook
Users’ Guides to Medical Literature: A Manual
for Evidence-Based Clinical Practice 2008
10
07/09/59
1. Assess the systematic review validity.
* Did the review explicitly Address a sensible

clinical question?
* Did the review include explicit and appropriate eligibility
criteria?
* Was biased selection and reporting of studies unlikely?
* Was the Search for Relevant Studies Detailed

and Exhaustive?
* Were the Primary Studies of High Methodologic
Quality?
* Were Assessments of Studies Reproducible?
2. What are the results?
* Were the results similar from study to study?
* What are the overall results of the review?
* How precise were the results?
11
07/09/59
3. How can I apply the results to

patient care?
* Were all patient-important outcomes considered?
* Are any postulated subgroup effects credible?
* What is the overall quality of the evidence?
* Are the benefits worth the costs and potential risks?
Validity criteria
1. Did the Review Explicitly Address a Sensible
Clinical Question?
 P Lupus nephritis
 I Mycophenolate mofetil (MMF)
 C Cyclophosphamide (CYC)
 O Complete, partial remission, adverse events
12
07/09/59
Validity criteria
2. Did the review include explicit and
appropriate eligibility criteria?
 Range of patients (older/younger, severity)

 Range of interventions ( dose, route)
 Range of outcomes (short/long-term,
surrogate/clinical)
Validity criteria
3 Was biased selection and reporting of studies
unlikely? Clear inclusion and exclusion criteria
Topic Guides
Therapy Were patients randomized?
Was follow-up complete?
Diagnosis Was the patient sample representative of those with the disorder?
Was the diagnosis verified using gold standard, and independent?
Harm Did the investigators demonstrate similarity in all known
determinants of outcome or adjust for differences in the analysis?
Was follow-up sufficiently complete?
Prognosis Was there a representative sample of patients?
Was follow-up sufficiently complete?
13
07/09/59
Study Search and Selection

 One reviewer (NK) electronically searched the MEDLINE database
using
 PubMed (National Library of Medicine, Bethesda,MD) (1951 to December
2009)
 Ovid (WoltersKluwer, NewYork, NY) (1966 to December 2009)
 The Cochrane Central Register of Randomized Controlled Trials
(CENTRALVThe Cochrane Library issue 4, 2009) (United States
Cochrane Center, Baltimore, MD).
 Search terms used without language restriction were as follows:
(mycophenolate mofetil or mycophenolate) and
cyclophosphamide and (lupus nephritis or glomerulonephritis),
limited to randomized controlled trial.
 Two reviewers (NK and AT) independently screened titles and
abstracts.
Validity criteria
4. Was the Search for Relevant Studies
Detailed and Exhaustive?
 Why should effort be exerted to search for published

and unpublished articles?
 What articles tend to published more - the ones with

positive or negative results?
 If positive articles tend to be published more, how will

this affect meta-analyses of treatment interventions?
14
07/09/59
Publication bias
 Positive studies are more likely
 to be published
 to be published in Eng
 to be cited by other authors
 To produce multiple publication
 Large studies are more likely to be published even

they have negative results
 Quality of study
 Lower quality of methodology shows larger effects
 Bias due to association between treatment effect and

study size
Publication bias assessment

 Using the Egger test on the 5 trials, we found borderline
evidence of bias (coefficient = 2.03, SE = 0.64, p = 0.049)
from the small study effects.
Funnel plot for

complete remission
15
07/09/59
Validity criteria
5. Were the Primary Studies of High Methodologic
Quality?
Methodologic Quality
 PRISMA guidelines
16
07/09/59
Validity criteria
6. Were Assessments of Studies Reproducible?
 Having 2 more people participate in each decision

 Good agreement
Data Extraction and Risk

Assessment
 Two reviewers (NK and AT) independently
performed data extraction.
 We extracted trial characteristics (for example,
study design, sample size, treatment dosage
and duration, WHO classification, renal biopsy
information) and definitions (complete
remission and complete/partial remission).
17
07/09/59
Results
18
07/09/59
19
07/09/59
Results
1. Were the results similar from study to
study?
Explore heterogeneity
What does heterogeneity mean?
20
07/09/59
What does heterogeneity mean?
 The results are significantly different
between studies.
 The possibility of excess variability
between the results of the difference
trials/studies is examined by the test of
heterogeneity.
 Why?
 As the studies might be not conduct
according to a common protocol.
 Variations in patient groups, clinical
setting, concomitant care, and the
methods of delivery of the intervention or
method of measurement of exposure for
observational studies.
21
07/09/59
How do we detect heterogeneity?
1) Visual interpretation
2) Do statistical tests (e.g. q test, p<.1
implies heterogeneity, or I2 >0.7)
Visual interpretation
22
07/09/59
23
07/09/59
Do statistical tests
Statistical test (1)

 Statistical test of heterogeneity (yes/no)
 Cochran Q
 Null hypothesis of the test for heterogeneity is that the
underlying effect is the same in each of the studies.
 Low P value means that random error is an unlikely
explanation of the differences in results from study to
study.
 High P value increases our confidence that the
underlying assumption of pooling holds true.
24
07/09/59
Statistical test (2)

 Magnitude of heterogeneity
 I2 statistic
 Provides an estimate of the percentage of variability in
results across studies that is likely due to true differences
in treatment effect as opposed to chance
 As the I2 increases, we become progressively less
comfortable with a single pooled estimate, and need to
look for explanations of variability other than chance
 I2 < 0.25 small heterogeneity
0.25-0.5 moderate heterogeneity
> 0.5 large heterogeneity
Plot study results

Forest plot or metaview
25
07/09/59
What can authors do if there is

heterogeneity?
1) Identify the source of heterogeneity

2) Try to group studies into homogeneous
categories (sensitivity analysis)
3) No statistical combination (no meta-
analysis)
Results
2 What are the overall results of the review?
26
07/09/59
Results
3. How precise were the results?
27
07/09/59
Confidence Intervals
0.6 0.8 1 1.2 1.4 1.6

Risk ratio
3. How can I apply the results to

patient care?
* Were all patient-important outcomes considered?
* Are any postulated subgroup effects credible?
* What is the overall quality of the evidence?
* Are the benefits worth the costs and potential risks?
28
07/09/59
Number need to treat (NNT)

Number needed to be treated to prevent one more event
NNT = 1/Rc-Rt
= 1/ARR
Number need to harm (NNH)

Number needed to be treated to harm one
more of them
NNH = 1/Rt-Rc
NNT and NNH
29
07/09/59
Network meta-analysis
Meta-analysis
 Traditional meta-analysis address the merits of one
intervention vs. another
 Drawback – it evaluates the effect of only 1
intervention vs. 1 comparator
 Do not permit inferences about the relative
effectiveness of several interventions
* Medical condition – there are a selection of
interventions that have most frequently been
compared with placebo and occasionally with one
another. 60
30
07/09/59
Network Meta-analysis (NMA)

 Multiple or mixed treatment comparison meta-analysis
 NMA approach provides estimates of effect sizes for all
possible pairwise comparisons whether or not they have
actually been compared head to head in RCTs.
61
Network Meta-analysis
 A network meta-analysis combines direct and
indirect sources of evidence to estimate
treatment effects.
 Direct evidence on the comparison of two
particular treatments will be obtained from studies
that contain both treatments
 Indirect evidence is obtained through studies that
examine both treatments via some common
treatment only.
31
07/09/59
Consideration in NMA
1. Among trials available for pairwise comparisons,
are the studies sufficiently homogenous to combine
for each intervention? (An assumption that is also
necessary for a conventional meta-analysis)
2. Are the trials in the network sufficiently similar, with
the exception of the intervention (eg, in important
features, such as populations, design, or outcomes)?
3. Where direct and indirect evidence exist, are the
findings sufficiently consistent to allow confident
pooling of direct and indirect evidence together?
63
Users' Guides to the Medical

Literature: A Manual for Evidence-
Based Clinical Practice, 3rd ed 2015
Gordon Guyatt, Drummond Rennie, Maureen O. Meade, Deborah J. Cook
http://jamaevidence.mhmedical.com/book.aspx?bookID=847
64
32
07/09/59
65
33
07/09/59
I. How Serious Is the

Risk of Bias?
67
1. Did the Meta-analysis Include Explicit

and Appropriate Eligibility Criteria?
 PICO
 Broader eligibility criteria  enhance generalizability of
the results  if participants are too dissimilar 
heterogeneity
 Diversity of interventions  excessive if authors pool
results from different doses or even different agents in
the same class, based on the assumption that effects are
similar.
 Too broad in their inclusion of different populations,
different doses or different agents in the same class, or
different outcomes to make comparisons across studies
credible. 68
34
07/09/59
Research question
 We therefore conducted a systematic review and
network meta-analysis with the aim of comparing
complete recovery rates at 3 and 6 months for
corticosteroids, AVT (Acyclovir or Valacyclovir), or
the combination of both for treatment of adult
Bell’s palsy.
 P
 I
 C
 O
Eligible criteria
 Studies were included if they were RCTs,
and studied subjects aged 18 years or older with

sufficient data. Non-English papers were
excluded from the review.
35
07/09/59
2. Was Biased Selection and Reporting

of Studies Unlikely?
 Include all interventions because data on clearly suboptimal
or abandoned interventions may still offer indirect evidence for
other comparisons
 Apply the search strategies from other systematic reviews
 only if authors have updated the search to include
recently published trials
 Some industry-initiated NMAs may choose to consider only
a sponsored agent and its direct competitors
 Omit the optimal agent  give a fragmented picture of the
evidence
 Selection of NMA outcomes should not be data driven but
based on importance for patients and consider both outcomes
71
of benefit and harm.
Search strategy
 One author (NP) located studies in MEDLINE (from
1966 to August 2010) and EMBASE (from 1950 to
September 2010) using PubMed and Ovid search
engines.
 Search terms used were as follows: (Bell’s palsy or
idiopathic facial palsy) and (antiviral agents or
acyclovir or valacyclovir), limited to randomized
controlled trials.
36
07/09/59
Selection of study
 Where eligible papers had insufficient
information, corresponding authors were
contacted by e-mail for additional information.
 The reference lists of the retrieved papers
were also reviewed to identify relevant
publications.
 Where there were multiple publications from
the same study group, the most complete and
recent results were used.
Study selection
37
07/09/59
Outcome
 Complete recovery was defined as
 a score ≤2 on the House-Brackman Facial
Recovery scale,
 ≥ 8 on the Facial Palsy Recovery Index,
 > 36 points on the Yanagihara score, or 100 on the
Sunnybrook scale.
3. Did the Meta-analysis Address

Possible Explanations of Between-
Study Differences in Results?
 When clinical variability is present  conduct
subgroup analyses or meta-regression to explain
heterogeneity  more optimally fit the clinical setting
and characteristics of the patient you are treating.
 Multiple control interventions (eg, placebo, no
intervention, older standard of care)
 It is important to account for potential differences
between control groups
 Potential placebo effect
76
38
07/09/59
Plan for explore heterogeneity
4. Did the Authors Rate the Confidence

in Effect Estimates for Each Paired
Comparison?
 Ideally, for each paired comparison, authors

will present the pooled estimate for the direct
comparison (if there is one) and its associated
rating of confidence, the indirect
comparison(s) that contributed to the pooled
estimate from the NMA and its associated
rating of confidence, and the NMA estimate
and the associated rating of confidence.
78
39
07/09/59
Lose Confidence in comparison of

treatments
 RCT - failed to protect against risk of bias by
allocation concealment, blinding, and preventing
loss to follow-up.
 When on pooled estimates are (imprecision)
 Results vary from study to study and we cannot
explain the differences (inconsistency);
 The population, intervention, or outcome differ from
that of primary interest (indirectness);
80
40
07/09/59
II. What Are the

Results?
81
1. What Was the Amount of Evidence

in the Treatment Network?
 Gauge from the number of trials, total sample

size, and number of events for each treatment
and comparison
 Understanding the geometry of the network
(nodes and links) will permit clinicians to
examine the larger picture and see what is
compared to what
 The credible intervals around direct, indirect,
and NMA estimates provide a helpful index
82
41
07/09/59
Result at 3 months
Result at >3 months
42
07/09/59
2. Were the Results Similar From

Study to Study?
 NMA, with larger numbers of patients and studies -
more powerful exploration of explanations of
between-study differences
 The search conducted by NMA authors for
explanations for heterogeneity may be informative.
 NMA - vulnerable to unexplained differences in
results from study to study
85
3. Were the Results Consistent in

Direct and Indirect Comparisons?
 Direct or indirect - most trustworthy?

 Requires assessing whether the direct and
indirect estimates are consistent or discrepant.
86
43
07/09/59
Inconsistency
B
Three designs: AB, AC, ABC
A C
 When the direct and indirect sources of
evidence within a network do not agree, this is
known as inconsistency
3. Were the Results Consistent in

Direct and Indirect Comparisons?
 Direct or indirect - most trustworthy?

 Requires assessing whether the direct and
indirect estimates are consistent or discrepant.
 Inconsistency in results in both the direct and
indirect comparisons  decrease confidence in
estimates
 Statistical methods exist for checking this type of
inconsistency, typically called a test for
incoherence. 88
44
07/09/59
Potential Reasons for Incoherence Between

the Results of Direct and Indirect
Comparisons
 Chance
 Genuine differences in results

 Differences in enrolled participants, interventions,
background managements
 Bias in head-to-head (direct) comparisons
 Publication bias
 Selective reporting of outcomes and of analyses
 Inflated effect size in stopped early trials
 Limitations in allocation concealment, blinding, loss to
follow-up, analysis as randomized
 Bias in indirect comparisons
 Each of the biasing issues above
Test for incoherence

 Discrepancy of treatment effects between direct and indirect
meta-results was then assessed using the standardized
normal method (Z), i.e. by dividing the difference by its
standard error.
45
07/09/59
4. How Did Treatments Rank and How

Confident Are We in the Ranking?
 Besides presenting treatment effects, authors may also
present the probability that each treatment is superior to all
other treatments, allowing ranking of treatments.
 May be misleading because
 Fragility in the rankings
 Differences among the ranks may be too small to be
important
 Other limitations in the studies (eg, risk of bias,
inconsistency, indirectness).
92
46
07/09/59
93
5. Were the Results Robust to

Sensitivity Assumptions and Potential
Biases?
 May assess the robustness of the study

findings by applying sensitivity analyses that
reveal how the results change if some criteria
or assumptions change.
 Sensitivity analyses may include restricting
the analyses to trials with low risk of bias only
or examining different but related outcomes
94
47
07/09/59
III. How Can I Apply the

Results to Patient Care?
95
1. Were All Patient-Important

Outcomes Considered?
 Many NMAs report only 1 or a few outcomes of
interest
 Adverse events are infrequently assessed in
meta-analysis and in NMAs.
 More likely to include multiple outcomes and
assessments of harms
96
48
07/09/59
2. Were All Potential Treatment

Options Considered?
 Network meta-analyses may place restrictions

on what treatments are examined.
 Need background knowledge review.
97
3. Are Any Postulated Subgroup

Effects Credible?
 Criteria exist for determining the credibility of

subgroup analyses.
 NMA allow a greater number of RCTs to be
evaluated and may offer more opportunities for
subgroup analysis.
98
49
07/09/59
 Single common comparator – star network

 Only allow for indirect comparison – reduce
confidence in effect
• Use both direct and indirect evidence
• increase confidence in estimates of interest
• Mixture of indirect links and close

loops, unbalanced shapes
• High confidence for some
• Low confidence for others 99
Hierarchy of Evidence
Systematic reviews
Randomized Controlled Trials
Cohort studies
Case-control studies
Cross-sectional
studies
Cases reports
50
07/09/59
Take home messages
 Systematic review is a secondary research. It

focused on a research question that tries to
identify, appraise, select and synthesize all high
quality research evidence relevant to that
question.
 Meta-analysis is a statistic tool of a systematic
review, which is broadly defined as a quantitative
review and synthesis of the results of related but
independent studies.
Take home messages

 NMA can provide extremely valuable
information in choosing among multiple
treatments offered for the same condition
 It is important to determine the confidence one
can place in the estimates of effect of the
treatments considered and the extent to which
that confidence differs across comparisons.
51

Critical Appraisal Systematic Reveiw 2-9-16 1

Uploaded by

Copyright:

Available Formats

You might also like

Critical Appraisal Systematic Reveiw 2-9-16 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Critical Appraisal Systematic Reveiw 2-9-16 1

Uploaded by

Copyright:

Available Formats

07/09/59

Atiporn Ingsathit MD.PhD.

Why we need systematic reviews?

Problems of conventional review

What is a systematic review?

Hunink, Glasziou et al, 2001. 10

 Explicit (precisely and clearly express)

 Estimates treatment effects

Process of conducting a systematic

 Apply inclusion and exclusion criteria

 Create data abstraction

Users’ guides for how

1. Assess the systematic review validity.

* Did the review explicitly Address a sensible

* Was the Search for Relevant Studies Detailed

2. What are the results?

* Were the results similar from study to study?

* What are the overall results of the review?

* How precise were the results?

3. How can I apply the results to

* Were all patient-important outcomes considered?

* Are any postulated subgroup effects credible?

* What is the overall quality of the evidence?

* Are the benefits worth the costs and potential risks?

 Range of patients (older/younger, severity)

Study Search and Selection

 Why should effort be exerted to search for published

 What articles tend to published more - the ones with

 If positive articles tend to be published more, how will

 to be cited by other authors

 To produce multiple publication

 Large studies are more likely to be published even

 Bias due to association between treatment effect and

Publication bias assessment

Funnel plot for

 Having 2 more people participate in each decision

Data Extraction and Risk

How do we detect heterogeneity?

Statistical test (1)

Statistical test (2)

Plot study results

What can authors do if there is

1) Identify the source of heterogeneity

0.6 0.8 1 1.2 1.4 1.6

3. How can I apply the results to

* Were all patient-important outcomes considered?

* Are any postulated subgroup effects credible?

* What is the overall quality of the evidence?

* Are the benefits worth the costs and potential risks?

Number need to treat (NNT)

Number need to harm (NNH)

NNT and NNH

Network Meta-analysis (NMA)

Users' Guides to the Medical

I. How Serious Is the

1. Did the Meta-analysis Include Explicit

 Studies were included if they were RCTs,

and studied subjects aged 18 years or older with