Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Journal of Clinical Epidemiology 65 (2012) 163e178

Development of the RTI item bank on risk of bias and precision


of observational studies
Meera Viswanathan*, Nancy D. Berkman
Social Policy, Health, and Economics Research, RTI International, 3040 Cornwallis Road, PO Box 12194, Research Triangle Park, NC 27709-2194, USA
Accepted 20 May 2011; Published online 29 September 2011

Abstract
Objective: To create a practical and validated item bank for evaluating the risk of bias and precision of observational studies of inter-
ventions or exposures included in systematic evidence reviews.
Study Design and Setting: The item bank, developed at RTI International, was created based on 1,492 questions included in earlier
instruments, organized by the quality domains identified by Deeks et al. Items were eliminated and refined through face validity, cognitive,
content validity, and interrater reliability testing.
Results: The resulting item bank consisting of 29 questions for evaluating the risk of bias and precision of observational studies of
interventions or exposures (1) captures all of the domains critical for evaluating this type of research, (2) is comprehensive and can be easily
lifted ‘‘off the shelf’’ by different researchers, (3) can be adapted to different topic areas and study types (e.g., cohort, caseecontrol, cross-
sectional, and case series studies), and (4) provides sufficient instruction to apply the tool to varied topics.
Conclusion: One bank of items, with specific instructions for focusing abstractor evaluations, can be created to judge the risk of bias and
precision of the variety of observational studies that may be used in systematic and comparative effectiveness reviews. Ó 2012 Elsevier Inc.
All rights reserved.
Keywords: Risk of bias assessment; Systematic review methodology; Quality assessment; Observational studies; Instrument development; Reliability testing;
Validity testing

1. Introduction gold standard for evidence, they frequently cannot answer


all relevant clinical questions. RCTs may be unethical
In the past decade, the number of publications included
[3], limited in their ability to address harms because of lim-
in PubMed has increased at an average annual rate of nearly ited size or length of follow-up [4], or lack of applicability
6% from 467,364 citations in 1998 to 816,597 in 2008. This
to vulnerable subpopulations [5]. Observational studies
steady expansion in the volume of published studies in-
(lacking randomization, allocation concealment, blinding
creases the complexity and variability of information that
of participants and interventionists, and in some instances,
policy makers, clinicians, and patients need to evaluate to
control groups) may fill these gaps, but the trade-off is
make informed health care choices. Systematic reviews that
a wider range of sources of bias, including potential biases
compare interventions play a key role in synthesizing the
in selection, performance, detection of effects, and attrition;
evidence [1]. The assessment of the design and conduct
these biases have the potential to alter effect sizes unpre-
of individual studies is central to this synthesis and is rou- dictably [6,7].
tinely used for interpreting results and grading the strength
The inclusion of non-RCT studies in systematic reviews
of the body of evidence. Systematic reviewers may also use
requires validated tools to assess the likelihood of bias.
these assessments to select studies for the review, meta-
Approaches to critical appraisal of study methodology and
analysis, and interpreting heterogeneous findings [2].
related terminology have varied and are evolving. Overlap-
Although well-designed and well-implemented random-
ping terms include quality, internal validity, risk of bias, or
ized controlled trials (RCTs) have long been considered the
study limitations, but a central goal is an assessment of the
believability of the findings. We use the phrase ‘‘assessment
of risk of bias and precision’’ as the most representative of the
* Corresponding author. Tel.: 919-316-3930; fax: 919-541-7384. goal of evaluating the degree to which the effects reported by
E-mail address: viswanathan@rti.org (M. Viswanathan). the study represent the ‘‘true’’ causal relationship between
0895-4356/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved.
doi: 10.1016/j.jclinepi.2011.05.008
164 M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178

Several reviews of critical appraisal tools, including


What is new? Deeks et al. [9] and West et al. [10], identified key quality
domains but found no gold standard in evaluating quality
Key finding [9e12]. Deeks et al. reviewed quality appraisal tools for
We created and validated an item bank, entitled the nonrandomized studies. Of 213 identified tools, only six
‘‘RTI item bank,’’ to evaluate risk of bias and preci- [13e18] met their criteria of evaluating six core elements
sion for observational studies of interventions or ex- of internal validity (creation of groups, comparability of
posures included in systematic literature reviews. It groups at the analysis stage, allocation to intervention, sim-
accommodates a variety of observational study de- ilarity of groups for key prognostic characteristics by de-
sign types, including studies with controls (cohort sign, identification of prognostic factors, and the use of
and caseecontrol) and without controls that rely on case-mix adjustment) and were specifically designed for
changes or differences in exposure (cross-sectional use in systematic reviews [9]. These tools vary in the crite-
and case series). ria covered [9] and their overall approach. Tools focus on
either a description or reporting of methods (questions re-
What this adds to what was known? garding whether authors reported a particular element of
No gold standard exists for evaluating the risk of bias the study in a manuscript) or a judgment of risk of bias
of observational studies. Existing tools require modi- (questions regarding whether the conduct of the study al-
fication or may not be applicable for specific designs tered the believability of results).
such as cross-sectional or case series. In practice, re- Existing tools also have other constraints. Some tools
view groups often develop their own critical appraisal such as the Newcastle Ottawa Scale [14] are scales that rely
tool. These ad hoc tools may lack validated questions mostly or entirely on uniform weights for all questions. The
and adequate instructions for reviewers, leading to in- use of uniform weights may be difficult to justify in all con-
consistent evaluations within and across reviews. texts [7]; for example, if, for a particular topic, a single flaw
substantially increases risk of bias. Tools may require mod-
We created a practical and validated item bank for
ification or may not be applicable for specific designs such
evaluating the conduct of observational studies of in-
as cross-sectional or case series. In practice, the idiosyncra-
terventions or exposures that (1) is comprehensive,
sies of topics require and often result in each review devel-
capturing all of the risk of bias and precision domains
oping its own critical appraisal tool. These ad hoc tools
critical for evaluating this type of research; (2) can be
may lack validated questions and adequate instruction for
easily adapted to different topic areas and study types
reviewers, leading to inconsistent evaluations within and
(e.g., cohort, caseecontrol, cross-sectional, and case
across reviews.
series studies); and (3) provides instruction to assist
Our objective was to create a practical and validated
reviewers in creating and applying the best tool for
item bank for evaluating the conduct of observational stud-
varied topics.
ies of interventions or exposures that (1) is comprehensive,
capturing all of the risk of bias and precision domains crit-
What is the implication, what should change now?
ical for evaluating this type of research; (2) can be easily
Systematic reviewers should adopt validated tools
adapted to different topic areas and study types (e.g., co-
that enable greater transparency and consistency in
hort, caseecontrol, cross-sectional, and case series studies);
evaluating risk of bias and precision of observational
and (3) provides instruction to assist reviewers in creating
studies. The RTI item bank is one such tool.
and applying the best tool for varied topics.
Our resulting risk of bias and precision item bank pro-
vides a means to assess threats to the accuracy of an esti-
mate provided in a study and is applicable to evaluating
exposure and outcome, that is, the accuracy of the estima-
tion. The accuracy of an estimate depends on its validity  studies of interventions or exposures that lack random
(the absence of bias or systematic error in selection, perfor- allocation to an intervention and rely on associations
mance, detection, measurement, attrition, and reporting and between changes or differences in exposure or inter-
adequacy in addressing potential confounders) and preci- ventions and changes or differences in an outcome
sion (the absence of random error through adequate study of interest [19]. It is not designed to evaluate diagnos-
size and study efficiency) [8]. Thorough assessment of tic studies.
these threats to the validity and precision of an estimate  a variety of observational study design types, including
is critical to understanding the believability of a study. studies with controls (cohort and caseecontrol) and
Table 1 presents a taxonomy and description of threats to without controls that rely on changes or differences
validity and precision, drawing on two well-cited sources: in exposure (cross-sectional and case series) [20].
the Cochrane Handbook for Systematic Reviews of Inter-  internal validity only and not external validity
ventions [7] and Modern Epidemiology [8]. (applicability).
M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178 165
a
Table 1. Threats to precision and validity
Threats Definition
Threats to precision (random error)
Inadequate study size Not powered to test study hypothesis.
Lack of study efficiency Absence of needed stratification in design. When confounding and effect modifiers do not exist, an
equal apportionment ratio between exposed and unexposed is the most efficient design.
Comparisons within strata may be required to account for known confounders and effect modifiers.
Matching on stratification variables allows for an efficient design.
Threats to validity (systematic error)
Selection bias Systematic differences in baseline characteristics of the groups that are compared (for multiple-arm
studies) or within the group (for single-arm or cross-sectional studies). For example, from self-
selection of treatments, physician-directed selection of treatments, or demographic characteristics,
failure to account for intention-to-treat clinical, or social characteristics. Includes confounding
from differential selection before exposure and disease as well as selection bias where exposure
and/or disease influence the selection of the participants.
Performance bias Systematic differences in the care provided to participants in the comparison groups other than the
intervention under investigation (for multiple-arm studies) or within groups (for single-arm and
cross-sectional studies), e.g., variation in delivery of the protocol, difference in co-interventions,
inadequate blinding of providers and participants (variation unlikely in observational studies)
Attrition bias Systematic differences among the comparison groups in the loss of participants from the study (for
multiple-arm studies) or within groups (for single-arm and cross-sectional studies) and how they
were accounted for in the results, e.g., incomplete follow-up, differential attrition
Detection bias Systematic differences in outcomes assessment among the comparison groups (for multiple-arm
studies) or within groups (for single-arm and cross-sectional studies, e.g., inadequate assessor
blinding, differential outcome assessment)
Reporting bias Systematic differences between reported and unreported findings, e.g., differential reporting of
outcomes or harms, potential for bias in reporting through source of funding
Information bias Systematic differences caused by measurement errors, e.g., recall bias
a
From Rothman et al. [8] and Higgins and Green [7].

Although we did not test the reliability of our item bank evaluation and could result in poor interrater reliability.
for other study designs, we believe that it can be used for The alternative approach of ‘‘methods description’’ is eas-
evaluating these studies as well, with some modifications. ier to implement because methods for each stage of re-
For instance, evaluations of quasi-experimental studies will search tend to correspond well with how manuscripts are
need to add, in addition to questions from our item bank, written. This approach relies less on reviewer judgment
questions from a validated RCT appraisal tools on alloca- [9] but may fall short of evaluating believability. One solu-
tion concealment and blinding of patients and intervention- tion, which we have adopted, uses both approaches, using
ists. We anticipate that systematic review study directors the methods description for each stage of research as the
(referred to as principal investigators [PIs]) will select spe- primary framework to facilitate ease of review, but evaluat-
cific items based on the needs of the review topic and the ing how the design and conduct of the study at that stage
most likely potential sources of bias and threats to precision addresses threats to validity and precision. This approach
in the included studies. requires the reviewer to judge risk of bias in the context
of adequate reporting and description of methods. In devel-
1.1. Approaches to assessing the risk of bias and oping our item bank, we identified questions relevant to
precision of studies each of the 12 ‘‘methods’’ domains identified by Deeks
et al. [9]: (1) background/context, (2) sample definition
As noted above, Deeks et al. [9, p23] identified two ap- and selection, (3) interventions/exposure, (4) outcomes,
proaches to evaluating the quality of observational studies, (5) creation of treatment groups, (6) blinding, (7) soundness
focusing on either a description of methods (the evaluation of information, (8) follow-up, (9) analysis comparability,
of the ‘‘objective characteristics of each study’s methods as (10) analysis outcome, (11) interpretation, and (12) presen-
they are described by the primary researchers’’) or an eval- tation and reporting. The item bank provides a tool for ab-
uation of the risk of bias and threats to precision. Study ap- stractors to review a manuscript to identify the risk of bias
praisal based on risk-of-bias lists potential sources of bias and threats to precision for these domains.
(Table 1), relies heavily on judgment, and is supported by
transparency in recording reasons for the judgment. One
constraint of this approach is that threats to validity and
2. Methods
precision can occur at various points in the study. Assessing
these threats without explicit reference to methods used at The project was conducted in two phases. The prelimi-
each stage of research would require a relatively abstract nary period, phase 1, resulted in the compilation of
166 M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178

potential questions for the item bank. Phase 2 included face University of Minnesota, and Vanderbilt University) partic-
validity testing, cognitive testing, content validity testing, ipated in cognitively testing a second version of the item
and interrater reliability testing. Also, to improve usability bank. Each interviewee independently answered questions
of the item bank, we reduced the number of questions concerning the readability of particular items, including in-
to those relevant for evaluating risk of bias and precision structions, questions, and response categories, to determine
in observational studies, eliminating questions regarding whether all portions of the item bank were being interpreted
applicability or the conduct of RCTs. We also dropped in the manner they were intended. At least one of the pro-
questions that were not relevant to systematic reviews, ject PIs, accompanied by a note taker, conducted the cogni-
overlapped with other questions, or had responses that were tive interview. Because interviewees were experienced
uninterpretable in the context of evaluating bias or preci- systematic reviewers, each was asked to also evaluate the
sion. When possible, questions on risk of bias or threats instrument in relation to whether it contained sufficient
to precision subsume evaluations of limitations resulting and appropriate questions to obtain information on all crit-
from deficiencies in quality of reporting. We indicate when ical domains.
specific questions may be excluded for particular study
designs or for the entire body of evidence. These deletions
2.4. Content validity testing
and modifications occurred over each stage of validity and
reliability testing. The Results section provides details on Seven TEP members participated in content validity test-
the disposition of specific questions. ing of a third version of the item bank. Content validity raters
reviewed each question and determined whether they con-
2.1. Compilation of potential questions for item bank sidered it to be essential, useful, or not necessary for evalu-
ating study risk of bias and precision. Raters repeated the
During phase 1, we compiled a large number of items that exercise four times, separately in relation to cohort, casee
had been used previously in Agency for Healthcare Research control, case series, and cross-sectional studies. We summa-
and Quality (AHRQ)esponsored systematic reviews and rized results through a content validity ratio (CVR) score
other instruments to evaluate the conduct of individual ob- that describes the extent to which the group of reviewers con-
servational studies (risk of bias, precision, and other threats sidered each question to be essential to evaluating each of the
to validity) [21e108]. Technical Expert Panel (TEP) mem- domains [113]. The CVR varies from 1.00 to þ1.00. A
bers identified additional instruments [13,16,109e112]. To CVR 5 0.00 would indicate that half of the reviewers con-
ensure that we included items addressing all important sidered a question to be essential. In this study, we consid-
methods domains, we sorted items into the domains identi- ered a CRV O 0 to indicate that an item was essential.
fied by Deeks et al. [9]. We created a prototype item bank
containing questions, corresponding multiple choice re-
sponse categories, and instructions for interpreting individ- 2.5. Interrater reliability
ual items by PIs and abstractors. We tested the performance of the item bank by conduct-
ing interrater reliability testing. We included all questions
2.2. Expert review and face validity testing rated essential or useful by a majority of content validity
In phase 2, we convened a TEP, composed of 16 senior raters. Twelve individuals with varying levels of experience
staff from across Evidence-based Practice Centers (EPCs) in conducting systematic reviews used the item bank to inde-
and the AHRQ to provide expert input throughout the pendently evaluate the risk of bias and precision of 10 studies
process of creating the item bank. The TEP’s activities in- that had previously been included in a systematic review of
cluded reviewing and advising on the proposed conceptual the literature. These 10 studies represented a cross section
framework; ensuring that the items in the bank adequately of topic areas and risk of bias and precision concerns that
evaluated all the domains identified by Deeks et al. and can arise in observational studies (Table 2) [114e123]. For
were not relevant to RCTs alone; sharing their knowledge each study, reviewers were instructed to evaluate all ques-
of earlier instruments that have been used for measuring tions included in a fourth version of the item bank in relation
risk of bias and precision; and evaluating the face validity to the key questions of the study’s original systematic review.
of the first draft of the item bank. For the face validity ex- Reviewers received a copy of the article and summary infor-
ercise, TEP members provided input on whether items were mation from the systematic review including key questions,
likely to be interpreted correctly and appeared to measure key outcomes (benefits and/or harms), any important con-
what they were intended to measure. founding variables, and the conceptual model (analytic
framework) included with the review.
2.3. Cognitive testing The structure of the item bank consists of multiple-
choice response questions only. Because we anticipated that
Nine potential users (staff at six EPCs, namely, Rand some questions would not be relevant to each study, we
Corporation, RTI International (RTI)eUniversity of North included a ‘‘not applicable’’ response category for some
Carolina, University of Alberta, University of Connecticut, items. Reviewers also commented on whether particular
Table 2. Studies included in interrater reliability testing
Mean time (min) needed
Intervention/exposure by rater to review study

M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178


Study group methods Comparison group? Study focus (range across raters)a
Baker et al. (2002). Functional health literacy and the risk of Prospective cohort No Health care delivery 47 (25e90)
hospital admission among Medicare managed care enrollees
Coleman (1990). Safety and efficacy of combined ritodrine Retrospective cohort Yes, prospective cohort Medication treatment 51 (20e90)
and magnesium sulfate for preterm labor: a method for
reduction of complications
Crisp et al. (1992). Long-term mortality in anorexia nervosa. Prospective cohort No Disease outcome/harms 52 (29e70)
A 20-year follow-up of the St George’s and Aberdeen cohorts
Daniel et al. (1999). Effectiveness of community-directed Prospective cohort Yes, prospective cohort Community-based 63 (25e90)
diabetes prevention and control in a rural Aboriginal intervention
population in British Columbia, Canada
Di Lieto et al. (2003). Immunohistochemical detection of Prospective cohort Yes, prospective cohort Medication treatment 43 (20e70)
insulin-like growth factor type I receptor and uterine volume
changes in gonadotropin-releasing hormone analog-treated
uterine leiomyomas
Fouad et al. (1997). A hypertension control program tailored to Prospective cohort Yes, retrospective cohort Community, workplace 51 (20e90)
unskilled and minority workers intervention
Hedderson et al. (2006). Pregnancy weight gain and risk of Case control Yes Disease outcome/harms 44 (15e60)
neonatal complications
Kinney et al. (1999). Safety of hydroxyurea in children with Prospective cohort No Medication harms 39 (25e60)
sickle cell anemia: results of the HUG-KIDS Study, a phase
I/II trial
Schindl et al. (2003). Elective cesarean delivery vs. Prospective cohort Yes, prospective Surgery outcomes 49 (38e60)
spontaneous delivery: a comparative experience of birth
experience
Van Ham et al. (1997). Maternal consequences of cesarean Retrospective cohort Yes, retrospective Surgery harms 44 (17e90)
section. A retrospective study intra-operative and
postoperative maternal complications of cesarean section
during a 10-year period
a
Time estimates were not provided by two raters; data presented reflect the mean and range for 10 raters.

167
168 M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178

items were irrelevant for evaluating a study and provided of observational studies and compiled 1,492 items that were
feedback on the construction of individual questions and available through the published literature and 84 [21e23,
the instrument as a whole. Raters also provided observa- 25e104] of the 90 AHRQ-sponsored systematic reviews
tions on ease of use of the item bank: any study risk of bias, that had been completed at the start of the project (2007);
precision, or other quality-related issues that were not cap- we considered the remaining six [24,33,105e108] reviews
tured by the item bank; and the time it took to review each to be irrelevant because they focused solely either on RCTs
study using the bank. or on the evaluation of genetic tests. We evaluated the com-
The study team calculated summary statistics describ- prehensiveness of the gathered items by categorizing them
ing agreement between reviewers including mean percent into the 12 quality domains (and related subdomains) iden-
agreement (and associated standard deviation) and first- tified in Deeks et al. [9] and comparing them with items in-
order agreement coefficients (AC1 statistic) for each item cluded in other instruments identified through our searches
in relation to each study and across studies. The AC1 sta- (e.g., Downs and Black [15] and Newcastle Ottawa [14]) or
tistic is a summary measure ranging from 1 (no agree- provided to us by our TEP [13,16,109e112].
ment) to 1 (100% agreement), which adjusts results for Many of the 1,492 items were completely or partially
chance agreement and is considered appropriate for inter- redundant. The study team selected 60 items for measur-
rater reliability tests with multiple raters [124]. Based on ing each of the included domains based on content, read-
previous work by Walter, Eliasziw and Donner (1998), ability, and comprehensiveness of instruction (Fig. 1).
with 10 raters and 10 articles, we calculated that we had During the development process, we reviewed items, in-
at least 80% power to detect that the intraclass correlation cluding questions and responses, and modified wording
was significantly different from 0 (based on observed intra- to ensure that critical domains were represented and to
class correlations of 0.2) [125]. Appendix A (see Appendix improve readability.
A on the journal’s Web site at www.elsevier.com) presents During phase 2, we crafted directions to help PIs indi-
background information for calculating the AC1. We cal- vidualize an item for a particular review and explanatory
culated Fleiss’ kappa statistics as summary measures ini- text to assist PIs and abstractors in standardizing item
tially but do not present them here because of the interpretation.
concerns about interpretation associated with the so-
called ‘‘kappa paradox’’ where high agreement can accom-
3.2. Face validity testing
pany low kappa scores [126].
Because each item bank question includes multiple re- Face validity testing with the TEP on a 60-item version
sponse categories, reliability testing evaluated agreement of the item bank, including instructions for PIs and abstrac-
between raters by comparing the most common response tors, resulted in the elimination of 19 items including seven
to all other response options for each question, in relation items considered outside the scope of the item bank or not
to each study. We calculated summary statistics by question relevant to systematic reviews. Based on TEP input, we
across the 12 reviewers and 10 studies. Across all studies, added three new questions and subsumed several into other
we summarized mean agreement across questions by qual- questions. (See Fig. 1 and Table 3 [see Table 3 on the jour-
ity domain and by the two analytic approaches to evaluat- nal’s Web site at www.elsevier.com] for details concerning
ing quality (solely determining if specific information is the disposition of specific questions.)
reported in the article or using judgment to evaluate the
study’s approach to addressing a bias or precision concern).
3.3. Cognitive testing
The project PIs also reviewed and considered all descriptive
comments made by reviewers. On an item-by-item basis, interviewees provided feedback
on the readability of a 44-item version of the item bank, in-
2.6. Posttest revisions cluding questions, response categories, and instructions such
as their interpretation of categories such as ‘‘no,’’ ‘‘don’t
Based on face validity, cognitive testing, content valid- know,’’ and ‘‘not applicable.’’ Based on their feedback, we
ity, and interrater reliability testing results and related com- eliminated three questions (two because of lack of relevance
ments, we added, deleted, or revised questions, including to systematic reviews and one because it was subsumed into
changing question content or syntax, response categories, another question) and added one question (Fig. 1 and Table
or instructions. 3 [see Table 3 on the journal’s Web site at www.elsevier.
com]). Because cognitive testing interviewees were experi-
enced reviewers, they were also able to identify aspects of
3. Results items that needed revising for clarity and greater direction
3.1. Compilation of potential items for item bank for PIs and/or abstractors. Specifically, we revised items to
ensure that particular response categories would provide the
During phase 1, we reviewed earlier instruments that distinctions that we were intending and added instructions
had been used for evaluating the risk of bias and precision where interviewees judged them to be needed.
M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178 169

Compilation of questions
for item bank
N = 1,492
Questions deleted for overlap or lack of relevance
to systematic reviews, evaluation of bias and
precision, or scope of the item bank
N = 1,431
Face validity testing
N = 60 Questions deleted
N = 19
Questions added • Not within scope of item bank: 5
N=3 • Not relevant to systematic reviews: 2
• Subsumed within other questions: 12
Cognitive testing
N = 44 Questions deleted
N=3
Questions added
• Not relevant to systematic reviews: 2
N=1
• Subsumed within other questions: 1

Content validity testing


N = 42
Questions deleted
N=2
• Subsumed within other questions: 1
• Uninterpretable: 1
Inter-rater reliability
N = 40 Questions deleted
N = 11
• Not relevant to systematic reviews: 1
• Not relevant to evaluation of bias or precision: 1
• Subsumed within other question: 8
Final item bank • Overlap within another question: 1
N = 29

Fig. 1. Disposition of questions for item bank.

3.4. Content validity testing consideration for evaluating the risk of bias for a body of
evidence, the item was deleted in relation to evaluating
Content validity testing identified a core set of questions
an individual study because responses are difficult to inter-
that experts considered essential in conducting a compre-
pret in terms of bias. Financial conflict of interest in the
hensive assessment of a study’s risk of bias and precision.
funding source does not guarantee biased results, nor does
A majority of experts considered 24 questions of a 42-
the absence of financial conflict of interest guarantee lack
item version of the bank (CRV O 0) to be essential across
of bias in results.
all relevant study designs, 10 questions to be either useful
Content validity testing results did not easily point to
or essential across all study designs, and another 6 ques-
items for exclusion. Instead, we used these findings in con-
tions to be useful or essential for at least one study design.
junction with interrater reliability scores to determine the
We eliminated two questions that most experts considered
need for item modifications or deletions.
to be neither essential nor useful to evaluating risk of bias
and precision for any study design type: (1) Is the analysis
3.5. Interrater reliability
conducted on an intention-to-treat basis? and (2) Was the
funding for this study derived from a source that does not We conducted interrater reliability testing on a 40-item
have a vested interest in its results? The first question version of the bank. Table 2 describes the 10 studies in-
was subsumed within another, and the second was elimi- cluded in the testing, by methodological approach (nine co-
nated and not evaluated further (Fig. 1 and Table 3 [see hort and one caseecontrol study), whether the study
Table 3 on the journal’s Web site at www.elsevier.com]). included a comparator group (six studies) and study focus
Although the source of study funding (potential financial (i.e., treatment, harms, and disease outcomes). Table 2 also
conflict of interest) has been demonstrated to influence the presents the time (average and range) it took for raters to
likelihood of publication bias and is therefore an important evaluate the risk of bias and precision of a study using
170 M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178

the item bank. On average, raters spent 48 minutes per evaluates some elements through a cluster of questions.
study (range: 17e90 minutes). For instance, the evaluation of selection bias in the creation
Table 3 (see Table 3 on the journal’s Web site at www. of the sample requires questions about whether inclusion
elsevier.com) presents details of testing results and the dis- and exclusion criteria were reported and measured appro-
position of each item. The mean AC1 score per item was priately before the reviewer can judge whether they were
relatively low, 0.38 (range: 0.10e0.88). The mean percent applied equally to all arms of the study (Questions 2, 3,
agreement between raters across all items was 66%. Per- and 4).
cent agreement between reviewers varied by domain, from The final version of the RTI Observational Studies Risk
a high of 90% for questions concerning presentation and of Bias and Precision Item Bank contains 29 questions,
reporting and 88% for those concerning soundness of infor- multiple choice response categories, and extensive instruc-
mation to a low of 56% for questions concerning follow-up tions for PIs and abstractors to assist PIs and abstractors in
and 59% for those concerning characteristics of interven- developing criteria for considering the issue being investi-
tions and exposures (results not shown). We further tested gated by the question (see Appendix B on the journal’s
if agreement varied significantly between questions that Web site at www.elsevier.com). The items in the bank are
concerned identifying whether specific information was re- ordered according to the study domain structure presented
ported in an article and questions that required more com- by Deeks et al. and are generally intended to allow the re-
plex judgment on the part of the reviewer and found no viewer to consider the various risks of bias and precision
significant difference (P 5 0.09). Reviewers agreed 70% issues of a study according to the presentation order of
of the time on reporting questions and 64% on questions re- a manuscript. Table 4 maps each of the final 29 items to
quiring judgment. the methods domains identified by Deeks et al. [9], specific
Because poor results from interrater reliability testing potential risks to bias or precision, and relevant study
were in no clear pattern and did not lend themselves to un- designs.
ambiguous conclusions. We did not eliminate any questions
based on this stage of testing. Instead, we used the interrater
reliability results to identify and revise questions that per-
4. Discussion
formed poorly and add instruction to PIs to help abstractors
interpret questions more clearly. With the increasing use of observational studies in evi-
dence synthesis, systematic reviewers have a greater burden
3.6. Posttest revisions of evaluating the risk of bias of study results. Although this
effort is essentially a subjective exercise, requiring judg-
In summary, based on face validity, cognitive testing, ments by a reviewer, it is the only means of evaluating
content validity, and interrater reliability testing and study the degree to which a study’s results can be believed and
team evaluation, we either deleted items that were con- is a critical step on the pathway to evaluating the strength
sidered unnecessary or revised items (including question of a body of evidence. The RTI Observational Studies Risk
syntax, response categories, and instructions). The original of Bias and Precision Item Bank builds on and extends the
60-item instrument decreased to 44 items after face validity efforts of previous instruments to (1) create an evaluation
testing, 42 items after cognitive testing, 40 items after con- tool that is specifically designed to work within the larger
tent validity testing, and 29 items after interrater reliability context of systematic review methodology and tasks; (2)
testing and final review of items by the study team. explicitly focus on believability of the study rather than ap-
Reasons for posttest deletion of items by the study team plicability; (3) comprehensively consider the elements that
include lack of relevance to systematic reviews (one ques- support believability; and (4) promote transparency and
tion), lack of relevance to evaluation of bias or precision consistency of judgment between pairs of reviewers work-
(one question), and overlap with other questions (one ques- ing on a single review and across reviews, particularly
tion). We subsumed items concerning reporting on a specific when customization is needed for the specific topic.
source of bias or precision within the response categories Our item bank is intended to be used to interpret the be-
for direct evaluation of that source of bias or precision lievability of individual studies, but just as importantly, to
(eight questions). For instance, we deleted a reporting ques- create the building blocks for evaluating the risk of bias
tion relating to selection bias (‘‘Did the authors report dif- and precision for the body of evidence. Systems to grade
ferences in baseline characteristics?’’) but added a response the strength of a body of evidence such as Grading of Rec-
category within the question (‘‘Did the authors control for ommendations Assessment, Development and Evaluation
differences in baseline characteristics?’’) to account for (GRADE) and the AHRQ strength of evidence approach
those who did not report this information on these differ- use an overall assessment of risk of bias as one key ele-
ences (Fig. 1 and Table 3 [see Table 3 on the journal’s ment; other separate elements include applicability and
Web site at www.elsevier.com]). precision. Commonly used instruments such as the New-
Because of the integral nature of reporting to evaluating castle Ottawa Scale [14] and Downs and Black [15] in-
risk of bias and precision in some cases, the item bank clude questions on all three areas: risk of bias, precision,
Table 4. Item bank questions mapped to risk of bias, precision, and methods domain
Methods Selection Overall Total
domain Precision bias/confounding Performance bias Attrition bias Detection bias Reporting bias Information bias believability N of items
Background/context d d d d d d d d 0
Sample definition  Q6 (CH, CC,  Q1 (CH, CC, CS)  Q1 (CH, CC, CS) d  Q1 (CH, CC, CS)  Q1 (CH,  Q3 (CH, CC, d 6
and selection CS, XS)  Q2 (CH, CC,  Q5 (CH, CC) CC, CS) CS, XS)
CS, XS)

M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178


 Q4 (CH, CC)
Interventions/ d d  Q7 (CH, CC, d d d d d 1
exposure CS, XS)
Outcomes d d d d d  Q8 (CH, CC, d d 1
CS, XS)
Creation of d  Q9 (CH, CC)  Q11 (CH, CC, d d d d d 4
treatment groups  Q10 (CH, CC) CS, XS)
 Q12 (CH, CC,
CS, XS)
Blinding d d d d  Q13 (CH, CC, d d d 1
CS, XS)
Soundness of d d d d d d  Q14 (CH, CC, d 2
information CS, XS)
 Q15 (CH, CC,
CS, XS)
Follow-up d d d  Q16 (CH, CC) d d d d 4
 Q17 (CH,
CC, CS)
 Q18 (CH,
CC, CS)
 Q19 (CH, CC)
Analysis d  Q20 d d d d  Q21 (CH, CC, d 3
comparability  Q22 (CH, CC, CS, XS)
CS, XS)
Analysis outcome  Q25 (CH, CC, d d  Q23 (CH, CS) d  Q24 (CH, CC, d d 5
CS, XS) CS, XS)
 Q27 (CH, CC,  Q26 (CH, CC,
CS, XS) CS, XS)
Interpretation d d d d d d d  Q28 (CH, CC, 1
CS, XS)
Presentation and d d d d d  Q29 (CH, CC, d d 1
reporting CS, XS)
Totala 3 7 5 5 2 5 4 1 29
Abbreviations: CH, cohort; CC, caseecontrol; CS, case series; XS, cross-sectional; N, number.
a
Number of items in each column sum to greater than total number of items because some items relate to multiple risks of bias.

171
172 M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178

and external validity (applicability) within their ratings for We note the overall low interrater reliability scores we
individual studies. These instruments identify questions obtained through our testing: one likely reason is that we
that evaluate external validity but not questions related to did not develop customized instructions to standardize eval-
precision. When all these items are combined within a rat- uation criteria for each study. By design, our goal was to
ing scale, as in the Newcastle Ottawa Scale, the results document experience using the item bank on a broad range
cannot be used as components for judging the strength of of studies. Our results are also limited by this range: unlike
the body of evidence without some manipulation: the ex- typical systematic review teams that share a common un-
ternal validity and precision elements need to be identified derstanding of a single review topic, our raters were asked
and removed from the overall scores to isolate the risk of to rate studies for 10 different topics.
bias for a study. The item bank requires further development and testing.
Our item bank focuses explicitly on risk of bias and pre- First, the identification of specific biases for which observa-
cision. Systematic reviews that combine studies through tional studies are most at risk will help to identify a core set
meta-analyses may not need to evaluate study-specific ele- of items. Second, the assessment of the bank using custom-
ments of precision, particularly sample size and appropriate ized questions (i.e., items that have been customized for
statistical analysis. Reviews that cannot pool estimates be- specific topics) and inter- and intrarater reliability for teams
cause of heterogeneity of results may choose to include of reviewers working on the same topic will help to further
evaluations of precision in addition to risk of bias. The item refine items. A key consideration in such testing will be the
bank identifies the precision items clearly so that reviewers selection of studies with known serious methodological
can judge whether to evaluate these elements. concerns to assess the ability of the instrument to identify
Our item bank encompasses nearly all the evaluation potential sources of bias. A third task is to assess the empir-
criteria identified by West et al. [10] for doing quality ass- ical basis for evaluating potential sources of bias by mea-
essment in one of three ways: within questions, within re- suring the correlation between responses to specific items
sponse categories to the questions, or within instructions and effect sizes as means of further culling the item bank.
for interpreting the questions. We did not include two ele- Although our item bank does not include questions spe-
ments from the list of criteria by West et al., which are (1) cific to RCTs, such as adequacy of randomization genera-
the study includes clearly focused and appropriate ques- tion and concealment of allocation, many items apply to
tions and (2) use of concurrent controls. We judged, in con- RCTs as well. Systematic reviews may find it useful to se-
sultation with our TEP and other users, that the former lect these broadly applicable questions from the item bank
question is not relevant to evaluating bias and precision; to consistently evaluate and compare methodological issues
our intent was to judge the believability of study results that across studies, regardless of design.
are relevant to the goals of the systematic review rather than
to its own intent. We did not include the latter question be-
cause we consider concurrent controls as a matter of design. Appendix
The intent of our item bank was not to evaluate the believ-
ability of study results based on the type of observational Supplementary material
study design relative to other designs. Rather, we intended
to facilitate the identification of the risk of bias that may Supplementary material can be found, in the online ver-
stem from design features. sion, at 10.1016/j.jclinepi.2011.05.008.
The wide array of designs and associated risks of biases
in observational studies imply that systematic reviewers References
will need to customize their review form to concentrate
on the most critical selection of items for the topic at hand [1] Lohr KN. Emerging methods in comparative effectiveness and
safety: symposium overview and summary. Med Care 2007;45(10
and establish minimum standards for addressing these
Suppl 2):S5e8. PMID: 17909383.
items. Our item bank assists by providing choices through [2] Viswanathan M. Systematic review: assessing the quality of in-
a comprehensive array of items rather than a fixed menu dividual studies. Rockville, MD: Agency for Healthcare Quality
of required elements through a static instrument. Instruc- and Review; 2010. Available at http://www.effectivehealthcare.
tions indicate where customization may be required. ahrq.gov/index.cfm/slides/?pageAction5displaySlides&tk524. Ac-
cessed June 9, 2011.
Although PIs of systematic reviews may see value in eval-
[3] Norris SL, Atkins D. Challenges in using nonrandomized studies in
uating multiple nuances related to threats to bias by systematic reviews of treatment interventions. Ann Intern Med
including all items included in the bank, the detailed consid- 2005;142(12 Pt 2):1112e9. PMID: 15968036.
eration of each of those concerns may not always be practi- [4] Chou R, Aronson N, Atkins D, Ismaila AS, Santaguida P, Smith DH,
cal. Our reviewers took on average, approximately 8 hours to et al. AHRQ series paper 4: assessing harms when comparing med-
ical interventions: AHRQ and the effective health-care program.
complete the 10 study risk of bias and precision review. In-
J Clin Epidemiol 2010;63:502e12. PMID: 18823754.
stead, a more realistic approach is to identify the most critical [5] Agency for Healthcare Research and Quality. Methods reference
threats to validity and precision in a body of evidence and guide for effectiveness and comparative effectiveness reviews, ver-
then select questions that can evaluate these concerns. sion 1.0. Rockville, MD: Agency for Healthcare Research and
M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178 173

Quality; 2007. Available at http://effectivehealthcare.ahrq.gov/rep in emergency departments. Evidence Report/Technology Assess-


Files/2007_10DraftMethodsGuide.pdf. Accessed June 9, 2011. ment Number 26 (Prepared by The New England Medical Center
[6] Juni P, Altman DG, Egger M. Systematic reviews in health care: as- Evidence-based Practice Center under Contract No. 290-97-0019).
sessing the quality of controlled clinical trials. BMJ 2001;323:42e6. Rockville, MD: Agency for Healthcare Research and Quality;
PMID: 11440947. 2001. AHRQ Publication No. 01-E006.
[7] Higgins JPT, Green S. Cochrane handbook for systematic reviews of [23] Sharma M, Clark H, Armour T, Stotts G, Cote R, Hill MD, et al.
interventions version 5.0.2. London, UK: The Cochrane Collabora- Acute stroke: evaluation and treatment. Evidence Report/
tion; 2009. Available at www.cochrane-handbook.org. Accessed Technology Assessment No. 127 (Prepared by the University of
June 9, 2011. Ottawa Evidence-based Practice Center under Contract No. 290-
[8] Rothman KJ, Greenland S, Lash TL. Modern epidemiology. 3rd edi- 02-0021). Rockville, MD: Agency for Healthcare Research and
tion. Philadelphia, PA: Lippincott, Williams, & Wilkins; 2008. Quality; 2005. AHRQ Publication No. 05-E023-2.
[9] Deeks JJ, Dinnes J, D’Amico R, Sowden AJ, Sakarovitch C, Song F, [24] Jadad AR, Boyle M, Cunningham C, Kim M, Schachar R. Treat-
et al. Evaluating non-randomised intervention studies. Health Tech- ment of attention-deficit/hyperactivity disorder. Evidence Report/
nol Assess 2003;7(27). iii-x, 1e173. PMID: 14499048. Technology Assessment No. 11 (Prepared by McMaster University
[10] West SL, King V, Carey TS, Lohr KN, McKoy N, Sutton SF, et al. under Contract No. 290-97-0017). Rockville, MD: Agency for
Systems to rate the strength of scientific evidence. Evidence Report/ Healthcare Research and Quality; 1999. AHRQ Publication No.
Technology Assessment No. 47 (Prepared by the Research Triangle 00-E005.
Institute-University of North Carolina Evidence-based Practice Cen- [25] Myers ER, Bastian LA, Havrilesky LJ, Kulasingam SL, Terplan MS,
ter under Contract No. 290-97-0011). Rockville, MD: Agency for Cline KE, et al. Management of adnexal mass. Evidence Report/
Healthcare Research and Quality; 2002. AHRQ Publication No. Technology Assessment No. 130 (Prepared by the Duke Evidence-
02-E016. based Practice Center under Contract No. 290-02-0025). Rockville,
[11] Sanderson S, Tatt ID, Higgins JP. Tools for assessing quality and MD: Agency for Healthcare Research and Quality; 2006. AHRQ
susceptibility to bias in observational studies in epidemiology: a sys- Publication No. 06-E004.
tematic review and annotated bibliography. Int J Epidemiol [26] Lau J, Balk E, Rothberg M, Ioannidis JPA, DeVine D, Litt M, et al.
2007;36:666e76. PMID: 17470488. Management of clinically inapparent adrenal mass. Evidence Re-
[12] Katrak P, Bialocerkowski AE, Massy-Westropp N, Kumar S, port/Technology Assessment No. 56 (Prepared by New England
Grimmer KA. A systematic review of the content of critical appraisal Medical Center Evidence-based Practice Center under Contract
tools. BMC Med Res Methodol 2004;4:22. PMID: 15369598. No. 290-97-0019). Rockville, MD: Agency for Healthcare Research
[13] Thomas H. Quality assessment tool for quantitative studies. Effec- and Quality; 2002. AHRQ Publication No. 02-E014.
tive Public Health Practice Project. Toronto, Canada: McMaster [27] Chou R, Fu R, Carson S, Saha S, Helfand M. Empirical evaluation
University. of the association between methodological shortcomings and esti-
[14] Wells G, Shay B, O’Connell D, Peterson J, Welch V, Losos M, et al. mates of adverse events. Technical Review No. 13 (Prepared by
The Newcastle-Ottawa Scale (NOS) for assessing the quality of the Oregon Evidence-based Practice Center under Contract No.
nonrandomised studies in meta-analysis. Ottawa, Canada: Univer- 290-02-0024). Rockville, MD: Agency for Healthcare Research
sity of Ottawa. Available at: http://www.effectivehealthcare.ahrq. and Quality; 2006. AHRQ Publication No. 07-0003.
gov/index.cfm/slides/?pageAction5displaySlides&tk524. Accessed [28] Hardy M, Coulter I, Venuturupalli S, Roth EA, Favreau J,
June 9, 2011. Morton SC, et al. Ayurvedic interventions for diabetes mellitus:
[15] Downs SH, Black N. The feasibility of creating a checklist for the a systematic review. Evidence Report/Technology Assessment No.
assessment of the methodological quality both of randomised and 41 (Prepared by Southern California Evidence-based Practice Cen-
non-randomised studies of health care interventions. J Epidemiol ter/RAND under Contract No. 290-97-0001). Rockville, MD:
Community Health 1998;52:377e84. PMID: 9764259. Agency for Healthcare Review and Quality; 2001. AHRQ Publica-
[16] Zaza S, Wright-De Aguero LK, Briss PA, Truman BI, Hopkins DP, tion No. 01-E040.
Hennessy MH, et al. Data collection instrument and procedure for [29] Balk E, Chung M, Raman G, Tatsioni A, Chew P, Ip S, et al. B
systematic reviews in the Guide to Community Preventive Services. vitamins and berries and age-related neurodegenerative disorders.
Task Force on Community Preventive Services. Am J Prev Med Evidence Report/Technology Assessment No. 134 (Prepared by
2000;18(1 Suppl):44e74. PMID: 10806979. Tufts-New England Medical Center Evidence-based Practice Center
[17] Cowley DE. Prostheses for primary total hip replacement. A critical under Contract No. 290-02-0022). Rockville, MD: Agency for
appraisal of the literature. Int J Technol Assess Health Care Healthcare Research and Quality; 2006. AHRQ Publication No.
1995;11:770e8. PMID: 8567209. 06-E008.
[18] Reisch JS, Tyson JE, Mize SG. Aid to the evaluation of therapeutic [30] Catlett C, Perl T, Jenckes MW, Robinson KA, Mitchell D, Hage J,
studies. Pediatrics 1989;84(5):815e27. PMID: 2797977. et al. Training of clinicians for public health events relevant to bio-
[19] Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, terrorism preparedness. Evidence Report/Technology Assessment
et al. Meta-analysis of observational studies in epidemiology: a pro- No. 51 (Prepared by Johns Hopkins Evidence-based Practice Center
posal for reporting. Meta-analysis Of Observational Studies in under Contract No. 290-97-006). Rockville, MD: Agency for
Epidemiology (MOOSE) group. JAMA 2000;283:2008e12. PMID: Healthcare Research and Quality; 2002. AHRQ Publication No.
10789670. 02-E011.
[20] Peipert JF, Phipps MG. Observational studies. Clin Obstet Gynecol [31] Bravata DM, McDonald K, Owens DK, Buckeridge D,
1998;41:235e44. PMID: 9646956. Haberland C, Rydzak C, et al. Bioterrorism preparedness and re-
[21] Wilt TJ, Lederle FA, MacDonald R, Jonk YC, Rector TS, Kane RL. sponse: use of information technologies and decision support sys-
Comparison of endovascular and open surgical repairs for abdominal tems. Evidence Report/Technology Assessment No. 59 (Prepared
aortic aneurysm, structured abstract. Evidence Report/Technology by University of California San Francisco B Stanford Evidence-
Assessment No. 144 (Prepared by the University of Minnesota based Practice Center under Contract No. 290-97-0013). Rockville,
Evidence-based Practice Center under Contract No. 290-02-0009). MD: Agency for Healthcare Research and Quality; 2002. AHRQ
Rockville, MD: Agency for Healthcare Research and Quality; 2006. Publication No. 02-E028.
AHRQ Publication No. 06-E017. [32] Appel LJ, Robinson KA, Guallar E, Erlinger T, Masood SO, Jehn J,
[22] Lau J, Ioannidis JPA, Balk E, Milch C, Chew P, Terrin N, et al. et al. Utility of blood pressure monitoring outside of the clinic set-
Evaluation of technologies for identifying acute cardiac ischemia ting. Evidence Report/Technology Assessment No. 63 (Prepared by
174 M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178

the Johns Hopkins Evidence-based Practice Center under Contract Center under Contract No. 290-02-0016). Rockville, MD: Agency
No. 290-97-006). Rockville, MD: Agency for Healthcare Research for Healthcare Research and Quality; 2004. AHRQ Publication
and Quality; 2002. AHRQ Publication No. 03-E004. 04-E022-2.
[33] Balion C, Santaguida P, Hill S, Worster A, McQueen M, Oremus M, [43] Rostom A, Dube C, Cranney A, Saloojee N, Sy R, Garritty C, et al.
et al. Testing for BNP and NT-proBNP in the diagnosis and progno- Celiac disease. Evidence Report/Technology Assessment No. 104
sis of heart failure. Evidence Report/Technology Assessment No. (Prepared by the University of Ottawa Evidence-based Practice
142 (Prepared by the McMaster University Evidence-based Practice Center, under Contract No. 290-02-0021). Rockville, MD: Agency
Center under Contract No. 290-02-0020). Rockville, MD: Agency for Healthcare Research and Quality; 2004. AHRQ Publication
for Healthcare Research and Quality; 2006. AHRQ Publication No. 04-E029-2.
No. 06-E014. [44] Grady D, Chaput L, Kristof M. Results of systematic review of re-
[34] Viswanathan M, King VJ, Bordley C, Honeycutt AA, Wittenborn J, search on diagnosis and treatment of coronary heart disease in
Jackman AM, et al. Management of bronchiolitis in infants and chil- women. Evidence Report/Technology Assessment No. 80 (Prepared
dren. Evidence Report/Technology Assessment No. 69 (Prepared by by the University of California, San Francisco-Stanford Evidence-
RTI International-University of North Carolina at Chapel Hill Evi- based Practice Center under Contract No. 290-97-0013). Rockville,
dence-based Practice Center under Contract No. 290-97-0011). MD: Agency for Healthcare Research and Quality; 2003. AHRQ
Rockville, MD: U.S. Department of Health and Human Services, Publication No. 03-0035.
Agency for Healthcare Research and Quality; 2003. AHRQ Publica- [45] Marinopoulos SS, Dorman T, Ratanawongsa N, Wilson LM,
tion No. 03-E014. Ashar BH, Magaziner JL, et al. Effectiveness of continuing medical
[35] Ford JG, Howerton MW, Bolen S, Gary TL, Lai GY, Tilburt J, et al. education. Evidence Report/Technology Assessment No. 149 (Pre-
Knowledge and access to information on recruitment of underre- pared by the Johns Hopkins Evidence-based Practice Center, under
presented populations to cancer clinical trials. Evidence Report/ Contract No. 290-02-0018). Rockville, MD: Agency for Healthcare
Technology Assessment No. 122 (Prepared by the Johns Hopkins Research and Quality; 2007. AHRQ Publication No. 07-E006.
University Evidence-based Practice Center under Contract No. [46] McCrory DC, Brown C, Gray RN, Goslin RE, Kolimaga JT,
290-02-0018). Rockville, MD: Agency for Healthcare Research MacIntyre NR, et al. Management of acute exacerbations of COPD.
and Quality; 2005. AHRQ Publication No. 05-E019-2. Evidence Report/Technology Assessment No. 19 (Contract 290-97-
[36] Ellis P, Robinson P, Ciliska D, Armour T, Raina P, Brouwers M, 0014 to the Duke University Evidence-based Practice Center).
et al. Diffusion and dissemination of evidence-based cancer control Rockville, MD: Agency for Healthcare Research and Quality;
interventions. Evidence Report/Technology Assessment Number 79 2001. AHRQ Publication No. 01-E003.
(Prepared by McMaster University under Contract No. 290-97- [47] Flamm CR, Aronson N, Bohn R, Finkelstein B, Piper M,
0017). Rockville, MD: Agency for Healthcare Research and Qual- Seidenfeld J, et al. Use of epoetin for anemia in chronic renal fail-
ity; 2003. AHRQ Publication No. 03-E033. ure. Evidence Report/Technology Assessment No. 29 (Prepared by
[37] Whelan TJ, O’Brien MA, Villasis-Keever M, Robinson P, Skye A, the Blue Cross and Blue Shield Association Technology Evaluation
Gafni A, et al. Impact of cancer-related decision aids. Evidence Re- Center under Contract No. 290-97-0015). Rockville, MD: Agency
port/Technology Assessment Number 46 (Prepared by McMaster for Healthcare Research and Quality; 2001. AHRQ Publication
University under Contract No. 290-97-0017). Rockville, MD: No. 01-E016.
Agency for Healthcare Research and Quality; 2002. AHRQ Publica- [48] Viswanathan M, Visco AG, Hartmann K, Wetcher ME,
tion No. 02-E004. Gartlehner G, Wu JM, et al. Cesarean delivery on maternal request.
[38] Ammerman A, Lindquist C, Hersey J, Jackman AM, Gavin NI, Evidence Report/Technology Assessment No. 133 (Prepared by the
Garces C, et al. Efficacy of interventions to modify dietary behavior RTI International-University of North Carolina Evidence-Based
related to cancer risk. Evidence Report/Technology Assessment No. Practice Center under Contract No. 290-02-0016). Rockville, MD:
25 (Contract No. 290-97-0011 to the Research Triangle Institute- Agency for Healthcare Research and Quality; 2006. AHRQ Publica-
University of North Carolina at Chapel Hill Evidence-based Prac- tion No. 06-E009.
tice Center). Rockville, MD: Agency for Healthcare Research and [49] Bonito AJ, Palton LL, Shugars DA, Lohr KN, Nelson JP, Bader JD,
Quality; 2001. AHRQ Publication No. 01-E029. et al. Management of dental patients who are HIV-positive. Evi-
[39] McAlister F, Ezekowitz J, Wiebe N, Rowe B, Spooner C, dence Report/Technology Assessment No. 37 (Contract 290-97-
Crumley E, et al. Cardiac resynchronization therapy for congestive 0011 to the Research Triangle Institute-University of North Carolina
heart failure. Evidence Report/Technology Assessment No. 106 at Chapel Hill Evidence-based Practice Center). Rockville, MD:
(Prepared by the University of Alberta Evidence-based Practice Agency for Healthcare Research and Quality; 2002. AHRQ Publica-
Center under Contract No. 290-02-0023). Rockville, MD: Agency tion No. 01-E042.
for Healthcare Research and Quality; 2004. AHRQ Publication [50] Bader JD, Bonito AJ, Shugars DA. Cardiovascular effects of
No. 05-E001-2. epinephrine on hypertensive dental patients. Evidence Report/
[40] Schein OD, Friedman DS, Fleisher LA, Lubomski LH, Magaziner J, Technology Assessment Number 48 (Prepared by Research Triangle
Sprintz M, et al. Anesthesia management during cataract surgery. Institute under Contract No. 290-97-0011). Rockville, MD: Agency
Evidence Report/Technology Assessment No. 16 (Prepared by the for Healthcare Research and Quality; 2002. AHRQ Publication No.
Johns Hopkins University Evidence-based Practice Center under 02-E006.
Contract No. 290-097-0006). Rockville, MD: Agency for Health- [51] Golden S, Boulware LE, Berkenblit G, Brancati F, Chander G,
care Research and Quality; 2001. AHRQ Publication No. 01-E017. Marinopoulos S, et al. Use of glycated hemoglobin and microalbu-
[41] Jampel H, Lubomski L, Friedman D. Treatment of coexisting cata- minuria in the monitoring of diabetes mellitus. Evidence Report/
ract and glaucoma. Evidence Report/Technology Assessment Num- Technology Assessment No. 84 (Prepared by Johns Hopkins Evi-
ber 38 (Prepared by Johns Hopkins University Evidence-based dence-based Practice Center under Contract No. 290-97-0006).
Practice Center under Contract No. 290-97-0006). Rockville, MD: Rockville, MD: Agency for Healthcare Research and Quality,
Agency for Healthcare Research and Quality; 2003. AHRQ Publica- U.S. Department of Health and Human Services; 2003. AHRQ Pub-
tion No. 03-E041. lication No. 04-E001.
[42] Viswanathan M, Ammerman A, Eng E, Gartlehner G, Lohr KN, [52] Ross SD, Levine C, Ganz N, Frame D, Estok R, Stone L, et al.
Griffith D, et al. Community-based participatory research: assessing Systematic review of the current literature related to disability and
the evidence. Evidence Report/Technology Assessment No. 99 (Pre- chronic fatigue syndrome. Evidence Report/Technology Assessment
pared by RTI University of North Carolina Evidence-based Practice No. 66 (Prepared by MetaWorks Inc. Evidence-based Practice
M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178 175

Center under Contract No 290-97-0016). Rockville, MD: Agency Technology Assessment No. 148 (Prepared by the McMaster Uni-
for Healthcare Research and Quality; 2002. AHRQ Publication versity Evidence-based Practice Center, under Contract No. 290-
No. 03-E007. 02-0020). Rockville, MD: Agency for Healthcare Research and
[53] Segal JB, Eng J, Jenckes MW, Tamariz LJ, Bolger DT, Krishnan JA, Quality; 2006. AHRQ Publication No. 07-E004.
et al. Diagnosis and treatment of deep venous thrombosis and pul- [64] Gebo KA, Jenckes MJ, Chander G, Torbenson MS, Ghanem KG,
monary embolism. Evidence Report/Technology Assessment Num- Herlong HF, et al. Management of chronic hepatitis C. Evidence
ber 68 (Prepared by Johns Hopkins University Evidence-based Report/Technology Assessment No. 60 (Prepared by the Johns Hop-
Practice Center under Contract No. 290-97-0007). Rockville, MD: kins University Evidence-based Practice Center under Contract No
Agency for Healthcare Research and Quality; 2003. AHRQ Publica- 290-97-0006). Rockville, MD: Agency for Healthcare Research and
tion No. 03-E016. Quality; 2002. AHRQ Publication No. 02-E030.
[54] Berkman ND, Bulik CM, Brownley KA, Lohr KN, Sedway JA, [65] Santaguida PL, Balion C, Hunt D, Morrison K, Gerstein H, Raina P,
Rooks A, et al. Management of eating disorders. Evidence Re- et al. Diagnosis, prognosis, and treatment of impaired glucose toler-
port/Technology Assessment No. 135 (Prepared by the RTI Interna- ance and impaired fasting glucose. Evidence Report/Technology
tional-University of North Carolina Evidence-Based Practice Center Assessment No. 128 (Prepared by the McMaster University Evi-
under Contract No. 290-02-0016). Rockville, MD: Agency for dence-based Practice Center under Contract No. 290-02-0020).
Healthcare Research and Quality; 2006. AHRQ Publication No. Rockville, MD: Agency for Healthcare Research and Quality;
06-E010. 2005. AHRQ Publication No. 05-E026-2.
[55] Lorenz K, Lynn J, Morton SC, Dy S, Mularski R, Shugarman L, [66] Buscemi N, Vandermeer B, Friesen C, Bialy L, Tubman M,
et al. End-of-life care and outcomes. Evidence Report/Technology Ospina M, et al. Manifestations and management of chronic insom-
Assessment No. 110 (Prepared by the Southern California Evi- nia in adults. Evidence Report/Technology Assessment No. 125
dence-based Practice Center, under Contract No. 290-02-0003). (Prepared by the University of Alberta Evidence-based Practice
Rockville, MD: Agency for Healthcare Research and Quality; Center, under Contract No. C400000021). Rockville, MD: Agency
2004. AHRQ Publication No. 05-E004-2. for Healthcare Research and Quality; 2005. AHRQ Publication
[56] Aronson N, Flamm CR, Mark D, Lefevre F, Bohn RL, No. 05-E021-2.
Finkelstein B, et al. Endoscopic retrograde cholangiopancreatogra- [67] Cole C, Binney G, Casey P, Fiascone J, Hagadorn J, Kim C, et al.
phy. Evidence Report/Technology Assessment Number 50 (Pre- Criteria for determining disability in infants and children: low birth
pared by Blue Cross and Blue Shield Association under Contract weight. Evidence Report/Technology Assessment No. 70 (Prepared
No. 290-97-001-5). Rockville, MD: Agency for Healthcare Re- by Tufts New England Medical Center Evidence-based Practice
search and Quality; 2002. AHRQ Publication No. 02-E017. Center under Contract No. 290-97-0019). Rockville, MD: Agency
[57] Ross SD, Estok R, Chopra S, French J, et al. Management of newly for Healthcare Research and Quality; 2002. AHRQ Publication
diagnosed patients with epilepsy: a systematic review of the literature. No. 03-E010.
Evidence Report/Technology Assessment No. 39 (Contract 290-97- [68] Meenan RT, Saha S, Chou R, Swarztrauber K, Krages KP,
0016 to MetaWorks, Inc.). Rockville, MD: Agency for Healthcare O’Keefee-Rosetti M, et al. Effectiveness and cost-effectiveness of
Research and Quality; 2001. AHRQ Publication No. 01-E038. echocardiography and carotid imaging in the management of stroke.
[58] Chapell R, Reston J, Snyder D. Management of treatment-resistant Evidence Report/Technology Assessment Number 49 (Prepared by
epilepsy. Evidence Report/Technology Assessment No. 77 (Pre- Oregon Health & Science University Evidence-based Practice Cen-
pared by the ECRI Evidence-based Practice Center under Contract ter under Contract No. 290-97-0018). Rockville, MD: Agency for
No 290-97-0020). Rockville, MD: Agency for Healthcare Research Healthcare Research and Quality; 2002. AHRQ Publication No.
and Quality; 2003. AHRQ Publication No. 03-0028. 02-EO22.
[59] Viswanathan M, Hartmann K, Palmieri R, Lux L, Swinson T, [69] Cook D, Meade M, Guyatt G, Griffith L, Booker L. Criteria for
Lohr KN, et al. The use of episiotomy in obstetrical care: a system- weaning from mechanical ventilation. Evidence Report/Technology
atic review. Evidence Report/ Technology Assessment No. 112 (Pre- Assessment No. 23 (Prepared by McMaster University under Con-
pared by the RTI-UNC Evidence-based Practice Center, under tract No. 290-97-0017). Rockville, MD: Agency for Healthcare Re-
Contract No. 290-02-0016). Rockville, MD: Agency for Healthcare search and Quality; 2000. AHRQ Publication No. 01-E010.
Research and Quality; 2005. AHRQ Publication No. 05-E009-2. [70] Buscemi N, Vandermeer B, Pandya R, Hooton N, Tjosvold L,
[60] Perrin EC, Cole CH, Frank DA, Glicken SR, Guerina N, Petit K, Hartling L, et al. Melatonin for treatment of sleep disorders. Evi-
et al. Criteria for determining disability in infants and children: fail- dence Report/Technology Assessment No. 108 (Prepared by the
ure to thrive. Evidence Report/Technology Assessment No. 72 (Pre- University of Alberta Evidence-based Practice Center, under Con-
pared by Tufts-New England Medical Center Evidence-based tract No. 290-02-0023). Rockville, MD: Agency for Healthcare
Practice Center under Contract No. 290-97-0019). Rockville, MD: Research and Quality; 2004. AHRQ Publication No. 05-E002-2.
Agency for Healthcare Research and Quality; 2003. AHRQ Publica- [71] Nelson HD, Haney E, Humphrey L, Miller J, Nedrow A,
tion No. 03-E026. Nicolaidis C, et al. Management of menopause-related symptoms.
[61] McDonagh MS, Carson S, Ash JS, Russman BS, Stavri PZ, Evidence Report/Technology Assessment No. 120 (Prepared by
Krages KP, et al. Hyperbaric oxygen therapy for brain injury, cere- the Oregon Evidence-based Practice Center, under Contract No.
bral palsy, and stroke. Evidence Report/Technology Assessment No. 290-02-0024). Rockville, MD: Agency for Healthcare Research
85 (Prepared by the Oregon Health & Science University Evidence- and Quality; 2005. AHRQ Publication No. 05-E016-2.
based Practice Center under Contract No. 290-97-0018). Rockville, [72] Beach MC, Cooper LA, Robinson KA, Price EG, Gray TL,
MD: Agency for Healthcare Research and Quality; 2003. AHRQ Jenckes MW, et al. Strategies for improving minority healthcare
Publication No. 04-E003. quality. Evidence Report/Technology Assessment No. 90 (Prepared
[62] Berkman ND, DeWalt DA, Pignone MP, Sheridan SL, Lohr KN, by the Johns Hopkins University Evidence-based Practice Center,
Lux L, et al. Literacy and health outcomes. Evidence Report/ Baltimore, MD). Rockville, MD: Agency for Healthcare Research
Technology Assessment No. 87 (Prepared by RTI International- and Quality; 2004. AHRQ Publication No. 04-E008-02.
University of North Carolina under Contract No. 290-02-0016). [73] McCrory DC, Pompeii LA, Skeen MB, Moon SD, Gray RN,
Rockville, MD: Agency for Healthcare Research and Quality; Kolimaga JT, et al. Criteria to determine disability related to multi-
2004. AHRQ Publication No. 04-E007-2. ple sclerosis. Evidence Report/Technology Assessment No. 100
[63] Oremus M, Hanson M, Whitlock R, Young E, Gupta A, Dal Cin A, (Prepared by the Duke Evidence-based Practice Center, Durham,
et al. The uses of heparin to treat burn injury. Evidence Report/ NC, under Contract No. 290-02-0025). Rockville, MD: Agency
176 M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178

for Healthcare Research and Quality; 2004. AHRQ Publication No. Tufts-New England Medical Center Evidence-based Practice Cen-
04-E019-2. ter, under Contract No. 290-02-0022). Rockville, MD: Agency for
[74] Ip S, Glicken S, Kulig J, O’Brien R, Sege RS. Management of Healthcare Research and Quality; 2004. AHRQ Publication No.
neonatal hyperbilirubinemia. Evidence Report/Technology Assess- 04-E009-2.
ment No. 65 (Prepared by Tufts-New England Medical Center [84] Marcy M, Takata G, Chan LS, Shekelle P, Mason W, Wachsman W,
Evidence-based Practice Center under Contract No. 290-97-0019). Wachsman L, et al. Management of acute otitis media. Evidence
Rockville, MD: U.S. Department of Health and Human Services, Report/Technology Assessment No. 15 (Prepared by the Southern
Agency for Healthcare Research and Quality; 2003. AHRQ Publica- California Evidence-based Practice Center under Contract No.
tion No. 03-E011. 290-97-0001). Rockville, MD: Agency for Healthcare Research
[75] Shekelle PG, Morton SC, Maglione MA, Suttorp M, Tu W, Li Z, and Quality; 2001. AHRQ Publication No. 01-E010.
et al. Pharmacological and surgical treatment of obesity. Evidence [85] Shekelle P, Takata G, Chan LS, MangioneeSmith R, Corley PM,
Report/Technology Assessment No. 103 (Prepared by the Southern Morphew T, et al. Diagnosis, natural history, and late effects of otitis
California-RAND Evidence-Based Practice Center, Santa Monica, media with effusion. Evidence Report/Technology Assessment No.
CA, under contract Number 290-02-0003). Rockville, MD: Agency 55 (Prepared by Southern California Evidence-based Practice Cen-
for Healthcare Research and Quality; 2004. AHRQ Publication No. ter under Contract No. 290-97-0001, Task Order No. 4). Rockville,
04-E028-2. MD: Agency for Healthcare Research and Quality; 2003. AHRQ
[76] Hodge W, Barnes D, Schachter H, Pan Y, Lowcock E, Zhang L, Publication No. 03-E023.
et al. Effects of omega-3 fatty acids on eye health. Evidence Re- [86] Hickam DH, Severance S, Feldstein A, Ray L, Gorman P,
port/Technology Assessment No. 117 (Prepared by University of Schuldheis S, et al. The effect of health care working conditions
Ottawa Evidence-based Practice Center under Contract No. 290- on patient safety. Evidence Report/Technology Assessment Number
02-0021). Rockville, MD: Agency for Healthcare Research and 74 (Prepared by Oregon Health & Science University under Con-
Quality; 2005. AHRQ Publication No. 05-E008-2. tract No. 290-97-0018). Rockville, MD: Agency for Healthcare Re-
[77] MacLean CH, Issa AM, Newberry SJ, Mojica WA, Morton SC, search and Quality; 2003. AHRQ Publication No. 03-E024.
Garland RH, et al. Effects of omega-3 fatty acids on cognitive func- [87] Gaynes BN, Gavin N, Meltzer-Brody S, Lohr KN, Swinson T,
tion with aging, dementia, and neurological diseases. Evidence Re- Gartlehner G, et al. Perinatal depression: prevalence, screening ac-
port/Technology Assessment No. 114 (Prepared by the Southern curacy, and screening outcomes. Evidence Report/Technology As-
California Evidence-based Practice Center, under Contract No. sessment No. 119 (Prepared by the RTI-University of North
290-02-0003). Rockville, MD: Agency for Healthcare Research Carolina Evidence-based Practice Center, under Contract No. 290-
and Quality; 2005. AHRQ Publication No. 05-E011-2. 02-0016). Rockville, MD: Agency for Healthcare Research and
[78] MacLean CH, Mojica WA, Morton SC, Pencharz J, Hasenfeld Quality; 2005. AHRQ Publication No. 05-E006-2.
Garland R, Tu W, et al. Effects of omega-3 fatty acids on lipids [88] Holtzman J, Schmitz K, Babes G, Kane RL, Duval S, Wilt TJ, et al.
and glycemic control in type II diabetes and the metabolic syndrome Effectiveness of behavioral interventions to modify physical activity
and on inflammatory bowel disease, rheumatoid arthritis, renal dis- behaviors in general populations and cancer patients and survivors.
ease, systemic lupus erythematosus, and osteoporosis. Evidence Evidence Report/Technology Assessment No. 102 (Prepared by the
Report/Technology Assessment. No. 89 (Prepared by Southern Cal- Minnesota Evidence-based Practice Center, under Contract No. 290-
ifornia/RAND Evidence-based Practice Center, under Contract No. 02-0009). Rockville, MD: Agency for Healthcare Research and
290-02-0003). Rockville, MD: Agency for Healthcare Research and Quality; 2004. AHRQ Publication No. 04-E027-2.
Quality; 2004. AHRQ Publication No. 04-E012-2. [89] Myers ER, Blumrick R, Christian AL, Datta S, Gray RN,
[79] Lewin GA, Schachter HM, Yuen D, Marchant P, Mamaladze V, Kolimaga JT, et al. Management of prolonged pregnancy. Evidence
Tsertsvadze A, et al. Effects of omega-3 fatty acids on child and ma- Report/Technology Assessment No. 53 (Prepared by Duke Evi-
ternal health. Evidence Report/Technology Assessment No. 118 dence-based Practice Center, Durham, NC, under Contract No.
(Prepared by the University of Ottawa Evidence-based Practice 290-97-0014). Rockville, MD: Agency for Healthcare Research
Center, under Contract No. 290-02-0021). Rockville, MD: Agency and Quality; 2002. AHRQ Publication No. 02-E018.
for Healthcare Research and Quality; 2005. AHRQ Publication [90] Bush DE, Ziegelstein RC, Patel UV, Thombs BD, Ford DE,
No. 05-E025-2. Fauerbach JA, et al. Post-myocardial infarction depression. Evi-
[80] Schachter HM, Kourad K, Merali Z, Lumb A, Tran K, Miguelez M, dence Report/Technology Assessment No. 123 (Prepared by the
et al. Effects of omega-3 fatty acids on mental health. Evidence Re- Johns Hopkins University Evidence-based Practice Center under
port/Technology Assessment No. 116 (Prepared by the University of Contract No. 290-02-0018). Rockville, MD: Agency for Healthcare
Ottawa Evidence-based Practice Center, Under Contract No. 290- Research and Quality; 2005. AHRQ Publication No. 05-E018-2.
02-0021). Rockville, MD: Agency for Healthcare Research and [91] Long A, McFadden C, DeVine D, Litt M, Chew P, Kupelnick B,
Quality; 2005. AHRQ Publication No. 05-E022-2. et al. Management of allergic and nonallergic rhinitis. Evidence Re-
[81] Bonis PA, Chung M, Tatsioni A, sun Y, Kupelnick B, port/Technology Assessment No. 54 (Prepared by New England
Lichtenstcin A, et al. Effects of omega-3 fatty acids on organ trans- Medical Center Evidence-based Practice Center under Contract
plantation. Evidence Report/Technology Assessment No. 115 (Pre- No. 290-97-0019). Rockville, MD: Agency for Healthcare Research
pared by Tufts-New England Medical Center Evidence-based and Quality; 2002. AHRQ Publication No. 02-E024.
Practice Center under Contract No. 290-02-0022). Rockville, MD: [92] Balk E, Chung M, Chew P, Ip S, Raman G, Kupelnick B, et al.
Agency for Healthcare Research and Quality; 2005. AHRQ Publica- Effects of soy on health outcomes. Evidence Report/Technology As-
tion No. 05-E012-2. sessment No. 126 (Prepared by Tufts-New England Medical Center
[82] Schachter H, Reisman J, Tran K, Dales B, Kourad K, Barnes D, Evidence-based Practice Center under Contract No. 290-02-0022).
et al. Health effects of omega-3 fatty acids on asthma. Evidence Rockville, MD: Agency for Healthcare Research and Quality;
Report/Technology Assessment No. 91 (Prepared by University of 2005. AHRQ Publication No. 05-E024-2.
Ottawa Evidence-based Practice Center under Contract No. 290- [93] McCrory DC, Samsa GP, Hamilton BB, Govert JA, Matchar DB,
02-0021). Rockville, MD: Agency for Healthcare Research and Goslin RE, et al. Treatment of pulmonary disease following cervical
Quality; 2004. AHRQ Publication No. 04-E013-2. spinal cord injury. Evidence Report/Technology Assessment Num-
[83] Wang C, Chung M, Lichtenstein A, Balk E, Kupelnick B, Devine D, ber 27 (Prepared by the Duke Evidence-based Practice Center under
et al. Effects of omega-3 fatty acids on cardiovascular disease. Contract No. 290-97-0014). Rockville, MD: Agency for Healthcare
Evidence Report/Technology Assessment No. 94 (Prepared by Research and Quality; 2001. AHRQ Publication No. 01-E014.
M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178 177

[94] DeForge D, Blackmer J, Moher D, Garritty C, Cronin V, Yazdi F, 290-02-0024). Rockville, MD: Agency for Healthcare Research
et al. Sexuality and reproductive health following spinal cord injury. and Quality; 2006. AHRQ Publication No. 06-E007.
Evidence Report/Technology Assessment No. 109 (Prepared by the [105] Levine C, Armstrong K, Chopra S, Estok R, Zhang S, Ross S. Diag-
University of Ottawa Evidence-based Practice Center under Con- nosis and management of specific breast abnormalities. Evidence
tract No. 290-02-0021). Rockville, MD: Agency for Healthcare Re- Report/Technology Assessment No. 33 (Prepared by MetaWorks,
search and Quality; 2004. AHRQ Publication No. 05-E003-2. Inc., Boston, MA under Contract No. 290-97-0016). Rockville,
[95] Ranney L, Melvin C, Lux LJ, McClain E, Morgan L, Lohr KN, et al. MD: Agency for Healthcare Research and Quality; 2001. AHRQ Pub-
Tobacco use: prevention, cessation, and control. Evidence Report/ lication No. 01-E046.
Technology Assessment No. 140 (Prepared by the RTI-University [106] Matchar DB, Thakur ME, Grossman I, McCrory Dc, Orlando LA,
of North Carolina Evidence-based Practice Center under Contract Steffens DC, et al. Testing for cytochrome P450 polymorphisms
No. 290-02-0016). Rockville, MD: Agency for Healthcare Research in adults with non-psychotic depression treated with selective sero-
and Quality; 2006. AHRQ Publication No. 06-E015. tonin reuptake inhibitors (SSRIs). Evidence Report/Technology As-
[96] Kiddoo D, Klassen TP, Lang ME, Friesen C, Russell K, Spooner C, sessment No. 146 (Prepared by the Duke Evidence-based Practice
et al. The effectiveness of different methods of toilet training for bowel Center under Contract No. 290-02-0025). Rockville, MD: Agency
and bladder control. Evidence Report/Technology Assessment No. for Healthcare Research and Quality; 2006. AHRQ Publication
147 (Prepared by the University of Alberta Evidence-based Practice No. 07-E002.
Center, under contract number 290-02-0023). Rockville, MD: [107] Seidenfeld J, Samson DJ, Bonnell CJ, Ziegler KM, Aronson N.
Agency for Healthcare Research and Quality; 2006. AHRQ Publica- Management of small cell lung cancer. Evidence Report/Technology
tion No. 07-E003. Assessment No. 143 (Prepared by Blue Cross and Blue Shield
[97] Kane RL, Saleh KJ, Wilt TJ, Bershadsky B, cross WW III, Association Technology Evaluation Center Evidence-based Practice
MacDonald RM, et al. Total knee replacement. Evidence Report/ Center under Contract No. 290-02-0026). Rockville, MD: Agency
Technology Assessment No. 86 (Prepared by the Minnesota Evi- for Healthcare Research and Quality; 2006. AHRQ Publication No.
dence-based Practice Center, Minneapolis, MN). Rockville, MD: 06-E016.
Agency for Healthcare Research and Quality; 2003. AHRQ Publica- [108] Myers ER, Havrilesky LJ, Kulasingam SL, Sanders GD, Cline KE,
tion No. 04-E006-2. Gray RN, et al. Genomic tests for ovarian cancer detection and man-
[98] Viswanathan M, Hartmann K, McKoy N, Stuart G, Rankins N, agement. Evidence Report/Technology Assessment No. 145 (Pre-
Thieda P, et al. Management of uterine fibroids: an update of the ev- pared by the Duke University Evidence-based Practice Center
idence. Evidence Report/Technology Assessment No. 154 (Prepared under Contract No. 290-02-0025). Rockville, MD: Agency for
by RTI International-University of North Carolina Evidence-based Healthcare Research and Quality; 2006. AHRQ Publication No.
Practice Center under Contract No. 290-02-0016). Rockville, MD: 07-E001.
Agency for Healthcare Research and Quality; 2007. AHRQ Publica- [109] ECRI Institute Evidence-based Practice Center. Quality item check-
tion No. 07-E011. list for SINGLE-GROUP studies (unpublished). 2008.
[99] Guise J-M, McDonagh MS, Hashima J, Kraemer DF, Eden KB, [110] University Of Alberta Evidence-Based Practice Centre. Quality As-
Berlin M, et al. Vaginal birth after cesarean (VBAC). Evidence sessment Tool for Observational Analytical Studies.
Report/Technology Assessment No. 71 (Prepared by the Oregon [111] National Institute for Health and Clinical Excellence. The guide-
Health & Science University Evidence-based Practice Center under lines manual. London, UK: National Institute for Health and Clini-
Contract No 290-97-0018). Rockville, MD: Agency for Healthcare cal Excellence; 2007. Available at www.nice.org.uk. Accessed June
Research and Quality; 2003. AHRQ Publication No. 03-E018. 9, 2011.
[100] Velmahos GC, Kern J, Chan L, Oder D, Murray JA, Shekelle P, et al. [112] Khan KS, ter Riet G, Glanville J, et al, editors. Undertaking system-
Prevention of venous thromboembolism after injury. Evidence atic reviews of research on effectiveness. 2nd edition. York, UK:
Report/ Technology Assessment No. 22 (Prepared by Southern NHS Centre for Reviews and Dissemination; 2001.
California Evidence-based Practice Center/RAND under Contract [113] Johnston P, Wilkinson K. Enhancing validity of critical tasks se-
No. 290-97-0001). Rockville, MD: Agency for Healthcare Research lected for college and university program portfolios. National Fo-
and Quality; 2000. AHRQ Publication No. 01-E004. rum Teach Educ J 2009;19(3):1e6.
[101] Chan LS, Kipke MD, Schneir A, Iverson E, Warf C, Limbos MA, [114] Crisp AH, Callender JS, Halek C, Hsu LKG. Long-term mortality in
et al. Preventing violence and related health-risking social behaviors anorexia nervosa: a 20-year follow-up of the St. George’s and Aber-
in adolescents. Evidence Report/Technology Assessment No. 107 deen cohorts. Br J Psychiatry 1992;161:104e7. PMID: 1638303.
(Prepared by the Southern California Evidence-based Practice Cen- [115] Daniel M, Green LW, Marion SA, Gamble D, Herbert CP,
ter under Contract No. 290-02-2003). Rockville, MD: Agency for Hertzman C, et al. Effectiveness of community-directed diabetes
Healthcare Research and Quality; 2004. AHRQ Publication No. prevention and control in a rural Aboriginal population in British
04-E032-2. Columbia, Canada. Soc Sci Med 1999;48:815e32. PMID:
[102] Beach J, Rowe BH, Blitz S, Crumley E, Hooton N, Russell K, et al. 10190643.
Diagnosis and management of work-related asthma. Evidence [116] Hedderson MM, Weiss NS, Sacks DA, Petttt DJ, Selby JV,
Report/Technology Assessment No. 129 (Prepared by the University Quesenberry CP, et al. Pregnancy weight gain and risk of neonatal
of Alberta Evidence-based Practice Center, under Contract No. 290- complications. Obstet Gynecol 2006;108:1153e61. PMID: 17077237.
02-0023). Rockville, MD: Agency for Healthcare Research and [117] Di Lieto A, Iannotti F, De Falco M, Staibano S, Pollio F, Ciociola F,
Quality; 2005. AHRQ Publication No. 06-E003-2. et al. Immunohistochemical detection of insulin-like growth factor
[103] Coulter ID, Hardy ML, Favreau JT, Elfenbaum PD, Morton SC, type I receptor and uterine volume changes in gonadotropin-
Roth EA, et al. Mind-body interventions for gastrointestinal condi- releasing hormone analog-treated uterine leiomyomas. Am J Obstet
tions. Evidence Report/Technology Assessment No. 40 (Prepared by Gynecol 2003;188:702e6. PMID: 12634644.
Southern California Evidence-based Practice Center/RAND under [118] Coleman FH. Safety and efficacy of combined ritodrine and magne-
Contract No. 290-97-0001). Rockville, MD: Agency for Healthcare sium sulfate for preterm labor: a method for reduction of complica-
Research and Quality; 2001. AHRQ Publication No. 01-E030. tions. Am J Perinatol 1990;7(4):366e9. PMID: 2222631.
[104] Hersh WR, Hickam DH, Severance SM, Dana TL, Krages KP, [119] Schindl M, Birner P, Reingrabner M, Joura EA, Husslein P,
Helfand M. Telemedicine for the medicare population: update. Langer M. Elective cesarean section vs. spontaneous delivery:
Evidence Report/Technology Assessment No. 131 (Prepared by a comparative study of birth experience. Acta Obstet Gynecol Scand
the Oregon Evidence-based Practice Center under Contract No. 2003;82:834e40. PMID: 12911445.
178 M. Viswanathan, N.D. Berkman / Journal of Clinical Epidemiology 65 (2012) 163e178

[120] Fouad MN, Kiefe CI, Bartolucci AA, Burst NM, Ulene V, [123] Van Ham MAPC, van Dongen PWJ, Mulder J. Maternal consequences
Harvey MR. A hypertension control program tailored to un- of caesarean section. A retrospective study of intra-operative and
skilled and minority workers. Ethn Dis 1997;7:191e9. PMID: postoperative maternal complications of caesarean section during
9467701. a 10-year period. Eur J Obstet Gynecol Reprod Biol 1997;
[121] Baker DW, Gazmararian JA, Williams MV, Scott T, Parker RM, 74(1):1e6. PMID: 9243191.
Green D, et al. Functional health literacy and the risk of hospital ad- [124] Blood E, Spratt KF. Disagreement on agreement: two alternative
mission among Medicare managed care enrollees. Am J Public agreement coefficients. Paper 186-2007 SAS Global Forum. 2007.
Health 2002;92:1278e83. PMID: 12144984. [125] Walter SD, Eliasziw M, Donner A. Sample size and optimal designs
[122] Kinney TR, Helms RW, O’Branski EE, Ohene-Frempong K, for reliability studies. Stat Med 1998 Jan 15;17(1):101e10.
Wang W, Daeschner C, et al. Safety of hydroxyurea in children with [126] Feinstein AR, Cicchetti DV. High agreement but low kappa: I. The
sickle cell anemia: results of the HUG-KIDS Study, a phase I/II problems of two paradoxes. J Clin Epidemiol 1990;43:543e9.
trial. Blood 1999;94:1550e4. PMID: 10477679. PMID: 2348207.

You might also like