Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

ORIGINAL CONTRIBUTION

Medicine Residents’ Understanding


of the Biostatistics and Results
in the Medical Literature
Donna M. Windish, MD, MPH Context Physicians depend on the medical literature to keep current with clinical in-
Stephen J. Huot, MD, PhD formation. Little is known about residents’ ability to understand statistical methods or
Michael L. Green, MD, MSc how to appropriately interpret research outcomes.
Objective To evaluate residents’ understanding of biostatistics and interpretation of

P
HYSICIANS MUST KEEP CURRENT
research results.
with clinical information to
practice evidence-based medi- Design, Setting, and Participants Multiprogram cross-sectional survey of inter-
nal medicine residents.
cine (EBM). In doing so, most
prefer to seek evidence-based summa- Main Outcome Measure Percentage of questions correct on a biostatistics/study
ries, which give the clinical bottom design multiple-choice knowledge test.
line,1 or evidence-based practice guide- Results The survey was completed by 277 of 367 residents (75.5%) in 11 residency
lines.1-3 Resources that maintain these programs. The overall mean percentage correct on statistical knowledge and inter-
information summaries, however, cur- pretation of results was 41.4% (95% confidence interval [CI], 39.7%-43.3%) vs 71.5%
rently include a limited number of com- (95% CI, 57.5%-85.5%) for fellows and general medicine faculty with research train-
ing (P⬍.001). Higher scores in residents were associated with additional advanced
mon conditions.4 Thus, to answer many degrees (50.0% [95% CI, 44.5%-55.5%] vs 40.1% [95% CI, 38.3%-42.0%]; P⬍.001);
of their clinical questions, physicians prior biostatistics training (45.2% [95% CI, 42.7%-47.8%] vs 37.9% [95% CI, 35.4%-
need to access reports of original re- 40.3%]; P=.001); enrollment in a university-based training program (43.0% [95%
search. This requires the reader to criti- CI, 41.0%-45.1%] vs 36.3% [95% CI, 32.6%-40.0%]; P=.002); and male sex (44.0%
cally appraise the design, conduct, and [95% CI, 41.4%-46.7%] vs 38.8% [95% CI, 36.4%-41.1%]; P = .004). On indi-
analysis of each study and subse- vidual knowledge questions, 81.6% correctly interpreted a relative risk. Residents were
quently interpret the results. less likely to know how to interpret an adjusted odds ratio from a multivariate regres-
Several surveys in the 1980s dem- sion analysis (37.4%) or the results of a Kaplan-Meier analysis (10.5%). Seventy-five
percent indicated they did not understand all of the statistics they encountered in jour-
onstrated that practicing physicians,
nal articles, but 95% felt it was important to understand these concepts to be an in-
particularly those with no formal edu- telligent reader of the literature.
cation in epidemiology and biostatis-
tics, had a poor understanding of com- Conclusions Most residents in this study lacked the knowledge in biostatistics needed
to interpret many of the results in published clinical research. Residency programs should
mon statistical tests and limited ability include more effective biostatistics training in their curricula to successfully prepare
to interpret study results.5-7 Many phy- residents for this important lifelong learning skill.
sicians likely have increased difficulty JAMA. 2007;298(9):1010-1022 www.jama.com
today because more complicated sta-
tistical methods are being reported in
the medical literature.8 They may be schools currently provide some formal ies related to their patients’ problems and
able to understand the analysis and in- teaching of basic statistical concepts.9 As apply knowledge of study designs and
terpretation of results in only 21% of part of the Accreditation Council for statistical methods to the appraisal of
research articles.8 Graduate Medical Education’s practice-
Author Affiliations: Department of Internal Medi-
Educators have responded by increas- based learning and improvement com- cine, Yale University School of Medicine, New Ha-
ing training in critical appraisal and bio- petency, residents must demonstrate ven, Connecticut.
Corresponding Author: Donna M. Windish, MD, MPH,
statistics throughout the continuum of ability in “locating, appraising, and as- Yale Primary Care Residency Program, 64 Robbins St,
medical education. Many medical similating evidence from scientific stud- Waterbury, CT 06708 (donna.windish@yale.edu).

1010 JAMA, September 5, 2007—Vol 298, No. 9 (Reprinted) ©2007 American Medical Association. All rights reserved.

Downloaded From: https://jamanetwork.com/ on 05/03/2021


RESIDENTS’ UNDERSTANDING OF PUBLISHED BIOSTATISTICS AND RESULTS

clinical studies.”10 Most residency pro- fidence questions about interpreting


Table 1. Statistical Methods Used in 239
grams address this competency through and assessing statistical concepts; and Original Research Articles in 6 General
EBM curricula or journal clubs.11-13 In (4) a 20-question biostatistics knowl- Medical Journals, 2005
2000, the majority of these programs in- edge test that assessed understanding Type of Test No. (%)
cluded training in appraisal of studies of statistical methods, study design, and Descriptive statistics a 219 (91.6)
and study conduct, but fewer specifi- interpretation of study results. Statis- Simple statistics 120 (50.2)
␹2 Analysis 70 (29.3)
cally addressed the selection and inter- tical attitudes and confidence ques- t Test 48 (20.1)
pretation of statistical tests.11,14 In addi- tions were adapted from surveys on the Kaplan-Meier analysis 48 (20.1)
Wilcoxon rank sum test 38 (15.9)
tion, the majority of published Assessment Resource Tools for Improv- Fisher exact test 33 (13.8)
assessments of residents’ knowledge and ing Statistical Thinking (ARTIST) Web Analysis of variance 21 (8.8)
Correlation 16 (6.7)
skills in EBM were performed at single site, which is a resource for teaching sta- Multivariate statistics 164 (68.6)
programs, were conducted in the con- tistical literacy, reasoning, and think- Cox proportional hazards 64 (26.8)
text of determining the impact of a spe- ing.16 Attitudes regarding statistics were Multiple logistic regression 54 (22.6)
Multiple linear regression 7 (2.9)
cific curriculum, evaluated critical ap- rated on a 5-point Likert scale. Confi- Other regression analyses b 38 (15.9)
praisal skills more commonly than dence questions were assessed using a None 5 (2.1)
Other methods, techniques,
biostatistics, and found that residents 5-point scale in which 1 indicated no or strategies
scored well below EBM “experts” on confidence and 5 indicated complete Intention-to-treat analysis 42 (17.6)
Incidence/prevalence 39 (16.3)
evaluation instruments. 15 We per- confidence. The remaining 20 knowl- Relative risk/risk ratio 29 (12.2)
formed a multiprogram assessment of edge test questions addressed under- Sensitivity analyses 21 (8.8)
residents’ biostatistics knowledge and in- standing of statistical techniques, study Sensitivity/specificity 15 (6.3)
a Descriptive statistics included mean, median, fre-
terpretation of study results using a new design, and interpretation of study re- quency, standard deviation, and interquartile range.
b Other regression analyses included weighted logistic re-
instrument developed for this study. sults most commonly represented in
gression, unconditional logistic regression, conditional
our journal review. These questions logistic regression, longitudinal regression, Poisson re-
METHODS were multiple-choice, clinically ori- gression, pooled logistic regression, nonlinear regres-
sion, meta-regression, negative binomial regression, and
Survey Development ented with a case vignette, and re- generalized estimating equations.
We developed an instrument to re- quired no calculations. Two questions
flect the statistical methods and re- were adapted from a study of Danish
sults most commonly represented in physicians’ statistical knowledge.7 Seven correctly and 1 faculty member cor-
contemporary research studies questions were adapted from course rectly answered 19 questions. This re-
(APPENDIX). Thus, we reviewed all 239 materials used in statistics courses at the sulted in an overall mean score of 94%.
original articles published from Janu- Johns Hopkins Bloomberg School of Incorrect responses did not favor any
ary to March of 2005 in each issue of 6 Public Health.17 The remaining ques- particular question. Residents an-
general medical journals (American tions were developed by one of the swered 53% of questions correctly.
Journal of Medicine, Annals of Internal study authors (D.M.W.). The knowl- Based on feedback, 1 question was
Medicine, BMJ, JAMA, Lancet, and New edge questions addressed research vari- modified to improve clarity, 3 ques-
England Journal of Medicine) and sum- able types, statistical methods, confi- tions were eliminated to avoid dupli-
marized the frequency of statistical dence intervals, P values, sensitivity and cating similar concepts, and 1 ques-
methods described (TABLE 1). From specificity, power and sample size, tion was added to further assess
this review, we developed questions that study design, and interpretation of interpretation of results. Therefore, the
focused on identifying and interpret- study results. final version of the test consisted of 20
ing the results of the most frequently questions.
occurring simple statistical methods Pilot Testing of Biostatistics
(eg, ␹2, t test, analysis of variance) and Knowledge Test Target Population
multivariate analyses (eg, Cox propor- The original test contained 22 knowl- and Survey Administration
tional hazards regression, multiple lo- edge questions and was pilot tested with We conducted an anonymous cross-
gistic regression). 5 internal medicine faculty with ad- sectional survey from February through
vanced training in epidemiology and July 2006 of 11 internal medicine resi-
Survey Instrument biostatistics and 12 primary care inter- dency programs in Connecticut, in-
The survey (Appendix) contained 4 sets nal medicine residents at 1 residency cluding 7 traditional internal medi-
of questions: (1) 11 demographic ques- program. Faculty reviewed the instru- cine programs, 2 primary care medicine
tions that included age, sex, current ment for content validity, completed the programs, 1 medicine/pediatrics pro-
training level, past training in bio- test, and provided feedback. Resi- gram, and 1 medicine/preventive medi-
statistics and EBM, and current journal- dents completed the test and provided cine program. We initially contacted all
reading practices; (2) 5 attitude written and oral feedback. Four of the 15 internal medicine residency pro-
questions regarding statistics; (3) 4 con- 5 faculty answered 21 of 22 questions grams in Connecticut to ask for their
©2007 American Medical Association. All rights reserved. (Reprinted) JAMA, September 5, 2007—Vol 298, No. 9 1011

Downloaded From: https://jamanetwork.com/ on 05/03/2021


RESIDENTS’ UNDERSTANDING OF PUBLISHED BIOSTATISTICS AND RESULTS

participation in the study. All programs lated the percentage of residents who RESULTS
were successfully contacted and ex- agreed or strongly agreed with each at- Training Program Characteristics
pressed interest. However, 3 pro- titudinal question. We determined the The 11 targeted training programs had
grams could not accommodate the percentage of respondents with fair to 532 residents, with a mean of 53.6 train-
study because of scheduling conflicts, high confidence for each confidence ees (range, 12-118).18,19 In compari-
and 1 program was not included be- question and the mean confidence score son, the 388 internal medicine train-
cause its residents (medicine/ped- based on the sum of all 4 questions. ing programs in the United States have
iatrics) were distributed to different Correlation analyses were per- a total of 21 885 residents, with a mean
training sites and therefore were not formed to test for multicollinearity be- of 56.4 trainees (range, 4-170) (P=.76
present at the conferences used for the tween 3 sets of factors we hypoth- compared to targeted programs).20 The
survey. esized might be highly correlated study programs had 41.9% women resi-
Included residencies were both uni- (training outside of the United States dents, compared with 42.1% nation-
versity affiliated (7 programs) and com- and years since medical school gradu- ally (P = .96), and 49.9% of residents
munity based (4 programs). Residents ation; training level and age; and past with training outside of the United
at all postgraduate levels of training biostatistics training and past epidemi- States vs 52.3% nationally (P = .51).19
were invited to participate. Oral con- ology training). Bivariate analyses were Comparing targeted programs with all
sent was obtained from each partici- performed to identify factors that might internal medicine programs, no statis-
pant after providing a description of the be associated with knowledge scores. tically significant differences were seen
survey’s purpose. The survey was ad- Candidate variables included sex, age, for postgraduate year 1 trainees in mean
ministered during the first 25 minutes academic affiliation of residency pro- duty hours per week (61.9 vs 65.2,
of an inpatient noon conference lec- gram, advanced degrees, years since P=.13), mean consecutive work hours
ture for current residents. After all ques- medical school graduation, training out- (30 vs 27.5, P = .09), and mean num-
tionnaires were collected, the remain- side of the United States, current level ber of days off per week (1.3 vs 1.2,
der of the time was devoted to a seminar of training, past biostatistics training, P=.31).18 Targeted programs also did
in statistical methods and interpreta- past epidemiology training, past EBM not differ in these characteristics from
tion of the literature. Four residency training, and currently reading medi- the remaining 4 Connecticut training
programs also allowed us to survey their cal journals. We also tested for effect programs.
entering intern classes during their ori- modification for pairs of factors includ-
entations. To provide data for validity ing past biostatistics training and past Respondent Characteristics
testing, an additional 10 faculty and fel- EBM training; past biostatistics train- Three hundred sixty-seven residents in
lows trained in clinical investigation ing and past epidemiology training; and the 11 targeted programs were on rota-
also completed the final survey. The past biostatistics training and sex. The tions that would make them available to
Yale University human investigation results of the correlation, bivariate, and attend their respective noon confer-
committee approved the study proto- effect modification analyses were used ences on the day of the survey. Of these,
col. to determine which demographic vari- 309 (84.2%) were in attendance. Of the
ables to include in the multivariable total available residents, 277 (75.5%)
Analysis model. Decisions to include factors in completed the assessment. The re-
In addition to assessing the content va- the multivariable regression analysis sponse rate for individual programs
lidity, the psychometric properties of were based on the strength of corre- ranged from 28.1% to 80%. No differ-
the 20-question knowledge test were lated factors (r ⬍0.75) or a P value ⬍.05 ences in response rates or attendance
determined by assessing internal con- on bivariate analyses. Forward step- were seen based on sex, level of train-
sistency using Cronbach ␣. Discrimi- wise regression was subsequently used ing, or past training outside of the United
native validity was assessed by com- to identify which demographic factors States. TABLE 2 lists the respondents’
paring the difference in mean scores were independently associated with bio- demographic characteristics. Approxi-
obtained between residents and re- statistics knowledge scores. mately equal numbers of men and
search-trained fellows and faculty using To adjust for multiple pairwise com- women were represented. Fifty-eight per-
the t test. parisons, a 2-sided level of statistical sig- cent were enrolled in traditional inter-
The biostatistics knowledge test was nificance was set at P⬍.01 using a Bon- nal medicine programs, 76.5% partici-
scored by determining the percentage ferroni correction. With a sample size pated in university-based programs,
of questions correct, weighting each of 277 and a P value of .01, the study 50.6% had some training outside of the
question equally. Missing values were had 80% power to detect a 4.4% differ- United States, and 14.8% had advanced
counted as incorrect responses. The ence in mean knowledge scores. All degrees. More than 68% of respondents
t test or a 1-way analysis of variance was analyses were performed using Stata re- had some training in biostatistics, with
used to compare survey scores by re- lease 8.2 (StataCorp, College Station, approximately 70% of this training oc-
spondent characteristics. We calcu- Texas). curring during medical school.
1012 JAMA, September 5, 2007—Vol 298, No. 9 (Reprinted) ©2007 American Medical Association. All rights reserved.

Downloaded From: https://jamanetwork.com/ on 05/03/2021


RESIDENTS’ UNDERSTANDING OF PUBLISHED BIOSTATISTICS AND RESULTS

Psychometric Properties
Table 2. Characteristics of the 277 Participants
of the Knowledge Test
Characteristic No. (%) a
The survey instrument had high inter-
Sex
nal consistency (Cronbach ␣ = 0.81). Men 143 (52.0)
Fellows and general medicine faculty Women 134 (48.0)
with advanced training in biostatistics Age range, y
had a significantly higher score than 21-25 25 (9.2)
residents (mean percentage correct, 26-30 166 (60.8)
71.5% [95% confidence interval {CI}, 31-35 62 (22.7)
57.5%-85.5%] vs 41.1% [95% CI, ⱖ36 20 (7.3)
39.7%-43.3%]; P ⬍ .001), indicating Other advanced degrees 41 (14.8)
good discriminative validity. Doctor of philosophy (PhD) 11 (4.0)
Master of public health (MPH)/master of health science (MHS) 16 (5.8)
Knowledge of Statistical Methods
Master of science (MSc) 12 (4.4)
and Results Other 4 (1.5)
The overall mean resident knowledge None 235 (85.1)
score was 41.1% (SD, 15.2%; range, Years since medical school graduation
10%-90%). Residents scored highest in ⬍1 105 (35.0)
recognition of double-blind studies 1-3 72 (26.8)
(87.4% [95% CI, 83.5%-91.3%] 4-10 81 (30.1)
answering correctly) and interpreta- ⱖ11 11 (4.1)
tion of relative risk (81.6% [95% CI, Training outside of the United States 139 (50.6)
77.0%-86.2%] answering correctly) College 36 (13.0)
(T ABLE 3). They were least able to Medical school 107 (38.6)
interpret the results of a Kaplan-Meier Residency 33 (11.9)
analysis, with 10.5% (95% CI, 6.9%- Other 7 (2.5)
14.1%) answering correctly. Only None 138 (49.4)
37.4% (95% CI, 31.9%-43.3%) under- Academic affiliation of program
University based 212 (76.5)
stood how to interpret an adjusted
Community based 65 (23.5)
odds ratio from a multivariate regres-
Current residency training program type
sion analysis, while 58.8% (95% CI, Traditional/categorical medicine 161 (58.3)
53.0%-64.6%) could interpret the Primary care medicine 60 (21.7)
meaning of a P value. Medicine/pediatrics 12 (4.4)
Medicine/preventive medicine 7 (2.5)
Factors Associated
Preliminary/transitional year 36 (13.0)
With Statistical Knowledge Current level of training
Training outside of the United States Entering intern 103 (37.3)
had moderate correlation with years Experienced intern b 72 (26.1)
since medical school graduation Second-year resident 42 (15.2)
(r = 0.59), as did past epidemiology Third-year resident 45 (16.3)
training with past biostatistics train- Fourth-year resident 6 (6.2)
ing (r =0.53). Training level had a fair Chief resident 8 (2.9)
correlation with age (r=0.46). No effect Previous training/coursework in biostatistics 190 (68.8)
modification was seen for the 3 sets of Location of biostatistics training
College 30 (15.9)
factors assessed. In bivariate analyses,
Medical school 132 (69.5)
differences in scores were seen based
Residency 6 (3.2)
on residency program type, with medi-
Other 26 (13.7)
cine/pediatric residents scoring the
Previous training/coursework in epidemiology 190 (68.8)
highest (TABLE 4). Residents with ad-
Previous training/coursework in evidence-based medicine c 162 (58.5)
vanced degrees performed better than
Regularly reads medical journals 187 (68.8)
those without advanced training (50.0% a Percentages may not total 100% due to missing data or multiple responses.
[95% CI, 44.5%-55.5%] vs 40.1% [95% b Trainees in month 8 to 12 of their intern year.
c Evidence-based medicine is defined as the integration of the best research evidence with patients’ values and clinical
CI, 38.3%-42.0%]; P ⬍ .001). Statisti- circumstances in clinical decision making. This is in contrast to biostatistics, which is the scientific use of quantitative
cally significant higher scores were also information to describe or draw inferences about natural phenomena, and epidemiology, which is the study of pat-
terns, causes, and control of disease in groups of people.
seen in residents who were just enter-
©2007 American Medical Association. All rights reserved. (Reprinted) JAMA, September 5, 2007—Vol 298, No. 9 1013

Downloaded From: https://jamanetwork.com/ on 05/03/2021


RESIDENTS’ UNDERSTANDING OF PUBLISHED BIOSTATISTICS AND RESULTS

40.8%-46.3%] vs 39.3% [95% CI,


Table 3. Percentages of Correct Answers for the Knowledge-Based Questions
37.0%-41.6%]; P =.02). Those who re-
Question
No. a Objective Correct (95% CI), % ported fair to high confidence in inter-
1a Identify continuous variable 43.7 (37.8-49.5) preting a P value were more likely to
1b Identify ordinal variable 41.5 (35.7-47.3) correctly interpret its meaning (62.8%
1c Identify nominal variable 32.9 (27.3-38.4) [95% CI, 56.8%-67.2%] vs 38.2% [95%
2 Recognize a case-control study 39.4 (33.6-45.1) CI, 24.3%-51.7%]; P=.006). No differ-
3 Recognize purpose of double-blind studies 87.4 (83.5-91.3) ences were seen in a resident’s ability
4a Identify ANOVA 47.3 (41.4-53.2) to appropriately identify the correct sta-
4b Identify ␹2 analysis 25.6 (20.5-30.8) tistical procedure used based on their
4c Identify t test 58.1 (52.3-63.9) confidence to do so.
5 Recognize definition of bias 46.6 (40.7-52.4)
6 Interpret the meaning of P value ⬎.05 58.8 (53.0-64.6) COMMENT
7 Identify Cox proportional hazard regression 13.0 (9.0-17.0) In this multiprogram survey of inter-
8 Interpret standard deviation 50.2 (42.3-56.1) nal medicine residents’ confidence in,
9 Interpret 95% CI and statistical significance 11.9 (8.0-15.7) attitudes toward, and knowledge of sta-
10 Recognize power, sample size, and significance-level 30.3 (24.9-35.7) tistical methods and interpretation of
relationship
research results, 95% believed that it
11 Determine which test has more specificity 56.7 (50.8-62.5)
was important to understand these con-
12 Interpret an unadjusted odds ratio 39.0 (33.3-44.7)
cepts to be an intelligent reader of the
13 Interpret odds ratio in multivariate regression analysis 37.4 (31.9-43.3)
literature, yet three-fourths of resi-
14 Interpret relative risk 81.6 (77.0-86.2)
dents acknowledged low confidence in
15 Determine strength of evidence for risk factors 17.0 (12.6-21.4)
understanding the statistics they en-
16 Interpret Kaplan-Meier analysis results 10.5 (6.9-14.1)
Abbreviations: ANOVA, analysis of variance; CI, confidence interval.
counter in the medical literature. This
a See Appendix. lack of confidence was validated by their
low knowledge scores, in which on av-
erage only 8 of 20 questions were an-
ing residency, had prior biostatistics like to learn more about statistics (77%). swered correctly. Although past in-
training, were enrolled in a university- Seventy-five percent reported they did struction in biostatistics and advanced
based training program, and were men not understand all of the statistics they degrees were associated with better per-
(Table 4). encountered in the literature, whereas formance, knowledge scores appeared
Using forward stepwise regression, only 15% felt that they do not trust sta- to decline with progression through
5 factors were found to be indepen- tistics “because it is easy to lie.” More training.
dently associated with knowledge than 58% of respondents indicated that The poor knowledge in biostatistics
scores (Table 4). An advanced degree they use statistical information in form- and interpretation of study results
was associated with an absolute in- ing opinions or when making deci- among residents in our study likely re-
crease of 9.2% questions correct after sions in medical care. flects insufficient training. Nearly one-
adjustment for other factors (P⬍.001). The mean confidence score in un- third of trainees indicated that they
Successive years since medical school derstanding certain statistical con- never received biostatistics teaching at
graduation were associated with de- cepts was 11.4 (SD, 2.7) (maximum any point in their career. When train-
creasing knowledge scores, with 11 possible confidence score, 20). The ma- ing did occur, the majority of instruc-
years or more postgraduation associ- jority of residents reported fair to com- tion took place during undergraduate
ated with a 12.3% absolute decrease in plete confidence in understanding P val- medical education and was not rein-
score compared with less than 1 year ues (88%). Fewer were confident in forced in residency. The most recent
postgraduation. Male sex, belonging to interpreting results of statistical meth- comprehensive survey of medical
a university-based program, and past ods used in research (68%), identify- school biostatistics teaching was con-
biostatistics training were all associ- ing factors influencing a study’s power ducted in the 1990s and found that
ated with higher scores. (55%), or assessing if a correct statis- more than 90% of medical schools
tical procedure was used (38%). focused their biostatistics teaching in
Attitudes and Confidence Respondents with higher confi- the preclinical years without later in-
The majority of residents agreed or dence in their statistical knowledge (a struction and that the depth and
strongly agreed that to be an intelli- score higher than the mean confi- breadth of this education varied greatly
gent reader of the literature it is nec- dence score) performed better on the among schools. 21 That review re-
essary to know something about sta- knowledge questions than those with ported that familiar concepts such as
tistics (95%) and indicated they would lower confidence (43.6% [95% CI, P values, t tests, and ␹2 analyses were
1014 JAMA, September 5, 2007—Vol 298, No. 9 (Reprinted) ©2007 American Medical Association. All rights reserved.

Downloaded From: https://jamanetwork.com/ on 05/03/2021


RESIDENTS’ UNDERSTANDING OF PUBLISHED BIOSTATISTICS AND RESULTS

Table 4. Knowledge Scores by Resident Characteristics a


Bivariate Analyses Multiple Linear Regression

Characteristic Mean Correct, % (95% CI) P Value Score Difference, % b (95% CI) P Value
Sex
Women 38.8 (36.4 to 41.1) c 1 [Reference]
.004 .01
Men 44.0 (41.4 to 46.7) 4.4 (0.97 to 7.9)
Age range, y
21-25 44.4 (38.8 to 50.0)
26-30 41.7 (39.5 to 44.0) d
.46
31-35 41.0 (37.0 to 45.0)
ⱖ36 37.3 (30.4 to 44.2)
Other advanced degrees
No 40.1 (38.3 to 42.0) c 1 [Reference]
⬍.001 ⬍.001
Yes 50.0 (44.5 to 55.5) 9.2 (4.2 to 14.3)
Years since medical school graduation
⬍1 45.2 (42.4 to 48.0) 1 [Reference]
1-3 42.2 (38.6 to 45.8) d −2.3 (−6.5 to 2.0) .29
⬍.001
4-10 36.8 (33.6 to 40.0) −4.7 (−9.4 to 0.01) .05
ⱖ11 34.5 (27.9 to 41.1) −12.3 (−22.2 to −3.3) .007
Training outside of the United States
No 45.2 (42.7 to 47.8) c
⬍.001
Yes 37.9 (35.4 to 40.3)
Academic affiliation of program
Community based 36.3 (32.6 to 40.0) c 1 [Reference]
.002 .02
University based 43.0 (41.0 to 45.1) 5.6 (0.93 to 10.2)
Current level of training
Entering intern 45.6 (42.8 to 48.4)
Experienced intern e 39.2 (35.7 to 42.7)
Second-year resident 39.3 (34.9 to 43.7) d
.01
Third-year resident 38.4 (33.6 to 43.2)
Fourth-year resident 43.3 (30.3 to 56.3)
Chief resident 38.1 (31.0 to 45.2)
Current residency training program type
Traditional/categorical medicine 39.8 (37.5 to 42.1)
Primary care medicine 42.4 (39.0 to 45.8)
Medicine/pediatrics 54.6 (47.3 to 61.9) .003 d
Medicine/preventive medicine 53.6 (37.8 to 69.5) f
Preliminary/transitional year 41.0 (35.9 to 46.1)
Previous training/coursework in biostatistics
No 37.9 (35.4 to 40.3) c 1 [Reference]
.001 .04
Yes 45.2 (42.7 to 47.8) 4.5 (0.80 to 8.2)
Location of biostatistics training
College 38.1 (31.7 to 44.5)
Medical school 42.2 (40.0 to 44.8) d
.004
Residency 58.8 (38.2 to 79.4)
Other 51.0 (44.8 to 57.2)
Previous training/coursework in epidemiology
No 37.5 (34.6 to 40.4) c
.003
Yes 43.3 (41.1 to 45.6)
Previous training/coursework in evidence-based medicine g
No 39.0 (35.8 to 42.1) c
.04
Yes 42.9 (40.7 to 45.0)
Regularly reads medical journals
No 42.3 (38.9 to 43.2) c
.53
Yes 41.0 (38.9 to 45.7)
a To adjust for multiple pairwise comparisons, P ⬍ .01 is considered statistically significant.
b Using forward stepwise regression, 5 factors (sex, advanced degree status, years since medical school graduation, program affiliation, and biostatistics training) were found to be
associated with knowledge scores. The R 2 value for the final model was 0.18.
c Analysis by the t test.
d Analysis by 1-way analysis of variance.
e Trainee in month 8 to 12 of the intern year.
f The medicine/preventive medicine scores were not normally distributed. Median (interquartile range), 45% (35%-75%).
g Evidence-based medicine is defined as the integration of the best research evidence with patients’ values and clinical circumstances in clinical decision making. This is in contrast
to biostatistics, which is the scientific use of quantitative information to describe or draw inferences about natural phenomena, and epidemiology, which is the study of patterns,
causes, and control of disease in groups of people.

©2007 American Medical Association. All rights reserved. (Reprinted) JAMA, September 5, 2007—Vol 298, No. 9 1015

Downloaded From: https://jamanetwork.com/ on 05/03/2021


RESIDENTS’ UNDERSTANDING OF PUBLISHED BIOSTATISTICS AND RESULTS

frequently addressed (95%, 92%, and vanced degrees, academic affiliation, those residents who were present at the
88%, respectively), but advanced meth- prior biostatistics training, sex, and time of their inpatient conference. Resi-
ods (such as Cox proportional haz- years since medical school gradua- dents who did not attend, either by
ards regression, multiple logistic re- tion. The proportion of explained varia- choice or by chance, might have scored
gression, and Kaplan-Meier analyses) tion for the model was small, with differently. However, since we found no
were not included in instruction.21 If R2 =0.18. This likely reflects in part the differences in demographic character-
biostatistics teaching has continued at low variance in resident scores. istics between responders and nonre-
the same level in recent years, it would Our results suggest the need for more sponders, this is less likely. Fourth, our
not be surprising that only a small per- effective training in biostatistics in resi- study was confined to internal medi-
centage of residents in our survey dency education. Such training has cine residents, limiting generalizabil-
(10.5%-37.6%) understood the re- proven difficult, with systematic re- ity to other resident physicians. Nev-
sults and use of these analyses. views showing only limited effective- ertheless, we were able to assess
The correlates of differences in ness of many journal clubs and EBM multiple types of internal medicine
knowledge scores might have been ex- curricula.14,28-32 Thus, it is not surpris- training programs and found similar re-
pected. Residents with prior biostatis- ing that prior EBM experience, which sults.
tical training and those with advanced in the past has not included biostatis- Despite these limitations, this study
instruction through a master’s or PhD tics training,11,14 was not associated with also has several strengths. First, it was
degree scored better than their coun- higher scores in our multivariable a multiprogram study that captured in-
terparts. More senior residents per- analysis. Interactive, self-directed, and formation on a wide range of internal
formed worse than junior residents, po- clinically instructional strategies seem medicine residents at different types of
tentially reflecting loss of knowledge to stand the best chance of success.33 residency programs. Second, the resi-
over time, lack of reinforcement, or Involvement in hypothesis-driven re- dents in our survey, although limited
both. Although fourth-year residents search during training that requires to 1 state, possessed characteristics simi-
were an exception to this pattern, these comprehensive reading of the litera- lar to all other trainees in internal medi-
residents were part of a single medicine/ ture may also enhance residents’ knowl- cine programs across the United States.
pediatrics program that outperformed edge and understanding.34 Third, the 11 residency programs were
all other training programs. The higher Faculty who are implementing bio- similar in size and composition to the
scores in university-based residency statistics curricula can access several average US internal medicine pro-
programs may reflect exposure to fac- teaching resources. In internal medi- gram, and thus our study appears to be
ulty with more biostatistical training or cine, the American College of Physi- generalizable to internal medicine train-
teaching experience. In a survey study, cians’ ACP Journal Club has presented ees and training programs in the United
community faculty considered EBM less a series of reports emphasizing basic States.
important, were less confident in their study designs and statistics.35 CMAJ has Higher levels of statistical methods
EBM knowledge, and demonstrated published a series of EBM “teaching are being used in contemporary medi-
poorer EBM skills than full-time fac- tips” for learners and teachers.36 A guide cal literature, but basic concepts, fre-
ulty.22 designed to help medical educators quently occurring tests, and interpre-
Although sex was associated with a choose and interpret statistical tests tation of results are not well understood
difference in scores, this finding is not when developing educational studies or by resident physicians. This inad-
supported by other literature. Studies when reading the medical literature is equate preparation demonstrates lack
of evidence-based practice knowledge also available.37 of competence in meeting part of the
and skills rarely report analyses by sex. Limitations of this study should be Accreditation Council for Graduate
In 2 studies, investigators found no sex considered. First, while our instru- Medical Education’s practice-based
differences in critical appraisal skills ment showed good content validity, in- learning and improvement require-
among family physicians23 or in use of ternal consistency, and discriminative ment.10 If physicians cannot detect ap-
online evidence databases among pub- validity, these psychometric proper- propriate statistical analyses and accu-
lic health practitioners.24 Six studies as- ties were not known in advance but rately understand their results, the risk
sessing the biostatistics and epidemi- were established in the current study. of incorrect interpretation may lead to
ology knowledge of physicians and Second, our survey was purposely kept erroneous applications of clinical re-
trainees did not conduct comparisons brief, thus limiting our ability to as- search. Educators should reevaluate
by sex.5-7,25-27 Furthermore, our result sess understanding of all biostatistical how this information is taught and re-
was not a confirmation of an a priori concepts and research results. None- inforced in order to adequately pre-
hypothesis and so should be inter- theless, our questions focused on the pare trainees for lifelong learning, and
preted with caution. most commonly used methods and re- further research should examine the ef-
Our final regression model found 5 sults found in the contemporary litera- fectiveness of specific educational in-
predictors of knowledge scores: ad- ture. Third, we attempted to survey only terventions.
1016 JAMA, September 5, 2007—Vol 298, No. 9 (Reprinted) ©2007 American Medical Association. All rights reserved.

Downloaded From: https://jamanetwork.com/ on 05/03/2021


RESIDENTS’ UNDERSTANDING OF PUBLISHED BIOSTATISTICS AND RESULTS

Author Contributions: Dr Windish had full access to http://www.acgme.org/Outcome/. Accessed Janu- 23. Godwin M, Seguin R. Critical appraisal skills of fam-
all of the data in the study and takes responsibility for ary 12, 2007. ily physicians in Ontario, Canada. BMC Med Educ.
the integrity of the data and the accuracy of the data 11. Green ML. Evidence-based medicine training in 2003;3:10.
analysis. internal medicine residency programs: a national survey. 24. Adily A, Westbrook J, Coiera E, Ward J. Use of
Study concept and design: Windish, Huot, Green. J Gen Intern Med. 2000;15(2):129-133. on-line evidence databases by Australian public health
Acquisition of data: Windish. 12. Dellavalle RP, Stegner DL, Deas AM, et al. As- practitioners. Med Inform Internet Med. 2004;29
Analysis and interpretation of data: Windish, Green. sessing evidence-based dermatology and evidence- (2):127-136.
Drafting of the manuscript: Windish, Green. based internal medicine curricula in US residency train- 25. Estellat C, Faisy C, Colombet I, Chatellier G, Bur-
Critical revision of the manuscript for important in- ing programs: a national survey. Arch Dermatol. 2003; nand B, Durieux P. French academic physicians had a
tellectual content: Windish, Huot, Green. 139(3):369-372. poor knowledge of terms used in clinical epidemiology.
Statistical analysis: Windish, Green. 13. Alguire PC. A review of journal clubs in post- J Clin Epidemiol. 2006;59(9):1009-1014.
Administrative, technical, or material support: Huot. graduate medical education. J Gen Intern Med. 1998; 26. Ambrosius WT, Manatunga AK. Intensive short
Study supervision: Green. 13(5):347-353. courses in biostatistics for fellows and physicians. Stat
Financial Disclosures: None reported. 14. Green ML. Graduate medical education training Med. 2002;21(18):2739-2756.
in clinical epidemiology, critical appraisal, and evidence- 27. Cheatham ML. A structured curriculum for im-
based medicine: a critical review of curricula. Acad Med. proved resident education in statistics. Am Surg. 2000;
REFERENCES 1999;74(6):686-694. 66(6):585-588.
15. Shaneyfelt T, Baum KD, Bell D, et al. Instru- 28. Coomarasamy A, Khan KS. What is the evidence
1. McColl A, Smith H, White P, Field J. General prac- ments for evaluating education in evidence-based prac- that postgraduate teaching in evidence based medi-
titioner’s perceptions of the route to evidence based tice: a systematic review. JAMA. 2006;296(9):1116- cine changes anything? a systematic review. BMJ.
medicine: a questionnaire survey. BMJ. 1998;316 1127. 2004;329(7473):1017.
(7128):361-365. 16. Assessment Resource Tools for Improving Statis- 29. Ebbert JO, Montori VM, Schultz HJ. The journal
2. Young JM, Ward JE. Evidence-based medicine in tical Thinking (ARTIST) Web site. https://ore.gen.umn club in postgraduate medical education: a systematic
general practice: beliefs and barriers among Austra- .edu/artist/index.html. Accessed January 9, 2007. review. Med Teach. 2001;23(5):455-461.
lian GPs. J Eval Clin Pract. 2001;7(2):201-210. 17. Department of Biostatistics; Johns Hopkins 30. Taylor R, Reeves B, Ewings P, Binns S, Keast
3. Putnam W, Twohig PL, Burge FI, Jackson LA, Cox Bloomberg School of Public Health. Course materials J, Mears R. A systematic review of the effectiveness
JL. A qualitative study of evidence in primary care: what from Statistical Methods in Public Health II and III, of critical appraisal skills training for clinicians. Med
the practitioners are saying. CMAJ. 2002;166(12): 2003-2004 academic year. http://www.biostat.jhsph Educ. 2000;34(2):120-125.
1525-1530. .edu/courses/bio622/index.html. Accessibility veri- 31. Parkes J, Hyde C, Deeks J, Milne R. Teaching criti-
4. Haynes RB. Of studies, syntheses, synopses, sum- fied August 2, 2007. cal appraisal skills in health care settings. Cochrane Da-
maries, and systems: the “5S” evolution of informa- 18. American College of Physicians. Residency data- tabase Syst Rev. 2001;(3):CD001270.
tion services for evidence-based healthcare decisions. base search. http://www.acponline.org/residency 32. Norman GR, Shannon SI. Effectiveness of instruc-
Evid Based Med. 2006;11(6):162-164. /index.html?idxt. Accessed April 20, 2007. tion in critical appraisal (evidence-based medicine) skills:
5. Berwick DM, Fineberg HV, Weinstein MC. When 19. American Medical Association. FREIDA online spe- a critical appraisal. CMAJ. 1998;158(2):177-181.
doctors meet numbers. Am J Med. 1981;71(6):991- cialty statistics training statistics information. http: 33. Khan KS, Coomarasamy A. A hierarchy of effec-
998. //www.ama-assn.org/vapp/freida/spcstsc tive teaching and learning to acquire competence in
6. Weiss ST, Samet JM. An assessment of physician /0,1238,140,00.html. Accessed April 20, 2007. evidenced-based medicine. BMC Med Educ. 2006;
knowledge of epidemiology and biostatistics. J Med 20. American Medical Association. State-level data 6:59.
Educ. 1980;55(8):692-697. for accredited graduate medical education programs 34. Rogers LF. The “win-win” of research. AJR Am J
7. Wulff HR, Andersen B, Brandenhoff P, Guttler F. in the US: aggregate statistics on all resident Roentgenol. 1999;172(4):877.
What do doctors know about statistics? Stat Med. physicians actively enrolled in graduate medical edu- 35. ACP Journal Club. Evidence-based medicine for
1987;6(1):3-10. cation during 2005-2006. http://www.ama-assn.org better patient care. http://www.acpjc.org/?hp. Ac-
8. Horton NJ, Switzer SS. Statistical methods in the /ama/pub/category/3991.html#4. Accessed April 20, cessed November 5, 2006.
journal. N Engl J Med. 2005;353(18):1977-1979. 2007. 36. Wyer PC, Keitz S, Hatala R, et al. Tips for learn-
9. Association of American Medical Colleges (AAMC). 21. Looney SW, Grady CS, Steiner RP. An update on ing and teaching evidence-based medicine: introduc-
Curriculum directory. http://services.aamc.org/currdir biostatistics requirements in U.S. medical schools. Acad tion to the series. CMAJ. 2004;171(4):347-348.
/section4/start.cfm. Accessed April 14, 2007. Med. 1998;73(1):92-94. 37. Windish DM, Diener-West M. A clinician-
10. Accreditation Council for Graduate Medical Edu- 22. Beasley BW, Woolley DC. Evidence-based medi- educator’s roadmap to choosing and interpreting sta-
cation (ACGME). Outcome project: enhancing resi- cine knowledge, attitudes, and skills of community tistical tests. J Gen Intern Med. 2006;21(6):656-
dency education through outcomes assessment. faculty. J Gen Intern Med. 2002;17(8):632-639. 660.

©2007 American Medical Association. All rights reserved. (Reprinted) JAMA, September 5, 2007—Vol 298, No. 9 1017

Downloaded From: https://jamanetwork.com/ on 05/03/2021


APPENDIX

1018 JAMA, September 5, 2007—Vol 298, No. 9 (Reprinted) ©2007 American Medical Association. All rights reserved.

Downloaded From: https://jamanetwork.com/ on 05/03/2021


BIOSTATISTICAL KNOWLEDGE TEST

©2007 American Medical Association. All rights reserved. (Reprinted) JAMA, September 5, 2007—Vol 298, No. 9 1019

Downloaded From: https://jamanetwork.com/ on 05/03/2021


BIOSTATISTICAL KNOWLEDGE TEST

1020 JAMA, September 5, 2007—Vol 298, No. 9 (Reprinted) ©2007 American Medical Association. All rights reserved.

Downloaded From: https://jamanetwork.com/ on 05/03/2021


BIOSTATISTICAL KNOWLEDGE TEST

©2007 American Medical Association. All rights reserved. (Reprinted) JAMA, September 5, 2007—Vol 298, No. 9 1021

Downloaded From: https://jamanetwork.com/ on 05/03/2021


BIOSTATISTICAL KNOWLEDGE TEST

1022 JAMA, September 5, 2007—Vol 298, No. 9 (Reprinted) ©2007 American Medical Association. All rights reserved.

Downloaded From: https://jamanetwork.com/ on 05/03/2021

You might also like