Professional Documents
Culture Documents
AO Estadisticas PDF
AO Estadisticas PDF
Dirk Stengel
Mohit Bhandari
Beate Hanson
Handbook
Dirk Stengel
Mohit Bhandari
Beate Hanson
La yo u t a n d t yp e s e t tin g: n o u ga t Gm b H, CH-4 0 5 6 Ba s e l
Lib ra ry o f Co n gre s s Ca ta lo gin g-in -Pu b lica tio n Da ta is a va ila b le fro m th e p u b lish e r.
Ha za rd s
Gre a t ca re h a s b e e n t a ke n to m a in t a in th e a ccu ra cy o f th e in fo rm a tio n co n ta in e d in th is
p u b lica tio n . Ho w e ve r, th e p u b lis h e r, a n d / o r th e d is trib u to r, a n d / o r th e e d ito rs , a n d / o r th e a u th o rs
ca n n o t b e h e ld re s p o n s ib le fo r e rro rs o r a n y co n s e q u e n ce s a ris in g fro m th e u s e o f th e in fo rm a tio n
co n ta in e d in th is p u b lica tio n . Co n trib u tio n s p u b lish e d u n d e r th e n a m e o f in d ivid u a l a u th o rs a re
s t a te m e n t s a n d o p in io n s s o le ly o f s a id a u th o rs a n d n o t o f th e p u b lis h e r, a n d / o r th e d is trib u to r,
a n d / o r th e AO Gro u p .
Th e p ro d u cts , p ro ce d u re s , a n d th e ra p ie s d e s crib e d in th is w o rk a re h a za rd o u s a n d a re th e re fo re
o n ly to b e a p p lie d b y ce rtifie d a n d tra in e d m e d ica l p ro fe s s io n a ls in e n viro n m e n t s s p e cia lly
d e s ign e d fo r su ch p ro ce d u re s . No s u gge s te d te s t o r p ro ce d u re s h o u ld b e ca rrie d o u t u n le s s , in
th e u s e r’s p ro fe s s io n a l ju d gm e n t, it s ris k is ju s tifie d . Wh o e ve r a p p lie s p ro d u ct s , p ro ce d u re s , a n d
th e ra p ie s s h o w n o r d e s crib e d in th is w o rk w ill d o th is a t th e ir o w n ris k. Be ca u s e o f ra p id a d va n ce s
in th e m e d ica l s cie n ce s , AO re co m m e n d s th a t in d e p e n d e n t ve rifica tio n o f d ia gn o s is , th e ra p ie s ,
d ru gs , d o s a ge s , a n d o p e ra tio n m e th o d s s h o u ld b e m a d e b e fo re a n y a ctio n is ta ke n .
Alth o u gh a ll a d ve rtis in g m a te ria l w h ich m a y b e in s e rte d in to th e w o rk is e xp e cte d to co n fo rm
to e th ica l (m e d ica l) s ta n d a rd s , in clu s io n in th is p u b lica tio n d o e s n o t co n s titu te a gu a ra n te e o r
e n d o rs e m e n t b y th e p u b lish e r re ga rd in g q u a lit y o r va lu e o f su ch p ro d u ct o r o f th e cla im s m a d e o f
it b y its m a n u fa ctu re r.
Le ga l re s t r ic t io n s
Th is w o rk w a s p ro d u ce d b y AO Pu b lis h in g, Da vo s , Sw it ze rla n d . All righ t s re s e rve d b y AO
Pu b lis h in g. Th is p u b lica tio n , in clu d in g a ll p a rt s th e re o f, is le ga lly p ro te cte d b y co p yrigh t . An y u s e ,
e xp lo it a t io n o r co m m e rcia liza tio n o u t s id e th e n a rro w lim it s s e t fo rth b y co p yrigh t le gisla tio n a n d
th e re s trictio n s o n u s e la id o u t b e lo w, w ith o u t th e p u b lis h e r’s co n s e n t, is ille ga l a n d lia b le to
p ro s e cu tio n . Th is a p p lie s in p a rticu la r to p h o to s t a t re p ro d u ctio n , co p yin g, s ca n n in g, o r d u p lica tio n
o f a n y kin d , tra n s la tio n , p re p a ra tio n o f m icro film s , e le ctro n ic d a t a p ro ce s s in g, a n d s to ra ge s u ch a s
m a kin g th is p u b lica tio n a va ila b le o n In tra n e t o r In te rn e t .
So m e o f th e p ro d u ct s , n a m e s , in s tru m e n t s , tre a tm e n t s , lo go s , d e s ign s , e tc. re fe rre d to in th is
p u b lica tio n a re a ls o p ro te cte d b y p a te n t s a n d tra d e m a rks o r b y o th e r in te lle ctu a l p ro p e rt y
p ro te ct io n la w s (e g, “AO”, “ASIF”, “AO/ ASIF”, “TRIANGLE/ GLOBE Lo go ” a re re gis te re d tra d e m a rks)
e ve n th o u gh s p e cific re fe re n ce to th is fa ct is n o t a lw a ys m a d e in th e te xt . Th e re fo re , th e
a p p e a ra n ce o f a n a m e , in s tru m e n t, e tc. w ith o u t d e s ign a tio n a s p ro p rie t a ry is n o t to b e co n s tru e d
a s a re p re s e n ta tio n b y th e p u b lis h e r th a t it is in th e p u b lic d o m a in .
Re s trictio n s o n u s e : Th e righ t fu l o w n e r o f a n a u th o rize d co p y o f th is w o rk m a y u s e it fo r
e d u ca tio n a l a n d re s e a rch p u rp o s e s o n ly. Sin gle im a ge s o r illu s tra tio n s m a y b e co p ie d fo r re s e a rch
o r e d u ca tio n a l p u rp o s e s o n ly. Th e im a ge s o r illu s tra t io n s m a y n o t b e a lte re d in a n y w a y a n d n e e d
to ca rry th e fo llo w in g s t a te m e n t o f o rigin “Co p yrigh t b y AO Pu b lis h in g, Sw it ze rla n d ”.
Prin te d in Sw it ze rla n d .
ISBN 9 7 8 -3 -13 -15 2 8 81-0
IV
Ta b le o f co n te n ts
1 Ab o u t n u m b e rs 1
2 Erro rs a n d u n ce r t a in t y 29
3 Ou t co m e s e le c t io n 53
4 Th e p e r fe c t d a t a b a s e 79
5 Ho w t o a n a lyze yo u r d a t a 93
6 P re s e n t yo u r d a t a 111
7 Glo s s a r y 13 5
V
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
VI
I In tro d u ctio n
De a r re a d e r
Applyin g n ew tech n ologies, presen tin g you r ex perien ce, an d, of cou rse,
th in k in g of you r cu rren t practice in th e ligh t of n ew eviden ce, n eeds
som e u n derstan din g of clin ical research m eth odology. You r job is to
save lives an d lim bs, an d you are doin g th at perfectly—n obody wan ts
you to becom e a statisticia n for a reason able a n d clear evalu ation of
you r resu lts. Bu t we n eed to sh are a com m on lan gu age, an d reach
con sen su s on basic scien tif c pr in ciples.
Dirk Stengel
Mohit Bhandari
Beate Hanson
VII
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
VIII
II Fo re wo rd
Wh eth er you love or h ate statistics, you n eed it for clin ical decision
m akin g, for cou n selin g patients an d th eir relatives, an d to argue w ith
th ose who decide wh ich health care in terven tion s w ill appear or rem ain
on the m arket, or even in you r h ospital. You need statistical knowledge to
m ake you r way th rough th e im men se and ever-grow in g body of scientif c
literatu re, an d, of cou rse, to plan an d con du ct you r ow n research .
Research is an in tegral part of bein g a doctor—h istorically, today, an d, far
m ore im portan t, tom orrow. As an orth opaedic or trau m a su rgeon , you
offer a preciou s good—you r skills and you r com m itm en t to you r patien ts
an d th e society. Sh arin g both you r expertise an d skepticism w ith th e
clin ical and scien tif c com mu n ity is im portant to brin g th is discipline
forward. Take th e h elm , an d participate in research actively.
David L Helfet
IX
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
X
III Co n trib u to rs
III Co n trib u to rs
Ed it o rs
Dirk St e n ge l , M D, Ph D, MSc
Head of th e Cen ter for Clin ica l Research
Departm en t of Trau m a an d Orth opaedics
Un fallk ran ken h au s Berlin
Waren er Strasse 7
12683 Berlin , Ger m an y
Mo h it Bh a n d a ri , M D, MSc, FRCS
McMasters Un iversity
Epidem iology an d Orth opaed ics
120 0 Ma in Street West
Ha m ilton , On tario, L8N 3Z5, Ca n ada
Be a te Ha n s o n , M D, M PH
Director of AO Clin ical In vestigation an d Docu m en tation
AO Clin ical In vestigation an d Docu m en tation
Stettbach strasse 6
860 0 Dü ben dorf, Sw itzerlan d
XI
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Au t h o rs
La u re n t Au d igé , PD Dr (DVM , Ph D)
Grou p leader Meth odology
AO Clin ical In vestigation an d Docu m en tation
Stettbach strasse 6
8600 Dü ben dorf, Sw itzerlan d
Ka i Ba u w e n s , M D
Sen ior Con su ltan t Su rgeon
Un fallk ran ken h au s Berlin
Departm en t of Trau m a an d Orth opaed ics
Waren er Strasse 7
12683 Berlin , Germ an y
Mo h it Bh a n d a ri , M D, MSc, FRCS
McMasters Un iversity
Epidem iology an d Orth opaed ics
120 0 Ma in Street West
Ha m ilton , On tario, L8N 3Z5, Ca n ada
XII
III Co n trib u to rs
Axe l Ekke rn ka m p , M D, Ph D
Director
Un fallk ran ken h au s Berlin
Departm en t of Trau m a an d Orth opaedics
Waren er Strasse 7
12683 Berlin , Ger m an y
Professor of Su rger y
Departm en t of Trau m a an d Orth opaedics
Un iversity Hospital of Greifswald
Sau erbru ch strasse
17475 Greifswald, Germ a n y
No rb e rt P Ha a s , Prof Dr m ed
Ch arité Un iversitätsm edizin Berlin
Cen tru m fü r Mu sku loskeleta le Ch iru rgie
Cam pu s Virch ow-Klin iku m
Au gu sten bu rger Platz 1
13353 Berlin , Germ an y
Th o m a s Ko h lm a n n , Ph D
Professor an d Director
In stitu t fü r Com mu n ity Med icin e
Abteilu n g Meth oden der Com m u n ity Med icin e
Walter Rath en au Strasse 48
17487 Greifswald, Germ an y
XIII
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Pe t e r Ma rt u s , Ph D
Professor an d Director
Ch arité Un iversitätsm ed izin Berlin
Cam pu s Ben jam in Fran k lin
In stitu t fü r Med izin isch e In for m atik,
Biom etr ie u n d Epidem iologie
Hin den bu rgda m m 30
1220 0 Berlin , Ger m an y
Jö rn Mo o ck , Ph D
In stitu t fü r Com m u n ity Med icin e
Abteilu n g Meth oden der Com m u n ity Med icin e
Wa lter Rath en au Strasse 4 8
174 87 Greifswa ld, Germ a n y
Dirk St e n ge l , M D, Ph D, MSc
Head of th e Cen ter for Clin ica l Research
Departm en t of Trau m a an d Orth opaedics
Un fallk ran ken h au s Berlin
Waren er Strasse 7
12683 Berlin , Ger m an y
Mich a e l Su k , M D, ID, M PH
Assistan t Professor
Un iversity of Flor ida
Director, Orth opaed ic Trau m a Ser vice
College of Med icin e Jackson ville
655 West Eigh t Street, 2n d Floor ACC
Jacksonville, FL 32209, USA
XIV
1 Ab o u t n u m b e rs
1
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
1 Ab o u t n u m b e rs
1 In t ro d u ct io n 3
4 Me a n ve rs u s m e d ia n 14
5 P ro p o r t io n s , ra t e s , o d d s , ris k s , a n d ra t io s 16
7 Su m m a r y 28
2
Dirk Ste n ge l, La u re n t Au d igé
1 Ab o u t n u m b e rs
1 In t ro d u ct io n
To ease com m u n ication w ith you r colleagu es, you h ave su rely al-
ready acqu ired a person a l d iction ar y of acron ym s, syn on ym s, a n d
abbreviation s. As a su rgeon dealin g w ith m u scu loskeletal in ju r ies an d
d iseases you are fam iliar w ith term s like ED, OR, CT, M RI, Ex Fix, or
ORIF (em ergen cy depar tm en t, operatin g room , com pu ted tom ograph y,
m agn etic reson an ce im agin g, exter n al f xator, open redu ction an d
in tern al f xation). Correct u se of th ese ter m s facilitates com mu n ication
an d for m s an im portan t elem en t of you r profession alism . You m ay,
h owever, m eet w ith problem s w h en doin g bu sin ess elsew h ere w ith ou t
adaptin g you r vocabu lar y. Med ical lan gu age an d term in ology can be
con fu sin g, an d sim ilar term s m ay h ave ver y d ifferen t m ean in gs.
Nu m bers h ave a fascin atin g attribu te —th ey are u n equ ivoca lly rec-
ogn ized as su ch by clin icia n s a n d research ers, oth er h ea lth care
profession als, you r patien ts, an d everybody, regard less of th eir
backgrou n d, aff liation , or n ation a lity. Th e la n gu age of n u m bers is
global—so it is th e perfect lan gu age of scien ce. You m ay u se n u m bers
to en crypt th e ton s of in for m ation you collect abou t you r patien ts in
da ily practice, to descr ibe th eir dem ograph ic prof le a n d in d ividu a l
risks, a n d th e resu lts of you r treatm en t. However, u sin g th e correct
code an d ch oosin g th e appropriate n u m bers is essen tial to com pile,
h an d le, an d process clin ical in form ation .
3
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Bin a r y (d ich o t o m o u s ) d a t a
Th e sim plest type of in form ation im agin able m ay be stored in th e
form of data variables h avin g on ly two possible categor ies, su ch as
yes or n o, on e or zero, m a le or fem ale, left or righ t, th e presen ce or
absen ce of a d isease or an in ju ry. Su ch variables are called bin ar y or
d ich otom ou s. Alth ou gh categor ies m ay be ex pressed in words, th e
data m ay be stored as n u m bers or bin ary in form ation ( Fig 1-1). Ch apter
5 “How to an alyze you r data”, ch apter 6 “Presen t you r data”, an d
cross-tables w ill focu s on th e u tility of bin ar y in form ation .
a Age Ma le b
gen der *
23 1
35 1
42 0
* Male gender:
52 1
true = 1 intact broken
65 0 false = 0
Fig 1-1a – b
a Example of categories expressed in numbers.
b Example of categories expressed in words.
Ca t e go rica l d a t a
A fractu re of th e radial bon e m ay occu r in its proxim al, m id-, an d distal
th ird, wh ich h as im plication s on th e treatm en t, bu t n ot n ecessarily
on th e ou tcom e. Th e an atom ical classif cation is valu e-free, w h ich
4
1 Ab o u t n u m b e rs
a b A va r ia ble ca lled
“ fr a ctu r e loca liza tion ”
ma y be stor ed a s:
1 = proximal
2 = midshaft
3 = distal
Fig 1-2 a – b
a Characteristic of the categorical data is that it is value-free.
b Categorical data can be numbered according to the requirements.
Ord in a l d a t a
Th ere are, h owever, categories w h ich can be placed in distin ct order
(ie, categor y B is worse th an category A). Th is type of data is called
ord in al data ( Fig 1-3 ). With in th e Mü ller AO Classif cation of Fractu res
in Lon g Bon es, a com plex, in traarticu lar fractu re w ith m u ltiple
fragm en ts an d alteration of th e cartilage layer (type C) h as a worse
fu n ction al progn osis th a n a n extraarticu lar type A fractu re. Oth er
exam ples are th e Am erican Society of An esth esiologists (ASA) r isk
classif cation sch em e (ASA I–V) or th e Gu stilo-An derson grad in g of
open fractu res.
Ord in al data var iables very often h ave a lim ited n u m ber of possible
categories su ch as in th e later clin ical gradin g system s. Th ese variables
are also said to be n on in ter val becau se th e in ter vals between adjacen t
categories (often ex pressin g progn ostic in form ation) m ay n ot be equ al.
Th e d ifferen ce between type C an d type B fractu res m ay n ot be th e
sam e as th at between type B an d type A fractu res.
5
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
A B C
Co n t in u o u s d a t a
Fin ally, data variables m ay be u sed to store in form ation from cou n ts
or m easu res th at, in prin ciple, can take in f n ity of valu es w ith in
clin ically plau sible ran ges. If you are in terested in th e treatm en t of
osteoporotic fractu res, th e T-Score obta in ed by a du al en ergy x-ray
absor ptiom etr y (DEXA) is a good exam ple of a con tinu ou s m easu re
w ith obviou s progn ostic im pact ( Fig 1-4 ).
Fig 1-4 The T-score obtained by a DEXA is a good example of continuous measure with
obvious prognostic impact.
6
10
1.82 130/80
1/ 2 / 3
1 Ab o u t n u m b e rs
2
79.8
Co u n t s a re in t e ge rs
Som e con tin u ou s dat a ca n sim ply be cou n ted.
Me a s u re s a re in t r in s ic a lly n o n in t e ge r s
Som e con tin u ou s dat a requ ire m easu r in g.
weights 79.8 kg
79.8
7
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Data variables stored as nu m bers thu s con tain in for m ation of d ifferen t
com plex ity. Th e com plexity of in form ation in creases from bin ar y to
con tinu ou s valu es. Th ey m ay describe:
• A cer tain fact w ith va lu ation (som eon e h as 2 or 5 ch ild ren )
• A cer tain fact w ith ou t va lu ation (som eon e wears a blu e,
wh ite, or red jacket)
• A scen ario w ith progn ostic im pact (som eon e n eeds 2 or 5 u n its
of packed red blood)
• Or d istin gu ish between two d ifferen t clin ica l situ ation s
(a patien t w ith a fem oral fractu re h as a blood pressu re of
60/ 35 m m Hg or 130/80 m m Hg)
3 .1 Pa t ie n t lis t in g ve rs u s s u m m a r y s t a t is t ics
Accord in g to Albert Ein stein ’s fa m ou s qu ote, ever yth in g sh ou ld
be m ade as sim ple as possible, bu t n ot sim pler. You spen d m u ch
tim e collectin g data of var yin g com plex ity so, in con du ctin g you r
an alysis, do n ot h astily rip th em to pieces, n or squ eeze th em in to
rou gh classes or categories. In doin g so, you m ay m iss su btle, bu t
im por tan t association s.
In a scien tif c ar ticle, an d w ith a sm all sam ple size of 20 patien ts, you
m ay h ave two d ifferen t ways ( Tab le 1-1 an d Ta b le 1-2 ) of presen tin g th e
dem ograph ics of you r patien ts.
8
1 Ab o u t n u m b e rs
1 male 18 91 0
2 female 25 49 0
3 male 49 68 1
4 male 58 71 2
5 male 71 63 5
6 female 50 55 3
7 female 40 109 0
8 male 31 45 0
9 male 69 60 0
10 male 54 67 1
11 male 58 90 1
12 male 82 84 2
13 male 19 56 4
14 female 47 79 0
15 male 31 64 0
16 female 59 102 0
17 male 67 61 3
18 male 69 53 2
19 male 73 50 1
20 female 84 47 1
Ta b le 1-1 Patient listing with individual patient data and integer values.
9
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Ch a r a cter istic
Gender (n)
Male 14
Female 6
Mean age (years) 52.7
Mean duration of surgery (minutes) 68.2
Mean number of units of packed red blood 1.3
10
1 Ab o u t n u m b e rs
Values with more than two decimal places are rarely needed.
Ma n y period ica ls dem a n d fou r decim a l places for P valu es (wh ich
def n itely m akes sen se).
3 .2 Sim p lifica t io n o f d a t a
Occasion a lly, it ca n be u sefu l an d n ecessar y to redu ce th e com plex ity
of in form ation . Th is m ay h appen w h en you gen erate grou ps of su bjects
from con tin u ou s data in a clin ica l stu dy, or w h en you com bin e cat-
egories of categorical data.
Limitations Especially w ith con tinu ou s m easu res, sm all u n its can pro-
du ce spu riou s, clin ically irrelevan t association s between th e variables
of in terest.
Co n s id e r t h e ca t e go riza t io n o f co n t in u o u s d a t a
Older patien ts m ay h ave m ore severely im paired sh ou lder fu n ction
after fractu res of th e prox im al h u m eru s th an you n ger su bjects.
11
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
a 60 b 60
Impairment according to the
Constant-Murley score
40 40
30 30
20 20
10 10
0 20 40 60 80 100 0 <55 55– 65 66–75 >75
Age Years
Fig 1-5 a – b
a Native data may describe more details.
b Categories clarify the message.
Ap p ro p ria t e ca t e go rie s
Bu ild in g appropriate categories is a trade-off between clin ical an d
statistical reason in g.
Exa mple You may be interested whether patients with grade III A open fractures
have poorer functional outcomes than those with grade I open fractures. However, your
group of 60 patients with open fractures may comprise the following fractures:
2 grade III B
5 grade III A
In this case, it can be
14 grade II necessary to rethink your
39 grade I original question.
You m igh t con sider grou pin g patien ts w ith grade II, III A, an d III B
fractu res to gen erate two sam ples of reason able size.
12
1 Ab o u t n u m b e rs
50% 50%
Slight and Severe and
Moderate Severe moderate very severe
Median
Fig 1-6 a – c 40 patients with multiple injuries, graded according to the injury severity score (ISS).
a The percentiles present natural thresholds which separate your data into groups of similar size.
b Dividing your sample along the median will generate two groups of equal size. 20 patients
with slight and moderate injuries, and another 20 patients with severe and very severe injuries.
c The central portion (or the sirloin) of your data is described by the interquartile range (IQR).
The IQR encompasses those 50% of patients with moderate or severe injuries. This data body
has two appendices on either side, including 25% patients each with less or more severe injuries.
13
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
4 Me a n ve rs u s m e d ia n
Alth ou gh ever ybody is keen on n orm a l d istr ibu tion , m ost data
are skewed in th e on e or th e oth er d irection . If you are recru itin g
patien ts on to a clin ica l stu dy, you w ill n ot en roll todd lers. Th u s,
th e age d istr ibu tion in su ch a sam ple w ill be or ien ted toward older
patien ts (in th e easterly d irection). Most patien ts in you r em ergen cy
departm en t w ill h ave a systolic blood pressu re of 120 m m Hg, w ith
few presen tin g alm ost k illin g valu es of 200 m m Hg or h igh er. In th is
case, you r data clou d w ill be geared toward lower valu es (in th e
westerly d irection ).
Th e m ean , or sam ple average, is su sceptible to even sm all d istu rban ces
at th e edges of you r sam ple of valu es ( Fig 1-7 ).
14
1 Ab o u t n u m b e rs
80
12 patients with
lethal injuries
Injury severity score (ISS)
60
40
Mean: 29
Mean: 26
Median: 24
20
0
Center A Center B
Fig 1-7 Consider our study, the injury severity of patients admitted to two different
trauma centers is investigated.
• Center B took care of a higher proportion of more severely injured patients.
• However, note that a similar number of patients (50%) with an ISS up to 24 were
treated at both institutions.
• So the median, cutting the sample in halves, is 24 in either group. Reporting only the
median would obscure the obvious imbalance.
• Providing of the mean ISS fully illustrates the difference between the two cohorts
(26 in Center A, and 29 in Center B)
15
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
5 P ro p o r t io n s , ra t e s , o d d s , ris k s , a n d ra t io s
Ra t e s
A rate expresses th e relation sh ip between two var iables w ith differen t
u n its (like m iles per h ou r, or beats per secon d).
Typical rates:
• Th e in ciden ce rate
• Th e n u m ber of f rst even ts of a certain d isease
• In ju ry per n u m ber of person -years
• You m ay also trade-off costs an d com plication s
(eg, 10,00 0 US$ per su rgical site in fection ).
P ro p o r t io n s a n d ris k s
Proportion s a n d risks h ave ver y sim ilar ch aracteristics, a n d th e
d ifferen ce in n am in g is m ain ly related to th e even t of in terest. Both
describe th e relation sh ip between two variables w ith sim ilar u n its
like th e nu m ber of patien ts w ith a cer tain even t or con d ition am on g
all stu died patien ts. On e m ay provide n u m erators an d den om in ators
(eg, 1/1,000), or a percen tage (eg, 0.1% ).
16
1 Ab o u t n u m b e rs
Th ere are two d ifferen t ways of expressin g frequ en cy of even ts. Fig 1-8 a
describes h ow m an y even ts h ave occu rred in a certain n u m ber of
patien ts (h ere: 4 of 10) an d in a certain in ter val of obser vation . Fig 1-8 b
sh ow s th at each of th ese patien ts h ad a d ifferen t tim e of ex posu re
(ex pressed by th e “tim e tail”). Note th at th ere are very d ifferen t
patter n s in th e relation sh ip between ex posu re tim e (for exa m ple,
th e du ration of an tibiotic treatm en t for join t in fection s) an d even ts
(for exam ple, recu rren t in fection ):
a b 1
2
3
1
1 4
1
1
3
4
Time Time
Patients without event Patients w ith event “Time tail”
Fig 1-8 a – b
a Certain number of patients—certain interval of observation (4/10).
b Each patient with different time of exposure.
1 Short exposure time—without event (these patients may have dropped from
the analysis because they could no longer be reached).
2 Long exposure time—without an event.
3 Short exposure time—with event (in case of antibiotic treatment, those patients
would represent clear treatment failures).
4 Long exposure time—with event.
17
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Od d s
Odds are def n ed as th e likelih ood of a th in g occu rin g rath er th a n
n ot occu r in g. Th ey d iffer from a risk in th at th e den om in ator does
n ot in clu de th e patien ts w ith th e con d ition . If 5 in 100 patien ts h ad
a certa in ex posu re, th e correspon d in g odds is 5:95 or 0.053.
Ra t io s
Ratios are u sed to descr ibe th e relative effect of a certain in ter ven tion
com pared to an oth er.
Th ere are two m ain types of ratios:
• Risk ratio (RR)
• Odds ratio (OR)
Th ese ratios are in tr in sically tied to th e u n derlyin g stu dy design , so
it is n ecessar y to go a little in to details of stu dy design .
18
1 Ab o u t n u m b e rs
St u d y d e s ig n
Th e association between a certain in terven tion or exposu re an d th e tar-
get ou tcom e can be investigated on two differen t tim elin es ( Fig 1-9 ):
• Retrospectively: u sin g available in for m ation from patien t
records an d h ospita l ch ar ts
• Prospectively: by begin n in g w ith data collection as soon
as patien ts en ter th e in stitu tion a n d follow in g th em u p
for a specif ed in ter val
Co h o r t s t u d y
If you set ou t to com pare th e bon e h ealin g rates after f xation of
d istal tibia fractu res by in tram edu llary n ails com pared to in terlock in g
plates, you w ill th in k of th e follow in g ch ron ology—in th e typical
order of a coh or t stu dy:
1) Sam plin g you r patien ts
2) Assign in g th em to on e or th e oth er f xation m eth od
3) Obtain in g x-rays after 6 m on th s to determ in e fractu re
con solidation
In a retrospective coh or t stu dy, you start w ith iden tifyin g patien ts
th at h ad been treated at you r in stitu tion , eg, in th e h ospital’s ad m in -
19
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Prospective
a Prospective
Nonunion
Locking plate Nonunion
Patients Union
with distal Locking plate
Patients Union
tibia
with distal
fractures
tibia Nonunion
fractures Intramedullary nail Nonunion
Union
Intramedullary nail
Union
b Retrospective
Retrospective Nonunion
Locking plate Nonunion
Patients Union
with distal Locking plate
Patients Union
tibia
with distal
fractures
tibia Nonunion
fractures Intramedullary nail Nonunion
Union
Intramedullary nail
Union
Fig 1-9 a – b Cohort studies start with intervention and end with outcomes.
20
1 Ab o u t n u m b e rs
In a prospective coh ort stu dy (an d on ly in a prospective coh ort stu dy)
th e in d ividu al treatm en t can be assign ed by ch an ce (w h ich m akes
u p th e ran dom ized con trolled tr ial).
Ca s e - co n t ro l s t u d y
Coh ort stu d ies are su itable if you h ave su ff cien t n u m bers of patien ts
w ith a com m on d isease or in ju ry. However, if you are in terested in
w h eth er a certa in ex posu re, su ch as sm ok in g, h as an y in f u en ce on a
rare ou tcom e (eg, epidu ral in fection after dorsoven tral stabilization of
a lu m bar spin e fractu re), a coh ort stu dy is n ot th e appropr iate design .
You m ay spen d th e rest of you r life waitin g for en ou gh patien ts w ith
th e rare targeted ou tcom e in a prospective coh ort stu dy. You m ay
also go crazy by scan n in g th ou san ds of patien t records to iden tify
sm okers an d n on sm okers wh o u n der wen t spin e su rger y an d d id or
d id n ot develop epidu ral in fection .
In th is situ ation , you m ay start w ith collectin g all epidu ral in fection s
f rst, w ith ou t k n ow in g w h eth er th ese patien ts are sm okers or n ot.
You m ay th en iden tify patien ts w ith sim ilar ch aracter istics (sam e
gen der, age, body m ass in dex, an d so on) w h o u n der wen t th e sam e
su rger y w ith u n even tfu l recover y. Fin a lly, you iden tify h ow m a n y
sm okers an d n on sm okers were in eith er grou p. Th is approach
ch aracter izes th e case-con trol stu dy. Case-con trol stu d ies are a lways
retrospective.
21
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Th ere is m u ch con fu sion w ith th e def n ition of case-con trol stu d ies,
especia lly in th e or th opaed ic literatu re. Please keep in m in d, th at th e
ter m “case” always refers to patien ts w h o reach ed a cer ta in en dpoin t
(eg, n on u n ion ), w h ereas “con trol” a lw ays in d icates th ose w h o d id
n ot reach th at en dpoin t (eg, th ose w h o u n ited u n even tfu lly). “Case”
a n d “con trol” do n ot descr ibe th e treat m en ts u n der in vestigation
(eg, in tra m edu llar y n a ils versu s lock in g plates).
Retrospective
Locking plate
Nonunion
Intramedullary nail Patients
with distal
tibia
Locking plate fractures
Union
Intramedullary nail
Fig 1-9 c Case-control studies start with outcomes and end with intervention.
22
1 Ab o u t n u m b e rs
Each stu dy type h as its appropr iate ratio m etric. In a coh ort stu dy,
th e relative effect can be expressed as a risk ratio (also called relative
risk) or a n odds ratio. In a case-con trol stu dy, on ly odds ratios m u st
be u sed.
23
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Tr ea tmen t Even t
Rerupture Healed uneventfully Total
Suturing 4 96 100
Conservative 8 92 100
Total 12 188 200
Suturing Conservative
b RR compared to the
4% /8% = 0.5 8% /4% = 2.0
alternative treatment
24
1 Ab o u t n u m b e rs
OR a n d RR in a la rge t ria l
Th an ks to th e su ccess of you r stu dy, you were awarded a research
gran t, an d patien ts are keen on participatin g in a m u ch larger trial.
Again , you r en ergy paid off in th e com plete follow-u p of 1,000 patien ts
each (con gratu lation s!).
25
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Tr ea tmen t Even t
Rerupture Healed uneventful Total
Suturing Conservative
RR compared to alternative
0.4% /0.8% = 0.5 0.8% /0.4% = 2.0
treatment
Ta b le 1-4 The larger the trial the closer gets the OR to the RR.
The RR remains unchanged at 0.5 in favor of the suturing group or at 2.0 in disfavor of the
conservative treatment group. The OR is getting closer and closer to the RR with the increasing
rarity of events.
Perh aps th e sim plest, still m ost im portan t an d clin ically relevan t
statistic to be calcu lated from th e 2 × 2 table an d r isk estim ates is th e
risk d ifferen ce (RD). In ou r f rst Ach illes ten don stu dy exam ple, th e
RD is 8% – 4% = 4% ( Tab le 1-3 a ). In oth er words: Su tu rin g redu ces th e
absolu te risk of su stain in g a reru ptu re of th e Ach illes ten don by 4%
com pared to con ser vative treatm en t.
26
1 Ab o u t n u m b e rs
In ou r secon d Ach illes ten don stu dy exam ple w ith in creased sam ple
sizes, th e r isk d ifferen ce n ow is on ly 0.8% –0.4% = 0.4% , lead in g to
a NNT of 1/ 0.4% = 250 ( Tab le 1-3b ).
27
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
7 Su m m a r y
28
2 Erro rs a n d u n ce rta in t y
Fa ilu re
52
52
Erro r
96 69
96 69
Bia s
215
215
29
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
2 Erro rs a n d u n ce rta in t y
1 In t ro d u c t io n 31
2 De s crip t io n s o f u n ce r t a in t y 32
2 .1 Accu ra cy a n d p re cisio n 32
2 .2 Ra n d o m iza tio n 36
2 .3 Typ e s o f e rro r 38
2 .4 Co m p a riso n a n d co n tra s t 42
4 Dis t rib u t io n s 45
4 .1 No rm a lly d is trib u te d d a ta 45
4 .2 Ske w e d d a ta 47
4 .3 Oth e r d is trib u tio n s 47
5 St a n d a rd d e via t io n ve rs u s s t a n d a rd e rro r 48
5 .1 Sta n d a rd d e via tio n 48
5 .2 Sta n d a rd e rro r 50
6 Su m m a r y 52
30
Pe te r Ma rtu s , Rich a rd E Bu ckle y
2 Erro rs a n d u n ce rta in t y
1 In t ro d u ct io n
31
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
2 De s crip t io n s o f u n ce r t a in t y
2 .1 Accu ra c y a n d p re cis io n
Un certain ty, variability, an d error are in tegral par ts of scien ce. Un -
avoidable as th ey are, an d in som e in stan ces desirable, th ey sh ou ld be
ex pressed an d h an d led in a qu alitative an d qu an titative m an n er.
32
2 Erro rs a n d u n ce rt a in t y
a 100 b0 0
100
2 2
90 90
4 4
80 80 6 6
8 8
70 70
10 10
60 60 12 12
Study number
Study number
SF-36 PCS
SF-36 PCS
14 14
50 50
16 16
40 40 18 18
20 20
30 30
22 22
20 20 24 24
26 26
10 10
28 28
0 0 30 30
Study 1 Study12
Study Study 2 0 0 50 100
50 100
SF-36 PCSSF-36 PCS
Fig 2 -1a – b Different studies determining the health-related quality of life after fracture treatment.
a Studies 1 and 2 have similar mean values, indicating restoration of physical function to norm
values. Study 1 has a wide distribution of values, making the estimate inaccurate. Study 2 shows
high accuracy of the estimate because the distribution of values is narrow.
b Thirty further studies, each of two different fracture treatments. Repeated studies indicated
by a solid dot consistently come up with almost similar results in one treatment group.
The precision of this treatment effect is high. In contrast, highly variable results are observed
with studies in the other treatment group indicated by circles. It is uncertain whether an
observation can be reproduced in a subsequent trial.
33
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Su rgeon B in serted all n ails th rou gh th e troch an teric fossa. He ach ieved
h igh precision , bu t low accu racy, sin ce all in ser tion s were m ade away
from th e correct en tr y poin t. Th ere m ay be two d ifferen t reason s
for th is: failu re an d system atic error, or bias. First, h e m ay h ave n ot
read th e in stru ction m a n u a l a n d fa iled to u se th e im plan t correctly
becau se h e d id n ot k n ow h ow. Secon d, h is u su a l access rou te an d
patien t position in g m ay con f ict w ith th e tip en tr y. He m ay in ser t a
n ail th rou gh th e fossa blin dly, bu t still n eeds to adapt h is tech n iqu e to
th e n ew im plan t. Un til h e realizes th is, th e sh ape of th e n ew rod m ay
cau se problem s in th e d istal part of th e fem u r an d worse ou tcom es
com pared to th e establish ed im plan t—n ot becau se of in adequ ate
h ardware, bu t du e to su rgeon -related bias. Th e su rgeon m ay also
h ave m istaken ly en tered th e n ail th rou gh th e fossa, despite h avin g
plan n ed to target th e tip.
34
2 Erro rs a n d u n ce rt a in t y
Fig 2 -2Entry points of a new tip-entry femoral nail achieved by four different surgeons.
Surgeon A: both high accuracy (low variability) and precision (in aiming the correct entry point).
Surgeon B: high precision, but low accuracy (all insertions away from the correct entry point).
Surgeon C: nails may have been inserted accurately, but with low precision.
Surgeon D: all entry points away from the correct side with high variability.
35
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
2 .2 Ra n d o m iza t io n
Th ere are m an y objection s to ran dom ized con trolled trials (RCTs) in
trau m a an d orth opaedic su rgery, m ost of wh ich are u n fou n ded. If
you h ave two treatm en ts, a n d do n ot k n ow w h ich perform s best in
a typical clin ica l settin g (eg, in a certain type of fractu re), th ere is
n o easier an d better way to do th is th an by an RCT.
Un fortu n ately, alth ou gh th e patien ts’ age, gen der, an d even th e du ration
of su rgery were well balan ced, th ere were clearly m ore sm okers in
th e n ailin g grou p. It is thu s u n clear w h eth er th e in ter ven tion or th e
sm ok in g in f u en ced th e h igh er rate of n on u n ion .
Th e list of poten tial con fou n ders is alm ost en d less, an d can on ly
con tain th ose th at are k n ow n an d m easu rable. Th ere m ay be d istin ct
gen etic factors th at con tribu te to bon e h ealin g, bu t gen etic prof lin g
ca n n ot be perfor m ed on a gen eral basis.
36
2 Erro rs a n d u n ce rt a in t y
37
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Event yes
Intervention A
A Event no
Study R Comparison
population
Event yes
Intervention B
Event no
Fig 2 -3 The RCT distributes known (yellow dots) and unknown (blue dots) equally to study arms.
R= Randomized allocation
2 .3 Typ e s o f e rro r
You m ay h ave h eard abou t type I (alph a) an d type II (beta) errors.
Un derstan d in g th eir m ean in g m akes it m u ch easier to plan a stu dy,
an d to in ter pret its resu lt. Sin ce th is is a h an dbook, n ot a textbook,
pu t th e follow in g descr iption s in you r tr ia l toolbox, a n d follow th em
w isely from th e ver y begin n in g of you r project. Both types of error
m u st be specif ed togeth er w ith you r stu dy h ypoth esis, an d before
startin g patien t recru itm en t.
38
2 Erro rs a n d u n ce rt a in t y
39
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
40
2 Erro rs a n d u n ce rt a in t y
Type I Type II
Fig 2 -4 The type I error (alpha error) can be compared to a fire detector that raises
alarm although it is not burning. The type II error (beta error) is the false-negative counterpart—
it burns but the fire detector keeps silent.
41
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
2 .4 Co m p a ris o n a n d co n t ra s t
If you stabilize a fractu re type A3 (accord in g to th e Mü ller AO
Classif cation of Fractu res in Lon g Bon es) of th e distal rad iu s w ith
a volar lock in g plate, a n d a ll patien ts sh ow a good to excellen t ou t-
com e, th e n ext qu estion m u st be: In com parison to w h at altern ative:
a T-plate, a Pi-plate, or a n extern al f xation ? Or even com pared to
con ser vative m an agem en t?
A B A
Fig 2 -5 Contrasts in clinical studies can be understood as demonstrated in this color chart.
A It is clear that there is a difference between colors at the extreme left and right ends of the bar.
B The more we move our focus from the ends toward the center, the more we face difficulties
distinguishing between the different shadings. Although there is still a measurable difference
in brightness, it is no longer visually recognizable.
42
2 Erro rs a n d u n ce rt a in t y
A scien tif c ex per im en t rarely com es u p w ith a black or w h ite resu lt,
bu t a likely ran ge of obser vation s. Th e m ore th e ran ge of obser vation s
m ade w ith two d ifferen t in ter ven tion s overlap, th e m ore th e differen ce
becom es u n d istin gu ish able. Th e greater th e d ifferen ce, th e less likely
th at it d isappears in a clou d of overlappin g observation s ( Fig 2-6 ).
43
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
3 .1 Th e m e a s u re m e n t
If we repeat a cer tain m easu rem en t, eg, an x-ray of lower lim bs of
th e sam e person u n der iden tical con d ition s, by th e sam e rad iologist,
an d at alm ost th e sam e tim e, we can assu m e th at var iation s are du e
to th e m easu rem en t device. In ou r exam ple, th is wou ld m ean th at
th ere are sou rces of error som ew h ere w ith in th e tech n ical process
from th e len s to th e f n a l im age.
3 .2 Th e o b s e r ve r/ re a d e r
It is th e aim of each m easu rem en t process to m ake th e m eth od in -
depen den t of th e obser ver. However, th ere are m an y reason s w h y
m easu rem en ts m ay be in ter preted d ifferen tly by d ifferen t readers.
With rad iological im ages, ex per ien ced obser vers (in con trast to
begin n ers) clearly d ifferen tiate between artifacts a n d tru e f n d in gs.
A fractu re m ay be d ifferen tly classif ed by d ifferen t obser vers, an d
if th e sever ity of a fractu re h as progn ostic im plication s, th is w ill
in f u en ce th e in ter pretation of ou tcom es.
3 .3 Th e s u b je ct
Th ere m igh t be ch an ges of th e “tru e” m easu rem en t valu e w ith in on e
su bject over tim e. If we on ly h ave on e m easu rem en t per patien t, th ese
ch a n ges w ill con tr ibu te to th e total variability seen in ou r sa m ple.
It m ay well be th at each m easu rem en t wou ld oscillate in th e lon g
term for on e patien t even th ou gh th ere is n o tren d toward larger or
sm aller valu es over tim e. Sh or t-term var iability m igh t be redu ced by
m u ltiple m easu rem en ts, if possible.
44
2 Erro rs a n d u n ce rt a in t y
3 .4 Th e p o p u la t io n
Th e f n al sou rce of variability is d ifferen t from oth ers. Var iability be-
tween su bjects is a biological ph en om en on . Popu lation s dem on strate
h eterogen eity of su bjects. Th is in itself m igh t be th e focu s of in terest
in a stu dy.
4 Dis t rib u t io n s
Valu es obtain ed in a clin ical stu dy always sh ow a d istin ct d istr ibu tion .
Th ey m ay be d istr ibu ted sym m etr ically arou n d th e m ean , or sh ow
certain peaks an d tails. We n eed to k n ow h ow data are d istr ibu ted
before we can decide th e appropr iate su m m ar y m easu re (see also
ch apter 1 “Abou t n u m bers”, su bch apter 4 “Mea n versu s m ed ia n ”),
w h eth er statistical testin g m akes sen se, an d wh ich type of test is
su itable for statistica l a n alysis.
4 .1 No rm a lly d is t rib u t e d d a t a
Th e typical bell sh ape of em pir ical data distribu tion is well k n ow n .
From a th eoretical poin t of view, we assu m e th at data sam ples taken
from a target popu lation are n orm ally d istribu ted for th e var iable of
in terest. Th is th eoretical d istribu tion is an idealization of th e tru e
d istribu tion in th e target popu lation .
45
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
50
40
30
Percent
20
10
0
-40 -20 0 20 40 60
Difference in DASH scores
Fig 2 -7 Differences
in
DASH
scores
between
baseline
and
1-year
follow-up
assessments
in
patients with
conservatively
treated fractures of the proximal humerus.
The bell shape is
not
perfect,
but many analysts would agree that,
in
this case, we could
have used
statistical
methods for
normally
distributed data.
46
2 Erro rs a n d u n ce rt a in t y
4 .2 Ske w e d d a t a
A bell-sh ape cu r ve can n ot be fou n d w ith all va riables of in terest.
Wh en an alyzin g raw DASH scores after 1 year in stead of d ifferen ces
to baselin e valu es, we n ote a left-tailed data d istribu tion of valu es,
equ atin g an ex pon en tial d istribu tion . Most patien ts reported on ly
sligh t im pa irm en ts in sh ou lder fu n ction , a n d on ly few h ad severe
problem s ( Fig 2-8 ).
50
40
30
Percent
20
10
0
0 20 40 60 80
DASH score after 1 year
4 .3 Ot h e r d is t r ib u t io n s
Th ere are m an y oth er data d istribu tion s. For exam ple, if we are
in terested in su ccess rates (eg, th e rate of bon e h ealin gs) or com -
plication s, data follow a bin om ial d istribu tion . In case of ver y rare
even ts, th e so-called Poisson d istr ibu tion applies. Th ere are also
com plex, m u ltim odal d istr ibu tion s w ith two or m ore peaks, su ch as
th e in ciden ce rates of d istal rad iu s fractu res th at occu r frequ en tly
in ch ild ren , adolescen ts, an d th e elderly.
47
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
5 St a n d a rd d e via t io n ve rs u s s t a n d a rd e rro r
Th e two statistical param eters, stan dard error an d stan dard deviation ,
are often m ixed u p. To m ake a lon g stor y sh or t, it can be said th at
stan dard errors are u sed to calcu late lim its of con f den ce, w h ereas
stan dard deviation s ser ve to calcu late n orm al ran ges. Th is is dem -
on strated in th e follow in g th eoretical exam ple.
Exa mple In the given example of functionally treated fractures of the proximal
humerus, 127 patients were evaluated after 1 year for differences in DA SH scores to
baseline levels (see Fig 2 -7 ). We obtain the following information on the data:
• Mean = 10.2
• Standard deviation = 16.5
• Standard er ror = 1.5
5 .1 St a n d a rd d e via t io n
Th e stan dard deviation (SD) is a m easu re of h ow th e DASH differen ces
var y w ith in th e popu lation of 127 patien ts.
Exa mple If data are normally distributed, we can derive a normal range
for these differences by a very useful rule of thumb:
• 32% of the population are within the inter val mean (10.2)
± one standard deviation (16.5).
• 95% of the population are within the inter val mean (10.2)
± two standard deviations (2 × 16.5).
• 99.8% of the population are within the inter val mean
± three standard deviations (3 × 16.5).
48
2 Erro rs a n d u n ce rt a in t y
Th is is illu strated in Fig 2-9 . Th e differen ces in DASH scores are between
-22.8 an d 43.2 in abou t 95% of all patien ts.
Fig 2 -9 The mean and the standard deviation give information on data ranges,
if they are distributed normally.
* SD = standard deviation
49
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
5 .2 St a n d a rd e rro r
Wh at is th en th e m ean in g of stan dard errors? We cou ld be satisf ed
w ith ou r stu dy if we k n ow th e average DASH resu lts, an d th eir stan -
dard deviation s. However, does th is m ea n th at, if you repeat th e
stu dy, you w ill obtain th e sam e resu lts? Will th e resu lts apply to all
patien ts you are goin g to stu dy an d treat in n ear fu tu re?
50
2 Erro rs a n d u n ce rt a in t y
51
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
6 Su m m a r y
52
3 Ou tco m e se le ctio n
In p u t
Ou t p u t
Ou t co m e
53
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
3 Ou tco m e se le ctio n
1 In t ro d u ct io n 55
2 Va lid a t io n o f o u t co m e m e a s u re s 58
2 .1 Va lid it y 59
2 .2 Re lia b ilit y 61
2 .3 Re sp o n sive n e ss 63
4 Lim it s a n d a d va n t a ge s o f co m m o n o u t co m e m e a s u re s 67
4 .1 Clin icia n -b a se d o u tco m e s 67
4 .2 Pa tie n t-re p o rte d o u tco m e s 68
4 .3 Lim itin g b ia s in a n o u tco m e e va lu a tio n 69
4 .4 Typ ica l d ich o to m o u s stu d y e n d p o in ts 71
5 Fu n ct io n a l s co re s 73
7 P ra ct ica l is s u e s o f s e le ct in g a p p ro p ria t e o u t co m e
m e a s u re s 77
8 Su m m a r y 78
54
Mo h it Bh a n d a ri, Mich a e l Su k
3 Ou tco m e se le ctio n
1 In t ro d u ct io n
55
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Outcome
Output
Patients and condition
56
3 Ou tco m e s e le ctio n
As a resu lt, clin ician s often settle for a gen er ic h ealth statu s in stru m en t
su ch as th e sh or t-for m h ealth su r vey qu estion n aire w ith 36 qu estion s
(SF-36). Alth ou gh gen era l m easu res m ay be su itable for com par ison s
of h ealth , a m easu re design ed to be d isease-specif c w ill n or m ally
be m ore appropriate.
Th is ch apter is in ten ded to fam iliarize you w ith clin ically relevan t
an d m eth odologically sou n d m easu res of ou tcom e, an d to review
tech n iqu es to im prove th e validity an d reprodu cibility of stu dy trial
en dpoin ts.
57
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
2 Va lid a t io n o f o u t co m e m e a s u re s
l =1.82m w =72kg
Va lidity Relia bility
1.84 1.86
1.82 1.80
Fig 3 -2 The three key components of a useful and accurate outcome measure. It must be valid,
ie, it can measure precisely what it intends to measure. It must also be reliable, ie, given there
is no change over time, it should come up with the same value after repeated measure. Finally, if
there is change, the instrument should be able to detect this change (responsiveness).
58
3 Ou tco m e s e le ctio n
2 .1 Va lid it y
An in stru m en t is sa id to h ave face va lid ity if it appears to m easu re
w h at it in ten ded to m easu re. Ou tcom e in stru m en ts are con stru cted
to m easu re specif c var iables w ith in a def n ed patien t popu lation an d
sh ou ld on ly be con sidered valid for u se in relation to th at pu r pose.
For in stan ce, a validated m easu re of d isability for patien ts w ith k n ee
osteoar th r itis follow in g total k n ee arth roplasty can n ot au tom atically
be con sidered va lid for u se in patien ts w ith d ista l fem oral fractu res.
To in ter pret th e valid ity of an in stru m en t, m u ltiple con cepts are
con sidered. In orth opaedic literatu re th ese con cepts in clu de th e n otion s
of con ten t, criterion , an d con stru ct valid ity ( Fig 3 -3 ).
59
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Co n t e n t va lid it y
Con ten t va lid ity ref ects a n in stru m en t’s com preh en siven ess. It ex-
am in es th e ability of th e in stru m en t to m easu re all aspects of th e
con d ition for w h ich it was design ed. Gen erally, as th e con ten t of
an in stru m en t in creases, th e reliability decreases proportion ately.
Th is is ver y m u ch com parable to d iagn ostic test research —on ly few
tests are both h igh ly sen sitive an d specif c at th e sam e tim e. With
h igh sen sitivity (or h igh con ten t valid ity) you w ill probably n ot m iss
an y patien t h avin g a certain d isease (or an y sin gle aspect of th e
target con d ition). However, th is is likely to be at th e price of low
specif city, or reliability—you m igh t falsely d iagn ose h ealth y people
as sick (or m easu re aspects th at, in fact, h ave n o m ea n in g for th e
target con d ition).
Co n s t ru ct va lid it y
In con trast to con ten t an d cr iter ion valid ity, con stru ct valid ity is a
m ore qu an titative form of assessin g th e valid ity of an ou tcom e in -
stru m en t. A con stru ct is an item or con cept su ch as d isease statu s,
pain , or d isability. Con stru ct va lid ity is evalu ated by com parin g th e
relation sh ip between a con stru ct w ith in a n in stru m en t again st a
h ypoth esized sim ilar con stru ct w ith in an oth er in stru m en t. For ex-
60
3 Ou tco m e s e le ctio n
2 .2 Re lia b ilit y
Reliability ref ects th e a m ou n t of ra n dom an d system atic m easu re-
m en t error presen t w ith in an in stru m en t. Reliability of an ou tcom e
in stru m en t is especia lly im portan t w h en m easu r in g th e treatm en t
effect of an in ter ven tion . If an ou tcom e m easu re is n ot reliable,
ch an ges obser ved in th e treatm en t grou p m ay n ot n ecessar ily be
attr ibu ted to th e in ter ven tion , bu t rath er, a problem in h eren t to
th e m easu r in g in stru m en t. Like valid ity, reliability is a dyn am ic
property an d is best assessed in ter m s of reprodu cibility an d in ter n al
con sisten cy ( Fig 3 -4 ).
61
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Re p ro d u cib ilit y
Reprodu cibility can be fu rth er su bdivided in to in terobserver an d test-
retest reprodu cibility.
In t e r n a l co n s is t e n c y
Testin g in tern al con sisten cy is appropriate wh en an in stru m en t con sists
of several item s form in g a scale. Th e item s or qu estion s w ith in th e
sca le sh ou ld be h om ogen eou s, m easu r in g th e aspects of on ly on e
attribu te. Most in stru m en ts em ploy several item s to assess a sin gle
con stru ct, based on th e pr in ciple of m easu rem en t th at several related
obser vation s typically produ ce a m ore reliable estim ate th an on e.
Th u s, an in stru m en t in ter n ally con sisten t is com pr ised of qu estion s
th at correlate h igh ly w ith on e an oth er an d w ith th e total score of
item s in th e sam e scale.
62
3 Ou tco m e s e le ctio n
2 .3 Re s p o n s ive n e s s
Respon siven ess is assessed by com parin g th e ou tcom e scores before an d
after an in ter ven tion an d is calcu lated by th e d ifferen ce between th e
m ean pre- an d postoperative scores d ivided by th e stan dard deviation
of th e preoperative score. It is possible for an in stru m en t to be both
valid an d reliable bu t in sen sitive to ch an ge over tim e.
63
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Construct validity
Criterion validity
Interobserver
Reproducibility
Test-retest
Va lida ted ou tcome
Reliability
mea su r e
Internal consistency
Responsiveness
64
3 Ou tco m e s e le ctio n
65
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Fig 3 -6 Huge outcome instrument may scare both patients and doctors, and may not
suit clinical practice.
Each par ticu lar m eth od h as u n iqu e advan tages an d d isadvan tages
to be con sidered. For exam ple, con sider th e patien t in ter view. You
ca n clarify qu estion s an d en su re com pletion , th ereby ach ievin g a
m axim al respon se rate. Except patien t in ter view s are costly an d th ere
is th e poten tial for in ter viewer an d repor tin g bias. Mailin g-ou t on th e
oth er h an d, is in ex pen sive an d relatively u n biased, bu t respon se rates
are gen erally low. Th e ch oice of th e m eth od to be u sed w ill depen d
largely on th e research qu estion , ch aracter istics of th e in stru m en t,
attribu tes of th e patien t popu lation , an d feasibility issu es associated
w ith cost an d patien t bu rden .
66
3 Ou tco m e s e le ctio n
4 Lim it s a n d a d va n t a ge s o f co m m o n o u t co m e m e a s u re s
4 .1 Clin icia n -b a s e d o u t co m e s
Clin ician -based ou tcom e (CBO) m easu res su ch as join t ran ge of
m otion , h ardware position in g, gait abn orm alities, an d fractu re u n ion
are often ph ysiologic an d assess th e resu lt of a h ealth care in ter ven tion
from th e perspective of th e clin ician . Th ey are often ou tpu ts rath er
th an ou tcom es.
67
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Man y CBO m easu res h ave a ten den cy to u se n u m erical scales to assign
a poin t valu e to en d resu lts in or th opaed ic tria ls. Th ese n u m erica l
sca les com bin e aspects of th e clin ical resu lt (eg, ra n ge of m otion ,
stren gth , radiograph ic ch an ges) w ith th e fu n ction al resu lt (eg, pain ,
activity of daily livin g ch an ges, occu pation al d isabilities) to provide
a f n a l com posite score.
4 .2 Pa t ie n t -re p o r t e d o u t co m e s
In con trast to CBO m easu res, patien t-reported ou tcom es (PRO) are
qu estion n aires or in stru m en ts com pleted by th e patien t rath er th an
th e h ealth profession al. Th ey provide eviden ce an d perspective d istin ct
from th at provided en tirely by clin ical assessm en t. Th is is especially
im por tan t sin ce m u ltiple stu d ies in oth er m edical an d su rgical
d isciplin es h ave sh ow n th at ph ysician s an d patien ts often sign if can tly
d isagree abou t h ealth statu s. Th is d iscrepan cy is also presen t w ith in
th e f eld of or th opaed ics. For exam ple, poor correlation s ex ist between
patien t an d su rgeon ou tcom e ratin gs of satisfaction follow in g total k n ee
arth roplasty. For proper evalu ation of an in terven tion , th e n eed to
com plem en t trad ition al CBO m easu res w ith patien t-der ived fu n ction al
ou tcom es is n ow appreciated. Th e PRO in stru m en ts regu larly u sed in
ou tcom e research m easu re gen eral an d d isease-specif c, h ealth -related
qu ality of life, patien t sym ptom s, an d fu n ction al statu s.
68
3 Ou tco m e s e le ctio n
4 .3 Lim it in g b ia s in a n o u t co m e e va lu a t io n
Th e selection of an ou tcom e sh ou ld con sider th e su sceptibility of
th at m easu re to bias an d th e poten tial for th e u se of bias-m in im izin g
tech n iqu es.
69
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Fig 3 -7 In a single blinded trial, the patient is not informed which of two or more interventions
under investigation have been applied. This may impose ethical concerns and disturb the mutual
trust of patients in doctors—it simply means “I know what you are receiving but I don’t tell you”.
In a double blinded trial, neither patients nor doctors are aware of the assigned treatment.
Obviously there cannot be a double-blinded trial of two different surgical interventions—however,
it is still possible to have a blinded outcome assessor who had not been involved in the patient’s
treatment process.
Dou ble-blin ded su rgical trials are im possible, sin ce th is wou ld m ean
blin d in g of th e su rgeon du rin g th e procedu re. At best, su rgical tr ials
m ay be patien t- an d ou tcom e-assessor blin ded.
70
3 Ou tco m e s e le ctio n
Dich otom ou s en dpoin ts are popu lar in orth opaed ic research bein g
advan tages in several aspects. Ease of statistical an alysis w ith th e u se of
risk or odds ratios a n d m ore im por tan tly, ease of in ter pretation m ake
th ese ou tcom es attractive to in vestigators. In clin ical practice, th is
tran slates in to im proved u n derstan d in g of stu dy resu lts by decision
m akers. For exam ple, th e im pact of an in ter ven tion th at redu ces
m or tality or th e in ciden ce of in fection is easy to appreciate. In ou tcom e
assessm en t, dich otom ou s m easu res are objective a n d th erefore n ot
su bject to m isin ter pretation . Th is effectively redu ces th e in trodu ction
of ascertain m en t bias (bias ou tcom e assessm en t).
71
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
72
3 Ou tco m e s e le ctio n
5 Fu n ct io n a l s co re s
Exa mple In the case of a patient with severe arthritis of the hand, maximal grip
strength measures physical functional capacity, while a self-report of task s completed at
home measures functional performance.
Daily
routine Daily
Hand-grip
routine
test Hand-grip
test
73
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
often com plicatin g com par ison s of resu lts across d ifferen t scales. Two
validated m easu res of fu n ction al statu s in th e orth opaed ic literatu re
are th e Wester n On tario an d McMaster Un iversities osteoarth r itis
in dex (WOM AC) an d th e DASH scores.
For a m ore com preh en sive review of fu n ction al score m easu res refer
to th e AO Han dbook—Mu scu loskeletal Ou tcom es Measu res an d In -
stru m en ts, 2n d ex pan ded ed ition (Su k et al).
74
3 Ou tco m e s e le ctio n
Disease-specif c Anatomy-specif c
Patient-specif c
6 .1 Ge n e ric m e a s u re s
A generic h ealth -related qu ality of life in stru m en t qu an tif es a pa-
tien t’s perception of h is or h er overall h ealth statu s. Th is in clu des
physical sym ptom s, fu n ction , an d em otion al dim en sion s of h ealth . Th e
sickn ess im pact prof le (SIP), th e Nottin gh am h ealth prof le (NHP),
th e SF-36, an d th e Eu roQol qu estion n aire (EQ-5D) are exam ples of
gen eric in stru m en ts. Since th ey m easu re overall h ealth rath er th an
a specif c con dition , gen eric in stru m en ts are u sefu l for com parin g
h ealth statu s across differen t diseases an d severities, in terven tion s,
an d even across differen t cu ltu res. Heart d isease, diabetes, obesity,
an d other com orbid h ealth issu es are in corporated alon g w ith th e
orth opaedic problem in to th e m easu rem en t. Du e to th eir w ide ran ge
of clin ical application s, h owever, generic in stru m en ts are pron e to
abu se. Clin ician s, often overwh elm ed by nu m ber an d variety of
ou tcom e m easu res available, m ay defau lt to th e u se of a validated
gen eric in stru m en t wh en a poten tially m ore appropriate or sen sitive
m easu re is in fact available. Gen eric in stru m en ts regu larly lack th e
sen sitivity to detect sm all bu t clin ically im portan t ch an ges, specif cally
w ith orth opaed ic d isorders.
75
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
On e QALY equ ates 1 year of life in perfect h ealth . Likew ise, 0.5 QALY
equ ate 6 m on th s of life in perfect h ealth , or 1 year of life w ith a 50%
redu ction in h ea lth -related qu a lity of life. QALYs are a m on g th e m ost
im por tan t in d icators of effectiven ess for h ealth policy decision s.
6 .2 Dis e a s e -s p e cific m e a s u re s
Disease-specif c m easu res of h ealth -related qu ality of life are tailored
to in qu ire abou t ph ysical, m en tal, an d social aspects of h ealth specif c
to in ju ry (eg, fractu re), d isease (eg, osteoarth ritis), an atom ical area
(eg, k n ee), or a popu lation of in terest (eg, ath letes). Specif c m easu res
of sin gle con cepts or con d ition s are th e m ost n u m erou s ou tcom e
in stru m en ts w ith in th e h ealth statu s f eld. Th e popu larity of th ese
m easu res prim arily arose from th e n eed of clin ical tr ials an d practition ers
for accu rate scales respon sive to clin ical ch an ges th at occu r over tim e.
In con trast to th eir gen eric cou n ter parts, d isease-specif c in stru m en ts
are better able to detect sm aller or im portan t ch an ges th at occu r over
tim e in th e particu lar d isease stu d ied. Th is specif city h as also been
sh ow n to con tribu te to a m ore respon sive m easu re.
76
3 Ou tco m e s e le ctio n
7 P ra ct ica l is s u e s o f s e le ct in g a p p ro p ria t e o u t co m e m e a s u re s
Utility m easu res are often of u se to h ealth econ om ists requ irin g a
preferen ce ratin g for econ om ic an alysis.
77
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
8 Su m m a r y
78
4 Th e p e rfe ct d a tab a se
Patients
Patients
Variable
variable
Database
Da t a b a s e
79
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
4 Th e p e rfe ct d a ta b a se
1 In t ro d u ct io n 81
2 Th e b a s ic d a t a b a s e 82
3 Do s a n d d o n ’t s in t h e u s e o f s p re a d s h e e t s fo r d a t a e n t r y 86
3 .1 Ge n e ra l fo rm a t 86
3 .2 En trie s 86
4 A s p re a d s h e e t p a ck a ge o r m o re a d va n ce d d a t a b a s e
p ro gra m s ? 88
5 So m e w a ys t o e n s u re d a t a q u a lit y 90
5 .1 Va lid a tio n ru le s 90
5 .2 Co n siste n cy ch e cks 90
5 .3 Do u b le d a ta e n try 91
6 Su m m a r y 92
80
Th o m a s Ko h lm a n n , Dirk Ste n ge l, Axe l Ekke rn ka m p
4 Th e p e rfe ct d a ta b a se
1 In t ro d u ct io n
81
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
2 Th e b a s ic d a t a b a s e
82
4 Th e p e rfe ct d a t a b a s e
1 TANUR877 1 18 1 91 0
2 MOLUT504 0 25 1 49 0
3 METUT536 1 49 1 68 1
4 BOMUK480 1 58 0 71 2
5 MUSUR955 1 71 0 63 5
6 BIBUT318 0 50 1 55 3
7 BEMEL434 0 40 0 109 0
8 NOBUT546 1 31 0 45 0
Rows–r ecor ds
9 BUMUK835 1 69 1 60 0
10 SUBON450 1 54 1 67 1
11 RABAB061 1 58 0 90 1
12 LARUT055 1 82 1 84 2
13 SUTUT761 1 19 0 56 4
14 RUBUS526 0 47 1 79 0
15 ROMUS580 1 31 1 64 0
16 BANON356 0 59 0 102 0
17 NUNUT119 1 67 0 61 3
18 LAKUK515 1 69 1 53 2
19 MIBEL627 1 73 1 50 1
20 SEBUR182 0 84 0 47 1
83
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
84
4 Th e p e rfe ct d a t a b a s e
Th ird ly, all en tr ies in th e records con tain precisely th e in form ation
described in th e colu m n h ead in gs (gen der of th e patien t en tered
con sisten tly as “m ale” or “fem ale”; age; type of n ail u sed, ie, tip
or troch an teric en tr y; du ration of su rgery; an d u n its of packed red
blood as nu m er ical data). In th is exam ple; m ale an d fem ale gen der,
an d th e type of n ail was coded as 0 or 1. If th e var iable of in terest
h as on ly two ex pression s (su ch as m ale or fem ale) th at are m u tu ally
exclu sive, it is better to u se a tru e bin ar y code (1=yes, 0 =n o) rath er
th an “m ale/ fem ale”, “yes/ n o”, or “1/ 2”. Th e bin ar y code is n atu ral in
th is settin g, an d is recogn ized by statistical software.
Fin ally, if th ere is som e reason to u se ch aracter data (in ou r exam ple
“m ale” an d “fem ale”), it is essen tial to follow exactly th e sam e spellin g
th rou gh ou t (an d n ot “m ale” an d “Male”). Oth er w ise, data an alysis w ill
report as m an y categories as th ere are d ifferen t ways of spellin g.
85
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
3 Do s a n d d o n ’t s in t h e u s e o f s p re a d s h e e t s fo r d a t a e n t r y
3 .1 Ge n e ra l fo r m a t
Data sh ou ld always be arran ged in rectan gu lar form at. Each row of th e
table sh ow s th e in form ation collected for on e case (avoid em pty row s);
each colu m n com prises in form ation abou t on e specif c ch aracteristic
of th e cases.
If th e data set com prises several grou ps of patien ts, th e data for each
grou p sh ou ld be placed on th e spreadsh eet on e after th e oth er an d
a colu m n sh ou ld be in clu ded in d icatin g to wh ich grou p th e cases
belon g.
3 .2 En t rie s
Each cell in th e data table sh ou ld con tain precisely th e relevan t
data —n oth in g else. In a colu m n con ta in in g n u m eric data, n ever
u se “?” alon e or in addition to an en try to in d icate som eth in g is still
u n clear, n or u se “grade 2/ 3 open fractu re” if it is still to be determ in ed
w h eth er th e en tr y is “2” or “3”.
86
4 Th e p e rfe ct d a t a b a s e
A spirin 3 500
B 2 2
Ta b le 4 -2 a —b
a It is much easier to combine different items of information into a composite than to
decompose a complex entry like “ASS 3 × 500 mg”.
b Similarly, a Müller AO Cassification of Fractures in Long Bones B2.2 fracture should be
classified using three columns.
87
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
4 A s p re a d s h e e t p a ck a ge o r m o re a d va n ce d d a t a b a s e p ro gra m s ?
Spreadsh eet packages are ver y u sefu l tools for data en tr y an d storage
an d a good ch oice in m an y circu m stan ces. However, th eir capacity
to deal w ith large an d com plex data sets is lim ited an d th ey provide
on ly few option s for gen eratin g data en tr y form s, ch eck in g data u pon
en tr y, or th e au tom atic sk ippin g of f elds con d ition al on previou sly
en tered in form ation . If com plex data sets h ave to be m an aged, eg,
lon gitu d in al patien t data w ith a var iable n u m ber of visits du rin g a
stu dy, or if advan ced data en tr y procedu res are requ ired, th e u se
of a specialized database program , su ch as M icrosoft Access™ or
FileMaker ® Pro software, sh ou ld be con sidered.
88
4 Th e p e rfe ct d a t a b a s e
89
Ha n d b o o k—St a tis tics a n d Da t a Ma n a ge m e n t
5 So m e w a ys t o e n s u re d a t a q u a lit y
5 .1 Va lid a t io n ru le s
Im posin g validation ru les at th e tim e of data en tr y is a powerfu l
tool for preven tin g errors. With ran ge ch ecks, ch ecks of perm itted
n u m erical an d ch aracter in pu t, errors ca n be detected a n d corrected
im m ed iately w h en th ey occu r. If th e data en tr y system does n ot
provide th ese ch ecks sim u ltan eou sly, th ey can be applied as a secon d
step after data en tr y h as been com pleted. M in im u m an d m ax im u m
valu es in a data colu m n as a ran ge ch eck for nu m erical valu es can
easily be deter m in ed even in a spreadsh eet en viron m en t. Frequ en cy
tables of nu m er ical or ch aracter data belon g to th e core fu n ction s of all
statistical packages an d are ver y u sefu l for detectin g data en tr y errors.
As th ese secon d-lin e ch ecks are perform ed m ostly after com pletion of
data en tr y th ey are m ore tim e con su m in g th a n rea l-tim e va lidation .
If an error is detected, th e or igin al data form s u su ally h ave to be
con su lted.
5 .2 Co n s is t e n c y ch e ck s
Data are in con sisten t if th e in form ation in on e var iable is in com patible
w ith th e in form ation in an oth er. A date of su rgery prior to th e date of
h ospitalization , prostate can cer as co-m orbid ity in a fem ale patien t,
or a period of 5 days sick-leave in an u n em ployed patien t are typical
exam ples of su ch in con sisten cies. Wh ile su ch in con sisten cies are
easily recogn ized on ce th ey h ave been iden tif ed, th ey are d iff cu lt to
erad icate becau se a large nu m ber of logically im possible com bin ation s
of data m ay ex ist. Research ers sh ou ld in vest appropriate tim e to
def n e as m an y poten tial in con sisten cies as possible. With a clear an d
com preh en sive list of su ch in con sisten cies a data an alyst can easily
ch eck if an y of th ese actu a lly occu r in th e database, for exa m ple by
ca lcu latin g th e tim e elapsed between th e day of h ospital ad m ission
an d th e day of su rger y, tabu latin g com orbid ity separately for m ale an d
fem ale patien ts, or by com parin g th e days of sick-leave for patien ts
w ith d ifferen t occu pation al statu s.
90
4 Th e p e rfe ct d a t a b a s e
5 .3 Do u b le d a t a e n t r y
No validation ru le or con sisten cy ch eck w ill h elp in case of erron eou s
en tr ies, for exam ple, if th e du ration of su rger y was 86 m inu tes bu t
th e digits 6 an d 8 were en tered in reverse on a com pu ter keyboard.
Dou ble data en tr y is th e on ly strategy to avoid or at least to m in im ize
th e likelih ood of th is type of error. Th is ver y effective m eth od for
qu ality con trol sh ou ld be u sed w h en ever possible. It is best to en ter
all data tw ice; h owever, if resou rces are restr icted, dou ble data en tr y
for a su bset of cases an d/or a lim ited n u m ber of variables m ay provide
an estim ate of th e reliability of th e data.
91
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
6 Su m m a r y
92
5 Ho w to an alyze yo u r d a ta
Pa ra m e t ric or No n p a ra m e t ric
Un p a ire d or Pa ire d
Ca t e ro g ica l or Co n t in o u s
93
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
5 Ho w to a n a lyze yo u r d a ta
1 St a t is t ica l t e s t s : t h e b a s ics 95
2 Ho w t o ch o o s e t h e a p p ro p ria t e t e s t 96
3 Bin a r y o r ca t e go rica l d a t a 98
4 Ord in a l d a t a 10 0
5 Gro u p co m p a r is o n s in vo lvin g co n t in u o u s d a t a 10 2
6 Co m p a ris o n o f m o re t h a n t w o gro u p s 10 6
7 An a lys is o f p a ire d d a t a a n d o t h e r e xt e n s io n s 10 7
8 Su m m a r y 10 9
94
Th o m a s Ko h lm a n n , Jö rn Mo o ck
5 Ho w to a n a lyze yo u r d a ta
1 St a t is t ica l t e s t s : t h e b a s ics
95
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
d ifferen ce between obser ved an d expected resu lts is n o lon ger a m atter
of ch an ce. All of th ese d ifferen t steps of com pu tation an d com par ison
are n owadays m ade by statistical software.
2 Ho w t o ch o o s e t h e a p p ro p ria t e t e s t
Man y popu lar statistical tests for an alyzin g d ifferen ces between grou ps,
su ch as th e t-test, an a lysis of varia n ce (ANOVA), or th e ch i-squ a re
(_2) test can be in tegrated in a fram ework of an alysis m eth ods based
on ou r th ree decision criter ia—n u m ber of grou ps, data type, an d
assu m ption of n orm al d istribu tion . An over view of som e com m on
m eth ods for an alyzin g grou p d ifferen ces based on th ese criter ia is
sh ow n in Ta b le 5 -1 .
96
5 Ho w to a n a lyze yo u r d a t a
Da ta type
Binary or
Ordinal Continuous
categorical
Wh en on ly two grou ps w ill be com pared a sim ple exam ple can illu strate
th e u se of th is table:
Only patients who are working at the time of the injury are enrolled in the study.
Duration of sick leave represents the primary endpoint. A total number of
60 patients are randomized to receive either conservative (short-arm cast) or operative
treatment (Herbert screw f xation).
97
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
3 Bin a r y o r ca t e go rica l d a t a
Conservative Operative
(N = 30) (N = 30) P value
Percentage with complications 10.0 20.0 0.278
98
5 Ho w to a n a lyze yo u r d a t a
Resu lts of th is test are also in clu ded in Tab le 5 -3 . Becau se th e P valu e
is greater th a n th e prespecif ed sign if ca n ce level of 0.05 we con clu de
th at th e n u ll h ypoth esis, th e proportion of com plication s is equ al in
both grou ps, can n ot be rejected. Hen ce, it is decided th at th e obser ved
d ifferen ce was produ ced by ch an ce.
99
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
4 Ord in a l d a t a
10 0
5 Ho w to a n a lyze yo u r d a t a
For com par ison s of m ore th an two grou ps th e n on param etr ic equ ivalen t
to th e F-test u sed in th e an alysis of varian ce is th e Kru skal-Wallis
H test. Again , a sign if ca n t resu lt of th is test on ly in d icates th at th e
data are n ot com patible w ith th e n u ll h ypoth esis of n o d ifferen ces
between grou ps. Th e test resu lt w ill n ot tell u s wh ich of th e grou ps
d iffer sign if can tly from each oth er. In con trast to th e param etr ic
an alysis of var ian ce, wh ere a n u m ber of m eth ods for pair w ise
com parison s are available w h ich avoid overadju stm en t for m u ltiple
testin g, n on param etric tests of pair w ise d ifferen ces m ostly rely on
Bon ferron i correction or a m od if ed, less con ser vative m eth od, th e
Bon ferron i-Holm correction .
Conservative Operative
100
80
80
60
Percent
50
40
40
20
10 10 10
0
None Strenous Minor None Strenous Minor
exercise exercise exercise exercise
Fig 5 -1 Pain and discomfort ratings of patients after conservative and operative treatment.
10 1
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
18
16
Dutration of sick leave (weeks)
14
12
Mean value = 11
10
Mean value = 9
8
0
Conservative Operative
Standard deviation of data in both groups = 3
Fig 5 -2 Duration of sick leave for patients under conservative and operative treatment.
10 2
5 Ho w to a n a lyze yo u r d a t a
10 3
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
As can be seen from Fig 5 -1 , th e data for th e prim ary en dpoin t are n ot
com pletely n orm al. Th e d istr ibu tion of th e var iable in th e con ser vative
grou p is m ore com pressed th an a n orm ally d istr ibu ted var iable wou ld
be an d th e operative grou p h as two cases w ith a ver y sh ort du ration
of sick leave. Man y statistical m eth ods ex ist for assessin g w h eth er
em pirical data are con sisten t w ith a n or m al d istr ibu tion . However,
th is is rarely n eeded—a pragm atic way is to graph you r data f rst to
gain an im pression of th e u n derlyin g d istribu tion .
10 4
5 Ho w to a n a lyze yo u r d a t a
Exa mple
Methods: … Differences between the two groups
were tested by the t-test. The results were P < .05
considered to be signif cant if P < .05 …
10 5
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
6 Co m p a ris o n o f m o re t h a n t w o gro u p s
10 6
5 Ho w to a n a lyze yo u r d a t a
7 An a lys is o f p a ire d d a t a a n d o t h e r e xt e n s io n s
10 7
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
10 8
5 Ho w to a n a lyze yo u r d a t a
8 Su m m a r y
10 9
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
110
6 Pre se n t yo u r d a ta
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
6 Pre se n t yo u r d a ta
1 We a re vis u a l p e o p le 113
4 Ba r ch a r t s 116
5 Erro r b a rs 118
5 .1 Clin ica l re le va n ce 118
5 .2 Sta tis tica l sign ifica n ce 12 0
6 Bo x-a n d -w h is ke rs p lo t s 121
8 Fo re s t p lo t s 12 8
10 Su m m a r y 13 4
112
Ka i Ba u w e n s , Mich a e l Su k, Dirk Ste n ge l
6 Pre se n t yo u r d a ta
1 We a re vis u a l p e o p le
Figu res an d ch ar ts are th e m ost in f u en tial veh icles for d istr ibu tin g
scien tif c in form ation . Th ey m ay affect th e accepta n ce or rejection
of a m an u script, an d th e reception of stu dy resu lts by th e scien tif c
com m u n ity. Un fortu n ately, th ey h ave also becom e popu lar tools to
ch eat both h ea lth care profession als an d con su m ers as well.
113
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
The researcher’s essen tial graph ical tool box shou ld contain h istogram s,
bar ch arts (always w ith m easu res of error), box-and-wh iskers plots, scat-
ter plots, an d forest plots. We w ill sh ow h ow to design an d u se th ese.
114
6 Pre s e n t yo u r d a ta
3 Yo u r gra p h ica l m a s t e r p la n a n d t o o lb o x
You m ade it—you com pleted a ran dom ized tr ial of a n ew lock in g
plate versu s a con ven tion al T-plate for open redu ction an d in tern al
f xation of an A3 d istal radial fractu re accord in g to th e Mü ller AO
Classif cation of Fractu res in Lon g Bon es. After 1 year of follow-u p,
th e d isability of th e ar m , sh ou lder, an d h an d (DASH) score in both
grou ps com es u p as sh ow n in Tab le 6 -1 (keep in m in d th at lower DASH
scores m ea n better fu n ction ).
115
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
4 Ba r ch a r t s
12.0
11.0
10.0
DASH score
9.0
8.0
7.0
6.0
5.0
Locking plate T-plate
116
6 Pre s e n t yo u r d a ta
Th e bar ch art rem ain s th e m ost com m on way of graph ical data
presen tation . It is easy to u n derstan d, an d you m ay h ave already
created bar ch ar ts th at resem bled th ose d isplayed in Fig 6 -2 . Note th at
Fig 6 -2 con tain s th e sam e in form ation as Fig 6 -1 , bu t w ith a m u ch h igh er
data den sity, correct a x is sca le, a n d appropriate labelin g.
12
Mean DASH score after 1 year follow-up
10
0
Locking plate T-plate
117
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
5 Erro r b a rs
5 .1 Clin ica l re le va n ce
Do n ot on ly pictu re m ean valu es or percen tages, bu t also u se a form at
th at sh ow s d istribu tion s an d ou tliers. It clearly m akes a d ifferen ce
w h eth er you r m ea n va lu e of 8.5 was derived from a ra n ge of 4 –10,
or 0 –25. Th e perfect graph sh ow s both th e clin ical relevan ce an d
statistical sign if can ce of stu dy f n d in gs. In case of sym m etr ically
d istribu ted data (like ran ges of m otion , fu n ction al scores, an d qu ality
of life m easu rem en ts), u se th e stan dard error of th e m ean (SEM) or
th e stan dard deviation (SD). Both are appropr iate bu t in d icate you r
ch oice u n equ ivoca lly, eith er at th e y-ax is or in th e f gu re legen d.
Rem em ber th e SEM is always tigh ter th an th e SD (th e SEM resu lts
from d ivid in g th e SD by th e squ a re-root of th e sa m ple size). Sin ce
th is su ggests a h igh er precision ( Fig 6 -3 ), au th ors som etim es om it to
in d icate th at th eir error bars are SEM , n ot SD. If in dou bt, th e SD is
th e better option . Most readers w ill ex pect error bars to represen t SD,
n ot SEM. Also be su re th at error bars in both grou ps ex pan d in both
d irection s. Im agin e you h ad obtain ed DASH scores in you r d istal radiu s
fractu re tria l after 3, 6, a n d 12 m on th s ( Fig 6 -4 ). By con trastin g th e
u pper ran ge of on e grou p to th e lower ran ge of th e oth er grou p, one
gain s th e optical illu sion of a large differen ce between stu dy grou ps.
118
6 Pre s e n t yo u r d a ta
a 20 b 20
18 18
Mean DASH score after 1 year follow-up
14 14
12 12
10 10
8 8
6 6
4 4
2 2
0 0
Locking plate T-plate Locking plate T-plate
Intervention Intervention
30 Locking plate
T-plate
25
Mean DASH score, SD
20
15
10
0
3 months 6 months 12 months
Follow-up
Fig 6 -4 One-tailed error bars skew the data because of optical enlargement of the differences.
119
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
5 .2 St a t is t ica l s ig n ifica n ce
Beside clin ica l releva n ce, th e statistica l sign if ca n ce of you r f n d in gs
ca n best be ex pressed by in cor poratin g a 95% con f den ce in ter val
(95% CI) in to you r ch art ( Fig 6 -5 a ). As a ru le of th u m b, it is u n likely
th at obser ved d ifferen ces h ave been produ ced by ch an ce, if th e 95%
CI of m ean valu es (a n d proportion s) do n ot overlap. Con versely,
overlappin g 95% CI m ean th at th e obser ved d ifferen ces are still
com patible w ith ch an ce (or a P valu e > .05, if th is was ch osen as th e
level of sign if can ce).
14 14
Mean DASH score after 1 year follow-up
Mean DASH score after 1 year follow-up
12 12
10 10
8 8
6 6
4 4
2 2
0 0
Locking plate T-plate Locking plate T-plate
Intervention Intervention
12 0
6 Pre s e n t yo u r d a ta
6 Bo x-a n d -w h is ke rs p lo t s
Box-a n d-w h iskers plots (or box plots), origin a lly proposed by Tu key,
h ave becom e an im portan t alter n ative to bar ch arts w h en d isplayin g
con tin u ou s data. In con trast to bar ch ar ts, th ey con tain detailed
in form ation abou t th e cen ter an d d istr ibu tion of data in you r sam ple.
Th e an atom y of a box plot is displayed in Fig 6 -6 , u sin g th e exam ple
of a case series of prox im a l h u m era l fractu res. Th e box plot sh ow s
th e d ifferen ce in DASH ratin gs between th e baselin e assessm en t an d
after th ree m on th s of follow-u p w ith con ser vative treatm en t.
121
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
70
60
Outliers
50
Difference in DASH scores (baseline— follow-up)
25%
40 Upper whisker *
30
20
75% percentile
Mean **
10 50%
Median (50% percentile)
25% percentile
0
-10
75%
-20 Lower whisker * 25%
-30
Outliers
12 2
6 Pre s e n t yo u r d a ta
Th is def n ition is, h owever, n ot straigh tfor ward, an d recen t descr iption s
su ggest d isplayin g th e 10% an d 90% , 5% an d 95% percen tile, or
even th e m in im u m an d m ax im u m in stead. Refer to th e in stru ction s
of you r software package for th e in d ividu al defau lt settin g. If you
con sider u sin g an oth er w h isker m ean in g th an th e defau lt, specify
you r ch oice in th e f gu re legen d.
40%
30%
Percentage
20%
10% Outliers
Fig 6 -7 Note that important values (median, interquartile range) are difficult
to be traced from this type of figure.
12 3
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
80
60
DASH score after 1 year follow-up
40
20
0
Undisplaced Displaced
12 4
6 Pre s e n t yo u r d a ta
Alm ost all com m ercial statistical software packages (like SPSS®,
SAS®, STATA®, an d oth ers) offer soph isticated box plot option s.
Alth ou gh M icrosoft Excel® h as com fortable graph ical featu res, it
cu rren tly does n ot allow for produ cin g box plots sim ply by a m ou se
click. However, m an y com m itted Excel® u sers h ave fou n d extrem ely
clever ways to produ ce box plots w ith th e available graph ical tools
in ju st a few steps.
12 5
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
60
Difference in DASH scores
40
(baseline— follow-up)
20
-20
-40
20 30 40 50 60 70 80 90
Age, years
12 6
6 Pre s e n t yo u r d a ta
a b 60
60
40 40
(baseline— follow-up)
(baseline— follow-up)
20 20
0 0
-20 -20
-40 -40
64 68 72 76 80 84 64 68 72 76 80 84
Fig 6 -10 a – b
a Widening of the 95% CI at the edges of the dataset.
b Similar slope of the regression in a comparable but smaller study with wider
confidence intervals.
12 7
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
8 Fo re s t p lo t s
Forest plots allow for viewing the results from all subgroup
analyses at once ( Fig 6 -11).
Male gender
Smoking
Age –>65 years
Diabetes mellitus
Osteoarthritis
Respiratory disease
Psychiatric disease
Aspirin use
Displaced fracture
High-energy injury
B/C-type fracture
Attempted fracture reduction
Fracture reduction under anesthesia
Fig 6 -11 A forest plot to illustrate the results from subgroups analyses.
12 8
6 Pre s e n t yo u r d a ta
12 9
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
Pie ch arts belon g to the graph types w ith th e lowest data den sity. Th e
best advice is to com pletely avoid th em in a scien tif c presen tation or
m anu script. Fig 6 -11 sketch es an exam ple. Alth ou gh it fu lly illu strates
th e distribu tion of data, im portan t values (m edian , interqu artile ran ge)
are diff cu lt to be traced from th is type of f gu re, pu blish ed in th e
report of a random ized trial of h ook pin s versu s screw s for the in tern al
f xation of cervical h ip fractu res ( Fig 6 -12 a ). Th e au th ors aim ed to sh ow
th e tim e in terval between adm ission an d su rgery. Becau se th e in terval
was separated in to seven categories, mu ltiple colors were n eeded to
bu ild a ch art wh ich still does n ot allow for readin g th e in dividu al
proportion s. Th e presen ted graph clearly m issed its target, an d cou ld
h ave been replaced by a h istogram ( Fig 6 -12b ).
a b 60%
<6 50%
Percentage of patients
6-12 40%
12-18
30%
18-24
20%
24-48
48-72 10%
>72 0%
<6 6-12 12-18 18-24 24-48 48-72 >72
Time interval from admission Time interval from admission
to surgery, hours to surgery, hours
Fig 6 -12 a – b
a This pie chart was intended to show the proportion of patients scheduled to surgery
at different time intervals. However, percentages cannot be traced from the diagram.
b The histogram is efficient, catchy, and does not need color.
13 0
6 Pre s e n t yo u r d a t a
Unknown
Other
Fall
Stapping
Gunshot
Pedestrian
Motorcycle
Auto
MRTP MTOS
Fig 6 -13 Low data density of a 3-D stacked bar chart that compares mechanisms of injury
of patients enrolled in the modal rural trauma project (MRTP) and the major trauma
outcome study (MTOS).
Ca u se of in ju r y MRTP MTOS
n = 266 n = 80544
Unknow n 5 (1.7% ) 81 (0.1% )
Other 35 (13.2% ) 11921 (14.8% )
Fall 16 (6.0% ) 13290 (16.5% )
Stabbing 10 (3.8% ) 7652 (9.5% )
Gunshot 22 (8.1% ) 8054 (10.0% )
Pedestrian 32 (12.0% ) 6041 (7.5% )
Motorcycle 32 (12.0% ) 5558 (6.9% )
Motorcar 106 (39.8% ) 27949 (34.7% )
131
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
90
80
70
60
50
Grades
40
30
20
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Elbows
b 100
80
ROM (degrees)
60
40
20
0
preoperative postoperative
13 2
6 Pre s e n t yo u r d a ta
a b
a 100 b 100
90
80 80
70
ROM (degrees)
ROM (degrees)
60 60
50
40 40
30
20 20
10
0 0
preoperative postoperative preoperative postoperative
13 3
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
10 Su m m a r y
13 4
7 Glo ssa ry
13 5
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
13 6
Dirk Ste n ge l
7 Glo ssa ry
13 7
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
13 8
7 Glo s s a r y
13 9
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
14 0
7 Glo s s a r y
141
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
14 2
7 Glo s s a r y
14 3
Ha n d b o o k—Sta tis tics a n d Da t a Ma n a ge m e n t
14 4