Download as pdf or txt
Download as pdf or txt
You are on page 1of 780

Statistics Hacks

By Bruce Frey
...............................................
Publisher: O'Reilly
Pub Date: May 2006
Print ISBN-10: 0-596-10164-3
Print ISBN-13: 978-0-59-610164-0
Pages: 356

Table of C ontents | I ndex

Want to c alc ulate the probability that an event will happen? Be


able to s pot fake data? P rove beyond doubt whether one thing
c aus es another? O r learn to be a better gambler? You c an do
that and muc h more with 7 5 prac tic al and fun hac ks pac ked into
Statis tics Hacks . T hes e c ool tips , tric ks , and mind- boggling
s olutions from the world of s tatis tic s , meas urement, and
res earc h methods will not only amaze and entertain you, but will
give you an advantage in s everal real- world s ituations - inc luding
bus ines s .

T his book is ideal for anyone who likes puzzles , brainteas ers ,
games , gambling, magic tric ks , and thos e who want to apply
math and s c ienc e to everyday c irc ums tanc es . Several hac ks in
the firs t c hapter alone- s uc h as the "c entral limit theorem,",
whic h allows you to know everything by knowing jus t a little-
s erve as s ound approac hes for marketing and other bus ines s
objec tives . U s ing the tools of inferential s tatis tic s , you c an
unders tand the way probability works , dis c over relations hips ,
predic t events with unc anny ac c urac y, and even make a little
money with a well- plac ed wager here and there.

Statis tics Hacks pres ents us eful tec hniques from s tatis tic s ,
educ ational and ps yc hologic al meas urement, and experimental
res earc h to help you s olve a variety of problems in bus ines s ,
games , and life. You'll learn how to:

P lay s mart when you play Texas H old 'E m, blac kjac k,
roulette, dic e games , or even the lottery

D es ign your own winnable bar bets to make money and


amaze your friends

P redic t the outc omes of bas eball games , know when to


"go for two" in football, and antic ipate the winners of
other s porting events with s urpris ing ac c urac y
D emys tify amazing c oinc idenc es and dis tinguis h the
truly random from the only s eemingly random- - even
keep your iP od's "random" s huffle hones t

Spot fraudulent data, detec t plagiaris m, and break


c odes

H ow to is olate the effec ts of obs ervation on the thing


obs erved

Whether you're a s tatis tic s enthus ias t who does c alc ulations in
your s leep or a c ivilian who is entertained by c lever s olutions to
interes ting problems , Statis tics Hacks has tools to give you an
edge over the world's s lim odds .
Statistics Hacks
By Bruce Frey
...............................................
Publisher: O'Reilly
Pub Date: May 2006
Print ISBN-10: 0-596-10164-3
Print ISBN-13: 978-0-59-610164-0
Pages: 356

Table of C ontents | I ndex

credits Credits
Preface
Chapter 1. The Basics
Hack 1. Know the Big Secret
Hack 2. Describe the World Using Just Tw o
Numbers
Hack 3. Figure the Odds
Hack 4. Reject the Null
Hack 5. Go Big to Get Small
Hack 6. Measure Precisely
Hack 7. Measure Up
Hack 8. Pow er Up
Hack 9. Show Cause and Effect
Hack 10. Know Big When You See It
Chapter 2. Discovering Relationships
Hack 11. Discover Relationships
Hack 12. Graph Relationships
Hack 13. Use One Variable to Predict Another
Hack 14. Use More Than One Variable to Predict
Another
Hack 15. Identify Unexpected Outcomes
Hack 16. Identify Unexpected Relationships
Hack 17. Compare Tw o Groups
Hack 18. Find Out Just How Wrong You Really Are
Hack 19. Sample Fairly
Hack 20. Sample w ith a Touch of Scotch
Hack 21. Choose the Honest Average
Hack 22. Avoid the Axis of Evil
Chapter 3. Measuring the World
Hack 23. See the Shape of Everything
Hack 24. Produce Percentiles
Hack 25. Predict the Future w ith the Normal Curve
Hack 26. Give Raw Scores a Makeover
Hack 27. Standardize Scores
Hack 28. Ask the Right Questions
Hack 29. Test Fairly
Hack 30. Improve Your Test Score While Watching
Paint Dry
Hack 31. Establish Reliability
Hack 32. Establish Validity
Hack 33. Predict the Length of a Lifetime
Hack 34. Make Wise Medical Decisions
Chapter 4. Beating the Odds
Hack 35. Gamble Smart
Hack 36. Know When to Hold 'Em
Hack 37. Know When to Fold 'Em
Hack 38. Know When to Walk Aw ay
Hack 39. Lose Slow ly at Roulette
Hack 40. Play in the Black in Blackjack
Hack 41. Play Smart When You Play the Lottery
Hack 42. Play w ith Cards and Get Lucky
Hack 43. Play w ith Dice and Get Lucky
Hack 44. Sharpen Your Card-Sharping
Hack 45. Amaze Your 23 Closest Friends
Hack 46. Design Your Ow n Bar Bet
Hack 47. Go Crazy w ith Wild Cards
Hack 48. Never Trust an Honest Coin
Hack 49. Know Your Limit
Chapter 5. Playing Games
Hack 50. Avoid the Zonk
Hack 51. Pass Go, Collect $200, Win the Game
Hack 52. Use Random Selection as Artificial
Intelligence
Hack 53. Do Card Tricks Through the Mail
Hack 54. Check Your iPod's Honesty
Hack 55. Predict the Game Winners
Hack 56. Predict the Outcome of a Baseball Game
Hack 57. Plot Histograms in Excel
Hack 58. Go for Tw o
Hack 59. Rank w ith the Best of Them
Hack 60. Estimate Pi by Chance
Chapter 6. Thinking Smart
Hack 61. Outsmart Superman
Hack 62. Demystify Amazing Coincidences
Hack 63. Sense the Real Randomness of Life
Hack 64. Spot Faked Data
Hack 65. Give Credit Where Credit Is Due
Hack 66. Play a Tune on Pascal's Triangle
Hack 67. Control Random Thoughts
Hack 68. Search for ESP
Hack 69. Cure Conjunctionitus
Hack 70. Break Codes w ith Etaoin Shrdlu
Hack 71. Discover a New Species
Hack 72. Feel Connected
Hack 73. Learn to Ride a Votercycle
Hack 74. Live Life in the Fast Lane (You're Already
In)
Hack 75. Seek Out New Life and New Civilizations
Colophon
Index
Credits
About the Author
Bruc e Frey, P h.D ., is a c omic book c ollec tor and film buff. I n his
s pare time, he teac hes s tatis tic s to graduate s tudents and
c onduc ts res earc h in his s ec ret identity as an as s is tant
profes s or in E duc ational P s yc hology and Res earc h at the
U nivers ity of Kans as . H e is an award- winning teac her, and his
s c holarly res earc h interes ts are in the areas of teac her- made
tes ts and c las s room as s es s ment, the meas urement of
s pirituality, and program evaluation methods . Bruc e's honors
inc lude taking third plac e in the Kans as M onopoly C hampions hip
as a teenager, s ec ond plac e in the Kans as Film Fes tival as a
c ollege s tudent, and a res pec table third- plac e finis h in the
L awrenc e, Kans as , Texas H old 'E m P oker Tournament as a
middle- aged man. H e is proudes t of two ac c omplis hments : his
marriage to his s weet wife, and his purc has e of a low- grade c opy
of Showcas e #4 , a c omic book wherein the "Silver A ge Flas h firs t
appears ," whatever that means .

Contributors
T he following people c ontributed their hac ks , writing, and
ins piration to this book:

J os eph A dler is the author of Bas eball Hacks (O 'Reilly),


and a res earc her in the A dvanc ed P roduc t D evelopment
G roup at VeriSign, foc us ing on problems in us er
authentic ation, managed s ec urity s ervic es , and RFI D
s ec urity. J oe has years of experienc e analyzing data,
building s tatis tic al models , and formulating bus ines s
s trategies as an employee and c ons ultant for c ompanies
inc luding D oubleC lic k, A meric an E xpres s , and D un &
Brads treet. H e is a graduate of the M as s ac hus etts
I ns titute of Tec hnology with an Sc .B. and an M .E ng. in
c omputer s c ienc e and c omputer engineering. J oe is an
unapologetic Yankees fan, but he apprec iates any good
bas eball game. J oe lives in Silic on Valley with his wife,
two c ats , and a D irec T V s atellite dis h.

Ron H ale- E vans is a writer, thinker, and game des igner


who earns his daily s andwic h with frequent gigs as a
tec hnic al writer. H e has a Bac helor's degree in
P s yc hology from Yale, with a minor in P hilos ophy.
T hinking a lot about thinking led him to c reate the
M entat Wiki (http://www.ludis m.org/mentat), whic h led to
his rec ent book, Mind Performance Hacks (O 'Reilly). You
c an find his multinefarious [s ic ] other projec ts at his
home page, http://ron.ludis m.org, inc luding his award-
winning board games , a lis t of his Short- D uration
P ers onal Saviors , and his blog. Ron's next book will
probably be about game s ys tems , es pec ially s inc e his
s eries of artic les on that topic for the dearly departed
The Games Journal (http://www.thegames journal.c om) has
been relatively popular among both gamers and
ac ademic s . I f you want to email Ron the names of s ome
gullible publis hers , or if you jus t want to bug him, you
c an reac h him at rwhe@ludis m.org (rhymes with nudis m
and has nothing to do with Luddis m).

Brian E . H ans en, 2 7 , grew up in the D allas , Texas area.


A fter s erving a two- year religious mis s ion in Spain, he
attended Texas A &M U nivers ity and graduated in 2 0 0 4
with a B.S. degree in P etroleum E ngineering. H e
c urrently works as a Res ervoir E ngineer for a large
independent oil and gas exploration and produc tion
c ompany headquartered in I rving, Texas .

J ill H . L ohmeier rec eived her P h.D . in C ognitive


P s yc hology from T he U nivers ity of M as s ac hus etts ,
A mhers t. She is c urrently the E valuation D irec tor for the
Sc hool P rogram E valuation and Res earc h group at the
U nivers ity of Kans as . J ill likes outdoor s ports ,
es pec ially running, hiking, and playing s oc c er with her
kids .

E rnes t E . Rothman is a P rofes s or and C hair of the


M athematic al Sc ienc es D epartment at Salve Regina
U nivers ity (SRU ) in N ewport, Rhode I s land. E rnie holds
a P h.D . in A pplied M athematic s from Brown U nivers ity
and held pos itions at the C ornell T heory C enter in
I thac a, N ew York before c oming to SRU . H is interes ts
are primarily in s c ientific c omputing, mathematic s and
s tatis tic s educ ation, and the U nix underpinnings of M ac
O S X. You c an keep abreas t of his lates t ac tivities at
http://homepage.mac .c om/s amc hops .

N eil J . Salkind is a s ometimes fac ulty member at the


U nivers ity of Kans as with an offic e oppos ite that of
Bruc e Frey, of Statis tics Hacks fame. I n addition to being
the author of Statis tics for People Who (Think They) Hate
Statis tics (SA G E ), N eil is a developmental ps yc hologis t
who c ollec ts books , c ooks , works on old hous es and a
p1 8 0 0 Volvo, and is ac tive in M as ters s wimming. H e
has als o written over 1 0 0 trade books and textbooks ,
and works with StudioB L iterary A genc y in N ew York.

William Skorups ki is c urrently an as s is tant profes s or in


the Sc hool of E duc ation at the U nivers ity of Kans as ,
where he teac hes c ours es in ps yc hometric s and
s tatis tic s . H e earned his Bac helor's degree in
educ ational res earc h and ps yc hology from Buc knell
U nivers ity in 2 0 0 0 , and his D oc torate in ps yc hometric
methods from the U nivers ity of M as s ac hus etts , A mhers t
in 2 0 0 4 . H is primary res earc h interes t is in the
applic ation of mathematic al models to ps yc hometric
data, inc luding the us e of Bayes ian s tatis tic s for s olving
prac tic al meas urement problems . H e als o enjoys
applying his knowledge of s tatis tic s and probability to
everyday s ituations , s uc h as playing poker agains t the
author of this book!

Acknowledgments
I 'd like to thank all the c ontributors to this book, both thos e who
are lis ted in the "C ontributors " s ec tion and thos e who helped
with ideas , reviewed the manus c ript, and provided s ugges tions
of s ourc es and res ourc es . T hanks in this c apac ity es pec ially go
to T im L angdon, neon bender, whos e gift of H arry Blac ks tone,
J r.'s paperbac k book There's One Born Every Minute (J ove
P ublic ations ) provided great ins piration for many of the hac ks
herein.

I 'd like to thank my editor, Brian Sawyer, who s hepherded this


projec t with a s trong hand and a s trong vis ion of what is and is
not a hack. H e was right mos t of the time. (T hough not all the
time, Brian. T hat hac k about us ing a monkey to pic k the winner
of the Kentuc ky D erby s hould have made it in. M aybe next
time....) Brian was ins trumental in bringing this projec t to
c ompletion, es pec ially during a s tring of unluc ky rolls where the
odds of s uc c es s looked s lim.

I 'd like to thank N eil Salkind, s tatis tic s writer s upreme, for his
help with many fac ets of my profes s ional life and this book.

M os t importantly, thanks to Bonnie J ohns on, my s weet wife,


whom I vaguely rec all, but who I think will be waiting for me at
home when I finally turn in the las t revis ion of this book.
Preface
C hanc e plays a huge part in your life, whether you know it or not.
Your partic ular genetic makeup mutated s lightly when you were
c reated, and it did s o bas ed on s pec ific laws of probability.
P erformanc e in s c hool involves human errors , yours and others ',
whic h tends to keep your ac tual ability level from being reflec ted
prec is ely in your report c ard or on thos e high- s takes tes ts .
Res earc h on c areers even s ugges ts that what you do for a living
was probably not a res ult of c areful planning and preparation, but
more likely due to happens tanc e. A nd, of c ours e, c hanc e
determines your fate in games of chance and plays a large role in
the outc ome of s porting events .

Fortunately, an entire s et of s c ientific tools , the various


applic ations of s tatis tic s , c an be us ed to s olve the problems
c aus ed by our fate- influenc ed s ys tem. I nferential s tatis tic s , a
field of s c ienc e bas ed entirely on the nature of probability, allows
us to unders tand the way things work, dis c over relations hips
among variables , des c ribe a huge population by s eeing jus t a
s mall bit of it, make unc annily ac c urate predic tions , and, yes ,
even make a little money with a well- plac ed wager here and
there.

T his book is a c ollec tion of s tatis tic al tric ks and tools . Statis tics
Hacks pres ents us eful tools from s tatis tic s , of c ours e, but als o
from the realms of educ ational and ps yc hologic al meas urement
and experimental res earc h des ign. I t provides s olutions to a
variety of problems in the world of s oc ial s c ienc e, but als o in the
worlds of bus ines s , games , and gambling.

I f you are already a top s c ientis t and do s tatis tic al c alc ulations
in your s leep, you'll enjoy this book and the c reative
applic ations it finds for thos e rus ty old tools you know s o well. I f
you jus t like the s c ientific approac h to life and are entertained
by c ool ideas and c lever s olutions to interes ting problems , don't
worry. Statis tics Hacks was written with the nons c ientis t in mind,
too, s o if that is you, you've c ome to the right plac e. I t's written
for the nons tatis tic ian as well, s o if this s till des c ribes you,
you'll feel s afe here.

I f, on the other hand, you are taking a s tatis tic s c ours e or have
s ome interes t in the ac ademic nature of the topic , you might find
this book a pleas ant c ompanion to the textbooks typic ally
required for thos e s orts of c ours es . T here won't be any
c ontradic tions between your textbook and this book, s o hearing
about real- world applic ations of s tatis tic al tools that s eem only
theoretic al won't hurt your development. I t's jus t that there are
s ome pretty c ool things that you c an do with s tatis tic s that
s eem more like fun than like work.

Why Statistics Hacks?


T he term hacking has a bad reputation in the pres s . T hey us e it
to refer to people who break into s ys tems or wreak havoc , us ing
c omputers as their weapon. A mong people who write c ode,
though, the term hack refers to a "quic k- and- dirty" s olution to a
problem or a c lever way to get s omething done. A nd the term
hacker is taken very muc h as a c ompliment, referring to s omeone
as being creative, having the tec hnic al c hops to get things done.
T he H ac ks s eries is an attempt to rec laim the word, doc ument
the good ways people are hac king, and pas s the hac ker ethic of
c reative partic ipation on to the uninitiated. Seeing how others
approac h s ys tems and problems is often the quic kes t way to
learn about a new tec hnology.

T he tec hnologies at the heart of this book are s tatis tic s ,


meas urement, and res earc h des ign. C omputer tec hnology has
developed hand- in- hand with thes e tec hnologies , s o the us e of
the term hacks to des c ribe what is done in this book is
c ons is tent with almos t every pers pec tive on that word. T hough
there is jus t a little c omputer hac king c overed in thes e pages ,
there is a plethora of clever ways to get things done.

How This Book Is Organized


You c an read this book from c over to c over if you like, but eac h
hac k s tands on its own, s o feel free to brows e and jump to the
different s ec tions that interes t you mos t. I f there's a
prerequis ite you need to know about, a c ros s - referenc e will
guide you to the right hac k.

T he earlier hac ks are more foundational and probably provide


generalized s olutions or s trategic approac hes ac ros s a variety
of problems to a greater extent than later hac ks . O n the other
hand, later hac ks provide muc h more s pec ific tric ks for winning
games or jus t information to help you unders tand what's going
on around you.

T he book is divided into s everal c hapters , organized by s ubjec t:

C hapter 1 , T he Bas ic s

U s e thes e hac ks as a s trong s et of foundational tools ,


the ones you will us e mos t often when you are s tat-
hac king your way into and out of trouble. T hink of thes e
as your bas ic toolkit: your hammer, s aw, and various
s c rewdrivers .
C hapter 2 , D is c overing Relations hips

T his c hapter c overs s tatis tic al ways to find, des c ribe,


and tes t relations hips among variables . You will be able
to make the invis ible vis ible with thes e hac ks .

C hapter 3 , M eas uring the World

A variety of tips and tric ks for meas uring the world


around you are pres ented here. You'll learn to as k the
right ques tions , as s es s ac c urately, and even inc reas e
your own performanc e on high- s takes tes ts .

C hapter 4 , Beating the O dds

T his c hapter is for the gambler. U s e the odds to your


advantage, and make the right dec is ions in Texas H old
'E m poker and jus t about every other game in whic h
probability determines the outc ome.

C hapter 5 , P laying G ames

From T V game s how s trategy to winning M onopoly to


enjoying s ports to jus t having fun, this c hapter pres ents
different hac ks for getting the mos t out of your game
playing.

C hapter 6 , T hinking Smart

T his c hapter is perhaps the mos t c erebral of them all.


G et your mind right, play mind games , make dis c overies ,
and unloc k the mys teries of the world around us us ing
the s tatis tic s hac ks you'll find here.

Conventions Used in This Book


T he following is a lis t of the typographic al c onventions us ed in
this book:

I talics

U s ed to indic ate key terms and c onc epts , U RL s , and


filenames .

Constant width

U s ed for E xc el func tions and c ode examples .

Constant width italic

U s ed for c ode text that s hould be replac ed by us er-


s upplied values .

Gray type

U s ed to indic ate a c ros s - referenc e within the text.

You s hould pay s pec ial attention to notes s et apart from the text
with this ic on:
T his is a tip, s ugges tion, or general note.
I t c ontains us eful s upplementary
information about the topic at hand.

T he thermometer ic ons , found next to eac h hac k, indic ate the


relative c omplexity of the hac k:

Safari Enabled
When you s ee a Safari® E nabled ic on on the c over of your
favorite tec hnology book, that means the book is available
online through the O 'Reilly N etwork Safari Books helf.

Safari offers a s olution that's better than e- books . I t's a virtual


library that lets you eas ily s earc h thous ands of top tec h books ,
c ut and pas te c ode s amples , download c hapters , and find quic k
ans wers when you need the mos t ac c urate, c urrent information.
Try it for free at http://s afari.oreilly.c om.

How to Contact Us
We have tes ted and verified the information in this book to the
bes t of our ability, but you may find that the rules or
c harac teris tic s of a given s ituation are different than des c ribed
here. A s a reader of this book, you c an help us to improve future
editions by s ending us your feedbac k. P leas e let us know about
any errors , inac c urac ies , mis leading or c onfus ing s tatements ,
and typos that you find anywhere in this book.

P leas e als o let us know what we c an do to make this book more


us eful to you. We take your c omments s erious ly and will try to
inc orporate reas onable s ugges tions into future editions . You c an
write to us at:

O 'Reilly M edia, I nc .
1 0 0 5 G ravens tein H wy N .
Sebas topol, C A 9 5 4 7 2
8 0 0 - 9 9 8 - 9 9 3 8 (in the U .S. or C anada)
7 0 7 - 8 2 9 - 0 5 1 5 (international/loc al)
7 0 7 - 8 2 9 - 0 1 0 4 (fax)

To as k tec hnic al ques tions or to c omment on the book, s end


email to:

bookques tions @oreilly.com

T he web s ite for Statis tics Hacks lis ts examples , errata, and
plans for future editions . You c an find this page at:

http://www.oreilly.c om/c atalog/s tatis tic s hks

For more information about this book and others , s ee the


O 'Reilly web s ite:

http://www.oreilly.c om

Got a Hack?
To explore H ac ks books online or to c ontribute a hac k for future
titles , vis it:

http://hac ks .oreilly.c om
Chapter 1. The Basics
T here's only a s mall group of tools that s tatis tic ians us e to
explore the world, ans wer ques tions , and s olve problems . I t is
the way that s tatis tic ians us e probability or knowledge of the
normal dis tribution to help them out in different s ituations that
varies . T his c hapter pres ents thes e bas ic hac ks .

Taking known information about a dis tribution and expres s ing it


as a probability [H ac k #1 ] is an es s ential tric k frequently us ed
by s tat- hac kers , as is us ing a tiny bit of s ample data to
ac c urately des c ribe all the s c ores in a larger population [H ac k
#2 ]. Knowledge of bas ic rules for c alc ulating probabilities [H ac k
#3 ] is c ruc ial, and you gotta know the logic of s ignific anc e
tes ting if you want to make s tatis tic ally- bas ed dec is ions [H ac ks
#4 and #8 ].

M inimizing errors in your gues s es [H ac k #5 ] and s c ores [H ac k


#6 ] and interpreting your data [H ac k #7 ] c orrec tly are key
s trategies that will help you get the mos t bang for your buc k in a
variety of s ituations . A nd s uc c es s ful s tat- hac kers have no
trouble rec ognizing what the res ults of any organized s et of
obs ervations or experimental manipulation really mean [H ac ks
#9 and #1 0 ].

L earn to us e thes e c ore tools , and the later hac ks will be a


breeze to learn and mas ter.
Hack 1. Know the Big Secret

Statisticians know one secret thing that makes them seem


smarter than everybody else.

T he primary purpos e of s tatis tic s as a s c ientific methodology is


to make probability s tatements about s amples of s c ores . Before
we jump into that, we need s ome quic k definitions to get us
rolling, both to unders tand this hac k and to lay a foundation for
other s tatis tic s hac ks .

Samples are numeric values that you have gathered together and
c an s ee in front of you that repres ent s ome larger population of
s c ores that you have not gathered together and c annot s ee in
front of you. Bec aus e thes e values are almos t always numbers
that indic ate the pres enc e or level of s ome c harac teris tic ,
meas urement folks c all thes e values s cores . A probability
s tatement is a s tatement about the likelihood of s ome event
oc c urring.

P robability is the heart and s oul of s tatis tic s . A c ommon


perc eption of s tatis tic ians , in fac t, is that they mainly c alc ulate
the exac t likelihood that c ertain events of interes t will oc c ur,
s uc h as winning the lottery or being s truc k by lightning.
H is toric ally, the pers on who had the tools to c alc ulate the likely
outc ome of a dic e game was the s ame pers on who had the tools
to des c ribe a large group of people us ing only a few s ummary
s tatis tic s .

So, traditionally, the teac hing of s tatis tic s inc ludes at leas t
s ome time s pent on the bas ic rules of probability: the methods
for c alc ulating the c hanc es of various c ombinations or
permutations of pos s ible outc omes . M ore c ommon applic ations
of s tatis tic s , however, are the us e of des criptive s tatis tics to
des c ribe a group of s c ores , or the us e of inferential s tatis tics to
make gues s es about a population of s c ores us ing only the
information c ontained in a s ample of s c ores . I n s oc ial s c ienc e,
the s c ores us ually des c ribe either people or s omething that is
happening to them.

I t turns out, then, that res earc hers and meas urers (the people
who are mos t likely to us e s tatis tic s in the real world) are c alled
upon to do more than c alc ulate the probability of c ertain
c ombinations and permutations of interes t. T hey are able to
apply a wide variety of s tatis tic al proc edures to ans wer
ques tions of varying levels of c omplexity without onc e needing
to c ompute the odds of throwing a pair of s ix- s ided dic e and
getting three 7 s in a row.

T hos e odds are .0 0 5 or 1 /2 of 1 perc ent if


you s tart from s c ratc h. I f you have already
rolled two 7 s , you have a 1 6 .6 perc ent
c hanc e of rolling that third 7 .

The Big Secret


T he key reas on that probability is s o c ruc ial to what
s tatis tic ians do is bec aus e they like to make probability
s tatements about the s c ores in real or theoretic al dis tributions .

A dis tribution of s c ores is a lis t of all the


different values and, s ometimes , how many
of eac h value there are.

For example, if you know that a quiz jus t adminis tered in a c las s
you are taking res ulted in a dis tribution of s c ores in whic h 2 5
perc ent of the c las s got 1 0 points , then I might s ay, without
knowing you or anything about you, that there is a 2 5 perc ent
c hanc e that you got 1 0 points . I c ould als o s ay that there is a
7 5 perc ent c hanc e that you did not get 1 0 points . A ll I have
done is taken known information about the dis tribution of s ome
values and expres s ed that information as a s tatement of
probability. T his is a tric k. I t is the s ec ret tric k that all
s tatis tic ians know. I n fac t, this is mos tly all that s tatis tic ians
ever do!

Statis tic ians take known information about the dis tribution of
s ome values and expres s that information as a s tatement of
probability. T his is worth repeating (or, tec hnic ally, threepeating,
as I firs t s aid it five s entenc es ago). Statis tic ians take known
information about the dis tribution of s ome values and expres s
that information as a s tatement of probability.

H eavens to Bets y, we c an all do that. H ow hard c ould it be?


I magine that there are three marbles in an otherwis e empty
c offee c an. Further imagine that you know that only one of the
marbles is blue. T here are three values in the dis tribution: one
blue marble and two marbles of s ome other c olor, for a total
s ample s ize of three. T here is one blue marble out of three
marbles . O h, s tatis tic ian, what are the c hanc es that, without
looking, I will draw the blue marble out firs t? O ne out of three.
1 /3 . 3 3 perc ent.

To be fair, the values and their dis tributions mos t c ommonly


us ed by s tatis tic ians are a bit more abs trac t or c omplex than
thos e of the marbles in a c offee c an s c enario, and s o muc h of
what s tatis tic ians do is not quite that trans parent. A pplied
s oc ial s c ienc e res earc hers us ually produc e values that
repres ent the differenc e between the average s c ores of s everal
groups of people, for example, or an index of the s ize of the
relations hip between two or more s ets of s c ores . T he underlying
proc es s is the s ame as that us ed with the c offee c an example,
though: referenc e the known dis tribution of the value of interes t
and make a s tatement of probability about that value.

T he key, of c ours e, is how one knows the dis tribution of all thes e
exotic types of values that might interes t a s tatis tic ian. H ow c an
one know the dis tribution of average differenc es or the
dis tribution of the s ize of a relations hip between two s ets of
variables ? C onveniently, pas t res earc hers and mathematic ians
have developed or dis c overed formulas and theorems and rules
of thumb and philos ophies and as s umptions that provide us with
the knowledge of the dis tributions of thes e c omplex values mos t
often s ought by res earc hers . T he work has been done for us .

A Smaller, Dirtier Secret

M os t of the proc edures that s tatis tic ians us e to take known


information about a dis tribution of s c ores and expres s that
information as a s tatement of probability have c ertain
requirements that mus t be met for the probability s tatement to
be ac c urate. O ne of thes e as s umptions that almos t always mus t
be met is that the values in a s ample have been randomly drawn
from the dis tribution.

N otic e that in the c offee c an example I s lipped in that "without


looking" bus ines s . I f s ome forc e other than random c hanc e is
guiding the s ampling proc es s , then the as s oc iated probabilities
reported are s imply wrong andhere's the wors t partwe c an't
pos s ibly know how wrong they are. M uc h, and maybe mos t, of the
applied ps yc hologic al and educ ational res earc h that oc c urs
today us es s amples of people that were not randomly drawn from
s ome population of interes t.

C ollege s tudents taking an introduc tory ps yc hology c ours e


make up the s amples of muc h ps yc hologic al res earc h, for
example, and s tudents at elementary s c hools c onveniently
loc ated near where an educ ational res earc her lives are often
c hos en for s tudy. T his is a problem that s oc ial s c ienc e
res earc hers live with or ignore or worry about, but, nevertheles s ,
it is a limitation of muc h s oc ial s c ienc e res earc h.
Hack 2. Describe the World Using Just
Two Numbers

Most of the statistical solutions and tools presented in this book


work only because you can look at a sample and make accurate
inf erences about a larger population. The Central Limit Theorem
is the meta-tool, the prime directive, the king of all secrets that
allows us to pull of f these inf erential tricks.

Statis tic s provide s olutions to problems whenever your goal is


to des c ribe a group of s c ores . Sometimes the whole group of
s c ores you want to des c ribe is in front of you. T he tools for this
tas k are c alled des criptive s tatis tics . M ore often, you c an s ee
only part of the group of the s c ores you want to des c ribe, but you
s till want to des c ribe the whole group. T his s ummary approac h is
c alled inferential s tatis tics . I n inferential s tatis tic s , the part of
the group of s c ores you c an s ee is c alled a s ample, and the
whole group of s c ores you wis h to make inferenc es about is the
population.

I t is quite a tric k, though, when you think about it, to be able to


des c ribe with any c onfidenc e a population of values when, by
definition, you are not direc tly obs erving thos e values . By us ing
three piec es of informationtwo s ample values and an as s umption
about the s hape of the dis tribution of s c ores in the
populationyou c an c onfidently and ac c urately des c ribe thos e
invis ible populations . T he s et of proc edures for deriving that
eerily ac c urate des c ription is c ollec tively known as the Central
Limit Theorem.
Some Quick Statistics Basics

I nferential s tatis tic s tend to us e two values to des c ribe


populations , the mean and the s tandard deviation.

Mean

Rather than des c ribe a s ample of values by s howing them all, it


is s imply more effic ient to report s ome fair s ummary of a group
of s c ores ins tead of lis ting every s ingle s c ore. T his s ingle
number is meant to fairly repres ent all the s c ores and what they
have in c ommon. C ons equently, this s ingle number is referred to
as the central tendency of a group of s c ores .

Typic ally, the bes t meas ure of c entral tendenc y, for a variety of
reas ons , is the mean [H ac k #2 1 ]. T he mean is the arithmetic
average of all the s c ores and is c alc ulated by adding together all
the values in a group, and then dividing that total by the number
of values . T he mean provides more information about all the
s c ores in a group than other c entral tendenc y options (s uc h as
reporting the middle s c ore, the mos t c ommon s c ore, and s o on).

I n fac t, mathematic ally, the mean has an interes ting property. A


s ide effec t of how it is c reated (adding up all s c ores and dividing
by the number of s c ores ) produc es a number that is as c los e as
pos s ible to all the other s c ores . T he mean will be c los e to s ome
s c ores and far away from s ome others , but if you add up thos e
dis tanc es , you get a total that is as s mall as pos s ible. N o other
number, real or imagined, will produc e a s maller total dis tanc e
from all the s c ores in a group than the mean.
Standard deviation

J us t knowing the mean of a dis tribution does n't quite tell us


enough. We als o need to know s omething about the variability of
the s c ores . A re they mos tly c los e to the mean or mos tly far from
the mean? Two wildly different dis tributions c ould have the s ame
mean but differ in their variability. T he mos t c ommonly reported
meas ure of variability s ummarizes the dis tanc es between eac h
s c ore and the mean.

A s with the mean, the more informative meas ure of variability


would be one that us es all the values in a dis tribution. A
meas ure of variability that does this is the s tandard deviation.
T he s tandard deviation is the average dis tanc e of eac h s c ore
from the mean. A s tandard deviation c alc ulates all the dis tances
in a dis tribution and averages them. T he "dis tanc es " referred to
are the dis tanc e between eac h s c ore and the mean.

A nother c ommonly reported value that


s ummarizes the variability in a dis tribution
is the variance. T he varianc e is s imply the
s tandard deviation s quared and is not
partic ularly us eful in pic turing a
dis tribution, but it is helpful when
c omparing different dis tributions and is
frequently us ed as a value in s tatis tic al
c alc ulations , s uc h as with the independent
t tes t [H ac k #1 7 ].
T he formula for the s tandard deviation appears to be more
c omplic ated than it needs to be, but there are s ome
mathematic al c omplic ations with s umming dis tanc es (negative
dis tanc es always c anc el out the pos itive dis tanc es when the
mean is us ed as the dividing point). C ons equently, here is the
equation:

S means to s um up. T he x means eac h s c ore, and the n means


the number of s c ores .

Central Limit Theorem

T he C entral L imit T heorem is fairly brief, but very powerful.


Behold the truth:

I f you randomly s elec t multiple s amples from a population, the


means of eac h of thos e s amples will be normally dis tributed.

A ttac hed to the theorem are a c ouple of mathematic al rules for


ac c urately es timating the des c riptive values for this imaginary
dis tribution of s ample means :

T he mean of thes e means (that's a mouthful) will be


equal to the population mean. T he mean of a s ingle
s ample is a good es timate for this mean of means .

T he s tandard deviation of thes e means is equal to the


s ample s tandard deviation divided by the s quare root of
the s ample s ize, n:
T hes e mathematic al rules produc e more ac c urate res ults , and
the dis tribution is c los er to the normal c urve as the s ample s ize
within any s ample gets bigger.

3 0 or more in a s ample s eems to be


enough to produc e ac c urate applic ations
of the C entral L imit T heorem.

So What?

O kay, s o the C entral L imit T heorem appears s omewhat


intellec tually interes ting and no doubt makes s tatis tic ians all
giggly and wriggly, but what does it all mean? H ow c an anyone
us e it to do anything c ool?

A s dis c us s ed in "Know the Big Sec ret" [H ac k #1 ], the s ec ret


tric k that all s tatis tic ians know is how to s olve problems
s tatis tic ally by taking known information about the dis tribution
of s ome values and expres s ing that information as a s tatement
of probability. T he key, of c ours e, is how one knows the
dis tribution of all thes e exotic types of values that might
interes t a s tatis tic ian. H ow c an one know the dis tribution of
average differenc es or the dis tribution of the s ize of a
relations hip between two s ets of variables ? T he C entral L imit
T heorem, that's how.

For example, to es timate the probability that any two groups


would differ on s ome variable by a c ertain amount, we need to
know the dis tribution of means in the population from whic h
thos e s amples were drawn. H ow c ould we pos s ibly know what
that dis tribution is when the population of means is invis ible and
might even be only theoretic al? T he C entral L imit T heorem, Bub,
that's how! H ow c an we know the dis tributions of c orrelations
(an index of the s trength of a relations hip between two variables )
whic h c ould be drawn from a population of infinite pos s ible
c orrelations ? E ver hear of the C entral L imit T heorem, dude?

Bec aus e we know the proportion of values that res ide all along
the normal c urve [H ac k #2 3 ], and the C entral L imit T heorem
tells me that thes e s ummary values are normally dis tributed, I
c an plac e probabilities on eac h s tatis tic al outc ome. I c an us e
thes e probabilities to indic ate the level of s tatis tic al
s ignific anc e (the level of c ertainty) I have in my c onc lus ions and
dec is ions . Without the C entral L imit T heorem, I c ould hardly
ever make s tatements about s tatis tic al s ignific anc e. A nd what a
drab, s ad life that would be.

Applying the Central Limit Theorem

To apply the C entral L imit T heorem, I need s tart with only a


s ample of values that I have randomly drawn from a population.
I magine, for example, that I have a group of eight new C ub
Sc outs . I t's my job to teac h them knot tying. I s us pec t, let's
s ay, that this is n't the brightes t bunc h of Sc outs who have ever
c ome to me for knot- tying guidanc e.

Before I demand extra pay, I want to determine whether they are,


in fac t, a few badges s hort of a bus hel. I want to know their I Q . I
know that the population's average I Q is 1 0 0 , but I notic e that
no one in my group has an intelligenc e tes t s c ore above 1 0 0 . I
would expec t at leas t s ome above that s c ore. C ould this group
have been s elec ted from that average population? M aybe my
s ample is jus t unus ual and does n't repres ent all C ubbies . A
s tatis tic al approac h, us ing the C entral L imit T heorem, would be
to as k:

I s it pos s ible that the mean I Q of the population repres ented by


this s ample is 1 0 0 ?

I f I want to know s omething about the population from whic h my


Sc outs were drawn, I c an us e the C entral L imit T heorem to
pretty ac c urately es timate the population's mean I Q and its
s tandard deviation. I c an als o figure out how muc h differenc e
there is likely to be between the population's mean I Q and the
mean I Q in my s ample.

I need s ome data from my s c outs to figure all this out. Table 1 - 1
s hould provide s ome good information.

Table Scout smarts


Scout IQ
Jimmy 100
Perry 95
Clark 90
Lex 92
Neil 85
Billy 88
Greg 93
John 91
T he des c riptive s tatis tic s for this s ample of eight I Q s c ores are:

M ean I Q = 9 1 .7 5

Standard deviation = 4 .5 3

So, I know in my s ample that mos t s c ores are within about 4 1/2
I Q points of 9 1 .7 5 . I t is the invis ible population they c ame from,
though, that I am mos t interes ted in. T he C entral L imit T heorem
allows me to es timate the population's mean, s tandard
deviation, and, mos t importantly, how far s ample means will
likely s tray from the population mean:

Mean I Q

O ur s ample mean is our bes t es timate, s o the population


mean is likely c los e to 9 1 .7 5 .

Standard deviation of I Q s cores in the population

T he formula we us ed to c alc ulate our s ample s tandard


deviation is des igned es pec ially to es timate the
population s tandard deviation, s o we'll gues s 4 .5 3 .

Standard deviation of the mean


T his is the real value of interes t. We know our s ample
mean is les s than 1 0 0 , but c ould that be by c hanc e?
H ow far would a mean from a s ample of eight tend to
s tray from the population mean when c hos en randomly
from that population? H ere's where we us e the equation
from earlier in this hac k. We enter our s ample values to
produc e our s tandard deviation of the mean, whic h is
us ually c alled the s tandard error of the mean:

We now know, thanks to the C entral L imit T heorem, that mos t


s amples of eight Sc outs will produc e means that are within 1 .6
I Q points of the population mean. I t is unlikely, then, that our
s ample mean of 9 1 .7 5 c ould have been drawn from a population
with a mean of 1 0 0 . A mean of 9 3 , maybe, or 9 4 , but not 1 0 0 .

Bec aus e we know thes e means are normally dis tributed, we c an


us e our knowledge of the s hape of the normal dis tribution [H ac k
#2 3 ] to produc e an exac t probability that our mean of 9 1 .7 5
c ould have c ome from a population with a mean of 1 0 0 . I t will
happen way les s than 1 out of 1 0 0 ,0 0 0 times . I t s eems very
likely that my knot- tying s tudents are tougher to teac h than
normal. I might as k for extra money.

Where Else It Works

A fuzzy vers ion of the C entral L imit T heorem points out that:

D ata that are affec ted by lots of random forc es and unrelated
events end up normally dis tributed.

A s this is true of almos t everything we meas ure, we c an apply


the normal dis tribution c harac teris tic s to make probability
s tatements about mos t vis ible and invis ible c onc epts .
We haven't even dis c us s ed the mos t powerful implic ation of the
C entral L imit T heorem. M eans drawn randomly from a population
will be normally dis tributed, regardles s of the s hape of the
population. T hink about that for a s ec ond. E ven if the population
from whic h you draw your s ample of values is not normaleven if it
is the oppos ite of normal (like my U nc le Frank, for example)the
means you draw out will s till be normally dis tributed.

T his is a pretty remarkable and handy c harac teris tic of the


univers e. Whether I am trying to des c ribe a population that is
normal or non- normal, on E arth or on M ars , the tric k s till works .
Hack 3. Figure the Odds

Will I win the lottery? Will I get struck by lightning and hit by a
bus on the same day? Will my basketball team have to meet our
hated rival early in the NCA A tournament? A t its core, statistics
is all about determining the likelihood that something will
happen and answering questions like these. The basic rules f or
calculating probability allow statisticians to predict the f uture.

T his book is full of interes ting problems that c an be s olved us ing


c ool s tatis tic al tric ks . While all the tools pres ented in thes e
hac ks are applied in different ways in different c ontexts , many of
the proc edures us ed in thes e c lever s olutions work bec aus e of a
c ommon c ore s et of elements : the rules of probability.

T he rules are a key s et of s imple, es tablis hed fac ts about how


probability works and how probabilities s hould be c alc ulated.
T hink of thes e two bas ic rules as a s et of tools in a beginner's
toolbox that, like a hammer and s c rewdriver, are probably enough
to s olve mos t problems :

Additive rule

T he probability of any one of s everal independent events


oc c urring is the s um of eac h event's probability.
Multiplicative rule

T he probability of a s eries of independent events all


oc c urring is the product of eac h event's probability.

T hes e two tools will be enough to ans wer mos t of your everyday
"What are the c hanc es ? " ques tions .

Questions About the Future

When a s tatis tic ian s ays s omething like "a 1 out of 1 0 c hanc e of
happening," s he has jus t made a predic tion about the future. I t
might be a hypothetic al s tatement about a s eries of events that
will never be tes ted, or it might be an hones t- to- goodnes s
s tatement about what is about to happen. E ither way, s he's
making a s tatis tic al s tatement about the likelihood of an
outc ome, whic h is jus t about all s tatis tic ians ever s ay [H ac k
#1 ].

I f the following s tatement makes s ome


intuitive s ens e to you, then you have all
the ability nec es s ary to ac t and think like
a s tat hac ker: "I f there are 1 0 things that
might happen and all 1 0 things are equally
likely to happen, then any 1 of thos e
things has a 1 out of 1 0 c hanc e of
happening."
Res earc h is full of ques tions that are ans wered us ing s tatis tic s ,
of c ours e, and probability rules apply, but there are many
problems in the world outs ide the laboratory that are more
important than any s tupid old s c ienc e problemlike games with
dic e, for example! I magine you are a part- time gambler, baby
needs a new pair of s hoes and all that, and the values s howing
the next time you throw a pair of dic e will determine your future.
You might want to know the likelihood of various outc omes of
that dic e roll. You might want to know that likelihood very
prec is ely!

You c an ans wer the three mos t important types of probability


ques tions that you are likely to as k us ing only your two- piec e
probability toolkit. Your ques tions probably fall into one of thes e
three types :

H ow likely is it that a s pec ific s ingle outc ome of interes t


will oc c ur next? For example, will a dic e roll of 7 c ome up
next?

H ow likely is it that any of a group of outc omes of


interes t will oc c ur next? For example, will either a 7 or
1 1 c ome up next?

H ow likely is it that a s eries of outc omes will oc c ur? For


example, c ould an hones t pair of dic e really be thrown all
night and a 7 never (I mean never!) c ome up? ! I mean,
really, c ould it? ! Could it?!
Probability Jargon

Before we talk about probability and how to determine it,


we need to learn how to talk like a s tatis tic ian.
Remember the "1 out of 1 0 c hanc e of happening"
s tatement? H ere are three ways of ans wering the
ques tion "What are the c hanc es ? ":

A s a perc entage

1 out of 1 0 c an be expres s ed as 1 0 perc ent.

A s odds

T he odds in a 1 out of 1 0 s ituation are 9 to 1 agains ti.e.,


nine c hanc es of los ing agains t one c hanc e of winning.

A s a proportion

1 0 perc ent c an be expres s ed as 0 .1 0 . Tec hnic ally,


probabilities s hould be expres s ed as proportions or they
s hould be c alled s omething els e.

Likelihood of a Specific Outcome

When you are interes ted in whether s omething is likely to


happen, that "s omething" c an be c alled a winning event (if you
are talking about a game) or jus t an outcome of interes t (if you
are talking about s omething other than a game). T he primary
princ iple in probability is that you divide the number of outc omes
of interes t by the total number of outc omes . T he total number of
outc omes is s ometimes s ymbolized with an S (for s et), and all
the different outc omes of interes t are s ometimes s ymbolized as
A (bec aus e it is the firs t letter of the alphabet, I gues s ; what am
I , a mathematic ian? ).

So, here's the bas ic equation for probability:

Figuring the c hanc es of any partic ular outc ome or event is a


matter of c ounting the number of thos e outc omes , c ounting the
number of all pos s ible outc omes , and c omparing the two. T his is
eas ily done in mos t s ituations with a s mall number of pos s ible
outc omes or a des c ription of a winning outc ome that is s imple
and involves a s ingle event.

To ans wer a typic al dic e roll ques tion, we c an determine the


c hanc es of any s pec ific value s howing up on the next roll by
c ounting the number of pos s ible c ombinations of two s ix- s ided
dic e that adds up to the value of interes t. T hen, divide that
number by the total number of pos s ible outc omes . With two 6 -
s ided dic e, there are 3 6 pos s ible rolls .

For example, there are s ix ways to throw a 7 (I peeked ahead to


Table 1 - 2 ), and 6 /3 6 = .1 6 7 , s o the perc entage c hanc e of
throwing a 7 on any s ingle roll is about 1 7 perc ent.

C alc ulate the total number of pos s ible dic e


rolls , or outc omes , by multiplying the total
number of s ides on eac h die: 6 x6 = 3 6 .
Likelihood of a Group of Outcomes

I f you are interes ted in whether any of a group of s pec ific


outc omes will oc c ur, but you don't c are whic h one, the additive
rule s tates that you c an figure your total probability by adding
together all the individual probabilities . To ans wer our dic e
ques tions , Table 1 - 2 borrows s ome information from "P lay with
D ic e and G et L uc ky" [H ac k #4 3 ] to expres s probability for
various dic e rolls as proportions .

Table Probability of independent dice rolls


Dice Number of
Probability
roll outcomes
2 1 0.028
3 2 0.056
4 3 0.083
5 4 0.111
6 5 0.139
7 6 0.167
8 5 0.139
9 4 0.111
10 3 0.083
11 2 0.056
12 1 0.028
Total 36 1.0

Table 1 - 2 provides information for various outc omes . For


example, there are two different ways to roll a 3 . Two winning
outc omes divided by a total of 3 6 different pos s ible outc omes
res ults in a proportion of .0 5 6 . So, about 6 perc ent of the time
you'll roll a 3 with two dic e. N otic e als o that the probabilities for
every pos s ible event add up to a perfec t 1 .0 .

L et's apply the additive rule to s ee the c hanc es of winning when,


to win, we mus t get any one of s everal different dic e rolls . I f you
will win with a roll of a 1 0 , 1 1 , or 1 2 , for ins tanc e, add up the
three individual probabilities :

.0 8 3 + .0 5 6 + .0 2 8 = .1 6 7

You will roll a 1 0 , 1 1 , or 1 2 about 1 7 perc ent of the time. T he


additive rule is us ed here bec aus e you are interes ted in whether
any one of s everal independent events will happen.

Likelihood of a Series of Outcomes

What about when the probability ques tion is whether more than
one independent event will happen? T his ques tion is us ually
as ked when you want to know whether a s equenc e of s pec ific
events will oc c ur. T he order of the events us ually does n't matter.

U s ing the data in Table 1 - 2 and the s ame three values of


interes t from our previous example (1 0 , 1 1 , and 1 2 ), we c an
figure the c hanc e of a partic ular s equenc e of events oc c urring.
What is the probability that, on a given s eries of three dic e rolls
in a row, you will roll a 1 0 , an 1 1 , and a 1 2 ? U nder the
multiplic ative rule, multiply the three individual probabilities
together:

.0 8 3 x.0 5 6 x.0 2 8 = .0 0 0 1 3

T his very s pec ific outc ome is very unlikely. I t will happen les s
than .1 perc ent, or 1 /1 0 of 1 perc ent of the time. T he
multiplic ative rule is us ed here bec aus e you are interes ted in
whether all of s everal independent events will happen.

What Probability Means

T his hac k talks about probability as the likelihood that


s omething will happen. A s I have plac ed our dis c us s ion within
the c ontext of analyzing pos s ible outc omes , this is an
appropriate way to think about probability. A mong philos ophers
and s oc ial s c ientis ts who s pend a lot of time thinking about
c onc epts s uc h as c hanc e and the future and what's for lunc h,
there are two different views of probability.

Analytic view

T his c las s ic view of probability is the view of the mathematic ian


and the approac h us ed in this hac k. T he analytic view identifies
all pos s ible outc omes and produc es a proportion of winning
outc omes to all pos s ible outc omes . T hat proportion is the
probability.
We are predic ting the future with the probability s tatement, and
the ac c urac y of the predic tion is unlikely to ever be tes ted. I t is
like when the weather forec as ter s ays there is a 6 0 perc ent
c hanc e of rain. When it does n't rain, we unfairly s ay the forec as t
was wrong, though, of c ours e, we haven't really tes ted the
ac c urac y of the probability s tatement.

Relative frequency view

U nder the framework of this c ompeting view, the probability of


events is determined by c ollec ting data and s eeing what ac tually
happened and how often it happened. I f we rolled a pair of dic e a
thous and times and found that a 1 0 or an 1 1 or a 1 2 c ame up
about 1 7 perc ent of the time, we would s ay that the c hanc e of
rolling one of thos e values is about 1 7 perc ent.

O ur s tatement would really be about the pas t, not a predic tion of


the future. O ne might as s ume that pas t events give us a good
idea of what the future holds , but who c an know for s ure? (T hos e
of us who hold the analytic view of probability c an know for s ure,
that's who.)
Hack 4. Reject the Null

Experimental scientists make progress by making a guess that


they are sure is wrong.

Sc ienc e is a goal- driven proc es s , and the goal is to build a body


of knowledge about the world. T he body of knowledge is
s truc tured as a long lis t of s c ientific laws , rules , and theories
about how things work and how they are. E xperimental s c ienc e
introduc es new laws and theories and tes ts them through a
logic al s et of s teps known as hypothes is tes ting.

Hypothesis Testing

A hypothes is is a gues s about the world that is tes table. For


example, I might hypothes ize that was hing my c ar c aus es it to
rain or that getting into a bathtub c aus es the phone to ring. I n
thes e hypothes es , I am s ugges ting a relations hip between c ar
was hing and rainfall or between bathing and phone c alls .

A reas onable way to s ee whether thes e hypothes es are true is to


make obs ervations of the variables in the hypothes is (for the
s ake of s ounding like s tatis tic ians , we'll c all that collecting data)
and s ee whether a relations hip is apparent. I f the data s ugges ts
there is a relations hip between my variables of interes t, my
hypothes is is s upported, and I might reas onably c ontinue to
believe my gues s is c orrec t. I f no relations hip is apparent in the
data, then I might wis ely begin to doubt that my hypothes is is
true or even rejec t it altogether.

T here are four pos s ible outc omes when s c ientis ts tes t
hypothes es by c ollec ting data. Table 1 - 3 s hows the pos s ible
outc omes for this dec is ion- making proc es s .

Table Possible outcomes of research hypothesis


testing
Hypothesis Hypothesis
is correct: is wrong:
the world the world
really is really is not
this way this way
Data does
support A. Correct B. Wrong decision:
hypothesis: decision: science science is
accept makes progress. thwarted!
hypothesis
Data does
not support C. Wrong D. Correct
hypothesis: decision: drat, decision:science
reject foiled again! makes progress.
hypothesis
O utc omes A and D add to s c ienc e's body of knowledge. T hough
A is more likely to make a res earc h s c ientis t all wriggly, D is
jus t fine. O utc omes B and C , though, are mis takes , and
repres ent mis information that only c onfus es our unders tanding
of the world.

Statistical Hypothesis Testing

T he proc es s of hypothes is tes ting probably makes s ens e to


youit is a fairly intuitive way to reac h c onc lus ions about the
world and the people in it. P eople informally do this s ort of
hypothes is tes ting all the time to make s ens e of things .

Statis tic ians als o tes t hypothes es , but hypothes es of a very


s pec ific variety. Firs t, they have data that repres ents a s ample
of values from a real or theoretic al population about whic h they
wis h to reac h c onc lus ions . So, their hypothes es are about
populations . Sec ond, they us ually have hypothes es about the
exis tenc e of a relations hip among variables in the population of
interes t. A generic s tatis tic ian's res earch hypothes is looks like
this : there is a relations hip between variable X and variable Y in
the population of interes t.

U nlike res earch hypothes is tes ting, with s tatis tical hypothes is
tes ting, the probability s tatement that a s tatis tic ian makes at
the end of the hypothes is tes ting proc es s is not related to the
likelihood that the res earc h hypothes is is true. Statis tic ians
produc e probability s tatements about the likelihood that the
res earc h hypothes is is fals e. To be more tec hnic ally ac c urate,
s tatis tic ians make a s tatement about whether a hypothes is
oppos ite to the res earc h hypothes is is likely to be c orrec t. T his
oppos ite hypothes is is typic ally a hypothes is of no relations hip
among variables , and is c alled the null hypothes is . A generic
s tatis tic ian's null hypothes is looks like this : there is no
relations hip between variable X and variable Y in the population
of interes t.

T he res earc h and null hypothes es c over all the bas es . T here
either is or is not a relations hip among variables . E s s entially,
when having to c hoos e between thes e two hypothes es ,
c onc luding that one is fals e provides s upport for the other.
L ogic ally, then, this approac h is jus t as s ound as the more
intuitive approac h pres ented earlier and utilized naturally by
humans every day. T he preferred outc ome by res earc hers
c onduc ting null hypothes is tes ting is a bit different than the
general hypothes is - tes ting approac h pres ented in Table 1 - 3 .

A s Table 1 - 4 s hows , s tatis tic ians us ually wis h to rejec t their


hypothes is . I t is by rejec ting the null that s tatis tic al
res earc hers c onfirm their res earc h hypothes es , get the grants ,
rec eive the N obel prize, and one day are rewarded with their
fac es on a pos tage s tamp.

Table Possible outcomes of null hypothesis


testing
Null Null
hypothesis hypothesis
is is wrong:
correct:there there is a
is no relationship
relationship in the
in the population
population
Data does
support
null A. Correct decision: B. Wrong
hypothesis: science makes decision: science
fail to progress. is thwarted!
reject the
null
Data does
not support
D. Correct
null C. Wrong decision:
decision:science
hypothesis: drat, foiled again!
makes progress.
reject the
null

A lthough outc ome A is s till O K (as far as s c ienc e is c onc erned),


it is now outc ome D that pleas es res earc hers bec aus e it
indic ates s upport for their real gues s es about the world, their
res earc h hypothes es . O utc omes B and C are s till mis takes that
hamper s c ientific progres s .

Why It Works
Statis tic ians tes t the null hypothes is gues s the oppos ite of what
they hope to findfor s everal reas ons . Firs t, proving s omething to
be true is really, really tough, es pec ially if the hypothes is
involves a s pec ific value, as s tatis tic al res earc h often does . I t
is muc h eas ier to prove that a prec is e gues s is wrong than prove
that a prec is e gues s is true. I c an't prove that I am 2 9 years
old, but it would be pretty eas y to prove I am not.

I t is als o c omparatively eas y to s how that any partic ular


es timate of a population value is not likely to be c orrec t. M os t
null hypothes es in s tatis tic s s ugges t that a population value is
zero (i.e., there is no relations hip between X and Y in the
population of interes t), and all it takes to rejec t the null is to
argue that whatever the population value is , it probably is n't
zero. Support for res earc hers ' hypothes es generally c ome by
s imply demons trating that the population value is greater than
nothing, without s pec ific ally s aying what that population value is
exac tly.

Q uite a perk for the profes s ional


s tatis tic ian, eh? A ll the s tatis tic ian has to
do is tell you that your ans wer is wrong,
not tell you what the right ans wer is !

E ven without us ing numbers as an example, philos ophers of


s c ienc e have long argued that progres s is bes t made in s c ienc e
by pos tulating hypothes es and then attempting to prove that
they are wrong. For good s c ienc e, fals ifiable hypothes es are the
bes t kind.
I t is the c us tom to c onduc t s tatis tic al analys es this way:
pres ent a null hypothes is that is the oppos ite of the res earc h
hypothes is and s ee whether you c an rejec t the null. R.A . Fis her,
the early 2 0 th c entury's greates t s tatis tic ian, s ugges ted this
approac h, and it has s tuc k. T here are other methods , though.
P lenty of modern s tatis tic ians have argued that we s hould
c onc entrate on produc ing the bes t es timate of thos e population
values of interes t (s uc h as the s ize of relations hips among
variables ), ins tead of foc us ing on proving that the relations hip is
the s ize of s ome nons pec ified number not equal to zero.
Hack 5. Go Big to Get Small

The best way to shrink your sampling error is to increase your


sample size.

Whenever res earc hers are playing around with s amples ins tead
of whole populations , they are bound to make s ome mis takes .
Bec aus e the bas ic tric k of inferential s tatis tic s is to meas ure a
s ample and us e the res ults to make gues s es about a population
[H ac k #2 ], we know that there will always be s ome error in our
gues s es about the values in thos e populations . T he good news
is that we als o know how to make the s ize of thos e errors as
s mall as pos s ible. T he s olution is to go big.

A n early princ iple s ugges ted in a gambling c ontext was


pres ented by J akob Bernoulli (in 1 7 1 3 ), who c alled his princ iple
the Golden Theorem. I t was later labeled by others (s tarting with
Sim\x8 e on- D enis P ois s on in 1 8 3 7 ) as the Law of Large
Numbers . I t is likely the s ingle mos t us eful dis c overy in the
his tory of s tatis tic s and provides the bas is for the key generic
advic e for all res earc hers : increas e your s ample s ize!

T he early his tory of the s c ienc e of applied


s tatis tic s (we're talking the 1 7 th and 1 8 th
c enturies ) is framed in the language of
gambling and probability. T his might be
bec aus e it gave the gentlemen s c holars of
the time an exc us e to c ombine their
intellec tual purs uits with purs uits of a les s
intellec tual nature. T he L aws of
P robability, of c ours e, are legitimately the
mathematic al bas is for s tatis tic al
proc edures and inferenc es , s o it might be
that gambling applic ations were us ed
s imply as the bes t teac hing examples for
thes e c entral s tatis tic al c onc epts .

Laying Down the Law

O ne applic ation of the L aw is its effec t on probability and


oc c urrenc es . T he L aw inc ludes the c ons equenc e that the
inc reas e in the ac c urac y of predic ting outc omes governed by
c hanc e is a s et amount. T hat is , the inc reas e in ac c urac y is
known. T he expec ted dis tanc e between the probability of a
c ertain outc ome and the ac tual proportion of oc c urrenc es you
obs erve dec reas es as the number of trials inc reas es , and the
exac t s ize of this expec ted gap between expec ted and obs erved
c an be c alc ulated. T he generic name for this expec ted gap is the
s tandard error [H ac k #1 8 ].

T he s ize of the differenc e between the theoretic al probability of


an outc ome and the proportion of times it ac tually oc c urs is
proportional to:

You c an think of this formula as the mathematic al expres s ion of


the L aw of L arge N umbers . For dis c us s ions of ac c urac y in the
c ontext of probability and outc ome, the s ample s ize is the
number of trials . For dis c us s ion of ac c urac y in the c ontext of
s ample means and population means , the s ample s ize is the
number of people (or random obs ervations ) in the s ample.

Improving Accuracy

T he s pec ific values affec ted by the L aw depend on the s c ale of


meas urement us ed and the amount of variability in a given
s ample. H owever, we c an get a s ens e of the improvement or
inc reas e in ac c urac y made by various c hanges in s ample s izes .
Table 1 - 5 s hows proportional inc reas es in ac c urac y for all
inferential s tatis tic s . So s peaketh the L aw.

Table Effect of increasing sample size


Relative
Sample decrease
Meaning
size in error
size
The error is equal to the
standard deviation of
1 1
the variable in the
population.
The error is about a third
of its previous size. Just
10 3.16 using 10 observations
instead of 1 has
dramatically increased
our accuracy.
An increase from 1 to 30
people will dramatically
30 5.48 improve accuracy. Even
the jump from 10 to 30
is useful.
A sample of 100 people
produces an estimate
much closer to the
population value (or
100 10 expected probability).
The size of the error
with 100 people in a
sample is just 1/10 of a
standard deviation.
Estimates with so many
1,000 31.62 observations are
remarkably precise.

Why It Works

L et's look at this important s tatis tic al princ iple from s everal
different angles . I 'll s tate the law us ing three different
approac hes , beginning with the gambler's c onc erns , moving on
to the is s ue of error, and ending with the implic ations for
gathering a repres entative s ample. A ll of the entries in this lis t
are the exac t s ame rule, jus t s tated differently.

Gambling

I f an event has a c ertain probability of oc c urring on a s ingle trial,


then the proportion of oc c urrenc es of the event over an infinite
number of trials will equal that probability. A s the number of
trials approac hes infinity, the proportion of oc c urrenc es
approac hes that probability.

Error

I f a s ample is infinitely large, the s ample s tatis tic s will be equal


to the population parameters . For example, the dis tanc e between
the s ample mean and the population mean dec reas es as the
s ample s ize approac hes infinity. E rrors in es timating population
values s hrink toward zero as the number of obs ervations
inc reas es .

Implications
Samples are more repres entative of the population from whic h
they are drawn when they inc lude many people than when they
inc lude fewer people. T he number of important c harac teris tic s in
the population repres ented in a s ample inc reas es , as does the
prec is ion of their es timates , as the s ample s ize gets larger.

A ll thes e s tatements of the L aw of L arge


N umbers are true only if bas ed on the
as s umption that the oc c urrenc es or the
s ampling take plac e randomly.

I n addition to providing the bas is for c alc ulations of s tandard


errors , the L aw of L arge N umbers affec ts other c ore s tatis tic al
is s ues s uc h as power [H ac k #8 ] and the likelihood of rejec ting
the null hypothes is when you s hould not [H ac k #4 ]. J akob
Bernoulli's gambling pals might have been mos t interes ted in his
G olden T heorem bec aus e they c ould get a s ens e of how many
dic e rolls it would take before the proportion of 7 s rolled
approac hed .1 6 6 or 1 6 .6 perc ent, and c ould then do s ome s olid
financ ial planning.

For the las t 3 0 0 years , though, all of s oc ial s c ienc e has made
us e of this elegant tool to es timate how ac c urately s omething we
s ee des c ribes s omething we c annot s ee. T hanks , J ake!

See Also
"Find O ut J us t H ow Wrong You Really A re" [H ac k #1 8 ]
Hack 6. Measure Precisely

Classical test theory provides a nice analysis of the components


that combine to produce a score on any test. A usef ul
implication of the theory is that the level of precision f or test
scores can be estimated and reported.

A good educ ational or ps yc hologic al tes t produc es s c ores that


are valid and reliable. Validity is the extent to whic h the s c ore on
a tes t repres ents the level of whatever trait one wis hes to
meas ure, and the extent to whic h the tes t is us eful for its
intended purpos e. To demons trate validity, you mus t pres ent
evidenc e and theory to s upport that the interpretations of the
tes t s c ores are c orrec t.

Reliability is the extent to whic h a tes t c ons is tently produc es


the s ame s c ore upon repeated meas ures of the s ame pers on.
D emons trating reliability is a matter of c ollec ting data that
repres ent repeated meas ures and analyzing them s tatis tic ally.

Classical Test Theory

Clas s ical tes t theory, or reliability theory, examines the c onc ept of
a tes t s c ore. T hink of the obs erved s c ore (the s c ore you got) on
a tes t you took s ometime. C las s ic al tes t theory defines that
s c ore as being made up of two parts and pres ents this
theoretic al equation:
O bs erved Sc ore = True Sc ore + E rror Sc ore

T his equation is made up of the following elements :

O bs erved s c ore

T he ac tual reported s c ore you got on a tes t. T his is


typic ally equal to the number of items ans wered
c orrec tly or, more generally, the number of points earned
on the tes t.

True s c ore

T he s c ore you s hould have gotten. T his is not the s c ore


you des erve, though, or the s c ore that would be the mos t
valid. True s core is defined as the average s c ore you
would get if you took the s ame tes t an infinite number of
times . N otic e this definition means that true s c ores
repres ent only average performanc e and might or might
not reflec t the trait that the tes t is des igned to meas ure.
I n other words , a tes t might produc e true s c ores , but not
produc e valid s c ores .

E rror Sc ore

T he dis tanc e of your obs erved s c ore from your true


s c ore.

U nder this theory, it is as s umed that performanc e on any tes t is


s ubjec t to random error. You might gues s and get a ques tion
c orrec t on a s oc ial s tudies quiz when you don't really know the
ans wer. I n this c as e, the random error helps you.
N otic e this is s till a meas urement "error,"
even though it inc reas ed your s c ore.

You might have c ooked a bad egg for breakfas t and,


c ons equently, not even notic e the las t s et of ques tions on an
employment exam. H ere, the random error hurt you. T he errors
are c ons idered random, bec aus e they are not s ys tematic , and
they are unrelated to the trait that the tes t hopes to meas ure.
T he errors are c ons idered errors bec aus e they c hange your
s c ore from your true s c ore.

O ver many tes ting times , thes e random errors s hould


s ometimes inc reas e your s c ore and s ometimes dec reas e it, but
ac ros s tes ting s ituations , the error s hould even out. U nder
c las s ic al tes t theory, reliability [H ac k #3 1 ] is the extent to
whic h tes t s c ores randomly fluc tuate from oc c as ion to oc c as ion.
A number repres enting reliability is often c alc ulated by looking
at the c orrelations among the items on the tes t. T his index
ranges from 0 .0 to 1 .0 , with 1 .0 repres enting a s et of s c ores
with no random error at all. T he c los er the index is to 1 .0 , the
les s the s c ores fluc tuate randomly.

Standard Error of Measurement

E ven though random errors s hould c anc el eac h other out ac ros s
tes ting s ituations , les s than perfec t reliability is a c onc ern
bec aus e, of c ours e, dec is ions are almos t always made bas ed on
s c ores from a s ingle tes t adminis tration. I t does n't do you any
good to know that in the long run, your performanc e would reflec t
your true s c ore if, for example, you jus t bombed your SAT tes t
bec aus e the pers on next to you wore dis trac ting c ologne.

M eas urement experts have developed a formula that c omputes a


range of s c ores in whic h your true level of performanc e lies . T he
formula makes us e of a value c alled the s tandard error of
meas urement. I n a population of tes t s c ores , the s tandard error
of meas urement is the average dis tanc e of eac h pers on's
obs erved s c ore from that pers on's true s c ore. I t is es timated
us ing information about the reliability of the tes t and the amount
of variability in the group of obs erved s c ores as reflec ted by the
s tandard deviation of thos e s c ores [H ac k #2 ].

T he formula for the s tandard error of meas urement is :

H ere is an example of how to us e this formula. T he G raduate


Rec ord E xam (G RE ) tes ts provide s c ores required by many
graduate s c hools to help in making admis s ion dec is ions . Sc ores
on the G RE Verbal Reas oning tes t range from 2 0 0 to 8 0 0 , with a
mean of about 5 0 0 (it's ac tually a little les s than that in rec ent
years ) and a s tandard deviation of 1 0 0 .

Reliability es timates for s c ores from this tes t are typic ally
around .9 2 , whic h indic ates very high reliability. I f you rec eive a
s c ore of 5 2 0 when you take this exam, c ongratulations , you
performed higher than average. 5 2 0 was your obs erved s c ore,
though, and your performanc e was s ubjec t to random error. H ow
c los e is 5 2 0 to your true s c ore? U s ing the s tandard error of
meas urement formula, our c alc ulations look like this :

1. 1 - .9 2 = .0 8

2. T he s quare root of .0 8 is .2 8

3. 1 0 0 x.2 8 = 2 8
T he s tandard error of meas urement for the G RE is about 2 8
points , s o your s c ore of 5 2 0 is mos t likely within 2 8 points of
what you would s c ore on average if you took the tes t many
times .

Building Confidence Intervals

What does it mean to s ay that an obs erved s c ore is mos t likely


within one s tandard error of meas urement of the true s c ore? I t is
ac c epted by meas urement s tatis tic ians that 6 8 perc ent of the
time, an obs erved s c ore will be within one s tandard error of
meas urement of the true s c ore. A pplied s tatis tic ians like to be
more than 6 8 perc ent s ure, however, and us ually prefer to report
a range of s c ores around the obs erved s c ore that will c ontain the
true s c ore 9 5 perc ent of the time.

To be 9 5 perc ent s ure that one is reporting a range of s c ores


that c ontain an individual's true s c ore, one s hould report a range
c ons truc ted by adding and s ubtrac ting about two s tandard errors
of meas urement. Figure 1 - 1 s hows what c onfidenc e intervals
around a s c ore of 5 2 0 on the G RE Verbal tes t look like.

Figure 1-1. Confidence intervals for a GRE score


of 520
Why It Works

T he proc edure for building c onfidenc e intervals us ing the


s tandard error of meas urement is bas ed on the as s umptions that
errors (or error s c ores ) are random, and that thes e random
errors are normally dis tributed. T he normal c urve [H ac k #2 5 ]
s hows up here as it does all over the world of human
c harac teris tic s , and its s hape is well known and prec is ely
defined. T his prec is ion allows for the c alc ulation of prec is e
c onfidenc e intervals .

T he s tandard error of meas urement is a s tandard deviation. I n


this c as e, it is the s tandard deviation of error s c ores around the
true s c ore. U nder the normal c urve, 6 8 perc ent of values are
within one s tandard deviation of the mean, and 9 5 perc ent of
s c ores are within about two s tandard deviations (more exac tly,
1 .9 6 s tandard deviations ). I t is this known s et of probabilities
that allows meas urement folks to talk about 9 5 perc ent or 6 8
perc ent c onfidenc e.
What It Means

H ow is knowing the 9 5 perc ent c onfidenc e interval for a tes t


s c ore helpful? I f you are the pers on who is requiring the tes t and
us ing it to make a dec is ion, you c an judge whether the tes t taker
is likely to be within reac h of the level of performanc e you have
s et as your s tandard of s uc c es s .

I f you are the pers on who took the tes t, then you c an be pretty
s ure that your true s c ore is within a c ertain range. T his might
enc ourage you to take the tes t again with s ome reas onable
expec tation of how muc h better you are likely to do by c hanc e
alone. With your s c ore of 5 2 0 on the G RE , you c an be 9 5
perc ent s ure that if you take the tes t again right away, your new
s c ore c ould be as high as 5 7 6 . O f c ours e, it c ould drop and be
as low as 4 6 4 the next time, too.
Hack 7. Measure Up

Four levels of measurement determine how the scores produced


in measurement can be used. If you have not measured at the
right level, you might not be able to play with those scores the
way you want.

Statis tic al proc edures analyze numbers . T he numbers mus t


have meaning, of c ours e; otherwis e, the exerc is es are of little
value. Statis tic ians c all numbers with meaning s cores . N ot all
the s c ores us ed in s tatis tic s , however, are c reated equal. Sc ores
have different amounts of information in them, depending on the
rules followed for c reating the s c ores .

When you dec ide to meas ure s omething, you mus t c hoos e the
rules by whic h you as s ign s c ores very c arefully. T he level of
meas urement determines whic h s orts of s tatis tic al analys es are
appropriate, whic h will work, and whic h will be meaningful.

Meas urement is the meaningful as s ignment


of numbers to things . T he things c an be
c onc rete objec ts , s uc h as roc ks , or
abs trac t c onc epts , s uc h as intelligenc e.
H ere's an example of what I mean when I s ay not all s c ores are
c reated equal. I magine your five c hildren took a s pelling tes t.
C huc k s c ored a 9 0 , D ic k and J an got 8 0 s , Bob s c ored 7 5 , and
D on got only 5 0 out of 1 0 0 c orrec t. I f a friend as ked how your
kids did on the big tes t, you might report that they averaged 7 5 .
T his is a reas onable s ummary. N ow, imagine that your five
c hildren ran a foot rac e agains t eac h other. Bob was firs t, J an
s ec ond, D ic k third, C huc k fourth, and D on fifth. Your nos ey friend
again as ks how they did. With a proud s mile, you report that they
averaged third plac e. T his is not s uc h a reas onable s ummary,
bec aus e it provides no information. I n both c as es , though,
s c ores were us ed to indic ate performanc e. T he differenc e lies
only in the level of meas urement us ed.

T here are four levels of meas urementthat is , four ways that


numbers are us ed as s c ores . T he levels differ in the amount of
information provided and the types of mathematic al and
s tatis tic al analys es that c an be meaningfully c onduc ted on
them. T he four levels of meas urement are nominal, ordinal,
interval, and ratio.

Using Numbers as Labels

I f you are planning to us e s c ores to indic ate only that the things
belong to different groups , meas ure at the nominal level. T he
nominal level of meas urement us es numbers only as names :
labels for various c ategories (nominal means "in name only").

For example, a s c ientis t who c ollec ts data on men and women,


us ing a 1 to indic ate a male s ubjec t and a 2 to indic ate a female
s ubjec t, is us ing the numbers at a nominal level. N otic e that
even though the number 2 is mathematic ally greater than the
number 1 , a 2 in this data s et does not mean more of anything. I t
is us ed only as a name.

Using Numbers to Show Sequence

I f you want to analyze your s c ores in ways that rely on


performanc e meas ured as evidenc e of s ome s equenc e or order,
meas ure at the ordinal level. O rdinal meas urement provides all
the information the nominal level provides , but it adds
information about the order of the s c ores . N umbers with greater
values c an be c ompared with numbers at lower values , and the
people or otters or whatever was meas ured c an be plac ed into a
meaningful order.

Take, for example, your rank order in your high s c hool c las s . T he
valedic torian is us ually the pers on who rec eived a s c ore of 1
when grade point averages are c ompared. N otic e that you c an
c ompare s c ores to eac h other, but you don't know anything about
the dis tanc e between the s c ores . I n a footrac e, the firs t- plac e
finis her might have been jus t a s ec ond ahead of the s ec ond-
plac e runner, while the s ec ond- plac e runner might have been 3 0
s ec onds ahead of the runner who c ame in third plac e.

Using Numbers to Show Distance

I nterval level meas urement us es numbers in a way that provides


all the information of earlier levels , but adds an element of
prec is ion. T his level of meas urement produc es s c ores that are
interpreted as having an equal differenc e between any two
adjac ent s c ores .
For example, on a Fahrenheit thermometer, the meaningful
differenc e between 7 0 and 6 9 degrees 1 degreeis equal to the
differenc e between 3 2 and 3 1 degrees . T hat one degree is
as s umed to be the s ame amount of heat (or, if you prefer,
pres s ure on the liquid in the thermometer), regardles s of where
on the s c ale the interval exis ts .

T he interval level provides muc h more information than the


ordinal level, and you c an now meaningfully average s c ores .
M os t educ ational and ps yc hologic al meas urement takes plac e
at the interval level.

T hough interval level meas urement would s eem to s olve all of


our problems in terms of what we c an and c annot do s tatis tic ally,
there are s till s ome mathematic al operations that are not
meaningful at this level. For ins tanc e, we don't make
c omparis ons us ing frac tions or proportions . T hink about the way
we talk about temperature. I f a 4 0 - degree day follows an 8 0 -
degree day, we do not s ay, "I t is half as hot today as yes terday."
We als o don't refer to a s tudent with a 1 2 0 I Q as "one- third
s marter" than a s tudent with a 9 0 I Q .

T he word interval is a term from old- time


c as tle arc hitec ture. You know thos e tall
towers or turrets where arc hers were
s tationed for defens e? A round the c irc ular
tops , there was typic ally a pattern of a
protec tive s tone, then a gap for launc hing
arrows , followed by another protec tive
s tone, and s o on. T he gaps were c alled
intervals ("between walls "), and the bes t
des igned defens es had the s tones and
gaps at equal intervals to provide 3 6 0 -
degree protec tion.
Using Numbers to Count in Concrete
Ways

T he highes t level of meas urement, ratio, provides all the


information of the lower levels but als o allows for proportional
c omparis ons and the c reation of perc entages . Ratio level
meas urement is ac tually the mos t c ommon and intuitive way in
whic h we obs erve and take ac c ounting of the natural world. When
we c ount, we are at the ratio level. H ow many dogs are on your
neighbor's porc h? T he ans wer is at the ratio level.

Ratio level meas urement provides s o muc h information and


allows for all pos s ible s tatis tic al manipulations bec aus e ratio
s c ales us e a true zero. A true zero means that a pers on c ould
s c ore 0 on the s c ale and really have zero of the c harac teris tic
being meas ured. T hough a Fahrenheit temperature s c ale, for
example, does have a zero on it, a zero- degree day does not
mean there is abs olutely no heat. O n interval s c ales , s uc h as in
our thermometer example, s c ores c an be negative numbers . A t
the ratio level of meas urement, there are no negative numbers .

Choosing Your Level of Measurement

Whic h level of meas urement is right for you? Bec aus e of the
advantages of moving to at leas t the interval level, mos t s oc ial
s c ientis ts prefer to meas ure at the interval or ratio level. A t the
interval level, you c an s afely produc e des c riptive s tatis tic s and
c onduc t inferential s tatis tic al analys es , s uc h as t tes ts ,
analys es of varianc e, and c orrelational analys es . Table 1 - 6
provides a s ummary of the s trengths and weaknes s es of eac h
level of meas urement.

Table Levels of measurement


Level of
Strength Weakness
measurement
Describes Numbers do
Nominal categorical not indicate
data. quantity.
Allows
Difficult to
comparison
Ordinal summarize
between
scores.
scores.
Most Proportional
statistical comparisons
Interval
analyses are are not
possible. possible.
True zero
Some variables
allows for all
of interest do
Ratio possible
not have a
statistical
true zero.
analyses.
To c hoos e the c orrec t s tatis tic al analys is of data c reated by
others , identify the level of meas urement us ed and benefit from
its s trengths . I f you are c reating the data yours elf, c ons ider
meas uring up: us ing the highes t level of meas urement that you
c an.

Controversial Tools

Sinc e the levels of meas urement bec ame c ommonly ac c epted in


the 1 9 5 0 s , there has been s ome debate about whether we really
need to c learly be at the interval level to c onduc t s tatis tic al
analys es . T here are many c ommon forms of meas urement (e.g.,
attitude s c ales , knowledge tes ts , or pers onality meas ures ) that
are not unequivoc ally at the interval level, but might be
s omewhere near the top of the ordinal level range. C an we s afely
us e this level of data in analys es requiring interval s c aling?

A majority c ons ens us in the res earc h literature is that if you are
at leas t at the ordinal level and believe that you c an make
meaning out of interval- level s tatis tic al analys es , then you c an
s afely perform inferential s tatis tic al analys es on this type of
data. I n the real world of res earc h, by the way, almos t everybody
c hoos es this approac h (whether they know it or not).

T he bas ic value of making analytic al dec is ions bas ed on level of


meas urement is hard to deny, however. A c las s ic example of the
importanc e of meas urement levels is des c ribed by Frederic k
L ord in his 1 9 5 3 artic le "O n the Statis tic al Treatment of
Football N umbers " (American Ps ychologis t, Vol. 8 , 7 5 0 - 7 5 1 ). A n
abs ent- minded s tatis tic ian eagerly analyzes s ome data given
him c onc erning the c ollege football team, and produc es a report
full of means and s tandard deviations and other s ophis tic ated
analys es . T he data, though, turn out to be the numbers from the
bac ks of the players ' jers eys . A c lear ins tanc e of not paying
attention to level of meas urement, perhaps , but the s tatis tic ian
s tands by his report. T he numbers thems elves , he explains ,
don't know where they c ame from; they behave the s ame way
regardles s .
Hack 8. Power Up

Success in social science research is typically def ined by the


discovery of a statistically signif icant f inding. To increase the
chances of f inding something, anything, the primary goal of the
statistically savvy super-scientist should be to increase power.

T here are two potential pitfalls when c onduc ting s tatis tic ally
bas ed res earc h. Sc ientis ts might dec ide that they have found
s omething in a population when it really exis ts only in their
s ample. C onvers ely, s c ientis ts might find nothing in their s ample
when, in reality, there was a beautiful relations hip in the
population jus t waiting to be found.

T he firs t problem is minimized by s ampling in a way that


repres ents the population [H ac k #1 9 ]. T he s ec ond problem is
s olved by inc reas ing power.

Power

I n s oc ial s c ienc e res earc h, a s tatis tic al analys is frequently


determines whether a c ertain value obs erved in a s ample is
likely to have oc c urred by c hanc e. T his proc es s is c alled a tes t
of s ignificance. Tes ts of s ignific anc e produc e a p-value
(probability value), whic h is the probability that the s ample value
c ould have been drawn from a partic ular population of interes t.
T he lower the p- value, the more c onfident we are in our beliefs
that we have ac hieved s tatis tic al s ignific anc e and that our data
reveals a relations hip that exis ts not only in our s ample but als o
in the whole population repres ented by that s ample. U s ually, a
predetermined level of s ignific anc e is c hos en as a s tandard for
what c ounts . I f the eventual p- value is equal to or lower than
that predetermined level of s ignific anc e, then the res earc her has
ac hieved a level of s ignific anc e.

Statis tic al analys es and tes ts of


s ignific anc e are not limited to identifying
relations hips among variables , but the
mos t c ommon analys es (t tes ts , F tes ts ,
c hi- s quares , c orrelation c oeffic ients ,
regres s ion equations , etc .) us ually s erve
this purpos e. I talk about relations hips
here bec aus e they are the typic al effect
you're looking for.

T he power of a s tatis tic al tes t is the probability that, given that


there is a relations hip among variables in the population, the
s tatis tic al analys is will res ult in the dec is ion that a level of
s ignific anc e has been ac hieved. N otic e this is a c onditional
probability. T here mus t be a relations hip in the population to
find; otherwis e, power has no meaning.

P ower is not the c hanc e of finding a s ignific ant res ult; it is the
c hanc e of finding that relations hip if it is there to find. T he
formula for power c ontains three c omponents :
Sample s ize

T he predetermined level of s ignific anc e (p- value) to beat


(be les s than)

T he effect s ize (the s ize of the relations hip in the


population)

Conducting a Power Analysis

L et's s ay we want to c ompare two different s ample groups and


s ee whether they are different enough that there is likely a real
differenc e in the populations they repres ent. For example,
s uppos e you want to know whether men or women s leep more.

T he des ign is fairly s traightforward. C reate two s amples of


people: one group of men and one group of women. T hen, s urvey
both groups and as k them the typic al number of hours of s leep
they get eac h night. To find any real differenc es , though, how
many people do you need to s urvey? T his is a power ques tion.

A t tes t c ompares the mean performanc e


of two s ample groups of s c ores to s ee
whether there is a s ignific ant differenc e
[H ac k #1 7 ]. I n this c as e, s tatis tic al
s ignific anc e means that the differenc e
between s c ores in the two populations
repres ented by the two s ample groups is
probably greater than zero.
Before a s tudy begins , a res earc her c an determine the power of
the s tatis tic al analys is that will be us ed. Two of the three piec es
needed to c alc ulate power are already known before the s tudy
begins : you c an dec ide the s ample s ize and c hoos e the
predetermined level of s ignific anc e. What you c an't know is the
true s ize of the relations hip between the variables , bec aus e data
for the planned res earc h has not yet been generated.

T he s ize of the relations hip among the variables of interes t (i.e.,


the effec t s ize) c an be es timated by the res earc her before the
s tudy begins ; power als o c an be es timated before the s tudy
begins . U s ually, the res earc her dec ides on the s malles t
relations hip s ize that would be c ons idered important or
interes ting to find.

O nc e thes e three piec es (s ample s ize, level of s ignific anc e, and


effec t s ize) are determined, the fourth piec e (power) c an be
c alc ulated. I n fac t, s etting the level of any three of thes e four
piec es allows for c alc ulation of the fourth piec e. For example, a
res earc her often knows the power s he would like an analys is to
have, the effec t s ize s he wants to be dec lared s tatis tic ally
s ignific ant, and the pres et level of s ignific anc e s he will c hoos e.
With this information, the res earc her c an c alc ulate the
nec es s ary s ample s ize.

For es timating power, res earc hers often


us e a s tandard ac c epted proc edure that
identifies a power goal of .8 0 and as s igns
a pres et level of s ignific anc e of .0 5 . A
power of .8 0 means that a res earc her will
find a relations hip or effec t in her s ample
8 0 perc ent of the time if there is s uc h a
relations hip in the population from whic h
the s ample was drawn.

T he effec t s ize (or index of relations hip s ize [H ac k #1 0 ]) with t


tes ts is often expres s ed as the differenc e between the two
means divided by the s tandard deviation in eac h group. T his
produc es effec t s izes in whic h .2 is c ons idered s mall, .5 is
c ons idered medium, and .8 is c ons idered large. T he power
analys is ques tion is : how big a s ample in eac h of the two groups
(how many people) do I need in order to find a s ignific ant
differenc e in tes t s c ores ?

T he ac tual formula for c omputing power is c omplex, and I won't


pres ent it here. I n real life, c omputer s oftware or a s eries of
dens e tables in the bac k of s tatis tic s books are us ed to
es timate power. I have done the c alc ulations for a s eries of
options , though, and pres ent them in Table 1 - 7 . N otic e that the
key variables are effec t s ize and s ample s ize. By c onvention, I
have kept power at .8 0 and level of s ignific anc e at .0 5 .

Table Necessary sample sizes for various effect


sizes
Effect size Sample size
.10 1,600
.20 400
.30 175
.40 100
.50 65
1.0 20

I magine that you think the ac tual differenc e in your gender- and-
s leep s tudy will be real, but s mall. A differenc e of about .2
s tandard deviations between groups in t tes t analys es is
c ons idered s mall, s o you might expec t a .2 effec t s ize. To find
that s mall of an effec t s ize, you need 4 0 0 people in each group!
A s the effec t s ize inc reas es , the nec es s ary s ample s ize gets
s maller. I f the population effec t s ize is 1 .0 (a very large effec t
s ize and a big differenc e between the two groups ), 2 0 people per
group would s uffic e.

Making Inferences About Beautiful


Relationships

Sc ientis ts often rely on the us e of s tatis tic al inferenc e to rejec t


or ac c ept their res earc h hypothes es . T hey us ually s ugges t a
null hypothes is that s ays there is no relations hip among
variables or differenc es between groups . I f their s ample data
s ugges ts that there is , in fac t, a relations hip between their
variables in the population, they will rejec t the null hypothes is
[H ac k #4 ] and ac c ept the alternative, their res earc h hypothes is ,
as the bes t gues s about reality.
O f c ours e, mis takes c an be made in this proc es s . Table 1 - 8
identifies the pos s ible types of errors that c an be made in this
hypothes is - tes ting game. Rejec ting the null hypothes is when
you s hould not is c alled a Type I error by s tatis tic al
philos ophers . Failing to rejec t the null when you s hould is c alled
a Type I I error.

Table Errors in hypothesis testing


Null Null
Action hypothesis hypothesis
is true is false
Reject null
Type I error Significant finding
hypothesis
Fail to
Correct decision Type II error
reject null

What you want to do as a s mart s c ientis t is avoid the two types


of errors and produc e a s ignific ant finding. Reac hing a c orrec t
dec is ion to not rejec t the null when the null is true is okay too,
but not nearly as fun as a s ignific ant finding. "Spend your life in
the upper- right quadrant of the table," my U nc le Frank us ed to
s ay, "and you will be happy and wealthy beyond your wildes t
dreams ! "

To have a good c hanc e of reac hing a s tatis tic ally s ignific ant
finding, one c ondition beyond your c ontrol mus t be true. T he null
hypothes is mus t be fals e, or your c hanc es of "finding"
s omething are s lim. A nd, if you do "find" s omething, it's not
really there, and you will be making a big errora Type I error.
T here mus t ac tually be a relations hip among your res earc h
variables in the population for you to find it in your s ample data.

So, fate dec ides whether you wind up in the c olumn on the right
in Table 1 - 8 . Power is the c hanc e of moving to the top of that
c olumn onc e you get there. I n other words , power is the c hanc e
of c orrec tly rejec ting the null hypothes is when the null
hypothes is is fals e.

Why It Works

T his relations hip between effec t s ize and s ample s ize makes
s ens e. T hink of an animal hiding in a hays tac k. (T he animal is
the effect s ize; jus t work with me on this metaphor, pleas e.) I t
takes fewer obs ervations (handfuls of hay) to find a big ol' effec t
s ize (like an elephant, s ay) than it would to find a tiny animal
(like a c ute baby otter, for ins tanc e). T he number of people
repres ents the number of obs ervations , and big effec t s izes
hiding in populations are eas ier to find than s maller effec t s izes .

T he general relations hip between effec t s ize and s ample s ize in


power works the other way, too. G ues s at your effec t s ize, and
jus t inc reas e your s ample s ize until you have the power you
need. Remember, Table 1 - 7 as s umes you want to have 8 0
perc ent power. You c an always work with fewer people; you'll jus t
have les s power.

Where It Doesn't Work


I t is important to remember that power is not the c hanc e of
s uc c es s . I t is not even the c hanc e that a level of s ignific anc e
will be reac hed. I t is the c hanc e that a level of s ignific anc e will
be reac hed if all the values es timated by the res earc her turn out
to be c orrec t. T he hardes t c omponent of the formula to gues s or
s et is the effec t s ize in the population. A res earc her s eldom
knows how big the thing is that he is looking for. A fter all, if he
did know the s ize of the relations hip between his res earc h
variables , there wouldn't be muc h reas on to c onduc t the s tudy,
would there?
Hack 9. Show Cause and Effect

Statistical researchers have established some ground rules that


must be f ollowed if you hope to demonstrate that one thing
causes another.

Soc ial s c ienc e res earc h that us es s tatis tic s operates under a
c ouple of broad goals . O ne goal is to c ollec t and analyze data
about the world that will s upport or rejec t hypothes es about the
relations hips among variables . T he s ec ond goal is to tes t
hypothes es about whether there are c aus e- and- effec t
relations hips among variables . T he firs t goal is a breeze
c ompared to the s ec ond.

T here are all s orts of relations hips between things in the world,
and s tatis tic ians have developed all s orts of tools for finding
them, but the pres enc e of a relations hip does n't mean that a
partic ular variable c aus es another. A mong humans , there is a
pretty good pos itive c orrelation [H ac k #1 1 ] between height and
weight, for example, but if I los e a few pounds , I won't get
s horter. O n the other hand, if I grow a few inc hes , I probably will
gain s ome weight.

Knowing only the c orrelation between the two, however, c an't


really tell me anything about whether one thing c aus ed the other.
T hen again, the abs ence of a relations hip would s eem to tell me
about c aus e and effec t. I f there is no c orrelation between two
variables , that would s eem to rule out the pos s ibility that one
c aus es the other. T he pres enc e of the c orrelation allows for that
pos s ibility, but does not prove it.
Designing Effective Experiments

Res earc hers have developed frameworks for talking about


different res earc h des igns and whether s uc h des igns even allow
for proof that one variable affec ts another. T he different des igns
involve the pres enc e or abs enc e of c omparis on groups and how
partic ipants are as s igned to thos e groups .

T here are four bas ic c ategories of group des igns , bas ed on


whether the des ign c an provide s trong evidenc e, moderate
evidenc e, weak evidenc e, or no evidenc e of c aus e and effec t:

N on- experimental des igns

T hes e des igns us ually involve jus t one group of people,


and s tatis tic s are us ed to either des c ribe the population
or demons trate a relations hip between variables . A n
example of this des ign is a c orrelational s tudy, where
s imple as s oc iations among variables are analyzed
[H ac k #1 1 ]. T his type of des ign provides no evidenc e of
c aus e and effec t.

P re- experimental des igns

T hes e des igns us ually involve one group of people and


two or more meas urement oc c as ions to s ee whether
c hange has oc c urred. A n example of this des ign is to
give a pretes t to a group of people, do s omething to
them, give them a pos t-tes t, and s ee whether their
s c ores c hange. T his type of des ign provides weak
evidenc e of c aus e and effec t bec aus e forc es other than
whatever you did to the poor folks c ould have c aus ed
any c hange in s c ores .

Q uas i- experimental des igns

T hes e des igns involve more than one group of people,


with at leas t one group ac ting as a c omparis on group.
A s s ignment to thes e groups is not random but is
determined by s omething outs ide the res earc her's
c ontrol. A n example of this des ign is c omparing males
and females on their attitudes toward s tatis tic s . A t bes t,
this s ort of des ign provides moderate evidenc e of c aus e
and effec t. Without random as s ignment to groups , the
groups are likely not equal on a bunc h of unmeas ured
variables , and thos e might be the real c aus e for any
differenc es that are found.

E xperimental des igns

T hes e des igns have a c omparis on group and,


importantly, people are as s igned to the groups randomly.
T he random as s ignment to groups allows for res earc hers
to as s ume that all groups are equal on all unmeas ured
variables , thus (theoretic ally) ruling them out as
alternative explanations for any differenc es found. A n
example of this des ign is a drug s tudy in whic h all
partic ipants randomly get either the drug being tes ted or
a c omparis on drug or a plac ebo (s ugar pill).

Does Weight Cause Height?


E arlier in this hac k, I mentioned a well- known c orrelational
finding: in people, height and weight tend to be related. Taller
males weigh more, us ually, then s horter males , for example. I
laughed off the s ugges tion that if we fed people more, they would
get tallerbec aus e of what I think I know about how the body
grows , the s ugges tion that weight c aus es height is theoretic ally
unlikely. But what if you demanded s cientific proof?

I c ould tes t the hypothes is that weight c aus es height us ing a


bas ic experimental des ign. E xperimental des igns have a
c omparis on group, and the as s ignment to s uc h groups mus t be
random. A ny relations hips found under s uc h c irc ums tanc es are
likely c aus al relations hips . For my s tudy, I 'd c reate two groups :

Group 1

T hirty c ollege fres hmen, who I would rec ruit from the
population of the M idwes tern univers ity where I work.
T his group would be the experimental group; I would
inc reas e their weight and meas ure whether their height
inc reas es .

Group 2

T hirty c ollege fres hmen, who I would rec ruit from the
population of the M idwes tern univers ity where I work.
T his group would be the control group; I would not
manipulate their weight at all and would then meas ure
whether their height c hanges .

I n this des ign, s c ientis ts would c all weight


the independent variable (bec aus e we don't
c are what c aus es it) and height the
dependent variable (bec aus e we wonder
whether it depends on, or is c aus ed by, the
independent variable).

Bec aus e this des ign matc hes the c riteria for experimental
des igns , we c ould interpret any relations hips found as evidenc e
of c aus e and effec t.

Fighting Threats to Validity

Res earc h c onc lus ions fall into two types . T hey have to do with
the c aus e- and- effec t c laim and whether any s uc h c laim, onc e it
is es tablis hed, is generalizable to whole populations or outs ide
the laboratory. Table 1 - 9 dis plays the primary types of validity
c onc erns when interpreting res earc h res ults . T hes e c onc erns
are the hurdles that mus t be c ros s ed by res earc hers .

Table Validity of research results


Validity
Validity question
concern
Statistical
conclusion Is there a relationship among
validity variables?

Internal Is the relationship a cause-and-effect


validity relationship?
Is the cause-and-effect relationship
Construct
among the variables you believe
validity
should be affected?
Does this cause-and-effect
External
relationship exist everywhere for
validity
everyone?

E ven when res earc hers have c hos en a true experimental des ign,
they s till mus t worry that any res ults might not really be due to
one variable affecting another. A c aus e- and- effec t c onc lus ion
has many threats to its validity, but fortunately, jus t by thinking
about it, res earc hers have identified many of thes e threats and
have developed s olutions .

Res earc hers ' unders tanding of group


des igns , the terminology us ed to des c ribe
them, the identific ation of threats to
validity in res earc h des ign, and the tools
to guard agains t the threats are pretty
muc h entirely due to the extremely
influential works of C ook and C ampbell,
c ited in the "See A ls o" s ec tion of this
hac k.
A few threats to the validity of c aus al c laims and c laims of
generalizability are dis c us s ed next, along with s ome ways of
eliminating them. T here are dozens of threats identified and
dealt with in the res earc h des ign literature, but mos t of them are
either uns olvable or c an be s olved with the s ame tools des c ribed
here:

His tory

O uts ide events c ould affec t res ults . A s olution is to us e


a c ontrol group (a c omparis on group that does not
rec eive the drug or intervention or whatever), with
random as s ignment of s ubjec ts to groups . A nother part
of the s olution is to c ontrol both groups ' environments
as muc h as pos s ible (e.g., in laboratory- type s ettings ).

Maturation

Subjec ts develop naturally during a s tudy, and c hanges


might be due to thes e natural developments . Random
as s ignment of partic ipants to an experimental group and
a c ontrol group s olves this problem nic ely.

Selection

T here might be s ys tematic bias in as s igning s ubjec ts to


groups . T he s olution is to as s ign s ubjec ts randomly.
Tes ting

J us t taking a pretes t might affec t the level of the


res earc h variable. C reate a c omparis on group and give
both groups the pretes t, s o any c hanges will be equal
between the groups . A nd as s ign s ubjec ts to the two
groups randomly (are you s tarting to s ee a pattern
here? ).

I ns trumentation

T here might be s ys tematic bias in the meas urement.


T he s olution is to us e valid, s tandardized, objec tively
s c ored tes ts .

H awthorne E ffec t

Subjec ts ' awarenes s that they are s ubjec ts in a s tudy


might affec t res ults . To fight this , you c ould limit your
s ubjec ts ' awarenes s of what res ults you expec t, or you
c ould c onduc t a double- blind s tudy in whic h s ubjec ts
(and res earc hers ) don't even know what treatment they
are rec eiving.

T he validity of res earc h des ign and the validity of any c laims
about c aus e and effec t are s imilar to c laims of validity in
meas urement [H ac k #2 8 ]. Suc h arguments are open and
unending, and validity c onc lus ions res t on a reas oned
examination of the evidenc e at hand and c ons ideration for what
s eems reas onable.
See Also

C ampbell, D .T. and Stanley, J .C . (1 9 6 6 ). Experimental


and quas i-experimental des igns for res earch. C hic ago:
Rand M c N ally.

C ook, T.D . and C ampbell, D .T. (1 9 7 9 ). Quas i-


experimentation: Des ign and analys is is s ues for field
s ettings . Bos ton: H oughton- M ifflin.

Shadis h, W.R., C ook, T.D ., and C ampbell, D .T. (2 0 0 2 ).


Experimental and quas i-experimental des igns for
generalized caus al inference. Bos ton: H oughton- M ifflin.
Hack 10. Know Big When You See It

You've just read about an amazing new scientif ic discovery, but


is such a f inding really a big deal? By applying ef f ect size
interpretations, you can judge the importance of such
announcements (or lack thereof ) f or yourself .

Something is mis s ing in mos t reports of s c ientific findings in


nons c ientific public ations , on T V, on the radio, anddo I even
have to mentionon the Web. A lthough reports in s uc h media
typic ally do a pretty good job of only reporting findings that are
"s tatis tic ally s ignific ant," this is not enough to determine
whether anything really important or us eful has been dis c overed.
A big drug s tudy c an report "s ignific ant" res ults , but s till not
have found anything of interes t to the res t of us or even other
res earc hers .

A s we repeat in many plac es in this book, s ignific anc e [H ac k


#4 ] means only that what you found is likely to be true about the
bigger population you s ampled from. T he problem is that this
fac t alone is not nearly enough for you to know whether you
s hould c hange your behavior, s tart a new diet, s witc h drugs , or
reinterpret your view of the world.

What you need to know to make dec is ions about your life and
reality in light of any new s c ientific report is the s ize of the
relations hip that has jus t been brought to light. H ow much better
is brand A than brand B? H ow big is that SAT differenc e between
boys and girls in meaningful terms ? I s it worth it to take that half
an as pirin a day, every day, to lower your ris k of a heart attac k?
H ow muc h lower is that ris k anyway?

T he s trength of that relations hip s hould be expres s ed in s ome


s tandardized way, too. O therwis e, there is no way to really judge
how big it is . U s ing a s tatis tic al tool known as effect s ize will let
you know big when you s ee it.

Seeing Effect Sizes Everywhere

A n effec t s ize is a s tandardized value that indic ates the s trength


of a relations hip between two variables . Before we talk about how
to rec ognize or interpret effec t s izes , let's begin with s ome
bas ic s about relations hips and s tatis tic al res earc h.

Statis tic al res earc h has always been interes ted in relations hips
among variables . T he c orrelation c oeffic ient, for example, is an
index of the s trength and direc tion of relations hips between two
s ets of s c ores [H ac k #1 1 ]. L es s obvious , but s till valid,
examples of s tatis tic al proc edures that meas ure relations hips
inc lude t tes ts [H ac k #1 7 ] and analys is of varianc e, a proc edure
for c omparing more than two groups at one time.

E ven proc edures that c ompare different


groups are s till interes ted in relations hips
between variables . With a t tes t, for
ins tanc e, a s ignific ant res ult means that it
matters whic h group a pers on is in. I n
other words , there is an as s oc iation
between the independent variable (whic h
defines the groups ) and the dependent
variable (the meas ured outc ome).
Finding or Computing Effect Sizes

T his hac k is about finding and interpreting effec t s izes to judge


the implic ations of s c ientific findings reported in the popular
media or in s c ientific writings . O ften, the effec t s ize is reported
direc tly and you jus t have to know how to interpret it. O ther
times , it is not reported, but enough information is provided s o
that you c an figure out what the effec t s ize is .

When effec t s izes are reported, they are typic ally one of three
types . T hey differ depending on the proc edure us ed and the way
that proc edure quantifies the information of interes t. I n eac h
c as e, the effec t s ize c an be interpreted as es timates of the "s ize
of the relations hip between variables ." H ere are the three typic al
types of effec t s izes :

C orrelation c oeffic ient

A correlation, s ymbolized by r, is already a meas ure of


the relations hip between variables and, thus , is an effec t
s ize. Bec aus e c orrelations c an be negative, though, the
value is s ometimes s quared to produc e a value that is
always greater than zero. T hus , the value of r 2 is
interpreted as the "proportion of varianc e" s hared by
variables .
d

T his value, s ymbolized by d s trangely enough,


s ummarizes the differenc e between two group means
us ed in a t tes t. I t is c alc ulated by dividing the mean
differenc e of the two groups by the average s tandard
deviation in the two groups .

H ere's an alternative, eas y, s uper- fun,


ultra- c ool, and neato- s well way to
c alc ulate d:

Eta-s quared

T he effec t s ize mos t often reported for the res ults of an


analys is of varianc e is s ymbolized as h 2. Similar to r 2, it
is interpreted as the "proportion of varianc e" in the
dependent variable (the outc ome variable) ac c ounted for
by the independent variable (what group you are in).

Interpreting Effect Sizes


With levels of s ignific anc e, s tatis tic ians have adopted c ertain
s izes that are "good" to ac hieve. For example, mos t s tatis tic al
res earc hers hope to ac hieve a .0 5 or lower level of s ignific anc e.
With effec t s izes , though, there are not always c ertain values
that are c learly good or c learly bad. Still, s ome s tandards for
s mall, medium, and large effec t s izes have been s ugges ted.

T he s tandards for big, medium, and little are bas ed, for the mos t
part, on the effec t s izes that are normally found in real- world
res earc h. I f a given effec t s ize is s o large as to be rarely found in
publis hed res earc h, it is c ons idered to be big. I f the effec t s ize is
tiny and eas y to find in real- life res earc h, then it is c ons idered to
be s mall.

You s hould dec ide yours elf, though, how big an effec t s ize is of
interes t to you when interpreting res earc h res ults . I t all depends
on the area of inves tigation. Table 1 - 1 0 provides the rules of
thumb for how big is big.

Table Effect size standards


Effect sze Small Medium Large
r +/-.10 +/-.30 +/-.50
r2 .01 .09 .25
d .2 .5 .8
2 .01 .06 .14
h
Interpreting Research Findings

T he advantage of talking about effec t s izes when dis c us s ing


res earc h res ults is that everyone c an get a s ens e of what impac t
the given res earc h variable (or intervention, or drug, or teac hing
tec hnique) is really having on the world. Bec aus e they are
typic ally reported without any probability information (level of
s ignific anc e), effec t s izes are mos t us eful when provided
alongs ide traditional level of s ignific anc e numbers . T his way, two
ques tions c an be ans wered:

D oes this relations hip probably exis t in the population?

H ow big is the relations hip?

Remember our example of whether you s hould dec ide to take half
an as pirin eac h day to c ut down your c hanc es of having a heart
attac k? A well- public ized s tudy in the late 1 9 8 0 s found a
s tatis tic ally s ignific ant relations hip between thes e two
variables . O f c ours e, you s hould talk with your doc tor before you
make any s ort of dec is ion like this , but you s hould als o have as
muc h information as pos s ible to help you make that dec is ion.
L et's us e effec t s ize information to help us interpret thes e s orts
of findings .

H ere is what was reported in the media:

A s ample of 2 2 ,0 7 1 phys ic ians were randomly divided into two


groups . For a long period of time, half took as pirin every day,
while the other half took a plac ebo (whic h looked and tas ted jus t
like as pirin). A t the end of the s tudy period (whic h ac tually
ended early bec aus e the effec tivenes s of as pirin was c ons idered
s o large), the phys ic ians taking as pirin were about half as likely
to have had a heart attac k than the plac ebo group. 1 .7 1 perc ent
of the plac ebo phys ic ians had attac ks vers us about 1 perc ent
(.9 4 perc ent) of the as pirin phys ic ians . T he findings were
s tatis tic ally s ignific ant.

T he "c lear" interpretation of s uc h findings is that taking as pirin


c uts your c hanc es of a heart attac k in half. A s s uming that the
s tudy was repres entative and the phys ic ians in the s tudy are
like you and me in important ways , this interpretation is fairly
c orrec t.

A nother way to interpret the findings is to look at the effec t s ize


of the as pirin us e. U s ing a formula for proportional c omparis ons ,
the effec t s ize for this s tudy is .0 6 s tandard deviations , or a d of
.0 6 . A pplying the effec t s ize s tandards s hown in Table 1 - 1 0 , this
effec t s ize s hould be interpreted as s mallvery s mall, really. T his
interpretation s ugges ts that there is really quite a tiny
relations hip between as pirin- taking and heart attac ks . T he
relations hip is real, jus t not very s trong.

O ne way to think about this is that your c hanc es of having a


heart attac k during a given period of time is pretty s mall to begin
with. 9 8 .7 6 perc ent of everyone in the s tudy did not have a heart
attac k, whether they took as pirin or not. A lthough taking as pirin
does lower your c hanc es , they go from s mall to a little s maller. I t
is s imilar to the idea that entering the lottery mas s ively
inc reas es your c hanc es of winning c ompared to thos e who do not
enter, but your c hanc es are s till s lim.

Why It Works

A res earc her c an ac hieve s ignific ant res ults , but s till not have
found anything for anyone to get exc ited about. T his is bec aus e
s ignificance tells you only that your s ample res ults probably did
not oc c ur by c hanc e. T he res ults are real and likely exis t in the
population. I f you have found evidenc e of a s mall relations hip
between two variables or between the us e of a drug and s ome
medic al outc ome, the relations hip might be s o s mall that no one
is really interes ted in it. T he effec t of the drug might be real, but
weak, s o it's not worth rec ommending to patients . T he
relations hip between A and B might be greater than zero, but s o
tiny as to do little to help unders tand either variable.

M odern res earc hers are s till interes ted in whether there is
s tatis tic al s ignific anc e in their findings , but they s hould almos t
always report and dis c us s the effec t s ize. I f the effec t s ize is
reported, you c an interpret it. I f it is not reported, you c an often
dig out the information you need from publis hed reports of
s c ientific findings and c alc ulate it yours elf. T he c ool part is that
you might then know more about the importanc e of the dis c overy
than the media who reported the findings and, maybe, even the
s c ientis ts thems elves .
Chapter 2. Discovering
Relationships
T here are invis ible webs of relations hips around us . Variable A
c aus es Variable B, whic h influenc es Variable C , whic h is entirely
independent of Variable D , unles s Variable E c omes into play.
T he hac ks in this c hapter allow you to dis c over thes e
c onnec tions and des c ribe them ac c urately. T hes e are the hac ks
that reveal the hidden reas ons for why people do the things they
do and why things are the way they are.

T he c onnec tions between one trait and another, between a c aus e


and an effec t, are relations hips that are eas ily revealedwith the
right tric ks . Begin by identifying the s trength of any as s oc iation
[H ac k #1 1 ], and then draw what it looks like [H ac k #1 2 ]. N ext,
us e your knowledge of that relations hip to make predic tions
[H ac k #1 3 ], and then improve the ac c urac y of thos e predic tions
[H ac k #1 4 ]. Some relations hips appear through the obs ervation
of unexpec ted oc c urrenc es [H ac ks #1 5 and #1 6 ] or by notic ing
real differenc es between groups [H ac k #1 7 ].

Bec aus e we c annot meas ure every example of a pers on, fis h, or
pine tree that we might be interes ted in, we mus t rely on
repres entative s amples [H ac k #1 9 ] to provide our obs ervations .
Sampling c an mis lead us [H ac k #1 8 ], however, or it c an work in
s urpris ingly c ool ways [H ac k #2 0 ].

To s hare your findings with others or unders tand what thes e


findings have to tell you, you need to avoid both being dec eived
and dec eiving others . Be c areful not to mis interpret any numbers
[H ac k #2 1 ] or pic tures [H ac k #2 2 ].

P ac k thes e tools in your tool belt and head out to find whatever
there is to find.
Hack 11. Discover Relationships

Revealing the invisible connections in the world is just a matter


of recording observations and computing the magical, mystical
correlation coef f icient.

You probably make all s orts of as s umptions about why people


feel the way they feel or do the things they do. Statis tic al
res earc hers would c all thes e as s umptions hypothes es about the
relations hip among variables .

Regardles s of what s c ienc e c alls it, you probably do it. You might
make thes e gues s es about as s oc iations between attitudes and
behavior or between attitudes and attitudes or behaviors and
behaviors . You might do it informally as you s eek to unders tand
people in the world around you, or you might need to do it as a
marketing s pec ialis t to unders tand your c us tomer, or you might
be a s truggling ps yc hology graduate s tudent who needs to
c omplete a c las s as s ignment that requires s tatis tic al analys is
of the relations hip between s elf- es teem and depres s ion.

I n s tatis tic s , s uc h a relations hip is c alled a correlation. T he


number des c ribing the s ize of that relations hip is a correlation
coefficient. By c omputing this us eful value, you c an get ans wers
to any ques tion you have about relations hips (exc ept in terms of
dating relations hips ; you're on your own there).

Testing Hypotheses About Relationships


I magine a s tudy in whic h a res earc her for the A meric an
C hees ec ake Sellers A s s oc iation has a hypothes is that the
reas on people like c hees ec ake is that they like c hees e. She is
gues s ing that there is a relations hip between attitude toward
c hees e and attitude toward c hees ec ake. I f her hypothes is turns
out to be c orrec t, s he'll purc has e the huge mailing lis t of c hees e
lovers from the A meric an C hees e L overs A s s oc iation and s end
them informative broc hures about the healing properties of
c hees ec ake. I f s he's right, s ales will roc ket up!

To tes t her hypothes is , s he c reates two s urveys . O ne as ks


res pondents to s ay how they feel about c hees e, and the other
as ks how they feel about c hees ec ake. A s c ore of 5 0 means the
pers on loves c hees e (or c hees ec ake), and a s c ore of 0 means
the pers on hates c hees ec ake (or c hees e). Table 2 - 1 s hows the
res ults for the data s he c ollec ts from five people on the bus on
her way to work.

Table Data for the relationship between cheese


and cheesecake attitudes
Attitude Attitude
Person toward toward
cheese cheesecake
Larry 50 36
Moe 45 35
Curly Joe 30 22
Shemp 30 25
Groucho 10 20
L et's look at the data and s ee if there s eems to be a relations hip
between the two variables . (G o ahead, I 'll give you 3 0 s ec onds .)

I 'd s ay there is a pretty c lear relations hip there. T he people who


s c ored the highes t on the c hees e s c ale als o s c ored the highes t
on the c hees ec ake s c ale. T he groups of people didn't s c ore
exac tly the s ame on both s c ales , of c ours e, and the rank order
is n't even the s ame, but, relatively s peaking, the pos ition of
eac h pers on to eac h of the other people when it c omes to c hees e
attitude is about the s ame as when it c omes to c hees ec ake
attitude. T he A s s oc iation's marketer has s upport for her
hypothes is .

Computing a Correlation Coefficient

J us t eyeballing two c olumns of numbers from a s ample, though,


is us ually not enough to really know whether there is a
relations hip between two things . T he marketing s pec ialis t in our
example wants to us e a s ingle number to more prec is ely
des c ribe whatever relations hip is s een.

T he correlation coefficient takes into ac c ount all the information


we us ed when we looked at our two c olumns of numbers in Table
2 - 1 and dec ided whether there was a relations hip there. T he
c orrelation c oeffic ient is produc ed through a formula that does
the following things :

1. L ooks at eac h s c ore in a c olumn

2. Sees how dis tant that s c ore is from the mean of that
c olumn
3. I dentifies the dis tanc e from the mean of its matc hing
s c ore in the other c olumn

4. M ultiplies the paired dis tanc es together

5. A verages the res ults of thos e multiplic ations

I f this were a s tatis tic s textbook, I 'd have to pres ent a


s omewhat c omplic ated formula for c alc ulating the c orrelation
c oeffic ient. To c all it s omewhat c omplic ated is generous . Frankly,
it is terrific ally frightening. For your own s anity, I 'm not even
going to s how it to you. Trus t me. I ns tead, I 'll s how you this
pleas ant, friendly looking formula (whic h works jus t as well):

Z refers to a Z-s core, whic h is the dis tanc e of a s c ore from the
mean. T hes e dis tanc es are then divided by the s tandard
deviation for that dis tribution. So, Zx means all the Z- s c ores
from the firs t c olumn, and Zy means all the Z- s c ores from the
s ec ond c olumn. ZxZy means multiply them together. T he S
s ymbol means add up. So, the equation s ays to multiply together
all the pairs of Z- s c ores and add thos e c ros s - produc ts together.
T hen, divide by the number (N) of pairs of s c ores minus 1 .

T he mean is the arithmetic average of a group of s c ores . I t is


produc ed by adding up all the numbers and dividing by the
number of s c ores . A s tandard deviation for a group of numbers is
the average dis tanc e of eac h s c ore from the mean.

Before I produc e the Z-s cores us ed in our c orrelation formula, I


need to know the means and s tandard deviations for eac h
c olumn of data. E quations for c alc ulating thes e key values are
provided in "D es c ribe the World U s ing J us t Two N umbers " [H ac k
#2 ]. H ere are the means and s tandard deviations for our two
variables :
Attitude toward chees e

M ean = 3 3 ; s tandard deviation = 1 5 .6 5

Attitude toward chees ecake

M ean = 2 7 .6 ; s tandard deviation = 7 .4 4

Table 2 - 2 s hows s ome of the c alc ulations for our c hees e


attitude data.

Table Calculations for discovering relationship b


cheesecake attitude
Z-
Attitude Attitude
scores
Person toward toward
for
cheese cheesecake
cheese
Larry 50 36 1.09
Moe 45 35 .77
Curly Joe 30 22 -.19
Shemp 30 25 -.19
Groucho 10 20 -1.47
T he c orrelation is .9 3 . T his is very c los e to 1 .0 , whic h is the
s tronges t a pos itive c orrelation c an be, s o the c hees e- to-
c hees ec ake c orrelation repres ents a very s trong relations hip.

Interpreting a Correlation Coefficient

Somewhat magic ally, the c orrelation formula proc es s produc es a


number, ranging in value from - 1 .0 0 to +1 .0 0 , that meas ures the
s trength of relations hip between two variables . P os itive s igns
indic ate the relations hip is in the s ame direc tion. A s one value
inc reas es , the other value inc reas es . N egative s igns indic ate
the relations hip is in the oppos ite direc tion. A s one value
inc reas es , the other value dec reas es . A n important point to
make is that the c orrelation c oeffic ient provides a s tandardized
meas ure of the s trength of linear relations hip between two
variables [H ac k #1 2 ].

T he direction of a c orrelation (whether it is negative or pos itive)


is the artific ial res ult of the direc tion of the s c ale one c hoos es to
us e to meas ure the variables . I n other words , s trong
c orrelations c an be negative. T hink of a meas ure of golf s kill
c orrelated with average golf s c ore. T he higher the s kill, the lower
the s c ore, but you would s till expec t a s trong relations hip.

Statistical Significance and Correlations

O ur marketing s pec ialis t is likely als o interes ted in whether a


s ample c orrelation is large enough that it is likely to have been
drawn from a population where the c orrelation is bigger than
zero. I n other words , is the c orrelation we found in our s ample s o
large that it mus t have c ome from a population where there is at
leas t s ome s ort of relations hip between thes e variables ?

T he marketer in our example trus ts c orrelations between a large


number of pairs more than s he does c orrelations from a s mall
s ample (s uc h as our five bus riders ). I f s he were to report this
relations hip to her bos s and it was n't true about mos t people,
s he might find hers elf s elling c hees ec ake out of her minivan for a
living.

Table 2 - 3 s hows how large a c orrelation in a s ample mus t be


before s tatis tic ians are s ure that there is a relations hip greater
than zero in the population it repres ents .

Table Correlations that likely did not occur by


chance
Smallest correlation
Sample
considered statistically
size
significant
5 .88
10 .63
15 .51
20 .44
25 .40
30 .38
60 .26
100 .20
With our s ample of five people, any c orrelation at leas t as big as
.8 8 would be treated as s tatis tically s ignificant (whic h means "s o
big it probably exis ts in whatever population you took your
s ample from").

Where Else It Works

You c an produc e a c orrelation c oeffic ient as a meas ure of the


s trength of a relations hip between any two variables as long as
c ertain c onditions are met:

You mus t be able to meas ure the variables in a way


where numbers have real meaning and repres ent s ome
underlying c ontinuous c onc ept. E xamples of continuous
variables are attitude, feelings , knowledge, s kill, and
things you c an c ount, s uc h as pounds gained bec aus e of
love of c hees ec ake. (I f the thing you are meas uring is
not c ontinuous , as in the c as e when you have different
c ategories , s uc h as gender or politic al party, you c an
s till c alc ulate a c orrelation, jus t not with the formula
s hown here.)

T he variables mus t ac tually vary. I f everyone felt the


s ame about c hees e, you c ouldn't c alc ulate a c orrelation
with attitude towards c hees ec ake or c hoc olate or
anything. T he math requires s ome variability.
T he minimum c orrelation s izes required to have
s tatis tic al s ignific anc e (s hown in Table 2 - 3 ) are
prec is ely ac c urate only when the s ample is randomly
drawn from the population. Res earc hers , s uc h as our
c hees ec ake marketer, mus t dec ide whether their s ample
is repres entative in the way a random s ample would be.

Dire Warning About Correlations

I t's tempting to treat c orrelational evidenc e as evidenc e of


c aus e and effec t. O f c ours e, there might be all s orts of reas ons
why two things are related that have nothing to do with one thing
c aus ing the other.

For example, in the pres enc e of s uc h a s trong c orrelation


between attitude toward c hees e and attitude toward c hees ec ake,
you might want to c onc lude that a pers on's affinity for c hees e
caus es him to like c hees ec ake bec aus e there is c hees e in it.
T here might be nonc aus al explanations , though. T he s ame
people who like c hees e might tend to like c hees ec ake bec aus e
they like all foods that are kind of s oft and s moos hy.
Hack 12. Graph Relationships

Whenever a relationship between two variables is discovered


and def ined, we can use one variable to guess another. Drawing
a regression line allows you to picture the relationship and make
predictions.

So, you've jus t been named as s is tant regional manager of ic e


c ream s ales for 1 0 ,0 0 0 s quare feet of prime beac hfront retail
s pac e along the s hores of Sunflower L ake in northeas t Kans as .
C ongratulations ! You have a lot of res pons ibility and many
s trategic dec is ions to make about how to maximize profit. O ne
dilemma that you will c onfront is whether to even open. Being
open c os ts money and us es res ourc es , and if you will s ell few ic e
c ream c ones that day, it probably won't be worth it to even
unloc k the s ervic e window of your brightly painted plywood
s hac k.

I f only there were s ome way to magic ally know how good
bus ines s will be on any given day. A s an amateur s tatis tic ian,
you as s ume there mus t be a s c ientific way to gues s how many
c ones will s ell without having to ac tually open for bus ines s and
tes t the market for the day. You're in luc k. T here is a way to
make es timates of the value or s c ore on s ome variable (s uc h as
ic e c ream s ales ) by us ing other information.

T he key is that the other information mus t c ome from a variable


that is related to the variable of interes t. By drawing a line that
s hows the relations hip among your variables for the days you
know, you c an look at the line as it extends into the future (or
the pas t) for the days you do not know and gues s what will
happen. Suc h a graphic tool is c alled a regres s ion line.

Drawing a Picture of the Future

O bs ervant folks often dis c over c orrelations between variables


[H ac k #1 1 ]. T he us efulnes s of knowing that a relations hip exis ts
goes beyond des c riptive s tatis tic s , however.

I magine that you have data on the ac tivities around Sunflower


L ake. A mong other things , you have c ollec ted information about
the amount of ic e c ream s ales under the former as s is tant
regional manager of ic e c ream s ales (in number of ic e c ream
c ones s old) and the high temperature for eac h day (in degrees
Fahrenheit). T he c orrelation c oeffic ient that repres ents the
relations hip between heat and c raving for ic e c ream s hould be
pos itive and fairly large. T hat is , as the heat inc reas es , s ales
probably inc reas e.

I ntuitively, it makes s ens e that with s ome experienc e, you c ould


look at the thermometer and get a s ens e of how bus y the ic e
c ream s tand is going to be that day. O nc e you know that there is
a pos itive or negative relations hip between two variables , it
makes s ens e that knowing the s c ore on one will give you a
general idea of what the s c ore is on the other.

O nc e you find a relations hip between two variables like this , it is


reas onable to as s ume that the relations hip between your two
variables is linear. I n other words , if you produc e a graph with all
the pos s ible values of one variable as the X- axis (the horizontal
line along the bottom) and all the pos s ible values of the other
variable as the Y- axis (the vertic al line along the s ide) and then
plot eac h pair of s c ores , the res ulting dots form an es s entially
s traight line.
Connecting the Dots

Figure 2 - 1 s hows a way to graph the relations hip between the


temperature and ic e c ream s ales at the beac h.

Figure 2-1. A linear relationship between sales


and temperature

G raph A plac es dots to repres ent both values on the two


variables , bas ed on his toric information you have c ollec ted. For
ins tanc e, the lowes t dot means that at 7 0 degrees , 5 0 ic e c ream
c ones were s old. A t 9 0 degrees , 6 0 c ones were s old. T here is a
c lear pattern here, and the relations hip looks like a s traight line.
For every 1 0 - degree jump in temperature, s ales go up 5 c ones .
For every 1 - degree c hange in temperature, there is a 1/2- c one
inc reas e in s ales . G raph B draws a line bas ed on this rule. T he
line goes through every dot.

I n Figure 2 - 1 , analyze G raph B to get a s ens e of the power of a


regres s ion equation. T he line inc ludes territory that is not
s ampled by the data. For ins tanc e, we do not have data for 1 0 0 -
degree days . With the regres s ion equation, though, we c an
es timate what s ales might be. I f we plac e a dot on the line at the
1 0 0 - degree mark, it appears to matc h up with the 6 5 - c ones
mark. U s ing this regres s ion equation, we c ould es timate that on
1 0 0 - degree days , 6 5 c ones would be s old. We c ould do the
s ame for c ooler days . O ur graph s ugges ts that on a 6 0 - degree
day, 4 5 c ones would be s old.

Playing "What If?"

T he relations hip between heat and c one s ales c an be expres s ed


mathematic ally. O ur data for graphs A and B in Figure 2 - 1 look
like this :

High Ice cream cones


temperature sold
70 50
80 55
90 60
So, let's s ee how we c ould build an equation that des c ribes the
relations hip us ing numbers . Regres s ion lines are s tatis tic al
tools , after all. N otic e that if we s tart with 7 0 degrees , we get 5 0
c ones . I f we enter 7 0 into our formula, we want 5 0 to be the
output. We als o want 8 0 to get us 5 5 , and 9 0 to get us 6 0 .

I played around with different pos s ibilities us ing thes e values in


an attempt to figure out what mus t be done to the input number
to get the c orrec t output number. I notic ed that the "ic e c ream
c ones s old" value was always s maller than the temperature
variable, s o I wanted an equation that would s hrink the
temperature. L inear equations require a c ons tant (s ome value to
us e in every equation) in order to produc e a s traight line, s o I
needed to have a c ons tant in my equation as well. Rather than
us e trial and error, you c ould als o enter this data into a s tatis tic s
program, s uc h as SP SS, or a s preads heet, s uc h as E xc el, to
produc e the c orrec t c omponents . I found that this formula works
well:

C ones Sold = 1 5 + (Temperature x.5 0 )

A lgebraic ally, if you begin with a c ons tant


and then add s ome s tandard amount that
is altered only through bas ic mathematic al
func tions , s uc h as multiplic ation, you will
define a s traight line that c an be graphed.

"What if? " is a fun game to play with regres s ion lines . E nter a
value in one end and a gues s c omes out the other end; you c an
get an ans wer even for unrealis tic s c enarios . T hrow s ome c razy
value onto the line, s uc h as 2 0 0 degrees , and you c an s till get
an es timate for c one s ales : 1 1 5 !

T he regres s ion equation for this relations hip would des c ribe a
line that c ould be drawn to s how this relations hip vis ually. With
real data, the relations hip is s eldom as c lear as it is in our
example. (T he c orrelation for our s mall fic tional data s et is a
perfec t 1 .0 .)

I n s tatis tic s , regres s ion formulas make


us e of the c orrelation c oeffic ient, the
means , and the s tandard deviations of
both groups of variable s c ores , regardles s
of the s trength of the relations hip in the
data s et. "U s e O ne Variable to P redic t
A nother" [H ac k #1 3 ] pres ents s tatis tic al
methods for produc ing a regres s ion
equation.

Why It Works

T he ac c urac y of thes e s orts of regres s ion es timates depends on


a c ouple important fac tors . Firs t, the relations hip between
variables mus t be fairly large. Small relations hips produc e dots
all over the plac e in patterns that aren't s traight at all, and a
regres s ion line drawn through s uc h a mes s mis s es a lot of dots
and is not ac c urate. U nfortunately, in the s oc ial s c ienc es , we
don't find very many really s trong relations hips , s o regres s ion
predic tions tend to produc e a c ertain number of errors . I n
s tatis tic s , errors c ome with the territory.

Sec ond, the relations hip mus t be at leas t s ort of linear. A s in our
ic e c ream c one example, if the nature of the relations hip
c hanges s omewhere along the regres s ion line, the regres s ion
line will mis s s ome of the data. Fortunately, mos t relations hips in
the natural world are linear or at leas t c los e to it.

Where It Doesn't Work

T he ac tual relations hip might not be exac tly linear, but if it is


es s entially s o, then regres s ion analys is works pretty well. For
example, with our ic e c ream example, maybe there is a c ertain
inc reas e in s ales for every degree jump in the temperature. I f
that inc reas e is the s ame regardles s of where we are on the
s c ale, we'll s ee a linear relations hip. I t is pos s ible, though, that
s ales jump onc e a c ertain temperature is reac hed. P erhaps onc e
it is over 9 0 degrees at the beac h, people really floc k to get
relief.

G raphs C and D in Figure 2 - 2 s how what happens if the true


relations hip is n't exac tly linear.

Figure 2-2. A nonlinear relationship


Following the requirements of linear regres s ion, the regres s ion
equation always produc es a s traight line and, in this c as e, two of
the dots fall right on it, but one does not. T his line does a dec ent
job of explaining the data by pic turing the relations hip, but
bec aus e the relations hip is not linear, the regres s ion equation
makes s ome errors .
Hack 13. Use One Variable to Predict
Another

Simple linear regression is a powerf ul tool f or measuring


something you cannot see or f or predicting the outcome of
events that have not happened yet. With some help f rom our
special f riend statistics, you can make a precise guess of how
someone will score on one variable by looking at perf ormance on
another.

M any profes s ionals , both in and outs ide of the s oc ial s c ienc es ,
often need to predic t how a pers on will perform on s ome tas k or
s c ore on s ome variable, but they c annot meas ure the c ritic al
variable direc tly. T his is a c ommon need when making admis s ion
dec is ions into c ollege, for example. A dmis s ions offic ers want to
predic t c ollege performanc e (perhaps grade point average or
years until c ompletion). H owever, bec aus e the pros pec tive
s tudent has not ac tually gone to c ollege yet, admis s ions offic ers
mus t us e whatever information they c an get now to gues s what
the future holds .

Sc hools often us e s c ores on s tandardized c ollege admis s ions


tes ts as an indic ator of future performanc e. L et's imagine that a
s mall c ollege dec ides to us e s c ores on the A meric an C ollege
Tes t (A C T ) as a predic tor of c ollege grade point average (G P A )
at the end of s tudents ' firs t years . T he admis s ions offic e goes
bac k through a few years of rec ords and gathers the A C T s c ores
and fres hman G P A s for a c ouple hundred s tudents . T hey
dis c over, to their delight, that there is a moderate relations hip
between thes e two variables : a c orrelation c oeffic ient of .5 5 .

C orrelation c oeffic ients are a meas ure of the s trength of linear


relations hips between two variables [H ac k #1 1 ], and .5 5
indic ates a fairly large relations hip. T his is good news bec aus e
the exis tenc e of a relations hip between the two makes A C T
s c ores a good c andidate as a predic tor to gues s G P A .

Simple linear regres s ion is the proc edure that produc es all the
values we need to c ook up the magic formula that will predic t the
future. T his proc edure produc es a regres s ion line that we c an
graph to determine what the future holds [H ac k #1 2 ], but onc e
we have the formula, we don't ac tually need to do any graphing
to make our gues s es .

Cooking Up the Equation

Firs t, examine the rec ipe for c reating the formula (s ee the
"Regres s ion Formula Rec ipe" s idebar), and then we'll s ee how to
us e it with real data. You c an c lip this rec ipe out and keep it in
the kitc hen drawer.
Regression Formula Recipe

Ingredients

2 s amples of data from c orrelated variables :

1 c riterion variable (the one you want to predic t)

1 predic tor variable (the one you will predic t with)

1 c orrelation c oeffic ient of the relations hip between the


2 variables

2 s ample means

2 s ample s tandard deviations

Container

A n empty equation s haped like this :

Directions

C alc ulate the weight by whic h you will multiply your


predic tor variable:

C alc ulate the c ons tant:

Fill the regres s ion equation with the weight and


c ons tant you jus t prepared.

Serves

A nyone interes ted in gues s ing what would happen if....


T he regres s ion rec ipe c alls for two other ingredients , means and
s tandard deviations for both variables . H ere are thos e s tatis tic s
for our example:

Variable Mean Standard deviation


ACT scores 20.10 2.38
GPA 2.98 .68

You c an review means and s tandard


deviations in "D es c ribe the World U s ing
J us t Two N umbers " [H ac k #2 ].

T he admis s ions offic e built a regres s ion equation from this


information. C ons equently, as eac h applic ant's letter c ame into
the admis s ions offic e, an offic er c ould enter the s tudent's A C T
s c ore into the regres s ion formula and predic t his G P A . L et's
figure out the parts of the regres s ion equation in this example:

By plac ing all this information into the regres s ion equation
format, we get this formula for predic ting fres hman G P A us ing
A C T s c ores :

N otic e that the c ons tant in this c as e is a


negative number. T hat's O K.

Predicting Scores

I n our c ollege admis s ions example, imagine two letters arrive.


O ne applic ant, M elis s a, has an A C T s c ore of 2 6 . T he other
applic antlet's c all him Bruc ehas an A C T s c ore of 1 4 .

U s ing the regres s ion equation we have built, there would be two
different predic tions for thes e folks ' eventual grade point
averages :

For Melis s a

P redic ted G P A = - .2 4 + (2 6 x.1 6 )

P redic ted G P A = - .2 4 + 4 .1 6

P redic ted G P A = 3 .9 0
For Bruce

P redic ted G P A = - .2 4 + (1 4 x.1 6 )

P redic ted G P A = - .2 4 + 2 .2 4

P redic ted G P A = 2 .0 0

I hope, for Bruc e's s ake, there is more than one s pot available.

T he two variables in this example, A C T


s c ores and G P A , are on different s c ales ,
with A C T s c ores typic ally running between
1 and 3 6 and G P A ranging from 0 to 4 .0 .
P art of the magic of c orrelational analys es
is that the variables c an be on all s orts of
different s c ales and it does n't matter. T he
predic ted outc ome s omehow knows to be
on the s c ale of the c riterion variable. Kind
of s pooky, huh?

Why It Works

When two variables c orrelate with eac h other, there is overlap in


the information they provide. I t is as if they s hare information.
Statis tic ians s ometimes us e c orrelational information to talk
about variables s haring variance.
I f s ome of the varianc e in one variable is ac c ounted for by the
varianc e in another variable, it makes s ens e that s mart
mathematic ians c an us e one c orrelated variable to es timate the
amount of varianc e from the mean (or dis tanc e from the mean)
on another variable. T hey would have to us e numbers that
repres ent the variables ' means and variability, and a number
that repres ents the amount of overlap in information. O ur
regres s ion equation us es all that information by inc luding
means , s tandard deviations , and the c orrelation c oeffic ient.

Where Else It Works

Regres s ion is helpful in ans wering res earc h ques tions beyond
making predic tions . Sometimes , s c ientis ts jus t want to
unders tand a variable and how it operates or how it is dis tributed
in a population. T hey c an do this by looking at how that variable
is related to another variable that they know more about.

Statis tic ians c all s imple linear regres s ion


s imple not bec aus e it is eas y, but bec aus e
it us es only one predic tor variable. I t is
s imple as c ompared to complex. Real- life
predic tions like thos e in our example
us ually us e many predic tors , not jus t one.
T he method of predic ting a c riterion
variable us ing more than one predic tor is
c alled multiple regres s ion [H ac k #1 4 ].
Where It Doesn't Work

T here will be error in predic tions under three c irc ums tanc es .
Firs t, if the c orrelation is les s than perfec t between two
variables , the predic tion will not be perfec tly ac c urate. Sinc e
there are almos t never really large relations hips between
predic tors and c riteria, let alone perfec t 1 .0 c orrelations , real-
world applic ations of regres s ion make lots of mis takes . I n the
pres enc e of any c orrelation at all, though, the predic tion is more
ac c urate than blind gues s ing. You c an determine the s ize of your
errors with the s tandard error of es timate [H ac k #1 8 ].

Sec ond, linear regres s ion as s umes that the relations hip is linear.
T his is dis c us s ed in "G raph Relations hips " [H ac k #1 2 ] in
greater detail, but if the s trength of the relations hip varies at
different points along the range of s c ores , the regres s ion
predic tion will make large errors in s ome c as es .

Finally, if the data c ollec ted to firs t es tablis h the values us ed in


the regres s ion equation are not repres entative of future data,
res ults will be in error. For example, in our c ollege admis s ions
example, if an applic ant pres ents with an A C T s c ore of 3 6 , the
predic ted G P A is 5 .5 2 . T his is an impos s ible value that does not
even fit on the G P A s c ale, whic h maxes out at 4 .0 . Bec aus e the
pas t data that was us ed to es tablis h the predic tion formula
inc luded few or no A C T s c ores of 3 6 , the equation was not
equipped to deal with s uc h a high s c ore.
Hack 14. Use More Than One Variable to
Predict Another

The super powers of predicting the f uture and seeing the


invisible are available to any statistics hackers who f eel they
are worthy. Statisticians of ten answer questions and use
correlational inf ormation to solve problems by using one
variable to predict another. For more accurate predictions,
though, several predictor variables can be combined in a single
regression equation by using the methods of multiple regression.

"G raph Relations hips " [H ac k #1 2 ] dis c us s es the us eful


prophetic qualities of a regres s ion line. T hos e proc edures allow
adminis trators and s tatis tic al res earc hers to predic t
performanc e on as s es s ments never taken, unders tand variables ,
and build theories about relations hips among thos e variables .
T hey ac c omplis h thes e tric ks us ing jus t a s ingle predic tor
variable.

"U s e O ne Variable to P redic t A nother" [H ac k #1 3 ] pres ents the


problem c olleges have when dec iding whic h applic ants to admit.
T hey want to admit s tudents who will s uc c eed, s o they try to
predic t future performanc e. T he s olution in that hac k us es one
variable (a s tandardized tes t s c ore) to es timate performanc e on
a future variable (c ollege grades ).

O ften, real- life res earc hers want to make us e of the information
found in a bunc h of variables , not jus t one variable, to make
predic tions or es timate s c ores . When they want greater
ac c urac y, s c ientis ts attempt to find s everal variables that all
appear to be related to the c riterion variable of interes t (the
variable you are trying to predic t). T hey us e all this information
to produc e a multiple regres s ion equation.

Choosing Predictor Variables

You probably s hould read or reread "U s e O ne Variable to P redic t


A nother" [H ac k #1 3 ] before going further with this hac k, jus t to
review the problem at hand and how regres s ion s olves it. H ere is
the equation we built in that hac k for us ing a s ingle predic tor,
A C T s c ores , to es timate future c ollege admis s ion:

P redic ted G P A = - .2 4 + (A C T Sc orex.1 6 )

T his s ingle predic tor produc ed a regres s ion equation with output
that c orrelated .5 5 with the c riterion. P retty good, and pretty
ac c urate, but it c ould be better.

I magine our adminis trator dec ides s he's unhappy with the level
of prec is ion s he c ould get us ing the regres s ion line or equation
s he had built, and wants to do a better job. She c ould get a more
ac c urate res ult if s he c ould find more variables that c orrelate
with c ollege grades . L et's imagine that our amateur s tatis tic ian
found two other predic tor variables that c orrelated with c ollege
performanc e:

A n attitude meas ure

T he quality of a written es s ay
P erhaps performanc e on a c ollege attitude s urvey is c ollec ted
by the c ollege (s c ores range between 2 0 and 1 0 0 ), and is found
to have s ome c orrelation with future G P A . A dditionally, a s c ore
of 1 to 5 on a pers onal es s ay c ould c orrelate with c ollege G P A
and might be inc luded in the multiple regres s ion equation.

Building a Multiple Regression Equation

L et's look firs t at the abs trac t format of the regres s ion equation
in general. T hen, we'll apply the tool to the tas k at hand. H ere is
the bas ic regres s ion equation us ing jus t one predic tor variable:

C riterion = C ons tant + (P redic torxWeight)

I f you want to us e more information, you c an extend this


equation to inc lude more predic tors . H ere's an equation with
three predic tors , but you c ould expand the equation form to
inc lude any number of predic tors :

C riterion = C ons tant +


(P redic tor 1 xWeight 1 ) +
(P redic tor 2 xWeight 2 ) +
(P redic tor 3 xWeight 3 )

E ac h predic tor has its own as s oc iated weight, whic h is


determined through s tatis tic al formulas that are bas ed on the
c orrelation between the predic tor and the c riterion variable. T he
equations for this proc es s are s omewhat c omplex, s o I won't
s how them here. (You're welc ome.) I n real- life regres s ion
equation building, c omputers are almos t always us ed to produc e
multiple regres s ion equations .
I us ed the s tatis tic al s oftware SP SS for
many of the c omputations in this book,
us ing data, often fic tional, that I entered
into SP SS data files . M ic ros oft's E xc el is
another handy tool for performing s imple
s tatis tic al analys es .

U s ing realis tic data that we might find with three predic tors that
c orrelate with the c riterion, as well as c orrelate with eac h other
s omewhat, we might produc e a regres s ion equation with values
like this :

P redic ted G P A = 3 .0 1 +
(A C T Sc orex.0 2 ) +
(A ttitude Sc orex.0 0 7 ) +
(E s s ay Sc orex.0 2 5 )

With the imaginary data I us ed on my c omputer to produc e thes e


weights , the overall equation predic ted c ollege G P A very well,
finding a c orrelation of .8 0 between obs erved G P A values and
predic ted G P A values . T his is muc h better than the .5 5
c orrelation of our s ingle predic tor.

When we add two other predic tors to the


model (a des c ription of a group of variables
and how they are related), s pec ific ally the
attitude meas ure and the es s ay s c ore, the
weight for the A C T s c ore c hanges . T his is
bec aus e of the us e of partial correlations
ins tead of one- to- one c orrelations for eac h
predic tor. I n addition, the c ons tant
c hanges . T his is dis c us s ed later, in the
"Why I t Works " s ec tion of this hac k.

Making Predictions and Understanding


Relationships

To es timate what a pros pec tive s tudent's c ollege performanc e


will be, our adminis trator takes the s c ores for that s tudent on
eac h of the predic tors and enters them into the equation. She
multiplies eac h predic tor s c ore by its weight and adds the
c ons tant. T he res ulting value is the bes t gues s for future
performanc e. I t might not be exac tly right, of c ours e (and, in
fac t, is mos t likely not exac tly right), but it is a better gues s
than having no information at all.

I f you have no information at all and have


to gues s how a s tudent will do in c ollege,
you s hould gues s that s he will earn the
mean G P A , whatever that is for your
ins titution.
What if you want to do more than jus t predic t the future, and want
to really unders tand the relations hips between your predic tors
and the c riterion? You might do this bec aus e you want to build a
more effic ient formula that does n't require a bunc h of information
that is n't very us eful. You als o might do it jus t bec aus e you want
to build theory and unders tand the world, you c razy s c ientis t,
you! T he problem is that it is hard to know the independent
c ontribution of eac h predic tor by jus t looking at the weights .

T he weights for eac h variable in a multiple regres s ion equation


are s c aled to the ac tual range of s c ores on eac h variable. T his
makes it hard to c ompare eac h predic tor to figure out whic h
provides the mos t information in predic ting the c riterion.
C omparing thes e raw weights c an be mis leading, as a variable
might have a s maller weight jus t bec aus e it is on a bigger s c ale.

C ompare the weight for A C T s c ore with the weight for attitude
s c ore, for example. T he weight of .0 2 for A C T is larger than the
weight of .0 0 7 for attitude, but don't be fooled into thinking that
A C T s c ores play a larger role in predic ting G P A than attitude.
Remember, G P A s c ores range from about 1 .0 to 4 .0 , whereas
attitude s c ores range from 2 0 to 1 0 0 . A s maller weight for
attitude ac tually res ults in a bigger jump on the c riterion than
does the larger weight for A C T s c ores .

C omputer program res ults for multiple regres s ion analys es often
provide information in the format s hown in Table 2 - 4 .

Table Multiple regression results


Criterion Nonstandardized Standard
weights weigh

Constant 3.01 -----


ACT scores .02 .321
Attitude
.007 .603
scores
Essay scores .025 .156

T he third c olumn in Table 2 - 4 is more us eful than the


"N ons tandardized weights " values in identifying the key
predic tors and c omparing the unique c ontributions of eac h
predic tor to es timating the c riterion.

Standardized weights are the weights you


would get if you firs t c onvert all the raw
data into z s c ores [H ac k #2 6 ]: the
dis tanc e of eac h raw s c ore from the mean
expres s ed in s tandard deviations .
T he s tandardized weights have plac ed all predic tors on the s ame
s c ale. By doing this , the relative overlap of eac h predic tor with
the c riterion c an be fairly c ompared and unders tood. For
example, with this data, it is probably appropriate to s ay that
attitude explains twic e as muc h about c ollege G P A than does
A C T performanc e, bec aus e the s tandardized weight for attitude
is .6 0 3 , about twic e as muc h as the s tandardized weight for A C T
s c ores (.3 2 1 ).

Why It Works

M ultiple linear regres s ion does a better job in predic ting


outc omes than s imple linear regres s ion bec aus e multiple
regres s ion us es an additional bit of information to c ompute the
exac t weights for eac h predic tor. M ultiple regres s ion knows the
c orrelation of eac h predic tor with the other predic tors and us es
that to c reate more ac c urate weights .

T his bit of c omplexity is nec es s ary bec aus e if the predic tors are
related to eac h other, they s hare s ome information. T hey aren't
really independent s ourc es of predic tion if they c orrelate with
eac h other. To make the regres s ion equation as ac c urate as
pos s ible, s tatis tic al proc edures remove the s hared information
from eac h predic tor in the equation. T his produc es independent
predic tors that c ome at the c riterion from different angles ,
produc ing the bes t predic tion pos s ible.

I magine two predic tor variables that


c orrelate perfec tly with eac h otherthat is ,
c orrelation equals 1 .0 0 . U s ing both
variables in a regres s ion equation would
be no more ac c urate than us ing jus t one
(does n't matter whic h one) by its elf. By
extens ion, any overlap between predic tors
(i.e., any c orrelation between predic tors
greater or les s than 0 .0 0 ) is redundant
information.

Figure 2 - 3 illus trates the us e of multiple s ourc es of independent


information to es timate a c riterion s c ore.

Figure 2-3. Multiple predictors in multiple


regression
T he c orrelation information us ed to determine the weight for
eac h predic tor in multiple regres s ion is not the one- to- one
c orrelation between a predic tor and the c riterion. I ns tead, it is
the c orrelation between a predic tor and the c riterion when the
overlap among all the predic tors has been removed.

T his proc es s produc es predic tor variables that are s omewhat


different than the ac tual meas ure variables . By s tatis tic ally
removing (or controlling for) the s hared information among
predic tors , the predic tors are c onc eptually different than they
were before. A s Figure 2 - 3 s hows , now they are independent
predic tors with a different "s hape." T he c orrelations between
thes e altered variables and the c riterion variable are us ed to
produc e the weights .

C orrelations between predic tor variables


and a c riterion variable when all the
redundant s hared information has been
s tatis tic ally removed from the predic tors
are c alled partial correlations . P artial
c orrelations are the one- on- one
c orrelations you would get between eac h
predic tor and the c riterion if the predic tor
variables do not c orrelate with eac h other.

Where Else It Works


M ultiple regres s ion is us ed every day by real people in the real
world for one of two reas ons . Firs t, multiple regres s ion allows for
the c ons truc tion of a predic tion equation, s o people c an us e
s c ores on a group of variables that they have in front of them to
es timate a s c ore on another variable that they c annot have in
front of them (bec aus e it is either in the future or c annot be
meas ured eas ily for s ome reas on). T his is how the tool of
multiple regres s ion is us ed to s olve problems in the world of
applied s cience.

M ultiple regres s ion als o allows for examination of the


independent c ontribution that a group of variables make to s ome
other variable. I t allows us to s ee where there is information
overlap among variables and build theories to unders tand or
explain that overlap. T his is how the tool of multiple regres s ion
is us ed to s olve problems in the world of bas ic s cience.
Hack 15. Identify Unexpected Outcomes

How do you know if your observations are correct or if you are


just biased? How do you know when there is more or less of
something than should have occurred by chance? You can f ind
out f or sure by using the f lexible one-way chi-square test.

I n s c ienc e, the oldes t type of obs ervational res earc h involved


c ounting people, animals , and things :

H ow many people are on this boat?

What proportions of butterflies have little green s pots on


their wings ?

A s the field of inferential s tatis tic s matured, the ques tions


bec ame more s pec ific :

Were an equal number of boys and girls born in L ondon in


1812?

A re an equal number of c rimes c ommitted at different


times of day?
T he res earc h ques tion for thes e s ituations is "are they equal? "
(or, at leas t, are they c los e enough that any fluc tuations are
probably due to c hanc e). T he implic ation of an unequal
dis tribution is that s omething is going on. What, exac tly, is going
on c annot be ans wered by this s ort of ques tion. I t is a s tart,
though, and a good firs t ques tion to as k.

H ave you ever notic ed that s omething s eemed to be going on, but
weren't s ure if it was jus t your imagination? D o a greater number
of hippies s hop at the loc al c ommunity merc antile than would be
expec ted by c hanc e? I f the ans wer is yes , and you are looking to
meet hippies , you s hould s tart hanging out there.

I n bus ines s , and for thos e who have to provide s ervic es ,


identifying where there is the mos t need is c ruc ial.
O bs ervational data c an be us ed to s olve that problem. E ven jus t
in everyday life, we all have our beliefs (whic h might be bias ed)
that are bas ed on obs ervations . I have notic ed a lot of hippies at
the c ommunity merc antile, but maybe I am jus t on the lookout
for hippies when I am in that s tore. A re there really more hippies
than normal there? M ore hippies than, s ay, nonhippies ?

T hes e s orts of ques tions c an be ans wered us ing a s tatis tic al


tool appropriate for s eeing whether the number of "things " within
eac h of a number of c ategories is more unequal than would
normally be found by c hanc e. T his tool is named the one-way chi-
s quare.

T his s tatis tic al analys is is c alled c hi-


s quare bec aus e the s ymbol us ed for the
c ritic al value generated is an C , whic h is
the G reek letter chi (pronounc ed "kye").
T he values needed in the c alc ulations are
all s quared, thus we c all this whole thing a
chi-s quare or chi-s quared.

Determining Whether Something Is Going


On

I magine you are res pons ible for s c heduling the polic e offic ers in
your town. T he problem is that you don't know whether to
s c hedule the s ame amount of offic ers for every s hift or whether
more c rime might oc c ur during partic ular s hifts . I f one s hift is
likely to be bus ier, you s hould probably as s ign more offic ers . O f
c ours e, another reas on to as s ign more offic ers during that time
is that their patrolling might c ut c rime down a bit.

H ere is an example of s ome imaginary data des c ribing c rime


events for three periods of time. I magine the data was c ollec ted
over a 3 0 - day period, and you would like to us e this data to plan
for the c oming year. T he numbers indic ate how many c rimes were
c ommitted during eac h of three polic e s hifts .

Midnight 8 a.m. - 4 p.m. -


Total
- 8 a.m. 4 p.m. Midnight
120 90 90 300
I t c ertainly looks like more c rimes oc c ur late at night. By
obs ervation alone, we might c onc lude that there is more c rime
late at night. P erhaps that is jus t in our s ample, though, and
there really is n't a differenc e in the population of all the data we
c ould have c ollec ted.

Calculating the Chi-Square

We c ould c ompute a c hi- s quare for this data. I f the c hi- s quare is
really big, then the 1 2 0 c rimes is unus ually larger than the other
two c rime periods . H ow big "really big" needs to be is an
important ques tion that we will explore later in this hac k.

H ere's how to think about the analys is we


are about to do. I f there are 3 0 0 c rimes
c ommitted in one 2 4 - hour period, we would
expec t 3 3 .3 perc ent of them, or 1 0 0 , to
oc c ur in eac h of three equally long time
intervals during the day. I f there is more or
les s than 1 0 0 for eac h of thos e intervals ,
s omething is going on. P erhaps the time of
day matters in the c ommis s ion of c rimes .
O f c ours e, there might be s ome c hanc e
fluc tuation, but the larger the differenc e
between the expec ted and the ac tual
frequenc ies , the les s likely that thos e
differenc es are jus t c hanc e.
H ere is the c hi- s quare formula:

S is a s ymbol that means to s um or add up the things that follow


it.

L et's c alc ulate a c hi- s quare for this data. T he obs erved
frequenc y for eac h c ategory is given. T he expec ted frequenc y for
eac h c ell would be 3 0 0 divided by three c ategories , or 1 0 0 :

T he c hi- s quare for this data is 6 . O kay. N ow what? I s 6 big or


s mall or what? C ould a c hi- s quare as big as 6 oc c ur by c hanc e?

Determining if the Chi-Square Is "Really


Big"

A s with all s tatis tic s s uc h as c orrelation c oeffic ients [H ac k


#1 1 ], t tes ts [H ac k #1 7 ], proportions , and s o ons tatis tic ians
have mapped out the dis tribution of the c hi- s quare. I n other
words , we know the likelihood that c hi- s quares of different s izes
will oc c ur by c hanc e. T he likelihood of finding c hi- s quares of
partic ular magnitudes depends on the number of c ategories .

Table 2 - 5 s hows a portion of a theoretic ally giant table that


s hows the c hi- s quare values that one mus t beat in order to be
9 5 perc ent s ure (level of s ignific anc e = .0 5 ) that the value didn't
get that big jus t bec aus e of c hanc e fluc tuations in the s ample.
We know thes e c ritic al values oc c ur by c hanc e 5 perc ent or les s
of the time bec aus e c hi- s quares , like almos t everything els e in
the orderly world of s tatis tic s , have a known dis tributioni.e., a
known s et of likelihoods that c ertain values will oc c ur. L ike the
normal c urve, the c hi- s quare dis tribution is well- defined [H ac k
#2 3 ].

Table Critical chi-square values at the .05 le


significance
Two Three Four
categories categories categories ca
3.84 5.99 7.82 9.49

O ur c hi- s quare value is 6 , whic h is higher than the c ritic al value


for three c ategories (5 .9 9 ). T his means s omething very s pec ific ,
s o I 'll emphas ize it. T hough I am s pec ific ally referring to the
c rime rate problem at hand, I am us ing the s ame pattern of
words that des c ribe all s tatis tic al findings that are s ignificant at
the .05 level.

I f, in the population, there are no


differenc es in the number of c rimes
c ommitted at the three times of day, you
would oc c as ionally draw out random
s amples with differenc es that produc e a
c hi- s quare of 6 or larger, but it would
happen les s than 5 perc ent of the time.
I t s eems reas onable to c onc lude, then, that in the population
there are differenc es in frequenc y of c rime bas ed on time of day.
Bec aus e thes e differenc es are "real," it is reas onable to
s c hedule a year's worth of polic e patrols bas ed on them.

Why It Works

D ata for c hi- s quare analys es are laid out in a way in whic h the
obs erved number of things in eac h c ategory c an be c ompared
with the expec ted number of things in eac h c ategory. T he
"expec ted number of things in eac h c ategory" is us ually defined
as an equal number. I f nothing is going on (i.e., if the c ategory
makes no differenc e), we expec t an equal number of things in
eac h c ategory.

C hi- s quares work with c ategoric al data. E s s entially, the


differenc e between what was expec ted and what was obs erved is
c omputed for eac h c ategory. T he differenc es are c ompared to
the expec ted frequenc y (as a way to s tandardized all the
differenc es ), and then thos e ratios are all added together. T he
s ize of the res ulting number determines its likelihood of
oc c urring by c hanc e. T he bigger the number, the les s likely that
c hanc e alone explains things . T here is a known dis tribution (lis t
of probabilities as s oc iated with eac h pos s ible c hi- s quare value)
that is us ed by a table (or c omputer) to as s ign a s pec ific
probability to eac h c hi- s quare value.

I f there are two or more c ategories and the res earc her wants to
know whether the ac tual dis tribution ac ros s thes e c ategories is
what would be expec ted by c hanc e alone, then the c hi- s quare is
an appropriate tes t. T he ac tual value that is tes ted is the
differenc e between what the res earc her expec ts to find and what
ac tually oc c urs .

T he c hi- s quare tes t is us ed in the framework of having c ertain


expectations and s eeing whether they are met by the obs erved
data. T his is a s imple form of model tes ting. T he res earc her has
a belief s ys tem, in the form of s ome model or hypothes is of how
the world s hould behave. She then obs erves the world (c ollec ts
data) and c ompares her obs ervations to her model. I f the data
fits the model, this is s upport for her hypothes es . T he c hi-
s quare tes t, c ons equently, is c ons idered a goodnes s -of-fit
s tatis tic. I t ans wers the ques tion of how well the data fits a
model.

Some s tatis tic s textbooks refer to the one-


way chi-s quare as the s ingle s ample chi-
s quare, s o don't get c onfus ed. But what are
you doing reading s ome other s tatis tic s
book anyway?

Statis tic ians know the s ize of normal fluc tuations in obs erved
frequenc ies c ompared to expec ted frequenc ies . With this
knowledge, they c an c ompute the likelihood that any obs erved
deviation from the expec ted oc c urs by c hanc e or bec aus e
s omething els e is going on.

Where Else It Works


T hough a s imple and his toric ally anc ient (about 8 0 years old,
whic h is old by s tatis tic s s tandards ! ) s tatis tic al method, the
c hi- s quare is very us eful for a variety of s tatis tic al ques tions at
both low levels of meas urement and, s urpris ingly, very advanc ed
s tatis tic al methods . Bec aus e it is a fairly s traightforward way to
model tes t (or quantify "goodnes s of fit"), the c hi- s quare is us ed
as part of c omplex c orrelational analys es and meas urement
diagnos tic s .

C hi- s quare analys es are us ed to s ee whether c omplic ated


theoretic al models of the worldc omprehens ive maps of
relations hips among variables ac tually matc h real- world data. I f
the real world deviates too muc h from the expec tations implied
by one of thes e models , it is c onc luded that the model is weak. A
s ignific ant c hi- s quare is the c riterion us ed for "too muc h"
deviation.

For example, if tes t developers are c onc erned about item bias
(that one item might work differently for one identifiable group
over anothers uc h as rac es , genders , and s o on), they will c hec k
whether the patterns of ans wer options meet c ertain
expec tations regardles s of whic h group generated the data. T he
c hi- s quare analys is c ompares the expec tations to ac tual tes t
performanc e.

See Also

"I dentify U nexpec ted Relations hips " [H ac k #1 6 ]


Hack 16. Identify Unexpected
Relationships

If you want to verif y whether a relationship you have observed


between two variables is real, you have a variety of statistical
tools available. A problem arises, though, when you have
measured these variables without much precision, using
categorical measurement. The solution is a two-way chi-square
test, which, among other things, can be used to make
unsubstantiated assumptions about the characteristics of
people you have just met.

"I dentify U nexpec ted O utc omes " [H ac k #1 5 ] us ed the one-way


chi-s quare tes t to make polic e s c heduling dec is ions bas ed on
whether equal numbers of c rimes were c ommitted at different
times of day. T hat tool works well to s olve any analytic al
problem when:

T he data is at the c ategoric al level of meas urement


(e.g., gender, politic al party, ethnic ity).

You want to determine whether there is a greater


frequenc y of s c ores in c ertain c ategories than would be
expec ted by c hanc e.
You fac e another c ommon analytic problem when you're c urious
to know whether two c ategoric al variables are related to eac h
other. Relations hips between c ategoric al variables c an be
examined with the handy two-way chi-s quare tes t.

I f two variables are meas ured at the


interval level (many s c ores are pos s ible
along a c ontinuum), the c orrelation
c oeffic ient [H ac k #1 1 ] is the bes t tool to
us e, but it does n't work well with
c ategoric al meas urement.

We make as s umptions all the time about relations hips between


thes e s orts of variables . M any of our c ommon s tereotypes about
c ategories of people have implic it hypothes es about thes e
relations hips . H ere are a few as s umptions you might have that
imply a relations hip between c ategoric al variables :

P rofes s ors are abs ent- minded.

C omputer programmers play Dungeons and Dragons .

A dults who c ollec t c omic s write Statis tics Hacks books .

P rofes s ors are abs ent- minded.


I f you meet a c omputer programmer at a party and you hold this
s tereotype belief about this type of pers on, you might as s ume
that s he is familiar with 2 0 - s ided dic e. I f you are wrong, though,
that might lead to muc h awkward c onvers ation. I t would be nic e
to know if there really were s uc h relations hips between thes e
c ategoric al variables of interes t. C alc ulating a two- way c hi-
s quare s olves this problem and c an verify or c as t doubt on thes e
as s umptions about people.
Review of the One-Way Chi-Square

T he c hi- s quare tes t is us ed in the framework of having


c ertain expectations and s eeing whether they are met by
the obs erved data. Statis tic ians know the s ize of normal
fluc tuations in obs erved frequenc ies c ompared to
expec ted frequenc ies . With this knowledge, they c an
plac e a likelihood that any obs erved deviation from the
expec ted oc c urred by c hanc e or whether s omething
els e is going on. T he raw data for thes e analys es is
us ually the number of people (the frequenc y) in eac h
c ategory of s ome variable.

H ere is the general c hi- s quare formula:

S means to add up the things that follow it. T he bigger


the c hi- s quare, the les s likely it is that the outc omes
oc c urred randomly.

Answering Relationship Questions

While the one- way c hi- s quare analyzes a s ingle c ategoric al


variable, two- way c hi- s quares analyze the relations hip between
two c ategoric al variables . T he proc es s is the s ame: c ompare the
expec ted frequenc ies with ac tual frequenc ies for eac h c ategory
or c ombination of c ategories . I f the differenc es add up to a big
number, then s omething is going on.

H ere is a c ategoric al relations hip ques tion that we might like to


have ans wered. I t is s imilar to other is s ues of s tereotype that
c ould be explored:

A re females more likely to be D emoc rats or


Republic ans ?

You probably already have s ome as s umption about this , but how
would you go about c hec king the ac c urac y of s uc h an
as s umption?

Conduct preliminary analyses

L ook at Table 2 - 6 for an example of c ategoric al frequenc y data


for, to s tart, a s ingle c ategoric al variable. T his data is fic tional,
but c ons is tent with publis hed s tudies , whic h typic ally find that
Republic ans are more likely to be male and that females tend to
more c ommonly identify as D emoc rats .

Table Hypothetical sample of Republicans


Males Females
45 30

I n this random s ample of 7 5 Republic ans , 4 5 are males and 3 0


are females . T hat's 6 0 perc ent male and 4 0 perc ent female. C an
we c onc lude that Republic ans in general are more likely to be
male than female? I f not, we would expec t there to be 5 0 perc ent
males and 5 0 perc ent females in our s ample.

A one-way c hi- s quare c ould s ee whether


more Republic ans are male than female,
but that's not the hac k we are exploring
here.

T his is n't our res earc h ques tion, though.

Compute the two-way chi-square

O ur initial ques tion inc luded only Republic ans , s o while politic al
party might have s eemed like a variable in our firs t analys is , it
was really jus t a des c ription of the population; it did not vary at
all. We c an add party to our analys is , though, by adding another
c ategoryDemocrat, for example and rec ruiting 7 5 more
partic ipants , and s uddenly we have data with two variables .
I magine frequenc y data as s hown in Table 2 - 7 .

Table Hypothetical sample of voters


Party Males Females Totals
Republican 45 30 75
Democrat 34 41 75
Totals 79 71 150

H ere we have two c ategoric al variables : party affiliation and s ex.


We c ould go ahead and us e a one- way analys is to look at either
of the two rows by thems elves . H owever, a more typic al ques tion
would be, "I s there a relations hip between party and s ex? "

Q : "I s there a relations hip between party


and s ex? "

A : Reminds me of my fres hman year.

(H a! I got a million of 'em. I 'll be here all


week. G ood night, everybody! )

To c alc ulate a s tandardized meas ure of the differenc e between


the expec ted frequenc ies and the obs erved frequenc ies , we us e
the s ame formula as with the one- way c hi- s quare. A s "I dentify
U nexpec ted O utc omes " [H ac k #1 5 ] demons trates , we s tart by
totaling up the differenc es between expec ted and obs erved
frequenc ies in eac h cell (eac h s quare of a table).

We do the s ame with the two- way c hi- s quare. T he expec ted
frequenc y in eac h c ell is equal to the number of people in that
c ell's row multiplied by the number of people in that c ell's
c olumn and then divided by the total s ample s ize. U s ing the data
in Table 2 - 7 , the c alc ulations for expec ted frequenc ies are
s hown in Table 2 - 8 .

Table Expected frequencies for two-way chi-


square analysis
Party Males Females
(75x79) / 150 = (75x71) / 150 =
Republican
39.5 35.5
(75x79) / 150 = (75x71) / 150 =
Democrat
39.5 35.5

T hus , the two- way c hi- s quare c alc ulations look like this :

Determine if the chi-square is big enough

Statis tic ians know that the c ritic al c hi- s quare value for 2 x2
tables (like the c hi- s quare we jus t c omputed) is 3 .8 4 . C hi-
s quare values greater than 3 .8 4 are found by c hanc e about 5
perc ent of the time or les s [H ac k #1 5 ].
Bec aus e our c hi- s quare value was 3 .2 4 and that is les s than the
key 5 perc ent value of 3 .8 4 , we know that s uc h a fluc tuation c an
oc c ur by c hanc e s omewhat greater than 5 perc ent of the time.
We c annot c laim s tatis tic al s ignific anc e here, and s o we mus t
c onc lude that though our s ample s eemed to s how a relations hip
between the two c ategoric al variables of party affiliation and s ex,
it might have oc c urred bec aus e of c hanc e s ampling error. I n the
population from whic h the s ample was drawn, there might not be
any relations hip.

Why It Works

A two- way c hi- s quare ans wers this relations hip ques tion by
looking at differenc es . T his might s eem c ounterintuitive,
bec aus e mos t s tatis tic s look for differenc es in order to s how,
well, a differenc e, not to s how s imilarities . But here's the
thinking:

I f there is no relations hip between party and s ex, then


eac h s ex s hould be equally s plit between Republic ans
and D emoc rats .

A ls o, if there is no relations hip, then eac h party s hould


be equally s plit between males and females .

T his equal dis tribution in both direc tions is what is


expec ted by c hanc e. L arge deviations from thos e
expec tations s ugges t that s omething is going on.
T he problem s olved with this hac k was one of knowing whether a
s tereotype belief we held was c orrec t. O f c ours e, outs ide of the
real world, in the s c ientific world, res earc hers us e this tool to
explore a wide variety of c omplex ques tions .

Two- way c hi- s quares , s ometimes c alled contingency table


analys es , are us eful anytime you have two c ategoric al variables
and want to s ee whether there is s ome dependenc y of one
variable on the other. O ur example us ed variables with jus t two
c ategories , but s imilar analys es c an be done on variables with
many c ategories . T he tec hnic al requirements are a bit more
c omplex, but the proc edure is the s ame.

See Also

"I dentify U nexpec ted O utc omes " [H ac k #1 5 ]


Hack 17. Compare Two Groups

Which is better? Which has more? Do people really dif f er?


Quantitative questions like these dominate the polite
conversations of our times. If you want some real evidence f or
your belief s about the best, most, and least, you can use a
statistical tool called the "t test" to support your point.

M y U nc le Frank is full of opinions . G reen M &M s tas te better than


blue. Women never get s peeding tic kets . T he Brady Bunch kids
c ould s ing better than the Partridge Family. P laid is bac k. H e c an
argue all day s pouting half- baked idea after half- baked idea.
While I dis agree with him on all four points (es pec ially the
pos ition that plaid is backafter all, it never left! ), I have only my
opinions to fight with.

I f only there were s ome s c ientific way to prove whether U nc le


Frank is right or wrong! You no doubt rec ognize the rhetoric al
nature of my plea. A fter all, there are only about a gazillion
s tatis tic al tools that exis t to tes t hypothes es like thes e. O ne of
the s imples t tools is des igned to tes t the s imples t of c laims . I f
the problem is dec iding whether one group differs from another,
the proc edure known as an independent t tes t is the bes t
s olution.

Proving Uncle Frank Wrong (or Right)


To apply a t tes t to inves tigate one of U nc le Frank's theories , we
have to c ompute a t value. L et's imagine that I dec ided to
ac tually c hallenge U nc le Frank and c ollec t s ome data to s ee
whether he is right or wrong.

U nc le Frank believes that males get s peeding tic kets more


frequently than females . To tes t this hypothes is , imagine that I
s elec t two groups of 1 5 drivers randomly [H ac k #1 9 ] from his
neighborhood. O ne group is female, and the other is male. I as k
them s ome ques tions . P retend that over the c ours e of the las t
five years , the male group averaged 1 .7 1 s peeding tic kets with a
varianc e of .7 1 . T he female group averaged 1 .3 5 s peeding
tic kets with a varianc e of .2 5 .

Variance is the total amount of variability in


a given group of numbers . I t is c alc ulated
by finding the dis tanc e of eac h s c ore in the
group from the mean s c ore. Square thos e
dis tanc es and average them to get the
varianc e.

H ere is the equation for produc ing a t value:

T he larger the t value, the les s likely that any differenc es found
between your s ample groups oc c urred by c hanc e. Typic ally, t
values larger than about 2 are big enough to reac h the
c onc lus ion that the differenc es exis t in the whole population, not
jus t in your s amples .
T he t formula s hown here works bes t when
both groups have the s ame number of
people in them. A s imilar formula that
averages the varianc e information is us ed
when there are unequal s ample s izes .

I s there s upport for U nc le Frank's belief? To determine that, our


c alc ulations require the data in Table 2 - 9 .

Table Data for speeding ticket t test


Group 1 Group 2
(males) (females)
Mean 1.71 1.35
Variance .71 .25
Sample
15 15
Size

I f we plac e thos e key values into our t formula, it looks like this :

T he c alc ulations work out this way:

I n this c as e, a mean differenc e of .3 6 produc es a t value of 1 .4 2 .


Interpreting the t Value

C ould our t value of 1 .4 2 have oc c urred by c hanc e? I n other


words , if the ac tual differenc e in the population is zero, c ould two
s amples drawn from that s ingle population produc e means that
differ by that muc h?

E arlier, I mentioned that values of 2 or greater are typic ally


required to reac h this c onc lus ion. U nder this s tandard, we would
c onc lude that there is no evidenc e that males really do get more
tic kets than females . T hey did in our s ample, of c ours e, but
might not if we meas ured everybody (the whole population).
T here is no evidenc e that U nc le Frank is right. T his is different
in an important way from c onc luding that he is wrong, but it s till
means he s hould los e this partic ular argument.

Statis tic s is all about prec is ion, though, s o let's explore our
1 .4 2 a little further. H ow big, exac tly, would it need to be for us to
c onc lude that U nc le Frank is ac tually right?

T he ans wer, determined through c us tom, is that if the t is bigger


than would oc c ur by c hanc e 5 perc ent of the time or les s , then
the t is big enough. Fortunately, the c hanc es of finding ts of
various s izes when drawn randomly from a population has been
determined by hard- working mathematic ians us ing as s umptions
of the C entral L imit T heorem [H ac k #2 ]. T he exac t t value
required for s tatis tic al s ignific anc e depends on the total s ample
s ize in both groups c ombined. Table 2 - 1 0 provides t values that
you mus t meet or beat to dec lare s tatis tic al s ignific anc e at the
.0 5 level.

Table t values occurring by chance less than 5


percent of the time
Sample size in both Critical t
groups combined value
4 4.30
20 2.10
30 2.05
60 2.00
100 1.99
\x91 (infinity) 1.96

For s ample s izes other than thos e s hown


in Table 2 - 1 0 , you c an figure out the rough
t value you need to meet or beat by
es timating the value between the values
s hown. A ls o, the c hart as s umes that you
want to identify differenc es between
groups in either direc tion. I t as s umes you
want to know whether either group mean is
larger than the other. T his is what
s tatis tic ians c all a two-tailed tes t, and it is
us ually the c omparis on of interes t.
U s ing Table 2 - 1 0 , we s ee that a t value of 1 .4 2 is les s than the
c ritic al value for a total of 3 0 s ubjec ts . We need to s ee a t value
greater than 2 .0 5 to be c onfident that the s ample differenc es we
obs erved did not oc c ur jus t by c hanc e.

Why It Works

Soc ial s c ientis ts us e this c omparis on method all the time.


E xperimental and quas i- experimental des igns often have two
groups of people who are believed to be different in s ome way or
another. You might be interes ted in the differenc es between
Republic ans and D emoc rats or girls and boys , or you might want
to s ee if a group taking a new drug has fewer c olds than a group
not taking a drug at all.

Suc h des igns produc e two s ets of s c ores , and thos e s ets of
values often differ, at leas t in the s amples us ed. Res earc hers
(and I , too, when it c omes to proving U nc le Frank wrong) are
more interes ted in whether there would be differenc es in the
populations repres ented by the two s amples .

T he logic of inferential s tatis tic s is that a


s ample of s c ores repres ents a larger
population of s c ores . I f the s amples differ
on s ome variable, that differenc e might be
reflec ted in the populations from whic h
they were drawn. O r that differenc e might
be due to errors res ulting from the
s ampling.
A t tes t ans wers the ques tion of whether any differenc es found
between two s amples are real (i.e., they probably exis t in the
populations from whic h the s amples were drawn) or due to
s ampling error (i.e., they probably exis t only in the s amples ). I f
the differenc e between the s amples is too large to have oc c urred
by c hanc e, res earc hers c onc lude that there is a real differenc e
between the populations .

T he t tes t formula us es information about the s hape of the


s ample dis tributions of s c ores . T he needed information is the
mean s c ore on the res earc h variable in eac h group, eac h group's
varianc e, and the s ample s ize of eac h group. T he s ample mean
provides a good gues s as to the population mean, the varianc es
give an indic ation as to how muc h the s ample mean might have
varied from the population mean, and the s ample s ize s ugges ts
the prec is ion of the es timate. T he differenc e between the two
means is s tandardized and is expres s ed as a t value.

T he way s tatis tic ians talk about real


differenc es is "the two s amples were likely
drawn from different populations ." T he way
you and I and res earc hers might talk
about real differenc es is "Republic ans and
D emoc rats differ" or "the drug reduc es the
c hanc e of getting a c old."
Where Else It Works

N umbers don't know where they c ome from. You c an us e t tes ts


to look at differenc es in any two s ets of numbers , whether thos e
numbers des c ribe people or things . I n fac t, the t tes t was firs t
developed to determine the quality of an entire elevator full of
grain us ed in beer produc tion.

I ns tead of examining all the grain, a beer s tatis tic ian (how's that
for a dream job? ) wanted a method that requires looking at a
s mall s ample only, randomly drawn from the larger population of
grain. T he res t is his tory, and s o we c an s ay today that muc h of
the work done by s tatis tic al res earc hers is literally driven by
beer.
Hack 18. Find Out Just How Wrong You
Really Are

A nytime you have used statistics to summarize observations,


you've probably been wrong. If you need to know how close you
have come to the truth, use standard errors.

Statis tic ians are perhaps the only profes s ionals who not only
proudly admit that their ans wers are probably wrong, but will go
to great lengths to tell you exac tly how wrong they are. When
you c onduc t a s urvey, rec ord obs ervations , or c onduc t s ome
s ort of experiment, your res ults des c ribe only your s amplethe
c us tomers , patients , s tudents , goldfis h, or piec es of Kryptonite
that you have in front of you. I nferential s tatis tic s us es values
c omputed for a s ample to es timate what that value would be for
the population it is meant to repres ent. For example, the mean of
a s ample is a pretty good gues s for the mean of the population.
T he problem is knowing whether to trus t your res ults .

Calibrating Error and Calculating


Precision

I t is unlikely that the mean of a s ample is exac tly the s ame as


the mean of the population, but it is likely to be c los e. I f you
want to know how far wrong you are, you c an c alibrate your
prec is ion us ing s tandard errors . T he s tandard error of the mean
gives us an es timate of the dis tanc e between our s ample mean
es timate and the ac tual population mean.

"M eas ure P rec is ely" [H ac k #6 ] dis c us s es


how to us e s tandard errors in the c as e of
meas urement. C alc ulating the s tandard
error of meas urement allows you to know
how c los e your tes t s c ore is to your
typic al level of performanc e. J us t as
meas urement allows us to produc e 9 5
perc ent c onfidenc e intervals around
individual obs erved s c ores , s tatis tic ians
routinely produc e 9 5 perc ent c onfidenc e
intervals around a wide variety of s ample
values .

Fortunately for anyone c urious to know how far a s tatis tic al


finding is from the hidden truth, every popular s tatis tic al
proc edure provides a s tandard error. A fter introduc ing the
following bas ic c onc epts , this hac k will explain how to apply the
following s tandard errors :

Standard error of the mean in des c riptive s tatis tic s

Standard error of the proportion in s urvey s ampling


Standard error of the es timate in regres s ion

T he C entral L imit T heorem [H ac k #2 ] is a


key tool for knowing how wrong we are
when we s ample, bec aus e it provides the
formula for c alc ulating s tandard errors and
s ugges ts that all s ample s ummary values
are normally dis tributed.

T here are three c ommon ways that s tandard errors are us ed to


verify the ac c urac y of res ults of s tatis tic al analys es . T he
partic ular tool you us e depends on whether you want to know
how c los e you are to c orrec tly es timating:

T he mean s c ore of a population on s ome variable (e.g.,


average s alary of untenured c ollege profes s ors )

T he proportion of a population that have s ome


c harac teris tic (e.g., who will vote for my U nc le Frank as
C hief D ogc atc her)

Future performanc e (e.g., probable c ollege G P A for your


pet monkey, whom you have trained to take multiple-
c hoic e tes ts )
Mean Estimates

T he prec is ion of a s ample mean as an es timate of a population


mean is bas ed on s ample s ize. H ere's the formula:

A s the s ample s ize inc reas es , the c los er the s ample mean is to
the true population mean. T his makes s ens e if you think of
s ample s ize as the number of independent obs ervations ; the
more looks you get at s omething, the more ac c urate your
des c ription will be.

T he s tandard error of the mean is the


average dis tanc e of s ample means from
their population mean.

Proportion Estimates

When a s ample of people is s urveyed and the res ults are


pres ented as s ome perc entage or proportion (e.g., "7 2 perc ent of
all s ailors have knee trouble"), that perc entage is s ome dis tanc e
from the ac tual perc entage you'd find if you s urveyed the whole
population. I f the s ample was s elec ted randomly, the s tandard
error of proportion indic ates how c los e the s ample perc entage is
to the population perc entage.
T he s tandard error of proportion is bas ed on s ample s ize and the
s ize of the proportion. H ere's the formula:

L ike the s tandard error of the mean, as the s ample s ize


inc reas es , the s ize of the s tandard error of the proportion
dec reas es . I f you are mathematic ally oriented, you might notic e
that as the proportion moves away from .5 0 , the s maller that
number in the top part of the formula bec omes .

When the c alc ulations are made, then, the further the s ample
proportion is from .5 0 , the s maller the s tandard error of the
proportion. A nother point of interes t is that the top part of the
formula is an indic ation of the amount of variability in the
s ample. (proportion)(1 - proportion) is the s tandard deviation for
proportions s quared.

T he s tandard error of the proportion is the


average dis tanc e of s ample proportions
from the true proportion in the population.

Estimates of Future Performance

I n regres s ion analys es , s c ores on one or more variables are


us ed to es timate s c ores on another variable [H ac k #1 3 ].
H owever, that predic ted s c ore is unlikely to be exac tly right.

J us t as we c an c alc ulate how far an average s ample mean is


from a population mean or how far off our s urvey res ults are from
theoretic al population res ults , we c an als o s ay how far off, on
average, our regres s ion predic tion will be from the ac tual s c ore a
pers on would get. H ere's the formula:

T he s tandard deviation us ed in the equation is the s tandard


deviation of the c riterion variable, whic h is the one you are
predic ting. T he c orrelation is the c orrelation between your
predic tor(s ) and the c riterion variable.

I n the interes t of ac c urac y (the point of


this hac k, after all), I s hould point out that
the s tandard error of the es timate formula
given earlier is n't quite c orrec t. H owever, it
does provide almos t the s ame res ult as
this more c omplex, but c orrec t, equation:

N otic e with this formula that the larger the c orrelation, the
s maller the s tandard error of the es timate. T his makes s ens e,
bec aus e if there is a lot of informational overlap between two
variables , you c an get a good s ens e of the s c ore on one variable
by looking at the other.

T he s tandard error of the es timate is the


average dis tanc e of the ac tual s c ore from
eac h predic ted s c ore.
Using Standard Errors

H ere's how to us e thes e tools to s tate with s ome c onfidenc e the


range within whic h the truth lies . Bec aus e s ampling errors are
normally dis tributed, the s tandard error c an be us ed jus t like a
s tandard deviation to define s pec ific proportions of s c ores under
the normal c urve.

For example, if we want to provide a range of values in whic h the


population value falls 9 5 perc ent of the time, we c an build a 9 5
perc ent c onfidenc e interval around our s ample value. Bas ed on
the normal c urve [H ac k #2 3 ], 1 .9 6 s tandard errors on either
s ide of the s ample value s hould provide a range of values that
we c an s ay with 9 5 perc ent c ertainty c ontains the population
value.

Table 2 - 1 1 s hows s ome examples of various s tandard errors


and the us e of s ample data to produc e thes e c onfidenc e
intervals [H ac k #6 ]. N otic e how a larger s ample s ize c reates a
s ample es timate c los er to the population value, and a larger
s ample s ize als o points to a c onfidenc e interval that is more
prec is e.

Table Building 95 percent confiden


Type of
standard Standard Sample Sample S
error deviation size value

Standard
error of the 15 30 100 2
mean
Standard
error of the 15 60 100 1
mean
Standard
error of the .25 30 .50 .0
proportion
Standard
error of the .25 60 .50 .0
proportion
Standard
error of the 15 30 100 1
estimate
Standard
error of the 15 60 100 1
estimate

T he "Sample value" c olumn in Table 2 - 1 1


for the s tandard error of the es timate is an
example of an es timated or predic ted
s c ore on s ome variable. T he c alc ulations
in the example as s ume a c orrelation of .2 5
between the predic tor and the c riterion.

Uncle Frank's Campaign for Dogcatcher

A s the c ampaign manager for my U nc le Frank in his rec ent


c ampaign for dogc atc her, I had an opportunity to us e s tandard
errors . Several weeks before the elec tion, I s urveyed 3 0
randomly c hos en voters in the town of Tonganoxie, Kans as ,
where Frank lives . M y s urvey found that 5 0 perc ent of
res pondents s aid they would vote for him. I warned U nc le Frank
that the s ample was s o s mall that it was not a very prec is e
reflec tion of the entire population of voters .

A fter referring to Table 2 - 1 1 , I determined that if we had


s urveyed all the voters in town, the perc entage s aying they
would vote for Frank might reas onably be anywhere between
about 3 2 perc ent and 6 8 perc ent, though the mos t likely value
was 5 0 perc ent. O f c ours e, the optimis t that is my unc le
interpreted this as meaning he might have 6 8 perc ent of the
vote and a huge lead. H e s pent the res t of his c ampaign c hes t on
a giant vic tory party the night before the elec tion. I , being the
realis t that I am and knowing my unc le's reputation around town,
as s umed the true outc ome would be in the other direc tion. I t
was . T hat's okay, though. I t was a great party.
Why It Works

We c an trus t the ac c urac y of s tandard errors if we ac c ept the


following as s umptions and apply s ome c ommon s ens e:

Sampling errors are normally dis tributed

T his means that the s ize of thes e errors range in value


in a way that matc hes the normal c urve. T his allows us
to produc e thos e pers uas ively prec is e c onfidenc e
intervals .

Sampling errors are nonbias ed

T his means that s ample values are equally likely to be


greater or les s than the population value. T his is
c onvenient bec aus e it means that ac ros s repeated
s tudies , one c an zero in on the true population value.

T he formulas are c ons truc ted in s uc h a way that if you have little
or no information about the population, then the s ize of the error
in your s ample es timate is about the s ize of the s tandard
deviation of the population.

L ook what happens with the s tandard error of the mean or the
s tandard error of the proportion when the s ample s ize is 1 , or
what happens with the s tandard error of the es timate when the
c orrelation is 0 .0 0 . I ntuitively, a good formula for figuring the
s tandard error s ize s hould produc e s maller errors when more is
known about the population.
Hack 19. Sample Fairly

If you want to f ind something out about every single customer


or employee in your business, you could talk to every single one
of them. If you are concerned about the quality of the beer you
serve at your bar, you could taste every one bef ore serving. Or,
to save time, money, and brain cells, "sample" ef f iciently
instead.

M anagement thrives on knowing the c harac teris tic s of every


widget produc ed, every trans ac tion c onduc ted, and every c lient
helped. O f c ours e, the whole s et of all of thes e widgets ,
interac tions , and people c an never be brought together under
one mic ros c ope and obs erved and evaluated. N o s pec imen s lide
is big enough.

T he s ame is true for thos e of us in s oc ial s c ienc eres earc hers


interes ted in people s imply c annot meas ure everybody. A s muc h
as we'd like to probe, s hoc k, injec t, has s le, embarras s , and
generally bother everyone in the world, we jus t c an't do it. We
don't have the time, s pac e, or money, and, frankly, no one really
wants to get to know s o many people.

T he problem is , "H ow c an you know about everything, without


being able to look at everything? " A s is the c as e with all hac ks
in this book, the s olution is provided by s tatis tic s . T here are
s c ientific ally s ound ways to ac c urately des c ribe any whole s et of
things by jus t looking at a s mall s ubs et of thos e things .
Using Samples to Make Inferences

I nferential s tatis tics allows us to generalize to a larger


population, bas ed on data from a s maller s ample. For thes e
generalizations to be valid, though, the s ample has to repres ent
the population fairly.

A population, in the s ens e we us e it here, is


rarely the "population" of a c ountry or c ity
or planet in the way the term is us ed in
s oc ial s tudies . I n inferential s tatis tic s , a
population is a des c ription of the type of
pers on or thing you're s tudying.
P opulations c an be third- grade boys in
N ebras ka, nurs es at Shawnee M is s ion
M edic al C enter in M erriam, Kans as , South
A meric an giant otters , or books in the
L ibrary of C ongres s . T he only rule is that a
population is bigger than its c orres ponding
s ample.

A good s ample repres ents a population. T his means that the


dis tribution of every important c harac teris tic in a population
mus t be dis tributed, proportionately, in the s ame way in the
s ample. M uc h of this hac k is about how to c ons truc t a good
s ample, s o let's look at a good s ample.
I magine a population of s quares , diamonds and triangles , as
s hown in Figure 2 - 4 .

Figure 2-4. A sample within a population

A fair s ample taken from a population of s quares , diamonds , and


triangles would c ontain thos e s hapes in the s ame proportion as
in the population. I n our diagram, the outer oval repres ents a
population, and the different s hapes are dis tributed as 4 0
perc ent s quares , 2 0 perc ent triangles , and 4 0 perc ent
diamonds . T he inner oval is the s ample, whic h c ontains a
s ubgroup of thos e elements in the population. T he s hapes in the
s ample are dis tributed in the exac t proportions as in the
population: 4 0 perc ent s quares , 2 0 perc ent triangles , and 4 0
perc ent diamonds .

T his s ample is fair. I t repres ents the population well, at leas t in


terms of the c harac teris tic of s hape. When s ampling people or
things , s amples typic ally repres ent a variety of traits . P eople
and things are not entirely triangles or s quares , s o a s ample of
people is repres entative when its mean level of traits matc hes
well with the population levels . E ac h pers on will have s ome level
of all the c harac teris tic s , and won't be entirely one trait, unlike
our s hape example. (T hough my U nc le Frank is pretty muc h
entirely s quare, ac c ording to my A unt H elois e.)

T he pers on as king the ques tion gets to


pic k the population he is interes ted in, but
he is then ac c urate when generalizing to
that population only, not any other.

I f you knew that the s ampling methods us ed to produc e this


s ample (the elements in the inner oval) were c orrec t, you c ould
infer s omething about the population by jus t looking at the
s ample. T he proc edure is s imple and intuitive:

1. O bs erve the s ample. For example, 2 0 perc ent of the


s ample is triangles .

2. I nfer to the population. I bet 2 0 perc ent of the


population is triangles .
I ns tead of abs trac t triangles in a theoretic al population, imagine
you are interes ted in c hec king the quality of the beer you s ell in
your bar. To get an idea of the beer population, c ons truc t a good
s ample of the beers you s ell and tas te eac h of them:

1. O bs erve the s ample. For example, 2 0 perc ent of the


beers have jus t a hint of a pos s um aftertas te.

2. I nfer to the population. I bet 2 0 perc ent of all the beers


you s ell have jus t a hint of a pos s um aftertas te. You
might c ons ider c leaning your beer tap.

I nferenc e is pretty eas y to do, but it works well only when the
s ample is good. C ons truc ting a good s ample is the key.

Constructing the Best Random Sample

A good s ample repres ents the population. Repres entative


s ampling begins with defining the univers e, or, in other words , the
population of things from whic h a res earc her wis hes to s ample.
T here are a variety of ways to c onc eptualize thes e elements and
various levels of grouping that are explic itly or implic itly
identified when c hoos ing a population and s elec ting a s ample.
You have to know about thes e ways of organizing your
population; otherwis e, you c annot c reate a good s ample:

G eneral univers e

A bs trac t population to whic h a res earc her hopes to


generalize his findings . For example, I might want to s ay
s omething about all comic book collectors .
Working univers e

C onc rete population that allows for s ampling to oc c ur. I


c an't really be s ure I have loc ated or c ounted all c omic
book c ollec tors , but I c ould operationalize that
population by defining it as all the s ubs cribers to the
Comics Buyer's Guide, a monthly magazine that mos t
s erious c ollec tors read. T his working population is not
exac tly the s ame as the general univers e, but it s hould
be almos t as large and will c apture mos t of the abs trac t
population of interes t.

Sampling unit

E lement that defines the population. I n our example, a


s ingle s ubs c riber to the monthly magazine would be a
s ampling unit.

Sampling frame

L is t, real or imagined, of s ampling units in a population.


I n our example, this would be the literal lis t of
s ubs c ribers that I might be able to purc has e from the
magazine.

A n obs ervation that is likely true about the


people and things that were not part of
your s ample is s aid to be generalizable. I f a
s ample does not repres ent a population,
the s ample is bias ed (a bad s ample).
T he bes t s ampling s trategy, without ques tion, is to s ample
randomly from a valid s ampling frame. Random s elec tion will do
the bes t job of c reating a s ample that repres ents all the traits of
interes t in the population. T he real power of random s elec tion,
though, is that you are als o repres enting all s orts of variables
you haven't even c ons idered that might otherwis e have an
impac t on your obs ervations .

Tec hnic ally, the term random des c ribes a s ampling proc es s that
gives every member of a population an equal and independent
c hanc e of being s elec ted. Equal means that every s ampling unit
in the s ampling frame has as good a c hanc e as anyone els e.
I ndependent means that a pers on's or thing's c hanc es of being
s elec ted are unrelated to whether any other partic ular pers on or
thing has been s elec ted.

So, s uppos e a s elec tion proc es s c alls c us tomers on a c lient lis t


to as k for partic ipation but s tops trying to c ontac t people if they
aren't home or in the offic e during the firs t attemptthis does not
give all pos s ible partic ipants an equal c hanc e of being s elec ted.
P eople who aren't eas ily available are les s likely to be c hos en,
and if people are not s olic ited to partic ipate when s omeone in
their offic e has already been c hos en, eac h member of the
population does not have an independent c hanc e of being
c hos en.

Random s ampling c an be done by numbering all names on the


s ampling frame lis t and us ing s ome method of c hoos ing a
random number to pic k eac h partic ipant.
Sampling Strategies for the Real World

I n the real world, it is often diffic ult or impos s ible to s ample


randomly. H ere are s ome s ampling s trategies that aren't quite
as good as random s ampling, but are more realis tic outs ide of
s ome imaginary s c ientific laboratory:

C onvenienc e s ampling

T he s ample is c hos en bas ed on ac c es s ibility. T his is


s ometimes c alled haphazard s ampling. H ead down to the
loc al mall and as k the firs t 1 0 people you s ee how they
feel about your c ompany's widgets , and you have
engaged in c onvenienc e s ampling.

Sys tematic s ampling

U nits are c hos en from the s ampling frame at equal


intervals . For example, you might take every 1 0 th
pers on from a long lis t. A s long as the order of names on
the lis t is unrelated to whatever you are trying to
determine, this might do as good a job of repres enting
the population as true random s elec tion. Statis tic al
theoris ts and prac titioners ac tually have ac ademic
debates over this is s ue.

Stratified s ampling

T he s ampling frame is divided into meaningful


s ubgroups , and units are randomly c hos en from eac h
s ubgroup. T his c ould res ult in greater
repres entativenes s than even random s ampling if the
c harac teris tic s that define the s ubgroups are important
to the ques tion you are as king.

C lus ter s ampling

G roups of units are randomly c hos en, and all units in


thos e groups are s ampled. For example, you might
c hoos e a publis hing c ompany at random and then
interview every employee about how to s uc c eed in
publis hing.

J udgment s ampling

T he s ample is c hos en bas ed on your expert judgment as


to whether the s ample will repres ent the population. You
might c hoos e to talk to only your bes t c us tomers ,
bec aus e they know the mos t about your widgets .

Choosing a Sample Size

I f you are able to c ons truc t a good s ample, as we have defined it,
even a s mall s ample c an be effec tive. A s with c hoc olate c hip
c ookies , though, bigger is better. T he larger the s ample, the
more repres entative of the population it is . C ons equently, the
obs ervations are more generalizable and you c an better trus t
their ac c urac y.

A ls o, if there is s ome interes ting relations hip between variables


in your obs ervations , you are more likely to find that relations hip
and be s ure that it did not oc c ur by c hanc e when you have
obs erved many elements in your s ample than when you have
looked at jus t a few.

Finally, if you do have s ome s oc ial s c ienc e purpos e for your


s ampling, there are c ertain tec hnic al s tatis tic al c harac teris tic s
that mus t be met to perform c ertain analys es . T hes e s tandards
are eas ier to meet in larger s amples s uc h as , s ay, s amples
c ons is ting of 3 0 or more widgets .

See Also

"Find O ut J us t H ow Wrong You Really A re" [H ac k #1 8 ]


s hows how to determine error s ize in inferential
s tatis tic s .
Hack 20. Sample with a Touch of Scotch

When statisticians choose samples of people f rom populations,


they are really sampling f rom continuous distributions of
variables. Sampling is sometimes easier to understand, though,
by treating your variables as discrete objects, not continuous
scores.

T he mos t powerful s tatis tic al proc edures us e s c ores at the


interval level of meas urement or higher [H ac k #7 ]. To s ample
s c ores from a population, s oc ial s c ienc e res earc hers us ually
c hoos e people, though, not s c ores . T he people are then
meas ured, whic h res ults in a s ample of s c ores . So far, s o good.

When dis c us s ing the s ampling proc es s , however, s mart


res earc hers s ometimes s ound not- s o- s mart when they refer to
their s ampling s trategy. For example, if a res earc her is
interes ted in meas uring the effec ts of s ome treatment on a
continuous variable s uc h as happines s , he might s ay (and think),
"O K, firs t I need to get a s ample full of happy and unhappy
people." H e, at leas t for the moment of the thought, is treating
happines s as if it were a dichotomous variable.

Dichotomous is s tatis tic s jargon meaning


"having only two values ." For example,
biologic al s ex is a dic hotomous variable.
H e is referring to people as if they are either c ompletely happy
or entirely unhappy. I n reality, of c ours e, he thinks there is a
large range of happines s s c ores that des c ribe people, whic h is
why he is us ing s tatis tic s that make the as s umption of interval
meas urement.

H e refers to his partic ipants as either/or bec aus e doing s o


makes it eas ier for him to pic ture the repres entativenes s of his
s ampling. I t's a s mart s trategy, bec aus e by thinking of s amples
as repres enting big, dis c rete c ategories ins tead of more prec is e,
c ontinuous values , this s ometimes makes ques tions about
s ampling eas ier to ans wer and jus tify.

A Sampling Problem

H ere's a brainteas er that c enters on a s ampling ques tion. A


drunk, untenured s tatis tic ian (I 've met a few) is mixing drinks at
a party. H e is making a Sc otc h and s oda for his department c hair.
T he c hair demands a drink with s ome exac t proportion of Sc otc h
to water (it does n't matter what the s pec ific reques t is ; our hero
never makes it that far).

T he s tatis tic ian s tarts with two glas s es of the s ame s ize. O ne
glas s (the firs t glas s ) has two ounc es of Sc otc h in it; the other
(the s ec ond glas s ) has two ounc es of water in it. H e s tarts by
pouring an ounc e of water from the water glas s into the Sc otc h.
H e apparently already s c rewed up, bec aus e he c hanges his mind
and pours an ounc e of the new mixture (three ounc es of Sc otc h
and water mixed up) bac k into the water glas s . Both glas s es now
have two ounc es of liquid in them, but the liquid in eac h glas s is
s ome mix of water and Sc otc h.
N ervous ly, the s tatis tic ian attempts to s tart all over, but his
department c hair s tops him. She s ays :

I have a propos ition for you. We c an't pos s ibly know the exac t
proportion of Sc otc h and water in eac h glas s right now, bec aus e
we c an't know how mixed up everything is . But if you c an ans wer
the following ques tion c orrec tly, I 'll write a s trong letter of
s upport to your tenure c ommittee. I f not, well, I 'm s ure s omeone
with your qualific ations s hould have no trouble finding work in
the hotel/motel or food s ervic e indus try. H ere's the ques tion:
right now, does the firs t glas s have more water in it, or does the
s ec ond glas s have more Sc otc h in it?

T hink of the ques tion as a s ampling is s ue. D oes the firs t s ample,
the liquid in the firs t glas s , have more water in it, or does the
s ec ond s ample, the liquid in the s ec ond glas s , have more Sc otc h
in it? Bec aus e both Sc otc h and water are made up of really s mall
partic les , it is diffic ult to pic ture how muc h of eac h liquid is
repres ented in eac h s ample. E ven proportionately, we c an't be
s ure how many water partic les (or s ampled s c ores that equal
"water") are mixed into the s ample of "Sc otc h" s c ores , bec aus e
who knows how muc h water drifted down into the bottom of the
firs t glas s and would have remained there as the top part of the
liquid near the s urfac e was poured bac k into the s ec ond glas s .
A n intuitive ans wer is c alled for. U nfortunately, it is wrong.

T he intuitive ans wer typic ally generated by s mart people is that


the firs t glas s , the Sc otc h glas s , has more water in it than the
water glas s has Sc otc h in it. T his makes s ens e bec aus e pure
water was poured into the Sc otc h, while s ome mix of water and
Sc otc h was poured bac k into the water glas s . A mazingly, this
c lever thinking leads us as tray. T he c orrec t ans wer is that the
proportions are equal! T here is the s ame amount of water in the
Sc otc h glas s as there is Sc otc h in the water glas s .
Using Metaphor to Solve the Problem

T he s olution to the s ampling problem is c learer if we imagine


that our variables are not tiny partic les , but ins tead are large
c ategories , s uc h as blue and white marbles . I ns tead of a glas s of
Sc otc h, imagine a glas s of 1 0 0 blue marbles . I ns tead of a glas s
of water, imagine a glas s of 1 0 0 white marbles .

T he glas s es are big, s o the marbles c an get mixed together well.


T hink large glas s fis hbowls . T his is nec es s ary to ens ure that
random s elec tion is pos s ible, as was likely with the mixed- up
liquids . Keep your eye on the marbles through eac h s tep of the
mixing.

O ur hero takes 5 0 white marbles from the s ec ond glas s and


mixes them into the firs t glas s . T he dis tribution of the two
variables is now:

Sample 1

1 0 0 blue marbles , 5 0 white marbles

Sample 2

5 0 white marbles

N ow, he (randomly, remember, to s imulate the mixed liquids )


takes any 5 0 marbles from the firs t glas s and mixes them bac k
into the s ec ond glas s . L et's imagine a variety of pos s ibilities .

I f by c hanc e he s elec ts all the white marbles , they go bac k into


the s ec ond glas s and the dis tribution is now:
Sample 1

1 0 0 blue marbles

Sample 2

1 0 0 white marbles

I f by c hanc e he s elec ts no white marbles and puts 5 0 blue


marbles into the s ec ond glas s , the dis tribution is :

Sample 1

5 0 blue marbles , 5 0 white marbles

Sample 2

5 0 white marbles , 5 0 blue marbles

N ow imagine a more likely s c enario: s ome of the marbles he


randomly draws are white and s ome are blue. For example, he
c ould draw out 1 0 white marbles and 4 0 blue marbles and plac e
them in the s ec ond glas s . I n that c as e, the new dis tribution is :

Sample 1

6 0 blue marbles , 4 0 white marbles


Sample 2

6 0 white marbles , 4 0 blue marbles

Try this with any mix of marbles you wis h, but remember you
have to draw out a total of 5 0 marbles (to duplic ate the one
ounc e, or half, of the water originally mixed up).

N otic e that any mixture you try res ults in 1 0 0 marbles in eac h
glas s at the end. A ls o, mos t importantly, notic e that the ratio of
blue to white marbles in the firs t glas s at the end is always equal
to the ratio of white to blue marbles in the s ec ond glas s . A ny
blue marble that is not in the s ec ond glas s mus t be in the firs t
glas s , and any white marble that is not in the firs t glas s mus t be
in the s ec ond glas s .

T he s ame is true for Sc otc h and water. T he c orrec t ans wer is


that the proportions will be equal, no matter how they were
originally mixed up.

Where Else It Works

Real- life polling c ompanies , who make their living and s take
their reputations on the ac c urac y of elec tion predic tions , are
als o primarily c onc erned with the proportion of s amples who are
in eac h of s everal c ruc ial c ategories . I f people have jus t voted
and there are two c andidates , anyone who did not vote for
c andidate A voted for c andidate B. T heir abs enc e in one
c ategory guarantees their pres enc e in the other. Reporting
predic tions as perc entages c reates the potential for greater
ac c urac y. I t als o allows for greater error, as a voter predic ted to
be in c ategory A who ends up in c ategory B has therefore
produc ed error in both c ategories .
When s tatis tic al s oc ial s c ienc e res earc hers want to be
c onvinc ed that their s ample is repres entative of its population,
their primary c onc ern is always the proportions of
c harac teris tic s in their s ample, not the number of people with
thos e c harac teris tic s . What matters mos t is that the proportions
of eac h s c ore for the key res earc h variables are the s ame in
both s amples and their populations .
Hack 21. Choose the Honest Average

Data-driven decisions, such as whether you can af f ord to buy a


house in a new town or who the core market is f or your
business, of ten rely on the "average" as the best description f or
a large set of data. The problem is that there are three
completely dif f erent values that can be labeled as the
"average," and the dif f erent averages of ten result in dif f erent
decisions. Make your decisions using the correct average.

When mos t people hear a s tatement like "the average pric e for a
hous e in this town is $ 2 9 0 ,0 0 0 " (whic h might s ound low, high, or
jus t right, depending on where you c all home), they imagine that
this figure was determined by adding up all of the s ales pric es
from all of the hous es in the town, and then dividing that s um by
the number of hous es . But s tatis tic ians know there is more than
one way to determine the "average," and s ometimes one kind is
better than another.

Whether that $ 2 9 0 ,0 0 0 really repres ents the typic al hous ing


pric e depends on whether the average is ac tually the mean,
median, or mode. I t als o depends on the s hape of the dis tribution
of all the numbers that are averaged. Wis e folks will make s ure
they are making their dec is ions us ing the bes t s ummary value.
H ere's when to trus t eac h type of average.

Measures of Central Tendency


T he purpos e of determining an average for a s et of
values whether thos e values are hous e pric es , grades from a final
exam, or the number of s tudents in a yoga c las s is to effic iently
c ommunic ate the central tendency for thos e values . I t's true that,
mos t of the time, c entral tendenc y is determined by adding up all
of the values in a dis tribution, and then dividing the s um by the
number of values . Statis tic ians don't c all this the average,
though; they c all it the mean. So, why not always us e the mean
to determine c entral tendenc y? Bec aus e in s ome s ituations , the
mean does n't repres ent any of the actual values !

C ons ider the opening example about the average pric e of a


hous e. L et's s ay you c ollec t data for 3 0 0 hous es in a town and
want to determine the average s ales pric e in that s ample.
G enerally s peaking, the mean is not a very good indic ator of
c entral tendenc y for hous e pric es . Figure 2 - 5 illus trates why.

Figure 2-5. Mean as a misleading average


T he mean is not a very hones t average in this s ituation, bec aus e
the dis tribution of s ales pric es is s kewed by a few outlying
values that are very large. O f the 3 0 0 hous es s ampled, 2 3 1 of
them were s old for pric es in between $ 1 0 0 ,0 0 0 and $ 6 0 0 ,0 0 0 .
T he remaining 6 9 hous es s old for pric es above $ 6 0 0 ,0 0 0 , with
5 6 of thos e above a million dollars . T he mean is heavily
influenc ed by thes e outlying values and therefore is not very
repres entative of any hous e in the s ample.

M eans don't work well as averages for mos t money variables .


T he average inc ome reported as a mean is muc h higher than
what mos t people earn. T here are always a few Bill G ates and
J .K. Rowling types who pull the mean way up.
So, what's the "hones t average" for thes e types of values ?
I ns tead of reporting the mean, with dis tributions like the one in
Figure 2 - 5 , hones t s tatis tic ians generally prefer the median. T he
median is that value in a dis tribution at the 5 0 th perc entile,
s uc h that half of all values are below it and the other half are
above it (jus t like, on a highway, the median divides the road in
half). T he median for this dis tribution of data is jus t under
$ 2 9 0 ,0 0 0 , and thus works very well as a meas ure of c entral
tendenc y.

Choosing the Middle Ground

T he median works well in thes e ins tanc es bec aus e it is muc h


les s s ens itive to outlying values than the mean, and thus is
preferred whenever a dis tribution is s kewed in one direc tion or
another. T he median is therefore als o the mos t "hones t"
meas ure of c entral tendenc y when the dis tribution is s kewed by
a few outlying values that are muc h s maller than the res t, as in
Figure 2 - 6 , a fic tional s et of 5 0 s tudents ' exam s c ores .

Figure 2-6. Median as the honest measure of


central tendency
Figure 2 - 6 s hows another type of data in whic h a mean might
lead to a wrong c onc lus ion. Relying on the median here would
res ult in a more ac c urate interpretation of c las s performanc e.

Where It Doesn't Work

N ot even the median will always be hones t, though. C ons ider the
following s c enario. Say you're a yoga ins truc tor, and half of the
s tudents in your c las s are between 2 5 and 3 5 years old, and the
other half are between 5 0 and 6 0 . H ow would you des c ribe the
average age of your s tudents ?
T he problem in s ituations like thes e is that neither the mean nor
the median will adequately des c ribe the group of individuals .
What to do? T he mos t hones t c hoic e for an average in this
s ituation is to report the mode, whic h is s imply the mos t
frequently oc c urring value in a s ample of data, as s hown in the
example in Figure 2 - 7 .

Figure 2-7. Mode as the honest average

I n this c as e, there are two modes : one at 3 0 years old and the
other at 5 4 years old. Reporting both of thes e values is the bes t
way to c hoos e the hones t average. T he mean and median both
mis lead for thes e s orts of data.

How to Choose the Honest Average

So, when is the mean the hones t average? Bas ic ally, the mean is
the bes t c hoic e when there is only one mode and the dis tribution
is s ymmetric , whic h means that there is no obvious s kew in
either direc tion. I f your yoga c las s were attended by your 2 5 - to
3 5 - year- old s tudents only, the mean would be the hones t
average.

When all is s aid and done, how do you c hoos e the mos t
appropriate average? Following thes e three s imple rules will
keep you hones t if you are reporting s ummaries , and will help
you make informed c hoic es if you are the one making dec is ions
bas ed on the data:

C hoos e the mode if there are two or more "trends " in the
data (i.e., two or more areas of high- frequenc y values ),
and report one mode for eac h trend.

C hoos e the median if the dis tribution is s kewed (i.e., a


s mall number of outliers is heavily influenc ing the
mean).

C hoos e the mean if the dis tribution is fairly s ymmetric


with one mode.
I t is interes ting to note that in mos t c as es , the mean, the
median, and the mode will all be fairly c los e to equal. So why
bother with the mean? T he mean remains as the mos t c ommon
way to report the average bec aus e it is mos t likely to be
replic ated if we were to take another s ample of data and look for
the c entral tendenc y. M edians and modes tend to be a lot more
variable, but the mean s tays nic e and s table.

William Skorups ki
Hack 22. Avoid the Axis of Evil

Graphs are powerf ul tools to represent quantities, relationships,


and the results of research studies. But in the wrong hands,
they can be made to deceive. Choose your destiny, young Luke
(or, if you are under the age of 25, "young A nakin"), and avoid
the dark side.

T here was a time when only s c ientis ts , engineers , and


mathematic ians ever s aw a graph. With the advent of more and
more news outlets aimed at the general public , vis ual
repres entations of numeric information have bec ome more and
more c ommon. J us t think of yes terday's is s ue of USA Todayit
c ontained at leas t a dozen graphs .

I n bus ines s c onferenc es , graphs are us ed frequently to


c ommunic ate information and demons trate s uc c es s (or failure).
I f the c reator of a graph is n't c areful, though, c hoic es that might
s eem arbitrary will affec t the interpretation of the information.
Without c hanging the data, you c an c hange the meaning.

So, if you want to avoid manipulating your audienc e when you


c reate a graph, or if you jus t want to be able to s pot a mis leading
(whether intentional or not) c hart, then us e this hac k to help you
c reate and interpret graphs effec tively.

Choosing the Honest Graph


To unders tand c orrec t and inc orrec t graphing options , we firs t
have to c over s ome graphing bas ic s . T here are various piec es to
a graph, and the manipulation of thos e piec es c an lead or
mis lead.

Typic al graphs have two axes , bec aus e they des c ribe two
different variables . A xes are the lines along the bottom, c alled
the X-axis , and along the s ide, c alled the Y-axis .

You c an remember that the vertic al axis is


c alled the Y- axis bec aus e the c ute little
letter Y is reac hing its c ute little hands up,
vertic ally, toward the s ky. G et it?
(Welc ome to the c reative world of
s tatis tic s educ ation.)

T he s ort of graph that is appropriate (and nondec eptive) for


s howing the variables you have meas ured depends on the level
of meas urement of your variables [H ac k #7 ]. You c an c hoos e
from three c ommon types of graphs , and only one will be the
right one for your variables :

Bar c hart

I n Figure 2 - 8 , the X- axis repres ents c ategories or


groups , s uc h as males and females . T he Y- axis is
c ontinuous : the taller the bars , the higher the value on
variable Y.
Figure 2-8. Bar chart

H is togram

I n Figure 2 - 9 , the X- axis repres ents c ontinuous values .


A his togram is often us ed when the X- axis repres ents
c ommon c ategories that reflec t an underlying
c ontinuous variable, s uc h as months of the year or s ome
other dis tinc tive s et of groupings that c an be plac ed in a
meaningful order. T hes e look like bar c harts , exc ept that
the bars are pus hed together with no s pac es between
them.
Figure 2-9. Histogram

L ine c hart

I n Figure 2 - 1 0 , both the X- and Y- axis are c ontinuous


variables ; in this example, they're time and value. T he
higher the line at any point, the greater the quantity as
repres ented by the Y- axis .

Figure 2-10. Line chart


To pic k the right kind of graph (i.e., the one with the format that
is the leas t dec eptive and the mos t intuitive), identify the types
of X variable you are us ing (notic e that Y is continuous in all of
thes e formats ):

I f X repres ents different c ategories and Y is c ontinuous ,


us e a bar c hart.

I f X c an be c onc eived of as c ategories , but there is als o


s ome meaningful order among them and Y is c ontinuous ,
us e a his togram.
I f X and Y are both c ontinuous , us e a line c hart.

Graphic Violence

A c ommon error in graphing, either intentional or not, has to do


with s etting the s c ale for the X- axis . H ere's why this is a
problem and how you c an avoid it.

G raphs with two variables invite c omparis ons ac ros s c ategories


or time or ac ros s different values of one variable. P ic tures are
worth a thous and words , as they s ay, and a graph c an be very
pers uas ive evidenc e. A nytime lines or bars are us ed to c ompare
values , the c omparis on is ac c urate only when the height of the
line or the length of the bar is judged agains t s ome s tandard
minimum value. T hat minimum value is often zero. I f the graph is
not c alibrated to s ome reas onable bas e value, s mall differenc es
look huge.

C ompare the two graphs s hown in Figure 2 - 1 1 , for example. Both


c onvey exac tly the s ame data, and yet your interpretation of
eac h might be wildly different. T he his togram in the top left
reflec ts performanc e of the U .S. s toc k market over the las t five
days . N otic e a rather frightening- looking drop on day five. N o
doubt, earth- s haking news hit near the end of day 4 . You might
als o notic e that the Y- axis (the D ow J ones I ndex) does not begin
at zero; it begins at 9 ,9 0 0 , a value that is low enough to c ontain
the top of all five bars , but that is otherwis e not meaningful.

Figure 2-11. The power of the Y-axis


L ook more c los ely at the s ec ond his togram in Figure 2 - 1 1 , on
the bottom right. Both c harts pres ent the s ame data, but the
s ec ond graph us es zero as the s tarting point. T he interpretation
of the data as pres ented in this graph s hows very little
fluc tuation ac ros s the las t five days , and the frightening drop at
day 5 is barely a hic c up.

Whic h dis play is the c orrec t one? Both reflec t a drop of 2 .8


perc ent in s toc k market value from day 4 to day 5 . I t really
depends on the intent of the graph c ons truc tor and the intended
audienc e. When number c ounts are involved, or money, the mos t
meaningful and faires t s tarting point is us ually nothing. M any
news papers provide daily s toc k information in the format as
s hown in the firs t his togram. T hey believe their readers are
interes ted in s mall c hanges , s o they s et a Y- axis s tarting value
that is as high as pos s ible but low enough to c ontain all data
points on the X- axis .

A fter all, to an avid inves tor who c hanges her portfolio often and
buys and s ells frequently, a drop of 2 .8 perc ent is s erious
bus ines s . A graph des igned to make s mall c hanges look s erious
might be the mos t valid for that reader. I f an inves tor is one of
thos e "in it for the long haul" types , a relatively s mall c hange is
meaningles s , however.

To get the mos t meaning out of graphs like thes e, always c hec k
the bottom value on the Y- axis . T his way, you c an get a s ens e of
the real differenc es on the X- axis as you c rawl from bar to bar. I f
you are making graphs like thes e, think about the mos t hones t
way to pres ent the information. You want to inform, not dec eive
(probably).

See Also

T he book that firs t pointed out to the general public how


c harts c an dec eive, es pec ially in advertis ing, was How to
Lie With Statis tics . H uff, D . (1 9 5 4 ). N ew York: N orton and
C ompany.
Chapter 3. Measuring the
World
H ac ks 2 3 - 3 4

T here is great value in unders tanding phenomena by hanging a


quantity on it. T hough s ometimes a s omething important is los t
in the trans lation from idea to number, c reating s c ores to
repres ent whatever we are interes ted in does allow for a level of
prec is ion in unders tanding, and it als o allows for c omparis on.
T hes e hac ks all involve meas urement and interpretation of
s c ores .

A whole family of hac ks relies on the normal dis tribution [H ac k


#2 3 ] and its pres enc e everywhere we look. With the normal
c urve, you c an tell where you s tand c ompared to everyone els e
[H ac k #2 4 ], know how you are likely to perform on a tes t before
you even take it [H ac k #2 5 ], and unders tand your tes t res ults at
a deeper level [H ac ks #2 6 and #2 7 ].

Speaking of tes ting, you'll learn how to produc e a good s et of


ques tions [H ac k #2 8 ] and make a quality tes t [H ac ks #3 1 and
#3 2 ]. You c an identify bad items , worthles s ques tions , and do
well on a tes t without knowing the ans wers [H ac k #2 9 ]. You c an
als o improve your tes t performanc e without c rac king a s ingle
book [H ac k #3 0 ].

Finally, by learning a c ouple of s olid meas urement princ iples ,


you c an determine the lifes pan of an era, pers on, or bus ines s
[H ac k #3 3 ] and als o learn how to us e medic al information [H ac k
#3 4 ] to maybe inc reas e your own lifes pan.

M eas ure by meas ure, here is a whole c hapter full of


meas urement hac ks .
Hack 23. See the Shape of Everything

A lmost everything in the natural world is distributed in the


same way. A s long as you can measure the thing, whatever it is,
and scores are allowed to vary, it has a well-def ined "normal
distribution." If you know the specif ics about the shape of this
normal curve, you can make very accurate predictions about
perf ormance.

T here are a few mirac les in the world of s tatis tic s . T here are at
leas t three tools three dis c overies that are s o c ool and magic al
that onc e s tudents of s tatis tic s learn about them and begin to
c omprehend their beauty, they frequently explode.

Well, maybe I am exaggerating a bit, but here are three dandy


tools for unders tanding the world:

T he c orrelation c oeffic ient [H ac k #1 1 ]

T he C entral L imit T heorem [H ac k #2 ]

T he normal c urve

Sinc e we've dis c us s ed the us es of the firs t two mirac les in other
hac ks , let's s pend our time now getting to know the s hape and
us es of the third: the normal curve. I am pleas ed to pres ent the
normal c urve, the normal dis tribution, the bell- s haped c urve, the
whole world, as s hown in Figure 3 - 1 .

Figure 3-1. The normal curve

Applying Areas Under the Normal Curve

Statis tic ians have defined the normal c urve very s pec ific ally.
U s ing both c alc ulus and hundreds of years of real- world data
c ollec tion, the two methods have reac hed the s ame s et of
c onc lus ions about the exac t s hape of the normal dis tribution.
Figure 3 - 2 s hows the important c harac teris tic s of the normal
c urve. T he mean is in the middle, and there is room for fewer and
fewer s c ores as you move away from that c enter.

Figure 3-2. Areas under the normal curve

T hough the normal c urve is theoretic ally infinitely wide, three


s tandard deviations on either s ide of the mean is us ually enough
to c ontain all the s c ores .
A dis tribution's s tandard deviation is the
average dis tanc e of eac h s c ore from the
mean [H ac k #2 ].

Predicting test performance

Rec all the c laim I made earlier that anything you meas ure will
dis tribute its elf as a normal c urve. By implic ation, then, anything
we meas ure will have mos t of the s c ores c los e to the mean and
only a few s c ores far from the mean. M eas ure enough people and
you will get the oc c as ional extreme s c ore very far from the
mean, but s c ores far from the mean will be rare. T he expec ted
proportion of people getting any partic ular s c ore gets s maller as
that s c ore moves away from the mean.

T hat next tes t you take? I don't know the tes t or anything about
you, but I am willing to wager that you will get a s c ore c los e to
the mean. I predic t your s c ore will be average. You might get
above average or below average, but the normal c urve tells me
that you will likely be pretty c los e to the mean.

To make thes e s orts of predic tions , and to be pretty c onfident


about their ac c urac y, you c an us e the known normal c urve's
dimens ions to es timate the perc entage of s c ores that will fall
between any two points on the X- axis (the bottom, horizontal
part of the graph). T he perc entage of s c ores between pairs of
s tandard deviation points on the s c ale are s hown in Figure 3 - 2 .
T he perc entages add up to 1 0 0 perc ent, but that is bec aus e of
rounding. Remember that s ome s c ores , though jus t a few, will be
further than three s tandard deviations away from the mean.

H ere are s ome key fac ts about the c urve that you c an us e to
predic t performanc e:

A bout 3 4 perc ent of s c ores fall between the mean and


one s tandard deviation above the mean. See the s haded
s ec tion in Figure 3 - 2 ? I f you took s ome ink and c olored
in the entire s pac e beneath the normal c urve, you would
us e 3 4 perc ent of the ink on this s ec tion.

A bout 3 4 perc ent of s c ores fall between the mean and


one s tandard deviation below the mean.

A bout 1 4 perc ent of s c ores fall between one and two


s tandard deviations above the mean.

A bout 2 perc ent of s c ores fall between two and three


s tandard deviations below the mean.

You c an als o c ombine the perc entages to make other


s tatements s uc h as :

A bout 6 8 perc ent of all s c ores will be within one


s tandard deviation of the mean.
A bout 5 0 perc ent of s c ores will be below the mean.

You c an us e thes e known perc entages to make predic tions and


s tatements of probability. We c an s peak of the normal c urve as
either the percentage of s cores that fall under given areas on the
c urve or the likelihood that any given tes t taker will fall under
given areas :

T here is a 2 perc ent c hanc e that you will s c ore more


than two s tandard deviations above the mean on your
next tes t.

T here is only a 1 6 perc ent c hanc e that this applic ant


will s c ore lower than one s tandard deviation below the
mean on our job s kills tes t.

Setting standards

P olic y makers rely on the as s umption that ability is normally


dis tributed when they es tablis h levels of performanc e. T hey
c hoos e levels of performanc e that will guarantee them a c ertain
perc entage of qualifying people. T he normal dis tribution is an
invaluable tool for s etting polic y for admis s ions or s ervic es if
one wants to magic ally know ahead of time how many people will
qualify.

For example, a c ollege with high ac ademic s tandards might


require s c ores on an ability tes t that are at leas t one s tandard
deviation above the mean. T his way, they ens ure thems elves of
ac c epting only the top 1 6 perc ent in ability.

L ikewis e, s pec ial educ ation polic y in the U nited States


es tablis hes c ertain c ut s c ores for s tudents on tes ts that qualify
them for s pec ial educ ation s tatus (and, thus , federal and s tate
funding). C ut s c ores are s pec ific s c ores that a pers on mus t
s c ore above (or below). I f polic y makers have the budget to pay
for s pec ial programming and s taff for only, s ay, two perc ent of all
c hildren, they s et the c ut s c ore at two s tandard deviations below
the mean. Faith in the normal c urve allows them to c alc ulate the
number of c hildren who will need funding.

Appreciating the Beauty of the Normal


Curve

To apprec iate the wonder of the normal dis tribution, you c an


always build your own. I magine you meas ured s omething (s uc h
as attitude, knowledge, height, or s peed). You have s ome s c oring
s ys tem in whic h s c ores are allowed to vary (s uc h as s c ores on
an attitude s urvey, or SAT s c ores , or inc hes , or miles per hour).
You have lots of s c ores bec aus e you meas ured lots of people,
buildings , or s parrows . N ow, plot thes e s c ores on a graph s uc h
that the X- axis repres ents the ac tual s c ore value from lowes t to
highes t, left to right (or the other direc tion if you'd like). T he Y-
axis (the vertic al left s ide part) s hould repres ent the relative
frequenc y of eac h value in your group of s c ores .

O n s uc h a c hart, the height of the line or dot repres ents the


relative proportion of s c ores that were at any partic ular value.
N otic e on the normal c urve that the highes t points are in the
middle and the lowes t points are on the ends . T he middle s c ore
is the average s c ore and the mos t popular s c ore. O n the normal
c urve, the median is equal to the mean, whic h is equal to the
mode [H ac k #2 1 ].

N otic e als o that the normal c urve is s ymmetric al: you c ould fold
it in half and one s ide would perfec tly c over the other. T he other
c harac teris tic of the normal c urve that is important to know is
that it goes on forever. I t is a theoretic al c urve, s o the two ends
of the c urve will never touc h the bas eline.

T he normal c urve is the c ommon truth that c onnec ts all of


nature. I t is perfec tly balanc ed. I t is forever. I t is eternal. I t als o
kind of looks like a dinos aur, whic h is c ool.
Hack 24. Produce Percentiles

A simple but powerf ul way of understanding test perf ormance


is through the use of percentile ranks. Here's how to take a raw
score with little explanatory value and transf orm it into
something much more inf ormative and usef ul.

I n s c hool, teac hers (or c ouns elors , or whoever reported


s tandardized tes t res ults ) might have reported res ults to you
without ever telling you your s c ore. I ns tead, you probably s aw a
number that looked like a perc entage and was des c ribed as
telling you how you (or your c hild) c ompared to others who took
the tes t. T his type of s c ore is c alled a percentile rank.

I f you have been s hown a perc entile rank that repres ents your
tes t performanc e, it won't be us eful unles s you know what it
means . O n the flip s ide, if you have to explain s omeone's tes t
performanc e and you s how the tes t taker a raw s c ore only, you
aren't really being very helpful. Being able to build or interpret
perc entile ranks is a us eful s kill for both s ides of the tes ting
game.

N orm- referenc ed s c oring [H ac k #2 6 ] is an approac h to making


tes t s c ores more informative by c omparing s c ores to eac h other.
T he norm- referenc ed s c ore you s ee mos t often in the real world
is the perc entile rank. T he perc entile rank is defined as "the
perc entage of s c ores in a dis tribution that are les s than a given
s c ore of interes t." For example, if you get 1 5 items c orrec t out
of 2 0 on a quiz and half the c las s got fewer c orrec t than you,
your perc entile rank is 5 0 .
Producing and Reporting Percentile
Ranks

I f you are a c las s room teac her, human res ourc es manager, or
anyone who has to report tes t res ults to others , being able to
report a perc entile rank ins tead of a raw s c ore will help tes t
takers unders tand how well they performed and als o help
dec is ion makers unders tand the c ons equenc es of s etting
various s tandards of performanc e.

Organize your data

P roduc ing perc entiles begins with organizing all your tes t
s c ores . For a s mall data s et, it is fairly s imple to build a
frequency table, whic h ans wers all s orts of ques tions in addition
to providing perc entile ranks . H ere is a s ample dis tribution for
3 0 s c ores on a c las s room tes t (arranged from lowes t to highes t)
in whic h 1 0 0 points was the highes t pos s ible s c ore:

59, 65, 72, 75, 75, 75, 80, 83, 83, 85, 85, 85, 85, 85,
85, 86, 86, 86, 86, 88, 88, 88, 90, 90, 90, 90, 90, 92,
94, 97

Compute frequencies and percentages

For effic ienc y's s ake, this data c an be dis played and the
frequenc y of eac h s c ore c an be c omputed, as s hown in Table 3 -
1.

Table Cumulative frequency for a clas


Cumulative
Score Frequency Percen
frequency
59 1 1 3.33 perc
65 1 2 3.33 perc
72 1 3 3.33 perc
75 3 6 10.00 per
80 1 7 3.33 perc
83 2 9 6.67 perc
85 6 15 20.00 per
86 4 19 13.33 per
88 3 22 10.00 per
90 5 27 16.67 per
92 1 28 3.33 perc
94 1 29 3.33 perc
97 1 30 3.33 perc

Table 3 - 1 s hows eac h s c ore that s omeone ac tually got, how


many people got that s c ore, the total number of people getting a
given s c ore or lower, the perc entage of all people getting eac h
s c ore, and the total perc entage of people getting a given s c ore
or lower. T he cumulative c olumns always report the total number
of people (or s c ores ) in the dis tribution (3 0 in our example) and
the total perc entage of people (always 1 0 0 perc ent).

Determine percentile ranks

To determine the perc entile rank for any s c ore in the


dis tribution, us e the "C umulative perc entage" c olumn. Find the
s c ore of interes t and look at the c umulative perc entage in the
row j us t above that s c ore's row. For ins tanc e, for a s c ore of 9 4 ,
the perc entile rank is 9 3 .3 3 or about the 9 3 rd perc entile. For a
s c ore of 8 6 , the perc entile rank is 5 0 .

I f you review a dozen s tatis tic s or


meas urement textbooks , you'll find that
there are ac tually two different and
c ompeting definitions for a perc entile rank.
I prefer "the perc entage of s c ores in a
dis tribution that are les s than a given
s c ore of interes t," but s ome books give
"the perc entage of s c ores in a dis tribution
that are equal to or les s than a given s c ore
of interes t." Both definitions are
reas onable and perc entile ranks c an be
c alc ulated either way us ing a frequenc y
table. U nder the firs t us e of the term, there
c an be no 1 0 0 th perc entile. U nder the
s ec ond, there c an be no 0 th perc entile.
P ic k the definition you prefer and go with
it, but always s hare your definition along
with your res ults .

Interpreting the Percentile Rank

I magine that you are s itting down with your guidanc e c ouns elor
and have been told that your perc entile rank is 9 3 . So, what
does this mean? Well, the mos t direc t interpretation is that 9 3
perc ent of all people who took the tes t s c ored les s than you did.
I t is als o c orrec t to s ay that 7 perc ent of people s c ored equal to
you or higher. We c an als o think of perc entile ranks as s aying
how far the s c ore is from normal. T he mean perc entile rank is
us ually around the 5 0 th perc entile and will be exac tly that if
s c ores are normally dis tributed, as they (ahem) normally are. So,
we c ould als o s ay that the 9 3 rd perc entile is pretty far above
average.

D on't make the mis take that many otherwis e s avvy s tat-
hac kers s ometimes make. E arlier in this hac k, we us ed an
example of a tes t s c ore in whic h you got 1 5 items c orrec t out of
2 0 on a quiz and half the c las s got fewer c orrec t than you. Your
perc entile rank in that example was 5 0 . N otic e that in that
example, your perc ent c orrec t is 7 5 perc ent (1 5 /2 0 ), but your
perc entile rank is 5 0 . D on't c onfus e the two! Knowing your
perc entile rank does not tell you how many ques tions you got
right.

Where It Doesn't Work


Remember that a perc entile rank is us eful only when you're
looking for a norm- referenc ed interpretation. I f you want to know
whether you have mas tered a key s et of s kills , it does not help
to know what perc entage of people have mas tered more or les s
of thos e s kills . To know where you are c ompared to s ome s et of
s tandards , not c ompared to other people, you want a c riterion-
referenc ed s c ore [H ac k #2 6 ]. A percent correct type of s c ore is
more meaningful for you in this c as e than a percentile rank.

See Also

I f you as s ume that your s c ores are normally dis tributed,


or at leas t drawn from a population that is normally
dis tributed, you c an jus t c onvert any s tandardized s c ore
direc tly to the perc entile rank, us ing information about
the areas under the normal c urve [H ac k #2 5 ].
Hack 25. Predict the Future with the
Normal Curve

Because almost anything we measure in the natural world has a


known distributional shape, the "normal curve," we can use the
precise details of that distribution to predict the f uture and
answer all sorts of probability questions.

A variety of hac ks in this book c apitalize on s tatis tic ians ' c los e
pers onal relations hip with the normal curve. "See the Shape of
E verything" [H ac k #2 3 ] s hows how to us e the normal c urve to
predic t tes t performanc e in a general way. We c an do better than
that, though.

So muc h is known about the exac t s hape of this mys tic al c urve
that we c an make exac t predic tions about the probability that
s c ores in a c ertain range will be obtained. T here are many other
types of ques tions that c an be as ked related to tes t
performanc e, and s tatis tic s c an help us to ans wer thes e s orts of
ques tions before we ever take the tes t!

For example:

What are the c hanc es that you will s c ore between any
two given s c ores ?
H ow many people will s c ore between thos e two s c ores ?

What are the c hanc es that you will pas s your next tes t?

Will you get ac c epted into H arvard?

What perc ent of s tudents in the U .S. will qualify as


N ational M erit Sc holars ?

What are the c hanc es that my U nc le Frank c ould pas s


the M ens a qualifying exam?

For thes e types of ques tions , a prec is e tool is needed. T his hac k
provides that tool: a table of areas under the normal curve.

The Table of Areas Under the Normal


Curve

T he normal c urve is defined by the mean and s tandard deviation


of a dis tribution, and the s hape of the c urve is always the s ame,
regardles s of what we meas ure, as long as the s c oring s ys tem
allows s c ores to vary. T he proportions of s c ores falling within
various areas beneath the c urve, s uc h as the s pac e between
c ertain s tandard deviations and dis tanc es from the mean, have
been s pec ified.

T his hac k relies on a c omplic ated- looking table, but it is s o full


of us eful information that it will quic kly bec ome a primary tool in
your hac ker's toolbox. Without further ado, take a deep breath
and look at Table 3 - 2 .

Table Areas under the normal curve


Proportion Proportion Propor
of scores of scores of sco
z
between in the in th
score
the mean larger small
and z area area
.00 .00 .50 .50
.12 .05 .55 .45
.25 .10 .60 .40
.39 .15 .65 .35
.52 .20 .70 .30
.67 .25 .75 .25
.84 .30 .80 .20
1.04 .35 .85 .15
1.28 .40 .90 .10
1.65 .45 .95 .05
1.96 .475 .975 .025
4.00 .50 1.00 .00
Deciphering the Table

Before we us e this nifty tool, we need to take a s ec ond deep


breath and get the lay of the land. I have s implified the
information on this table in a c ouple of ways . Firs t, I have lis ted
only a few of the values that c ould be c omputed. I ndeed, many
tables in s tatis tic al books have every value between a z of .0 0
and a z of 4 .0 0 , inc reas ing at the rate of .0 1 . T hat's a lot of
information that c ould be pres ented, s o I have c hos en to s how
only a glimps e of the mos t c ommonly needed values , inc luding
the z s c ores nec es s ary for 9 0 perc ent c onfidenc e (1 .6 5 ) and 9 5
perc ent c onfidenc e intervals (1 .9 6 ); s ee "M eas ure P rec is ely"
[H ac k #6 ] for more on c onfidenc e intervals .

I have als o rounded the proportions to two dec imal plac es .


Finally, I us ed the s ymbol z in the table to indic ate the dis tanc e
from the mean in s tandard deviations . You c an learn more about
z s c ores in "G ive Raw Sc ores a M akeover" [H ac k #2 6 ].

A fter unders tanding the s implific ations made to the table, the
firs t s tep toward us ing it to make probability predic tions about
performanc e or ans wer s tatis tic al ques tions is to unders tand the
four c olumns .

The z column

P ic ture the normal c urve [H ac k #2 3 ]. I f you are


interes ted in s ome s c ore that c ould fall along the bottom
horizontal line, it is s ome dis tanc e from the mean. I t
c ould be greater than the mean s c ore or les s than it. T he
dis tanc e to the mean expres s ed in s tandard deviations
is the z s core. A z s c ore of 1 .0 4 des c ribes a s c ore that is
a little more than one s tandard deviation away from the
mean. Bec aus e the normal c urve is s ymmetric al, we
don't bother to note whether the dis tanc e is negative or
pos itive, s o all of thes e z s c ores are s hown as pos itive.

P roportion of s c ores between the mean and z

I n that s pac e between a given s c ore and the mean, there


will be a c ertain proportion of s c ores . T his is the
probability that a random s c ore will fall in the area
defined by the mean and any z.

Proportion of s cores in the larger area

You c ould als o des c ribe the area between any given z
and a z of 4 .0 0 , or the end of the c urve.

T he c urve does n't really ever end, theoretic ally, but a z


s c ore of 4 .0 0 will c ome very c los e to inc luding 1 0 0
perc ent of the s c ores .

T here are two ends of the c urve, though. U nles s your z is


0 .0 , the dis tanc e between the z and one end of the c urve
will be greater than the dis tanc e between the z and the
other end. T his c olumn refers to the area between the z
and that furthes t end of the c urve, and the value in this
c olumn is the proportion of s c ores that will fall in that
s pac e. I n other words , it is the c hanc e that a random
pers on will produc e a s c ore in that area.
Proportion of s cores in the s maller area

T his c olumn refers to the area between the z and that


c los es t end of the c urve. I t is the proportion of s c ores
that will fall in that s pac e.

Estimating the Chance of Scoring Above


or Below Any Score

I f you need to know your c hanc es of getting into your c ollege of


c hoic e, identify the nec es s ary s c ore you need to beat, als o
known as the cut s core, on that s c hool's admis s ions tes ts . O nc e
you know the s c ore, find out the mean and s tandard deviation for
the tes t. (A ll of this info is probably on the Web.) C onvert your
raw s c ore to a z s c ore [H ac k #2 6 ], and then find that z s c ore, or
s omething c los e to it, in Table 3 - 2 .

D etermine whether the c ut s c ore is above the mean:

I f it is , look at the "P roportion of s c ores in the s maller


area" c olumn. T hat repres ents your c hanc es of s c oring
at or above that c ut s c ore, and your c hanc es of getting
in.

I f the c ut s c ore is below the mean (unlikely, but for the


s ake of c ompletely training you on how to us e this tool),
identify "P roportion of s c ores in the larger area." T hat's
the proportion of s tudents being ac c epted and, thus ,
your c hanc es , all things being equal.
For the c hanc es of s c oring below a given s c ore, the proc es s is
the oppos ite of the options jus t mentioned. T he c hanc e of
getting below a s pec ific c ut s c ore that is below the mean is
s hown in the "s maller area" c olumn. T he c hanc e of s c oring below
a given c ut s c ore that is above the mean is s hown in the "larger
area" c olumn.

Estimating the Chance of Scoring


Between Any Two Scores

T he c hanc es of getting a s c ore within any range of


s c orings c ores c an be determined by looking at the proportion of
s c ores that will normally fall in that range.

I f you want to know what proportion of s c ores falls between any


two points under the c urve, define thos e points by their z s c ore
and figure out the relevant proportion. D epending on whether
both s c ores fall on the s ame s ide of the mean, one of two
methods will give you the c orrec t proportion between thos e
points :

I f the z s c ores are on the s ame s ide of the c urve, look up


the proportion of s c ores in either the "larger area" or
"s maller area" c olumn for both z s c ores and s ubtrac t the
lower value from the higher value.

I f the z s c ores fall on both s ides of the mean with the


mean between them, us e the "P roportion of s c ores
between the mean and z" c olumn. L ook up the value for
both s c ores and add them together.
Producing Percentile Ranks

A third us e of the table is to c ompute perc entile ranks . You c an


read more about s uc h norm-referenced s c ores in "P roduc e
P erc entiles " [H ac k #2 4 ]. For s c ores above the mean, the
perc entile rank is "P roportion of s c ores between the mean and z"
plus .5 0 . For s c ores below the mean, the perc entile rank is
"P roportion of s c ores in the s maller area."

Determining Statistical Significance

A nother us e for thes e s orts of tables is to as s ign s tatis tic al


s ignific anc e [H ac k #4 ] to differenc es in s c ores . By knowing the
proportion of s c ores that will fall a c ertain dis tanc e from eac h
other or further, you c an as s ign a s tatis tic al probability to that
outc ome.

M ore us efully, other s tatis tic al values s uc h as c orrelations and


proportions c an be c onverted to z s c ores , and this table c an be
us ed to c ompare thos e values to zero or to eac h other.

Why It Works

"See the Shape of E verything" [H ac k #2 3 ] provides a good


pic ture of the normal c urve. H owever, jus t by looking at the way
thes e values c hange in Table 3 - 2 , you c an get a good s ens e of
the normal dis tribution's s hape. N ear the mean, where the rows
have s maller z s c ores , a goodly proportion of s c ores will fall. A s
you move further and further away from the mean, it takes larger
and larger areas of the c urve to c ontain the s ame proportion of
s c ores .

For example, it takes a jump from a z of 1 .6 5 to 4 jus t to c over


that las t 5 perc ent of the dis tribution. N ear the mean, though, it
requires only a jump from z = .1 2 to z = .2 5 to c over 5 perc ent of
s c ores . T he table demons trates how c ommon it is to be c ommon
and how rare it is to be s c arc e.

See Also

You will be able to c ompute your own exac t areas under


the normal c urve by us ing this web s ite:
http://www.ps yc hs tat.mis s ouris tate.edu/introbook/s bk1 1 m
A good dis c us s ion and s ome interac tive c alc ulators are
part of this s ite maintained by D avid Stoc kburger. When
you vis it, don't be c onfus ed by words like Mu and Sigma.
T hat's s tats talk for mean and s tandard deviation,
res pec tively.
Hack 26. Give Raw Scores a Makeover

A raw score on a test has little or no meaning. Change that


pitif ul raw score to a "z score," though, and you will scarcely
believe how much inf ormation is crammed into that one little
super number.

I t is s urpris ing how little information is c onveyed by that s ingle


raw s c ore plas tered at the top of s omething like a high s c hool
tes t. H ere's what I mean. I f I c ome home from s c hool and tell
my mom that I got a 1 6 on the big exam in s c hool today, s he'll
probably s ay a few things , inc luding "Why are you s till living at
home at age 4 2 ? " and "T hat's nic e, dear. I s 1 6 good? "

When you jus t tell s omeone a raw s c ore, very little real
information has been s hared. You don't know if 1 6 is good. You
don't know if 1 6 is relatively high or low. D id mos t people get a
1 6 or higher, or did mos t people get s omething les s than 1 6 ?
E ven if we know the range of s c ores on that tes t and the points
pos s ible and s o on, we s till c an't c ompare performanc e on that
tes t to performanc e on the pas t tes t or the next tes t or a tes t on
s ome other s ubjec t. Raw s c ores are virtually meaningles s .

D on't fret! You c an s till unders tand your performanc e and the
performanc es of others . You c an s till make s elec tion dec is ions
and c ompare performanc e ac ros s people and ac ros s tes ts .
T here is s till hope!

Raw s c ores c an be c hanged into a new number that does all the
things that that 9 7 - pound weakling, the raw s c ore, c ould never
do. Raw s c ores c an be trans formed into a s uper number: a z
s core. U nlike a raw s c ore, a z tells you whether the performanc e
is above or below average, and how far above or below average it
is . A z als o allows you to c ompare performanc e ac ros s tes ts and
oc c as ions , and even between people.

Calculating z Scores

A z s c ore is a raw s c ore that has been trans formed in s uc h a way


that the new number indic ates how far above or below the mean
the raw s c ore is .

H ere's the equation:

To c hange a raw s c ore into a z, s ubtrac t the mean from it and


then divide by the s tandard deviation. T he s tandard deviation of
a dis tribution is the average dis tanc e of eac h s c ore from the
mean [H ac k #2 ].

Understanding Performance

z s c ores typic ally take on a range of values between - 3 and +3 .


E xamine the top part of the z s c ore equation and you might
notic e the following:

I f the raw s c ore is greater than the mean, the z will be


pos itive.

I f the raw s c ore is below the mean, the z will be negative.


I f the raw s c ore is exac tly the mean, the z will be 0 .

z s c ores tend to range between - 3 and +3


bec aus e the normal dis tribution of s c ores
is typic ally jus t s ix s tandard deviations
wide [H ac k #2 3 ].

Smart meas urement profes s ionals us e the z s c ore tric k when


they report res ults . I ns tead of s upplying raw s c ores , all you s ee
are s c ores bas ed on z s c ores , known generic ally as s tandardized
s c ores [H ac k #2 7 ]. T hes e s tandardized s c ores have known
s table c harac teris tic s . T herefore, if you know thes e s c ores '
c harac teris tic s (their mean and s tandard deviation), you c an
turn them bac k into z s c ores and know how you did c ompared to
other people.

To s ee how to us e this formula to reveal hidden information


about your performanc e, let's us e the example of A C T tes ts . T he
A meric an C ollege Tes t is taken by juniors in many high s c hools
ac ros s the U .S. and is required by many c olleges for admis s ion.
I t is a tes t of ac hievement and ability believed to predic t
performanc e in c ollege.

Sc ores on any portion of the tes t range from 1 to 3 6 . T hough the


ac tual tes t's des c riptive s tatis tic s have drifted over the las t few
dec ades (as performanc e has improved), the offic ial A C T mean
is often reported as 1 8 with a s tandard deviation of 6 . I magine
three s tudents take the A C T and rec eive three different s c ores .
We c ould us e the mean and s tandard deviation from the A C T
s c ore dis tribution to trans form them to z s c ores , as s hown in
Table 3 - 3 .

Zac k's z is negative, s o we know he s c ored below average. H e


s c ored about two- thirds of a s tandard deviation below the mean.
Taylor's z of 0 .0 0 means he performed average c ompared to
others who have taken the A C T over the years . I s aac did the
bes t, s c oring a full s tandard deviation above the mean.

T he ac tual A C T mean and s tandard


deviation c hanges every year the tes t is
given. T he real mean and s tandard
deviation for the las t few years has been
around a mean of 2 1 and a s tandard
deviation of about 4 .5 .

Identifying the Rarity of Your Performance

T hough knowing how you s c ored in c omparis on with others who


took the tes t is more us eful than jus t knowing a raw s c ore, the
real interpretative power of z s c ores c omes from its relations hip
to the normal c urve. Figure 3 - 3 is a c hart of the normal
dis tribution, s imilar to the one s hown in "See the Shape of
E verything" [H ac k #2 3 ].

Figure 3-3. z scores and the normal curve


T he differenc e between the figure in "See the Shape of
E verything" [H ac k #2 3 ] and this one is that ins tead of s howing
the dis tanc e of eac h s tandard deviation from the mean, Figure 3 -
3 s hows thos e values as z s c ores . By us ing knowledge of areas
under the normal c urve, you c an learn even more from a z s c ore.
I f the s c ores are normally dis tributed, there is a great deal you
c an s ay about the probability of s c ores in a c ertain range
oc c urring.

T he s c ores for the s tudents s hown in Table 3 - 3 c an als o be


interpreted as the number of s tudents they did better (or wors e)
than. Taylor's z of 0 .0 0 means he did better than 5 0 perc ent of
s tudents . T he kids ' s c ores c an als o be expres s ed in a
probabilis tic s ens e. T here was a 5 0 perc ent c hanc e that Taylor
would get a z of 0 .0 0 or better. T here is only a 1 6 perc ent
c hanc e of getting a z of 1 .0 0 or better on any tes t, s o I s aac did
well c ompared to other s tudents who took the tes t.

Why It Works

I f c onverting raw s c ores to z s c ores s o we c an c ompare people


to eac h other makes s ome s ens e to you, then you are not alone.
For the las t 1 0 0 years in the world of educ ational meas urement,
s oc ial s c ientis ts (and anyone who mus t evaluate human
performanc e) have been attrac ted to the s implic ity of norm-
referenced interpretations . I f we aren't s ure what the s c ore on a
tes t really means , we c an at leas t c ompare your s c ore to how
everyone els e has done. We at leas t know whether you have more
or les s of whatever it is we jus t meas ured than other people
have.

T he alternative way to interpret educ ational and ps yc hologic al


s c ores is criterion-referenced. T hat approac h requires knowing
more about the trait or c ontent that we have jus t meas ured and
dec iding beforehand how muc h is enough. C riterion- referenc ed
meas urement allows for everyone to get the s ame s c ore as long
as they meet the s ame c riteria. T he former approac h has been
and c ontinues to be the mos t popular interpretative method,
while the latter has jus t rec ently s tarted to c atc h on.
Hack 27. Standardize Scores

Surprisingly, none of those well-known high-stakes tests, such


as the SA T or A CT or intelligence tests, ever reports your raw
score. Instead, test reports have transf ormed that useless
number into a more meaningf ul score, one that can be used to
understand your perf ormance compared to everyone else who
ever took the same test. Once you understand "standardized"
scores, you can calculate them yourself and even invent your
own.

"G ive Raw Sc ores a M akeover" [H ac k #2 6 ] dis c us s es the


s uperpowers of z s c ores . T hes e s tandardized s c ores take
meaningles s raw s c ores and add all s orts of information to them.
T hat's all well and good, and anyone us ing this book c an
interpret z s c ores and make dec is ions bas ed on that information.

I f you want to interpret many s c ore reports , though (s uc h as


thos e SAT res ults you jus t got), you will not s ee a z s c ore
reported anywhere, but ins tead s ome weirdo c us tomized
s tandardized s c ore, us ed only by that c ompany, whic h is kind of
like a z s c ore but different enough to be meaningles s for the
uninitiated.

N ever fear. H ere are the tools you need to both interpret thes e
s trange s tandardized s c ores and, if you want, even c reate your
own (for when you report s c ores to other people from your own
weirdo tes t that is jus t about to s weep the nation and make you
as ric h as M r. A C T or M s . I Q or whoever makes money from our
tes t- bas ed s oc iety).
Problems with z Scores

T here is a c ertain, s hall I s ay, uglines s to z s c ores that prevents


their wides pread us e when reporting performanc e to tes t takers
or their parents or the c olleges and employers who are
c ons idering them. I ns tead, mos t tes t c ompanies us e the z s c ore
as the firs t s tep in c reating a more attrac tive s tandardized
s c ore, whic h is then reported.

A raw s c ore is trans formed into a z s c ore us ing this formula:

A s des c ribed in greater detail in "G ive Raw Sc ores a M akeover"


[H ac k #2 6 ], this equation c reates z s c ores that tend to range
between - 3 .0 0 and +3 .0 0 , with 0 .0 0 as the average and a
s tandard deviation equal to one. T hough very us eful as a tool for
interpreting tes t performanc e, people don't like thes e numbers
when they s ee them bec aus e of a few problems :

I t c an be negative. I n fac t, half of all z s c ores will be


negative. I t is hard to c onvinc e people who take tes ts
that a negative s c ore c an be anything but bad news .

A s c ore of 0 .0 0 is the average s c ore! I f we c an't explain


to people that a negative number is n't nec es s arily a bad
thing, imagine trying to c onvinc e parents that we expec t
little Billy to get zero on the big tes t and we are pleas ed
when he does .

T he highes t s c ore you c an expec t is a 3 .0 0 , and only 1


out of a 1 0 0 tes t takers will ever get that. I t s eems like
an awful lot of hard work in tes t preparation jus t to get a
meas ly 3 !

M eas urement folks have s earc hed for and found other
s tandardized s c ales to report tes t performanc e that have more
pleas ing properties . T he tric k is to s tart with a z s c ore, and then
c onvert it onto s ome other s c ale with a mean and s tandard
deviation that is friendlier.

Creating and Interpreting T Scores

O ne problem with z s c ores is that the mean is zero. Reporting


zero as if it is an okay thing rubs s ome teac hers , parents , and
s tudents the wrong way. We c an s olve that problem by moving
down the alphabet form a z to a T.

T s cores are a trans formation of z s c ores into a new dis tribution


that has a mean of 5 0 and a s tandard deviation of 1 0 . T he
equation for a T s c ore us es this bac kwards trans formation
approac h. H ere's the T s c ore formula:

So, if little Billy's performanc e on a big tes t is average and he


gets a z s c ore of 0 .0 0 , ins tead of reporting that frightening s c ore
to his parents , we c an trans form it into a T:

and report that Billy s c ored a 5 0 . C ongratulations ! To make the


s c ore meaningful, a good teac her or s c hool c ouns elor would
explain that T s c ores range from about 2 0 to 8 0 , and 5 0 is
average.

T s c ores are us ed on s ome tes t reports as a better alternative to


z s c ores . Sc ores c annot be negative, and the mean is a more
s ubs tantial- s eeming 5 0 .
O ne popular tes t that reports s c ores us ing
the T s c ore dis tribution is the M innes ota
M ultiphas e P ers onality I nventory- I I , a
ps yc hologic al tes t that meas ures
depres s ion, s c hizophrenia, and s o on.
M ean s c ores on eac h M M P I - I I s ubs c ale
are 5 0 , with a s tandard deviation of 1 0 . By
putting eac h s ubtes t s c ore on the s ame
s c ale, you c an c ompare ac ros s traits and
develop a profile of s c ores to unders tand
the tes t taker more c ompletely.

Creating Customized Standardized


Scores

Tes t developers have found other ways of reporting s tandard


s c ores . Table 3 - 4 lis ts many of the bes t- known high- s takes
tes ts that mos t people have taken or will take s omeday.

Table Common standardized score distributions


Typical Standard
Test score Mean deviation
range
-3.00 to
z scores 0 1
3.00
T scores 20 to 80 50 10
American
College Test 1 to 36 18 6
(ACT)
SAT 200 to 800 500 100
Graduate
Record Exam 200 to 800 500 100
(GRE)
Graduate
Management
200 to 800 500 100
Admission Test
(GMAT)
Law School
Admission Test 120 to 180 150 10
(LSAT)
Medical College
Admission Test 1 to 15 8 2.5
(MCAT)
Wechsler
Intelligence
55 to 145 100 15
Scales
(IQ Test)
Stanford-Binet
Intelligence 52 to 148 100 16
Test (IQ Test)

Bec aus e tes t performanc e is normally dis tributed, you c an


interpret any of thes e s c ores by plac ing it agains t the normal
c urve and s eeing whether your performanc e was average,
unus ually low, or unus ually high [H ac k #2 3 ].

Create Your Own Standardized Score

For fun, you c an c reate your own s tandardized s c ore dis tribution
with any mean and s tandard deviation you wis h. D on't like your
SAT s c ore of 3 5 0 ? Trans form it into a s c ore within a dis tribution
of your c hoos ing.

I magine, for example, that you'd prefer a dis tribution with a


mean of 7 5 2 ,3 6 5 and a s tandard deviation of 2 1 6 ,4 5 6 (and who
wouldn't? ). L et's c all this dis tribution the Frey Score Dis tribution.
G eneralizing the T s c ore formula, you c ould trans form your SAT
s c ore of 3 5 0 into a Frey s c ore. Remember, you have to s tart with
the z s c ore for an SAT s c ore of 3 5 0 :

and then trans form it into a Frey s c ore:

N ow, does n't a s c ore of 4 2 7 ,6 8 1 s ound better than a s c ore of


3 5 0 ? Bec aus e you know the mean of the Frey dis tribution, the
interpretation of both s c ores is the s ame; they are s till below
average, and they are s till 1 1/2 s tandard deviations below the
mean. You haven't c hanged reality, jus t the numbers you us e to
des c ribe it.
Why It Works

T he dis tribution of z s c ores has a mean of 0 and a s tandard


deviation of 1 . T his is bec aus e of the equation us ed. By dividing
a group of values by its s tandard deviation, the s tandard
deviation of the new dis tribution is 1 . By s ubtrac ting the mean
from eac h s c ore in a dis tribution, the new values dis tribute
thems elves around a mean of 0 .

I f we want the s c ores we us e to have a partic ular mean and


s tandard deviation of our own c hoos ing, we c an take eac h z
s c ore and revers e engineer it, replac ing the mean of 0 with
anything we want and the s tandard deviation of 1 with anything
we want.

Understanding Norm-Referenced Scoring

We have talked about the information inherent in norm-


referenc ed s c oring and its intuitive appeal from a s tatis tic al
pers pec tive, but it is not the only way to produc e meaningful
s c ores , and it's not always the bes t method.

A s dis c us s ed in "G ive Raw Sc ores a M akeover" [H ac k #2 6 ],


there are really two philos ophies from whic h you c an c hoos e
when des igning s c oring s ys tems and building tes ts :

N orm- referenc ed s c oring

D riven by the philos ophy that to bes t unders tand


performanc e on a tas k (s uc h as ac ting in a movie or
taking the A C T ), the level of performanc e for one pers on
s hould be c ompared to how other people performed

C riterion- referenc ed s c oring

E valuates performanc e bas ed on a s et of c riteria, s uc h


as a bas e of knowledge, a s et of s kills , ins truc tional
objec tives , and diagnos tic c harac teris tic s

I f the norm- referenc ed approac h makes s ens e to you, then you


will want to us e the tools pres ented here to interpret your
performanc e on thes e c ommon s tandardized tes ts .
Hack 28. Ask the Right Questions

If you are a classroom teacher, a job interviewer, or in any


situation where you want to measure someone's understanding,
you have a variety of ways to ask a question. Here are some
tools f rom the science of measurement that allow you to ask
the right question in the right way.

For more than a hundred years , c las s rooms have been an


environment of ques tions and ans wers . O uts ide of s c hool, tes ts
are more and more c ommon in the workplac e and in hiring
dec is ions . E ven in my free time, I c an't pic k up a Cos mo without
having to res pond to a relations hip quiz about whether I am
"friendly" or "fros ty" when it c omes to meeting people at parties .
(I 'm fros ty. Want to make s omething of it? )

M any profes s ions have to as k good ques tions or write good


tes ts :

Teac hers as k s tudents ques tions while lec turing or one-


on- one in private c onferenc es to as s es s s tudent
unders tanding.

Trainers write ques tions to evaluate the effec tivenes s of


works hops .
P ers onnel offic ers develop s tandard ques tions to
meas ure applic ants ' s kills .

A nyone who ever has to as s es s how muc h s omeone els e knows


is fac ed with the dilemma of dec iding what s ort of ques tion to
as k to really get to the heart of the matter. T his hac k provides
s olutions to the two mos t c ommon problems when writing tes ts
or des igning ques tions meant to meas ure knowledge or
unders tanding:

H ow do I c ons truc t a good ques tion?

What s hould I as k about?

Constructing a Good Question

For meas uring knowledge quic kly and effic iently, it is hard to
beat the multiple-choice item as a ques tion format.

M ultiple- c hoic e ques tions are a type of


item that pres ents res pondents with a
ques tion or ins truc tion (c alled the s tem),
and then as ks them to s elec t the c orrec t
ans wer or res pons e from a lis t of ans wer
options . T hes e types of items are
s ometimes referred to as s election items
bec aus e people s elec t the ans wer.
To give us the right terms to us e as we talk about how to write a
good multiple- c hoic e item, a quic k primer is in order.

H ere is an example of a multiple- c hoic e item:

Who wrote The Great


Stem
Gatsby?
A. Faulkner Distractor
Correct answer ("keyed"
B. Fitzgerald
answer)
C. Hemingway Distractor
D. Steinbeck Distractor

A s you s ee, eac h part of the ques tion has a name. T he c orrec t
ans wer is c alled the correct ans wer (how's that for s c ientific
jargon? ), and wrong ans wers are c alled dis tractors .

N ot muc h, but s ome real- world res earc h has been done on the
c harac teris tic s of multiple- c hoic e items and how to write good
ones . To write good multiple- c hoic e items , follow the following
c ritic al item- writing guidelines from this res earc h:
I nclude 3 to 5 ans wer options

I tems s hould have enough ans wer options that pure


gues s ing is diffic ult, but not s o many that the dis trac tors
are not plaus ible or the item takes too long to c omplete.

Do not include "All of the Above" as an ans wer option

Some people will gues s this ans wer option frequently, as


part of a tes t- taking s trategy. O thers will avoid it as part
of a tes t- taking s trategy. E ither way, it does not operate
fairly as a dis trac tor. A dditionally, to evaluate the
pos s ibility that "A ll of the A bove" is c orrec t requires
analytic al abilities that vary ac ros s res pondents .
M eas uring this partic ular analytic ability is likely not the
targeted goal of the tes t.

Do not include "None of the Above" as an ans wer option

T his guideline exis ts for the s ame reas ons as the


previous guideline. A dditionally, for s ome reas on,
teac hers do tend to c reate items where "N one of the
A bove" is mos t likely to be the c orrec t ans wer, and s ome
s tudents know this .

Make all ans wer options plaus ible

I f an ans wer option is c learly not c orrec t bec aus e it


does not s eem related to the other ans wer options , it is
from a c ontent area not c overed by the tes t, or the
teac her is obvious ly inc luding it for humorous reas ons , it
does not operate as a dis trac tor. Students are not
c ons idering the dis trac tor, s o a four- ans wer- option
ques tion is really a three- ans wer- option ques tion and
gues s ing bec omes eas ier.

Order ans wer options logically or randomly

Some teac hers develop a tendenc y to write items where


a c ertain ans wer option (e.g., B or C ) is c orrec t. Students
might pic k up on this with a given teac her. A dditionally,
s ome c ours es on doing well on s tandardized multiple-
c hoic e tes ts s ugges t this tec hnique as part of a tes t-
taking s trategy. Teac hers c an c ontrol for any tendenc ies
of their own by plac ing the ans wer options in an order
bas ed on s ome rule (e.g., s hortes t to longes t,
alphabetic al, c hronologic al).

A nother s olution to this ordering problem


is for teac hers to s c roll through the firs t
draft of the tes t on their word proc es s ors
and attempt to randomize the order of
ans wer options . C omputerized
randomization is the s olution, of c ours e,
for c ommerc ial s tandardized tes t
developers as well.

M ake the s tem longer than ans wer options


A n item is proc es s ed more quic kly if the bulk of the
reading is in the s tem, followed by brief ans wer options .

Bec aus e longer s tems followed by s horter ans wer


options allows for eas ier proc es s ing for tes t takers , a
good multiple- c hoic e item s hould look like this :

=======================================

====================

====================

====================

====================

D o not us e negative wording

Some s tudents read more c arefully or proc es s words


more ac c urately than others , and the word "not" c an
eas ily be mis s ed. E ven if the word is emphas ized s o no
one c an mis s it, educ ational c ontent tends not to be
learned as a c ollec tion of non- fac ts or fals e s tatements ,
but is likely s tored as a c ollec tion of pos itively worded
truths .
M ake ans wer options grammatic ally c ons is tent with s tem

For example, if the grammar us ed in the s tem makes it


c lear that the right ans wer is a female or is plural, make
s ure that all ans wer options are female or plural.

Us e complete s entences for s tems

I f a s tem is a c omplete ques tion ending with a ques tion


mark, or a c omplete ins truc tion ending with a period,
s tudents c an begin to identify the ans wer before
examining ans wer options . Students mus t work harder if
s tems end with a blank or a c olon, or if it's s imply an
unc ompleted s entenc e. M ore proc es s ing inc reas es
c hanc es of errors .

Asking a Question at the Right Level

I dentifying the right level of ques tion to as k is the s ec ond major


problem that mus t be overc ome when c reating tes ts . Some
ques tions are eas y; they only as s es s one's ability to rec all
information and indic ate a fairly low level of knowledge. O ther
ques tions are more diffic ult and require a res pons e that
c ombines exis ting knowledge or applies it to a new problem or
s ituation. Bec aus e different levels of ques tions meas ure
different levels of unders tanding, the right ques tion mus t be
as ked at the right level for anything us eful to be gained from the
enterpris e.

A s mart fellow and educ ational res earc her, Benjamin Bloom,
writing in the 1 9 5 0 s , s ugges ted a way of thinking about
ques tions and the level of unders tanding required to res pond
c orrec tly. H is c las s ific ation s ys tem has bec ome known as
Bloom's Taxonomy, a c las s ific ation s ys tem of educ ational
objec tives bas ed on the level of unders tanding nec es s ary for
ac hievement or mas tery. Bloom and c olleagues have s ugges ted
s ix different c ognitive s tages in learning. T hey are, in order from
lowes t to highes t:

1 . Knowledge

A bility to rec all words , fac ts , and c onc epts

2 . C omprehens ion

A bility to unders tand and c ommunic ate about a topic

3 . A pplic ation

A bility to us e generalized knowledge to s olve an


unfamiliar problem

4 . A nalys is

A bility to break an idea into parts and unders tand their


relations hip

5 . Synthes is

A bility to c reate a new pattern or idea out of exis ting


knowledge
6 . E valuation

A bility to make informed judgments about the value of


new ideas

Choosing the right cognitive level

L et's us e teac hers as an example of how to think about what


level of ques tions you want. Teac hers c hoos e the appropriate
c ognitive level for c las s room objec tives , and a quality
as s es s ment is des igned to meas ure how well thos e objec tives
have been met. M os t items written by teac hers , and thos e on
prewritten tes ts pac kaged with textbooks and teac hing kits , are
at the knowledge level. M os t res earc hers c ons ider this
unfortunate, bec aus e c las s room objec tives s hould be (and
us ually are) at higher c ognitive levels than s imply memorizing
information.

When new material is being introduc ed, however (at any


agepres c hool through advanc ed profes s ional training), an
as s es s ment probably s hould inc lude at leas t a c hec k that bas ic
new fac ts have been learned. When teac hers dec ide to meas ure
beyond the knowledge level, the appropriate level for items
depends on the developmental level of s tudents . T he c ognitive
level of s tudents , partic ularly their ability to think and
unders tand abs trac tly, and their ability to s olve problems us ing
multiple s teps , s hould determine the bes t level for c las s room
objec tives , and, therefore, the bes t level for tes t items .
Res earc hers believe that teac hers s hould tes t over what they
teac h, in the s ame way that they teac h it.
So, any time you find yours elf wanting to as s es s the knowledge
hidden ins ide s omeone's head, think about what level of
unders tanding you want to as s es s . I s bas ic memorized
knowledge enough? I f s o, then the knowledge level is the
appropriate level for a ques tion. D o you want to know whether
your job applic ant c an us e her knowledge to s olve problems s he
has never experienc ed before? A s k a ques tion at the application
level, and s he will have to demons trate that ability.

Designing questions at different


cognitive levels

Follow the guidelines in Table 3 - 5 for c reating items or tas ks at


eac h level of Bloom's Taxonomy.

Table Questions at different cognitive levels


Example
Bloom's Question
question
level characteristics
or task
Who wrote
Requires only rote The Great
memory ability and Gatsby?
Knowledge such skills as recall, A. Faulkner
recognition, and B. Fitzgerald
repeating back C. Hemingwa
D. Steinbeck
Requires skills such
What is a
as paraphrasing,
Comprehension prehensile
summarizing, and
tail?
explaining
If a farmer
Requires skills such
owns 40 acre
as performing
of land and
operations and
buys 16 acres
Application solving problems, and
more, how
includes words such
many acres o
as use, compute, and
land does she
produce
own?
Requires skills such
as outlining, listening, Draw a map o
logic, and your
Analysis observation, and neighborhood
uses words such as and identify
identify and break each home.
down
Based on you
understandin
Requires skills such of the
as organization and characters,
Synthesis design, and includes describe wha
words such as might happen
compare and contrast in a sequel to
Flowers for
Algernon.
Which musica
Requires skills such film performer
as criticism and was probably
Evaluation
forming opinions, and the best
includes words such athlete?
as support and explain Defend your
answer.

When to use Bloom's Taxonomy

T here is an implied hierarc hy to Bloom's c ategories , with


knowledge repres enting the s imples t level of c ognition and
evaluation repres enting the highes t and mos t c omplex level.
A nyone writing ques tions to as s es s knowledge c an write items
for any given level. Teac hers c an identify the level of c hos en
c las s room objec tives and c reate as s es s ments to matc h thos e
levels . With objec tively s c ored item formats , it is fairly s imple to
tap lower levels of Bloom's taxonomy and more diffic ult, but not
impos s ible, to meas ure at higher levels .

You s hould not worry too muc h about the fine dis tinc tions
between the s ix levels as defined by Bloom. For example,
comprehens ion and application are c ommonly treated as
s ynonymous , as it is the ability to apply what is learned that
indic ates c omprehens ion. M os t tes ting theoris ts and c las s room
teac hers today pay the mos t attention to the dis tinc tion between
the knowledge level and all the res t of the levels . M os t teac hers ,
exc ept at introduc tory s tages of brand new areas , prefer to teac h
and meas ure to objec tives that are above the knowledge level.

See Also

H ere's s omething a little more s c holarly that I wrote


with s ome c olleagues : Frey, B.B., P eters en, S.E .,
E dwards , L .M ., P edrotti, J .T., and P eyton, V. (2 0 0 5 ).
"I tem- writing rules : C ollec tive wis dom." Teaching and
Teacher Education, 21, 3 5 7 - 3 6 4 .

For a good review of item- writing rules , c hec k out


H aladyna, T.M ., D owning, S.M ., and Rodriguez, M .C .
(2 0 0 2 ). "A review of multiple- c hoic e item- writing
guidelines for c las s room as s es s ment." Applied
Meas urement in Education, 15(3 ), 3 0 9 - 3 3 4 .

T he influential ideas in Bloom's taxonomy were


introduc d in Bloom, B.S. (E d.). (1 9 5 6 ). Taxonomy of
educational obj ectives : The clas s ification of educational
goals . Handbook 1. Cognitive domain. N ew York: M c Kay.

Bloom, B.S., H as tings , J .T., and M adaus , G .F. (1 9 7 1 ).


Handbook on formative and s ummative evaluation of
s tudent learning. N ew York: M c G raw- H ill.

P hye, G .D . (1 9 9 7 ). Handbook of clas s room as s es s ment:


Learning, adj us tment, and achievement. San D iego, C A :
A c ademic P res s .
Hack 29. Test Fairly

Classroom teachers f requently create their own tests to


measure their students' learning. They of ten worry whether
their tests are too hard or too easy and whether they measure
what they are supposed to measure. Item analysis tools provide
the solutions to teachers' concerns.

C las s room as s es s ment is perhaps the s ingle mos t c ommon


ac tivity in the modern s c hoolroom. Teac hers are always making
and grading tes ts , s tudents are always s tudying for and taking
tes ts , and the whole proc es s is meant to s upport s tudent
learning. Tes ts mus t not be too hard (or too eas y), and they
mus t meas ure what the teac her wants them to meas ure. Tes t
s c ores and grades are the way that teac hers c ommunic ate with
parents , s tudents , and adminis trators , s o the s c ore at the top of
the tes t needs to be fair. I t mus t ac c urately reflec t s tudent
learning, and it s hould be the res ult of a quality as s es s ment.

C onc erned teac hers c ons tantly work to improve their tes ts , but
they are often working in the dark without s olid data to guide
them. What c an a s mart, c aring teac her do to improve his tes ts
or improve the validity of his grading? A family of s tatis tic al
methods c alled item analys is c an provide direc tion to teac hers
as they s eek to develop fair as s es s ments and grading.

Item Analysis
I tem analys is is the proc es s of examining c las s room
performanc e on individual tes t items . A c las s room teac her might
want to examine performanc e on parts of a tes t s he has written,
to s ee what areas are being mas tered by her s tudents and what
areas need more review. A c ommerc ial tes t developer produc ing
exams for nurs ing c ertific ation might want to know whic h items
on his tes t are the mos t valid and whic h s eem to meas ure
s omething els e and s hould therefore be removed.

I n both c as es , the developer of the tes t is interes ted in item


diffic ulty and item validity. T hough one example involves a high
s c hool teac her making tes ts for her own s tudents , and the other
example involves a large for- profit c orporation, both developers
are interes ted in the s ame types of data, and both c an apply the
s ame tools of item analys is .

Three Types of Classroom Assessment


Problems

I f you are a c las s room teac her worried about your own
as s es s ments , there are three different types of ques tions that
you probably need to ans wer. Fortunately, there are three item-
analys is tools that will provide you with the three different types
of information you need.

Are my test questions too hard?

T he diffic ulty of any s pec ific tes t ques tion c an be c alc ulated
fairly eas ily us ing the formula for the difficulty index. You c an
produc e a diffic ulty index for a tes t item by c alc ulating the
proportion of s tudents taking the tes t that got that item c orrec t.
T he larger the proportion, the more tes t takers who know the
information meas ured by the item.

T he term difficulty index is


c ounterintuitive, bec aus e it ac tually
provides a meas ure of how eas y the item
is , not the difficulty of the item. A n item
with a high diffic ulty index is an eas y item,
not a tough one.

H ow hard is too hard? You get to dec ide that yours elf. Some
teac hers treat diffic ulty indic es at .5 0 or below as too hard
bec aus e mos t people mis s ed the item. You might have higher
s tandards . I f you believe that mos t s tudents s hould have
learned the material and your diffic ulty index for an item
s ugges ts that a s ubs tantial portion of your c las s mis s ed it, it
might be too hard.

Is each test question measuring what it


is supposed to?

M eas urement experts s ay that if a tes t item meas ures what it is


s uppos ed to, then it is valid [H ac k #3 2 ]. T he dis crimination index
is a bas ic meas ure of the validity of an item, in addition to its
reliability. I t meas ures an item's ability to dis c riminate between
thos e who s c ored high on the total tes t and thos e who s c ored
low.

T hough there are s everal s teps in its c alc ulation, onc e


c omputed, this index c an be interpreted as an indic ation of the
extent to whic h overall knowledge of the c ontent area or mas tery
of the s kills is related to the res pons e on an item.

A dis crimination index is not s o named


bec aus e it s ugges ts tes t bias .
Dis crimination is the ability to identify
whether one who got an item c orrec t is in a
high- s c oring group or a low- s c oring group.

Why did my students miss a question?

I n addition to examining the performanc e of an entire tes t item,


teac hers are often interes ted in examining the performanc e of
individual dis trac tors (inc orrec t ans wer options ) on multiple-
c hoic e items through analys is of ans wer options . By c alc ulating
the proportion of s tudents who c hoos e eac h ans wer option,
teac hers c an s ee what s orts of errors s tudents are making. H ave
they mis learned c ertain c onc epts ? D o they have c ommon
c onfus ions about the material?

To improve how well the item works from a meas urement


pers pec tive, teac hers als o c an identify whic h dis trac tors are
"working" and appear attrac tive to s tudents who do not know the
c orrec t ans wer, and whic h dis trac tors are s imply taking up s pac e
and are not being c hos en by many s tudents .

To eliminate educ ated gues s es that res ult in c orrec t ans wers
purely by c hanc e, teac hers and tes t developers want as many
plaus ible dis trac tors as is feas ible. A nalys es of res pons e
options allow teac hers to fine- tune and improve items they
might want to us e again with future c las s es .

Conducting Item Analyses and


Interpreting Results

H ere are the proc edures for the c alc ulations involved in item
analys is , us ing data for an example item. For this example,
imagine a c las s room of 2 5 s tudents who took a tes t that
inc luded the item in Table 3 - 6 (keep in mind, though, that even
large- s c ale s tandardized tes t developers us e the s ame
proc edures for tes ts taken by hundreds of thous ands of people).

T he as teris k for the ans wer options in


Table 3 - 6 indic ates that B is the c orrec t
ans wer.
Table Sample item for item analysis
Answer to Number of
question: "Who students who
wrote The Great chose each
Gatsby?" answer
A. Faulkner 4
B. Fitzgerald* 16
C. Hemingway 5
D. Steinbeck 0

To c alc ulate the diffic ulty index:

1. C ount the number of people who got the c orrec t ans wer.

2. D ivide by the total number of people who took the tes t.

O n the item s hown in Table 3 - 6 , 1 6 out of 2 5 people got the


item right:

1 6 / 2 5 = .6 4

D iffic ulty indic es range from .0 0 to 1 .0 . I n our example, the item


had a diffic ulty index of .6 4 . T his means that 6 4 perc ent of
s tudents knew the ans wer.

I f a teac her believes that .6 4 is too low, there are a c ouple of


ac tions s he c an take. She c ould dec ide to c hange the way s he
teac hes to better meet the objec tive repres ented by the item.
A nother interpretation might be that the item was too diffic ult or
c onfus ing or invalid, in whic h c as e the teac her c an replac e or
modify the item, perhaps us ing information from the item's
dis c rimination index or analys is of res pons e options .

To c alc ulate the dis c rimination index:

1. Sort your tes ts by total s c ore, and c reate two groupings


of tes ts : the high s cores , made up of the top half of tes ts ,
and the low s cores , made up of the bottom half of tes ts .

2. For eac h group, c alc ulate a diffic ulty index for the item.

3. Subtrac t the diffic ulty index for the low s c ores group
from the diffic ulty index for the high s c ores group.

I magine that in our example 1 0 out of 1 3 s tudents (or tes ts ) in


the high group and 6 out of 1 2 s tudents in the low group got the
item c orrec t. T he high group diffic ulty index is .7 7 (1 0 /1 3 ) and
the low group diffic ulty index is .5 0 (6 /1 2 ), s o we c an c alc ulate
the dis c rimination index like s o:

.7 7 - .5 0 = .2 7

T he dis c rimination index for the item is .2 7 . D is c rimination


indic es range from - 1 .0 to 1 .0 . T he greater the pos itive value
(the c los er it is to 1 .0 ), the s tronger the relations hip is between
overall tes t performanc e and performanc e on that item.

I f the dis c rimination index is negative, that means that, for s ome
reas on, s tudents who s c ored low on the tes t were more likely to
get the ans wer c orrec t. T his is a s trange s ituation, and it
s ugges ts poor validity for an item or that the ans wer key was
inc orrec t. Teac hers us ually want eac h item on the tes t to tap
into the s ame knowledge or s kill as the res t of the tes t.

T he formula for the dis c rimination index is


s uc h that if more s tudents in the high-
s c oring group c hos e the c orrec t ans wer
than did s tudents in the low- s c oring group,
the number is pos itive. A t a minimum,
then, a teac her would hope for a pos itive
value, bec aus e that would indic ate that
knowledge res ulted in the c orrec t ans wer.

We c an us e the information provided in Table 3 - 6 to look at the


popularity of different ans wer options , as s hown in Table 3 - 7 .

Table Item analysis of "Who wrote The Great


Gatsby?"
Popularity of Difficulty
Answer
options index
A. Faulkner 4/25 .16
B.
16/25 .64
Fitzgerald*
C. 5/25 .20
Hemingway
D.
0/25 .00
Steinbeck

T he analys is of res pons e options s hows that s tudents who


mis s ed the item were about equally likely to c hoos e ans wer A
and ans wer C . N o s tudents c hos e ans wer D , s o ans wer option D
does not ac t as a dis trac tor. Students are not c hoos ing between
four ans wer options on this item; they are really c hoos ing
between only three options , s inc e they are not even c ons idering
ans wer D .

T his makes gues s ing c orrec tly more likely, whic h hurts the
validity of an item. A teac her might interpret this data as
evidenc e that mos t s tudents make the c onnec tion between The
Great Gats by and Fitzgerald, and that the s tudents who don't
make this c onnec tion c an't differentiate between Faulkner and
H emingway very well.

Suggestions for Item Analysis and Test


Fairness

To improve the quality of tes ts , item analys is c an identify items


that are too diffic ult (or too eas y, if a teac her has that c onc ern),
don't differentiate between thos e who have learned the c ontent
and thos e who have not, or have dis trac tors that are not
plaus ible.

I f you as a teac her have c onc erns about tes t fairnes s , you c an
c hange the way you teac h, c hange the way you tes t, or c hange
the way you grade the tes ts :

Change the way you teach

I f s ome items are too hard, you c an adjus t the way you
teac h. E mphas ize unlearned material or us e a different
ins truc tional s trategy. You might s pec ific ally modify
ins truc tion to c orrec t a c onfus ing mis unders tanding
about the c ontent.

Change the way you tes t

I f items have low or negative dis c rimination values , they


c an be removed from the c urrent tes t, and you c an
remove them from the pool of items for future tes ts . You
c an als o examine the item, try to identify what was
tric ky about it, and c hange the item. When dis trac ters
are identified as being nonfunc tional (no one pic ks
them), teac hers c an tinker with the item and c reate a
new dis trac ter. O ne goal for a valid and reliable tes t is to
dec reas e the c hanc e that random gues s ing c ould res ult
in c redit for a c orrec t ans wer. T he greater the number of
plaus ible dis trac ters , the more ac c urate, valid, and
reliable the tes t typic ally bec omes .

Change the way you grade

You might us e item analys is information to dec ide that


the material was not taught and, for the s ake of fairnes s ,
remove the item from the c urrent tes t and rec alc ulate
s c ores . T he s imples t way for real c las s room teac hers to
do this is to s imply c ount the number of bad items on a
tes t and add that number to everyone's s c ore. T his is
not tec hnic ally the s ame as res c oring the tes t as if the
item never exis ted, but this way s tudents s till get c redit
if they got a hard or tric ky item c orrec t, whic h s eems
fairer to mos t teac hers .

T hes e c onc erns that teac hers have about the quality of their
tes ts are not muc h different than the res earc h ques tions that
s c ientis ts as k. J us t like s c ientis ts , teac hers c an c ollec t data in
their c las s room, analyze the data, and interpret res ults . T hey
c an then dec ide, bas ed on their own pers onal philos ophies , how
to ac t on thos e res ults .
Hack 30. Improve Your Test Score While
Watching Paint Dry

If you don't like the score you just got on that important high-
stakes test, maybe you should take the test again. Or should
you?

We've already dis c us s ed how to meas ure anything prec is ely by


applying c onc epts of reliability [H ac k #6 ]. Reliability is the
c ons is tenc y with whic h a tes t as s es s es s ome outc ome. I n other
words , a reliable tes t produc es a s table s c ore, and an unreliable
tes t does not. Bec aus e tes ts that are les s than perfec tly reliable
produc e s c ores at leas t partly due to random c hanc e, their
s c ores c an move around in ways that s tatis tic ians c an predic t.
Bec aus e your tes t s c ore when you retake a tes t will tend to
move toward the average s c ore on that tes t, this effec t is c alled
regres s ion toward the mean.

When you take a high- s takes tes t s uc h as the SAT, A C T, G RE ,


L SAT, or M C AT, you often have the option of retaking it to try to
improve your s c ore. Your dec is ion on whether it is worth the
time, hard work, and money to try to improve your tes t s c ore
s hould be made with an unders tanding of the tes t's reliability
and how muc h c hange is pos s ible s imply through regres s ion to
the mean.

Regressing to the Mean


Firs t, let's make regres s ion to the mean oc c ur, s o you'll believe
that s c ores c an c hange in a predic table direc tion for no reas on
other than the c harac teris tic s of the normal c urve [H ac k #2 3 ].
Seeing is believing, and I hope to make this invis ible magic al
phenomenon happen before your eyes .

G ive the true/fals e quiz s hown in Table 3 - 8 to 1 0 0 of your


c los es t friends . Well, O K, maybe 1 0 people, c ounting you. 1 ,0 0 0
would be even better, but I jus t need enough to prove to you that
this regres s ion thing happens . A s we proc eed, keep in mind that
if we had 1 0 0 or 1 ,0 0 0 takers of this very diffic ult (or very eas y)
tes t, the res ults would be even more c onvinc ing.

O h, and for this tes t, you don't have to s ee the ac tual ques tions
thems elves . Sc ores will c hange on this tes t without any c hange
in the c ons truc t that is being meas ured [H ac k #3 2 ]. So, all you
c an do on this quiz is gues s . Bec aus e they are true/fals e
ques tions , you will have a 5 0 perc ent c hanc e of getting any
ques tion c orrec t, and the average performanc e for your group of
1 0 tes t takers (or 1 0 0 if you are really s erious about this ...c an
you do at leas t 3 0 maybe? ...anyone? ) s hould be a s c ore of 5 out
of 1 0 .

Table Advanced Quantum Physics Quiz


Question Circle Your Answer
1. True or False
2. True or False
3 True or False
4. True or False
5. True or False
6. True or False
7. True or False
8. True or False
9. True or False
10. True or False

A dminis ter the Advanced Quantum Phys ics Quiz to all the people
you were able to get. A nd when you and the others take this quiz,
don't c heat by looking at the ans wer key, even though it is only
inc hes away from your eyes right now (in Table 3 - 9 )!

Table Answer key for the Advanced Quantum


Physics Quiz
1. True 2. True 3. False 4. False 5. True
6. False 7. False 8. True 9. True 10. False

C ollec t the c ompleted tes ts (make s ure they put their names on
them) and s c ore them up, us ing the ans wer key in Table 3 - 9 .

N ow, pic k your highes t s c orer (this repres ents s omeone like
you, perhaps , who s c ores higher than average on s tandardized
tes ts s uc h as the SAT ) and the lowes t s c orer (this repres ents
s omeone not like you, perhaps , who s c ores lower than average).
G ive thes e two people the quiz again (without them s eeing the
c orrec t ans wers ) and s c ore them again.
H ere's where regres s ion to the mean kic ks in. I am pretty
s urewithout knowing you or your friends or what their ans wers
areof two things :

T he pers on who s c ored lowes t the firs t time will s c ore


higher than he did before.

T he pers on who s c ored highes t the firs t time will s c ore


lower than s he did before.

I f it worked, then aha! I told you s o. I f it didn't work, I told you I


was only "pretty s ure." With a larger s ample, it is muc h more
likely to work.

Why It Works

What we expec t to happen with the two s c ores is that all the tes t
s c ores that are below 5 (or whatever your tes t mean was ) would
move up toward the mean, and thos e s c ores above 5 would move
down toward the mean. T his may or may not have happened with
your two s c ores , but it is the mos t probable outc ome.

Remember this was a tes t in whic h knowledge had no effec t on


s c ores . Sc ores were due entirely to c hanc e both times . T his
effec t oc c urs with real tes ts , though, even when knowledge does
influenc e your s c ore. T hat's bec aus e no real tes t is perfec tly
reliable, and c hanc e plays s ome role in performanc e on every
tes t. T his demons tration jus t exaggerated the effec t by
pres enting a tes t in whic h c hanc e ac c ounts for 1 0 0 perc ent of
the tes t taker's s c ore.
So, why are s c ores likely to c hange and move c los er to the mean
on s ec ond oc c as ions ? I n the long run, with 1 0 0 or 1 ,0 0 0 s ets of
tes t s c ores , we would expec t the outc omes to be s omething like
the normal dis tribution. J us t like flipping a c oin (whic h c an c ome
up heads or tails , with a 5 0 perc ent c hanc e of either),
probabilities are as s oc iated with partic ular outc omes on a
true/fals e tes t (or any tes t, for that matter). Table 3 - 1 0 s hows
the pos s ible s c ores and the likelihood of a tes t taker rec eiving
them for the Advanced Quantum Phys ics Quiz.

Table Likely quiz score distribution


Score Probability
0 0.001
1 0.010
2 0.044
3 0.117
4 0.205
5 0.246
6 0.205
7 0.117
8 0.044
9 0.010
10 0.001

Why would more extreme s c ores bec ome les s extreme with
repeated tes ting? L ook at the likelihood of getting two extreme
s c ores (s uc h as a s c ore of 2 and then another s c ore of 2 ) vers us
getting a s c ore of 2 (probability = .0 4 4 ), and then a s c ore of 4
(probability = .2 0 5 ). I t's almos t five times as likely that a
pers on with a 2 the firs t time will s c ore a 4 on a s ec ond
adminis tration. I t is almos t 9 5 perc ent c ertain that he will s c ore
higher than 2 (1 - .0 4 4 - .0 1 0 - .0 0 1 = .9 4 5 ).

T he phras e "regres s ion toward the mean"


gets its name from the famous (and half
c ous in to C harles D arwin) Franc is G alton,
who s tudied the heights of parents and
their c hildren. H e found that the average
height of the c hildren was c los er to the
mean height of all c hildren than to the
mean of the average height of the
c hildren's parents . While G alton c alled this
obs ervation "regres s ion toward
medioc rity" (G alton was not known to be a
diplomat), we're a bit kinder. I t has nothing
to do with genetic s and everything to do
withyou gues s ed its tatis tic s .

With this tes t, in whic h s c ores were entirely due to c hanc e, there
is a 6 5 .6 perc ent c hanc e of s c oring at or very near the mean
(c ombining probabilities of s c ores 4 , 5 , and 6 ). With mos t tes ts ,
whic h have a greater number of items and produc e normal
dis tributions , you have a 6 8 perc ent c hanc e of s c oring at or near
the mean [H ac k #2 3 ].
Predicting the Likelihood of a Higher
Score

T his is all very interes ting, but how will it help you dec ide
whether it is worth it to take a tes t a s ec ond time? Bac k to our
original dilemma. Taking thes e important tes ts (s uc h as c ollege
admis s ions tes ts ) a s ec ond time takes more money, time,
s tres s , and, perhaps , preparation, s o one needs to be s trategic
in dec iding when to try again.

O f c ours e, you c an do better on a tes t by


ac tually inc reas ing your level of whatever
knowledge the tes t is meas uring. You are
likely to s c ore higher if you prepare for an
exam through s tudy, taking prac tic e
exams or preparation c ours es , and s o on.
I f you s c ore very low, though, you are
likely to do better without having done
anything between tes t adminis trations ,
jus t bec aus e of regres s ion to the mean.
You c an watc h paint dry between tes ting
times and your s c ore will s till probably
inc reas e. L uc ky dog!

T he likelihood that you will do better on a tes t by jus t taking it a


s ec ond time depends on two things : your s c ore the firs t time and
the reliability of the tes t.
Your s core

Bec aus e s c ores are likely (by c hanc e alone) to move


toward the mean, the c hanc e of you doing better given a
s ec ond c hanc e depends on whether your firs t s c ore is
below or above the mean. T hink of the mean as that big
s uc king s ound you hear, pulling all the s c ores along a
dis tribution towards it. Sc ores below the mean are more
likely to inc reas e than are s c ores above the mean.

Tes t reliability

M eas urement s tatis tic ians us e a number for reliability,


whic h repres ents the proportion of s c ore variability that
is not due to c hanc e. T he higher the reliability, then, the
les s of a role c hanc e will play in determining your s c ore.
Reliable s c ores are s table s c ores , and the s uper-
s uc king powers of the mean are no matc h for a reliable
s c ore.

Statis tic ians have developed a formula that you c an apply to


give you a good idea of how muc h wiggle room you have around
your s c ore. I f there is plenty of room to grow, you might c ons ider
a s ec ond s hot at it. A us eful tool to us e here is the s tandard error
of meas urement. H ere's the formula for the s tandard error of
meas urement [H ac k #6 ]:

M os t s tandardized tes ts publis h their levels of reliability and the


expec ted s tandard deviation for the many hundreds of thous ands
of s c ores produc ed by the tes t during eac h adminis tration. By
plugging values for thes e tes ts into the s tandard error of
meas urement equation, one c an get a general s ens e of the
variation of s c ores from tes t to retes t that might be pos s ible
without any real c hange in the pers on being meas ured.

H owever, even the s tandard error is mis leading for extreme


s c ores . Very low s c ores and very high s c ores are likely to move
a greater dis tanc e by c hanc e alone than the s tandard error
would s ugges t. T he further you are from normal, the harder it is
to res is t the gravitational forc es of normal. E xtreme s c ores
c annot res is t that pull, unles s they are perfec tly reliable.

I n s um, here's s ome s ound advic e on how to dec ide whether to


retake a tes t:

I f you s c ored very high, relatively s peaking, but not as


high as you would like, it is probably not worth the
trouble to take the tes t a s ec ond time.

I f you s c ored very low (far below average), it is almos t


c ertain that you will s c ore higher the s ec ond time. Try
again. You might s tudy a little this time, too.

N eil Salkind
Hack 31. Establish Reliability

People who use, make, and take high-stakes tests have a


vested interest in establishing the precision of a test score.
Fortunately, the f ield of educational and psychological
measurement of f ers several methods f or both verif ying that a
test score is consistent and precise and indicating just how
trustworthy it is.

A nyone who us es tes ts to make high- s takes dec is ions needs to


be c onfident that the s c ores that are produc ed are prec is e and
that they're not influenc ed muc h by random forc es , s uc h as
whether the job applic ant had breakfas t that morning or the
s tudent was overly anxious during the tes t. Tes t des igners need
to es tablis h reliability to c onvinc e their c us tomers that they c an
rely on the res ults produc ed.

M os t importantly, perhaps , when you take a tes t that will affec t


your admis s ion to a s c hool or determine whether you get that
promotion to head beverage c hef, you need to know that the
s c ore reflec ts your typic al level of performanc e. T his hac k
pres ents s everal proc edures for meas uring the reliability of
meas ures .

Why Reliability Matters

Some bas ic s , firs t, about tes t reliability and why you s hould
s eek out reliability evidenc e for important tes ts you take. Tes ts
and other meas urement ins truments are expec ted to behave
c ons is tently, both internally (meas uring the s ame c ons truc t
behaving in s imilar ways ) and externally (providing s imilar
res ults if they are adminis tered again and again over time).
T hes e are is s ues of reliability.

Reliability is meas ured s tatis tic ally, and a s pec ific number c an
be c alc ulated to repres ent a tes t's level of c ons is tenc y. M os t
indic es of reliability are bas ed on c orrelations [H ac k #1 1 ]
between res pons es to items within a tes t or between two s ets of
s c ores on a tes t given or s c ored twic e.

Four c ommonly reported types of reliability are us ed to es tablis h


whether a tes t produc es s c ores that do not inc lude muc h random
varianc e:

I nternal reliability

I s performanc e for eac h tes t taker c ons is tent ac ros s


different items within a s ingle tes t?

Tes t- retes t reliability

I s performanc e for eac h tes t taker c ons is tent ac ros s


two adminis trations of the s ame tes t?

I nter- rater reliability

I s performanc e for eac h tes t taker c ons is tent if two


different people s c ore the tes t?
P arallel forms reliability

I s performanc e for eac h tes t taker c ons is tent ac ros s


different forms of the s ame tes t?

Calculating Reliability

I f you have produc ed a tes t you want to us ewhether you are a


teac her, a pers onnel offic er, or a therapis tyou will want to verify
that you are meas uring reliably. T he methods you us e to
c ompute your level of prec is ion depend on the reliability type
you are interes ted in.

Internal reliability

T he mos t c ommonly reported meas ure of reliability is a meas ure


of internal c ons is tenc y referred to as c oeffic ient (or C ronbac h's )
alpha. Coefficient alpha is a number that almos t always ranges
from .0 0 to 1 .0 0 . T he higher the number, the more internally
c ons is tent a tes t's items behave.

I f you took a tes t and s plit it in halfthe odd items in one half and
the even items in the other, for exampleyou c ould c alc ulate the
c orrelation between the two halves . T he formula for s plit- half
c orrelations is the c orrelation c oeffic ient formula [H ac k #1 1 ]
and is a traditional method for es timating reliability, though it is
c ons idered a bit old- fas hioned thes e days .

M athematic ally, the formula for c oeffic ient alpha produc es an


average of c orrelations between all pos s ible halves of a tes t and
has c ome to replac e a s plit- half c orrelation as the preferred
es timate of internal reliability. C omputers are typic ally us ed to
c alc ulate this value bec aus e of the c omplexity of the equation:

where n = the number of items on the tes t, SD = s tandard


deviation of the tes t, S means to s um up, and SD i = s tandard
deviation of eac h item.

Test-retest reliability

I nternal c ons is tenc y is us ually c ons idered appropriate evidenc e


for the reliability of a tes t, but in s ome c as es , it is als o
nec es s ary to demons trate c ons is tenc y over time.

I f whatever is being meas ured is s omething that s hould not


c hange over time, or if it s hould c hange very s lowly, then
res pons es from the s ame group s hould be pretty muc h the s ame
if they were adminis tered the s ame tes t on two different
oc c as ions . A c orrelation between thes e two s ets of s c ores would
reflec t a tes t's c ons is tenc y over time.

Inter-rater reliability

We c an als o c alc ulate reliability when more than one pers on


s c ores a tes t or makes an obs ervation. When different raters are
us ed to produc e a s c ore, it is appropriate to demons trate
c ons is tenc y between them. E ven if only one s c orer is us ed (as
with a teac her in a c las s room), if the s c oring is s ubjec tive at all,
as with mos t es s ay ques tions and performanc e as s es s ments ,
this type of reliability has great theoretic al importanc e.

To demons trate that an individual's s c ore repres ents typic al


performanc e in thes e c as es , it mus t be s hown that it makes no
differenc e whic h judge, s c orer, or rater was us ed. T he level for
inter- rater reliability is us ually es tablis hed with c orrelations
between raters ' s c ores for a s eries of people or with a
perc entage that indic ates how often they agreed.

Parallel forms reliability

Finally, we c an demons trate reliability by arguing that it does n't


matter whic h form of a tes t a pers on takes ; s he will s c ore about
the s ame. D emons trating parallel forms reliability is nec es s ary
only when the tes t is c ons truc ted from a larger pool of items .

For example, with mos t s tandardized c ollege admis s ion tes ts ,


s uc h as the SAT and the A C T, different tes t takers are given
different vers ions of the tes t, made up of different ques tions
c overing the s ame s ubjec ts . T he c ompanies behind thes e tes ts
have developed many hundreds of ques tions and produc e
different vers ions of the s ame tes t by us ing different s amples of
thes e ques tions . T his way, when you take the tes t in M aine on a
Saturday morning, you c an't c all your c ous in in C alifornia and
tell him s pec ific ques tions to prepare for before he takes the tes t
next week, bec aus e your c ous in will likely have a different s et of
ques tions on his tes t.

When c ompanies produc e different forms of the s ame tes t, they


mus t demons trate that the tes ts are equally diffic ult and have
other s imilar s tatis tic al properties . M os t importantly, they mus t
s how that you would s c ore the s ame on your M aine vers ion as
you would if you took the C alifornia vers ion.

Interpreting Reliability Evidence

T here are a variety of approac hes to es tablis hing tes t reliability,


and tes ts for different purpos es s hould have different types of
reliability evidenc e as s oc iated with them. You c an rely on the
s ize of the reliability c oeffic ients to dec ide whether a tes t you
have made needs to be improved. I f you are only taking the tes t
or relying on the information it provides , you c an us e the
reliability value to dec ide whether you trus t the tes t res ults .

I nternal reliability

A tes t des igned to be us ed alone to make an important


dec is ion s hould have extremely high internal reliability,
s o the s c ore one rec eives s hould be very prec is e. A
c oeffic ient alpha of .7 0 or higher is mos t often
c ons idered nec es s ary for a c laim that a tes t is internally
reliable, though this is jus t a rule of thumb. You dec ide
what is ac c eptable for the tes ts you make or take.

Tes t- retes t reliability

A tes t us ed to meas ure c hange over time, as in various


s oc ial s c ienc e res earc h des igns , s hould dis play good
tes t- retes t reliability, whic h means any c hanges
between tes ts are not due to random fluc tuations in
s c ores . A n appropriate s ize for a c orrelation of s tability
depends on how theoretic ally s table a c ons truc t s hould
be over time. D epending on its c harac teris tic s , then, a
tes t s hould produc e s c ores over time that c orrelate in
the range of .6 0 to 1 .0 0 .

I nter- rater reliability

I nter- rater reliability is interes ting only if the s c oring is


s ubjec tive, s uc h as with an es s ay tes t. O bjec tive,
c omputer- s c ored multiple- c hoic e tes ts s hould produc e
perfec t inter- rater reliability, s o that s ort of evidenc e is
typic ally not produc ed for objec tive tes ts . I f an inter-
rater c orrelation is us ed as the es timate of inter- rater
reliability, .8 0 is a good rule of thumb for minimum
reliability.

Sometimes , reliability ac ros s raters is es timated by


reporting the perc entage of time the two s c orers agreed.
With a percentage agreement reliability es timate, 8 5
perc ent is typic ally c ons idered good enough.

Parallel forms reliability

O nly tes ts with different forms c an be des c ribed as


having parallel forms reliability. Your c ollege profes s or
probably does n't need to es tablis h parallel forms
reliability when there is only one vers ion of the final, but
large- s c ale tes t c ompanies probably do.

P arallel forms reliability s hould be very high, s o people


c an treat s c ores on any form of the tes t as equally
meaningful. Typic ally, c orrelations between two forms of
a tes t s hould be higher than .9 0 . Tes t c ompanies
c onduc t s tudies in whic h one group of people takes both
forms of a tes t in order to determine this reliability
c oeffic ient.

Before you take a high- s takes tes t that c ould determine whic h
roads are open to you, make s ure that the tes t has ac c epted
levels of reliability. T he type of reliability you'd like to s ee
evidenc e of depends on the purpos e of the tes t.

Improving Test Reliability

T he eas ies t way to ens ure a high c oeffic ient alpha or any other
reliability c oeffic ient is to inc reas e the length of your tes t. T he
more items as king about the s ame c onc ept and the more
opportunities res pondents have to c larify their attitudes or
dis play knowledge, the more reliable a total s c ore on that tes t
would be. T his makes s ens e theoretic ally, but als o inc reas es
reliability mathematic ally bec aus e of the formula us ed to
c alc ulate reliability.

L ook bac k at the equation for c oeffic ient alpha. A s the length of
a tes t inc reas es , the variability for the total tes t s c ore inc reas es
at a greater rate than the total variability ac ros s items . I n the
formula, this means that the value in the parenthes es gets larger
as a tes t gets longer. T he n/n-1 portion als o inc reas es as the
number of items inc reas es . C ons equently, longer tes ts tend to
produc e higher reliability es timates .

Why It Works

C orrelations c ompare two s ets of s c ores matc hed up s o that


eac h pair of s c ores des c ribes one individual. I f mos t people
perform c ons is tentlyeac h of their two s c ores is high, low, or
about average when c ompared to other individuals , or a high
s c ore on one tes t matc hes c ons is tently with a low s c ore on
anotherthe c orrelation will be c los e to 1 .0 0 or - 1 .0 0 .

A n inc ons is tent relations hip between s c ores produc es a


c orrelation c los e to 0 . C ons is tenc y of s c ores , or the c orrelation
of a tes t with its elf, is believed to indic ate that a s c ore is reliable
under the c riteria es tablis hed within C las s ic al Tes t T heory
[H ac k #6 ]. C las s ic al Tes t T heory s ugges ts , among other things ,
that random error is the only reas on that s c ores for a s ingle
pers on will vary if the s ame tes t is taken many times .
Hack 32. Establish Validity

The single most important characteristic of a test is that it is


usef ul f or its intended purpose. Establishing validity is
important if anyone is to trust that a test score means what it
is supposed to mean. You can convince yourself and others that
your test is valid if you provide certain types of evidence.

A good tes t meas ures what it is intended to meas ure. For


example, a s urvey that is s uppos ed to find out how often high
s c hool s tudents wear s eatbelts s hould, obvious ly, c ontain
ques tions about s eatbelt us e. A s urvey without thes e items
c ould reas onably be c ritic ized as not having validity. Validity is
the extent to whic h s omething meas ures whatever it is expec ted
to meas ure. Surveys , tes ts , and experiments all require validity
to be ac c eptable. I f you are building a tes t for ps yc hologic al or
educ ational meas urement, or jus t want to be s ure your tes t is
us eful, you s hould be c onc erned about es tablis hing validity.

Validity is not s omething that a tes t s c ore either has or does not
have. Validity is an argument that is made by the tes t des igner,
thos e relying on the tes t's res ults , or anyone els e who has a
s take in the ac c eptanc e of the tes t and its res ults .

C ons ider a s pelling tes t that c ons is ts of math problems . C learly,


a tes t with math problems is not a valid s pelling tes t. While it is
not a valid s pelling tes t, though, it might well be a valid math
tes t. T he validity of a tes t or s urvey is not in the ins trument
its elf, but in the interpretation of the res ults .
A tes t might be valid for one purpos e, but not another. I t would
not be appropriate to interpret a c hild's s c ore on a s pelling tes t
as an indic ation of her math ability; the s c ore might be valid as a
meas ure of verbal ability, but not as a meas ure of numeric al
fluidity. T he s c ore its elf is neither valid nor invalid; it is the
meaning attac hed to the s c ore that is arguably valid or not valid.

To illus trate how to s olve the problem of es tablis hing validity,


imagine you have des igned a new way of meas uring s pelling
ability. You want to s ell the tes t forms to s c hool dis tric ts ac ros s
the c ountry, but firs t you mus t produc e vis ible evidenc e that
your tes t meas ures s pelling ability and not s omething els e, s uc h
as voc abulary, tes t anxiety, reading ability, or (in terms of other
fac tors that might affec t s c ores ) gender or rac e.

Strategies for Winning the Validity


Argument

Validity might s eem like an argument that c an never be won,


bec aus e as an invis ible indic ator of quality, it c an never be
c ompletely es tablis hed. A s a tes t developer, though, you want to
be able to c onvinc e your tes t- takers and anyone who will be
us ing the res ults of your tes t that you are meas uring
s ubs tantially whatever it is you are s uppos ed to meas ure.
Fortunately, there are a number of ac c epted ways in whic h
evidenc e for the validity of a tes t c an be provided.

T he mos t c ommonly ac c epted type of validity evidenc e is als o,


interes tingly, theoretic ally the weakes t argument one c an make
for validity. T his argument is one of face validity, and it runs as
follows : this tes t is valid bec aus e it looks (on its fac e) like it
meas ures what it is s uppos ed to meas ure. T hos e pres enting or
ac c epting an argument for fac e validity believe that the tes t in
ques tion has the s ort of items that one would expec t to find on
s uc h a tes t. For example, the s eatbelt us e s urvey mentioned
earlier would be ac c epted as valid if it has items as king about
s eatbelt us e.

T he fac e validity argument is weak bec aus e it relies on human


judgment alone, but it c an be c ompelling. C ommon s ens e is a
s trong argument, perhaps even the s tronges t, for c onvinc ing
s omeone to ac c ept any as pec t of an as s es s ment. T hough fac e
validity s eems les s s c ientific than other types of validity
evidenc e (and in a real s ens e, it is les s s c ientific ), few tes t
ins truments would be ac c eptable to thos e who make and us e
them if fac e validity evidenc e is lac king. I f you, as a tes t
developer or us er, c annot s upply the types of validity evidenc e
dis c us s ed in the res t of this hac k, you are expec ted to provide a
tes t that at leas t has fac e validity.

For your s pelling tes t, if tes t takers are


as ked to s pell, you have es tablis hed fac e
validity.

Four s omewhat more s c ientific types of validity evidenc e are


generally ac c epted by thos e who rely on as s es s ments . T hey are
all part of the range of arguments that c an be made for validity.

C ontent- bas ed arguments

D o the items on the tes t fairly repres ent the items that
c ould be on the tes t? I f a tes t is meant to c over s ome
well- defined domain of knowledge, do the ques tions fairly
s ample from that domain?

C riterion- bas ed arguments

D o s c ores on the tes t es timate performanc e on s ome


other tes t?

C ons truc t- bas ed arguments

D oes the s c ore on the tes t repres ent the trait or


c harac teris tic you wis h to meas ure?

C ons equenc es - bas ed arguments

D o the people who take the tes t benefit from the


experienc e? I s the tes t bias ed agains t c ertain groups ?
D oes taking the tes t c aus e s o muc h s tres s that, no
matter how you s c ore, it is n't worth it?

Content-Based Arguments

I f you dec ide to meas ure a c onc ept, there are many as pec ts of
that c onc ept and many different ques tions that c an be as ked on
a tes t. Some demons tration that the items you c hoos e for your
tes t repres ent all pos s ible items would be a c ontent- bas ed
argument for validity.
T his s ounds like a daunting requirement. Traditionally, this s ort
of evidenc e has been c ons idered more important for tes ts of
ac hievement. I n areas of ac hievementmedic ine, law, E nglis h,
mathematic s there are fairly well- defined domains and c ontent
areas from whic h a valid tes t s hould s ample. A c las s room
teac her als o, pres umably, has defined a s et of objec tives or
c ontent areas that a tes t s hould meas ure. Suc h c onc is ely
defined as pec ts of a s ubjec t are rarely available, however, when
tes ting a range of behaviors , knowledge, or attitudes .
C ons equently, making a reas onable argument that you have
s elec ted ques tions that are repres entative of s ome imaginary
pool of all pos s ible ques tions is diffic ult.

So, what is nec es s ary for c ontent evidenc e of validity in tes t


c ons truc tion? I t s eems that, at a minimum, tes t c ons truc tion
c alls for s ome organized method of ques tion s elec tion or
c ons truc tion. When meas uring s elf- es teem, for example,
ques tions might c over how the tes t taker feels about hims elf in
different environments (e.g., work, home, or s c hool), while
performing different tas ks (e.g., s ports , ac ademic s , or job
duties ), or how he feels about different as pec ts of hims elf (e.g.,
his appearanc e, intelligenc e, or s oc ial s kills ).

For a c las s room teac her meas uring how


muc h s tudents have learned during the
las t few weeks , a table of s pecifications (an
organized lis t of topic s c overed and
weights indic ating their importanc e) is a
good method.
T he c hoic e of how to organize a c onc ept or how to break it down
into c omponents belongs to the tes t developer. T he developer
might have been ins pired by res earc h or other tes ts , or s he
might jus t be following a c ommon- s ens e s c heme. T he key is to
c onvinc e yours elf, s o that you c an c onvinc e others that you are
c overing the vital as pec ts of whatever area you are meas uring.

For your s pelling tes t, if you c an es tablis h that the words


s tudents are as ked to s pell repres ent a larger pool of words that
s tudents s hould be able to s pell, you are providing c ontent-
bas ed validity evidenc e.

Criterion-Based Arguments

C riterion evidenc e of validity demons trates that res pons es on a


tes t predic t performanc e in s ome other s ituation. "P erformanc e"
c an mean s uc c es s in a job, a tes t s c ore, ratings by others , and
s o on.

I f res pons es on the tes t are related to performanc e on c riteria


that c an be meas ured immediately, the validity evidenc e is
referred to as concurrent validity. I f res pons es on the tes t are
related to performanc e on c riteria that c annot be meas ured until
s ome future time (e.g., eventual c ollege graduation, treatment
s uc c es s , or eventual drug abus e), the validity evidenc e is c alled
predictive validity.

I t might go without s aying that the meas ures you c hoos e to


s upport c riterion validity s hould be relevant; the c riteria s hould
be meas ures of c onc epts that are s omehow theoretic ally related.
T his form of validity evidenc e is mos t pers uas ive and important
when the expres s purpos e of a tes t is to es timate or predic t
performanc e on s ome other meas ure.
C riterion- bas ed evidenc e is les s pers uas ive, and perhaps
irrelevant, for tes ts that do not c laim to predic t the future or
es timate performanc e on s ome other meas ure. For example,
s uc h evidenc e might not be us eful for your s pelling tes t. O n the
other hand, it is pos s ible that you c an demons trate that high
s c orers on your tes t do well in the N ational Spelling Bee.

Construct-Based Arguments

T he third c ategory of validity evidenc e is c ons truc t evidenc e. A


cons truct (pronounc ed with an emphas is on the firs t s yllable:
con- s truc t) is the theoretic al c onc ept or trait that a tes t is
des igned to meas ure. We know that we c an never meas ure
c ons truc ts s uc h as intelligenc e or s elf- es teem direc tly. T he
methods of ps yc hologic al meas urement are indirec t. We as k a
s eries of ques tions we hope will require the res pondent to us e
the part of her mind we are meas uring or referenc e the portion of
her memory that c ontains information on pas t behaviors or
knowledge, or, at the very leas t, direc t the res pondent to
examine her attitudes and feelings on a partic ular topic .

We further hope that the tes t takers ac c urately and hones tly
res pond to tes t items . I n prac tic e, tes t res ults are often treated
as a direc t meas ure of a c ons truc t, but we s houldn't forget that
they are educ ated gues s es only. T he s uc c es s of this whole
proc es s depends on another s et of as s umptions : that we have
c orrec tly defined the c ons truc t we are trying to meas ure and
that our tes t mirrors that definition.

C ons truc t evidenc e, then, often inc ludes both a defens e of the
defined c ons truc t its elf and a c laim that the ins trument us ed
reflec ts that definition. E videnc e pres ented for c ons truc t validity
c an inc lude a demons tration that res pons es behave as theory
would expec t res pons es to behave. C ons truc t validity evidenc e
c ontinues to ac c umulate whenever a s urvey or tes t is us ed, and,
like all validity arguments , it c an never be fully c onvinc ing. I n a
s ens e, c ons truc t validity arguments inc lude both c ontent and
c riterion validity arguments , bec aus e all validity evidenc e s eeks
to es tablis h a link between a c onc ept and the ac tivity that
c laims to meas ure it.

For your s pelling tes t, there might be res earc h on the nature of
s pelling ability as a c ognitive ac tivity or pers onality trait or
s ome other well- defined entity. I f you c an define what you mean
by s pelling ability and demons trate that your tes t's s c ores
behave as your definition would expec t, then you c an c laim
c ons truc t- bas ed validity evidenc e. D oes theory s ugges t that
better readers are better s pellers ? Show that relations hip,
perhaps with a c orrelation c oeffic ient [H ac k #1 1 ], and you have
pres ented validity evidenc e that might c onvinc e others .

Consequences-Based Arguments

U ntil the las t dec ade or two, meas urement folks interes ted in
es tablis hing validity were c onc erned only with demons trating
that the tes t s c ore reflec ted the c ons truc t. Bec aus e of
inc reas ing c onc erns that c ertain tes ts might unfairly penalize
whole groups of people, plus other c onc erns about the s oc ial
c ons equenc es of the c ommon us e of tes ts , polic y makers and
meas urement philos ophers now look at the c ons equenc es
experienc ed by the tes t taker bec aus e of taking a tes t.

T he idea is that we have gotten s o us ed to tes ting and making


high- s takes dec is ions bas ed on thos e tes t s c ores that we
s hould take a s tep bac k oc c as ionally and as k whether s oc iety is
really better off if we rely on tes ts to make thes e dec is ions . T his
repres ents a broadening of the definition of validity from a s core
repres enting the cons truct to a tes t fulfilling its intended purpos e.
P res umably, tes ts are here to help the world, not hurt it, and
c ons equenc es - bas ed validity evidenc e helps to demons trate the
s oc ietal value of tes ting.

L ike people from the government in all


thos e old jokes , tes ts are "here to help
us ."

For your s pelling tes t, the key negative c ons equenc es you want
to rule out involve tes t bias . I f your theory of s pelling ability
expec ts no differenc es ac ros s gender, rac e, or s oc io- ec onomic
s tatus , then s pelling s c ores s hould be equal between thos e
groups . P roduc e evidenc e of s imilar s c ores between groups ,
perhaps with a t tes t [H ac k #1 7 ], and you will be well on your
way to es tablis hing that your tes t is fair and valid.

Choosing from the Menu of Validity


Options

T he variety of c ategories of validity evidenc e des c ribed here


repres ents a s trategic menu of options . I f you want to
demons trate validity, you c an c hoos e from ac ros s the range of
validity evidenc e types .

C learly, not all tes ts need to provide all types of validity


evidenc e. A s mall teac her- made his tory tes t meant for a group
of 2 5 s tudents might require only s ome c ontent- bas ed validity
evidenc e to c onvinc e the teac her to trus t the res ults . C riterion-
bas ed validity evidenc e is unnec es s ary, bec aus e es timating
performanc e on another tes t is not an intended purpos e of this
s ort of tes t.

O n the other hand, higher- s takes tes ts , s uc h as c ollege


admis s ions tes ts (e.g., the A C T, SAT, and G RE ) and intelligenc e
tes ts us ed to identify s tudents as eligible for s pec ial educ ation
funding, s hould be s upported with evidenc e from all four validity
areas . For your s pelling tes t, you c an dec ide whic h type of
evidenc e, and whic h type of argument, is mos t c onvinc ing.
Hack 33. Predict the Length of a Lifetime

Many of us instinctively trust that things that have been around


a long time are likely to be around a lot longer, and things that
haven't, aren't. The f ormalization of this heuristic is known as
Gott's Principle, and the math is easy to do.

P hys ic is t J . Ric hard G ott I I I has s o far c orrec tly predic ted when
the Berlin Wall would fall and c alc ulated the duration of 4 4
Broadway s hows .1 C ontrovers ially, he has predic ted that the
human rac e will probably exis t between 5 ,1 0 0 and 7 .8 million
more years , but no longer. H e argues that this is a good reas on
to c reate s elf- s us taining s pac e c olonies : if the human rac e puts
s ome eggs in other nes ts , we might extend the life s pan of our
s pec ies in c as e of an as teroid s trike or nuc lear war on the home
planet.2

G ott believes that his s imple c alc ulations c an be extended to


almos t anything at all, within c ertain parameters . To predic t how
long s omething will be around by us ing thes e c alc ulations , all
you need to know is how long it has been around already.

In Action

G ott bas es his c alc ulations on what he c alls the C opernic an


P rinc iple (and what s ome people c all, in this s pec ific applic ation,
G ott's P rinc iple). T he princ iple s ays that when you c hoos e a
moment in time to c alc ulate the lifetime of a phenomenon, that
moment is probably quite ordinary, not s pec ial or privileged, jus t
as C opernic us told us the E arth does not oc c upy a privileged
plac e in the univers e.

I t's important to c hoos e s ubjec ts at ordinary, unprivileged


moments . Bias ing your tes t by c hoos ing s ubjec ts that you
already believe to be near the beginning or end of their life
s pans uc h as the human oc c upants of a neonatal ward or a
nurs ing homewill yield bad res ults . Further, G ott's P rinc iple is
les s us eful in s ituations where ac tuarial data already exis ts .
P lenty of ac tuarial data is available on the human life s pan
already, s o G ott's P rinc iple is les s us eful here.

H aving c hos en a moment, let's examine it. A ll els e being equal,


there's a 5 0 perc ent c hanc e the moment is s omewhere in the
middle 5 0 perc ent of the phenomenon's lifetime, a 6 0 perc ent
c hanc e it's in the middle 6 0 perc ent, a 9 5 perc ent c hanc e it's in
the middle 9 5 perc ent, and s o on. T herefore, there's only a 2 5
perc ent c hanc e that you've c hos en a moment in the firs t fourth
of its lifetime, a 2 0 perc ent c hanc e it's in the firs t fifth, a 2 .5
perc ent c hanc e it's in the las t 2 .5 perc ent of the s ubjec t's
lifetime, and s o on.

Table 3 - 1 1 provides equations for the 5 0 perc ent, 6 0 perc ent,


and 9 5 perc ent c onfidenc e levels . T he variable t past repres ents
how long the objec t has exis ted, and t future repres ents how long
it is expec ted to c ontinue.

Table Confidence levels under Gott's Principle

Confidence Minimum Maximum


level tfuture tfuture
50 percent tpast/3 3tpast
60 percent tpast/4 4tpast
95 percent tpast/39 39tpast

L et's look at a s imple example. Q uic k: whos e work do you think


is more likely to be lis tened to 5 0 years from now, J ohann
Sebas tian Bac h's or Britney Spears '? Bac h's firs t work was
performed around 1 7 0 5 . A t the time of this writing, that's 3 0 0
years ago. Britney Spears ' firs t album was releas ed in J anuary
1 9 9 9 , about 6 .5 years or 7 9 months ago.

C ons ulting Table 3 - 1 1 , for the 6 0 perc ent c onfidenc e level, we


s ee that the minimum t future is t past/4 , and the maximum is
4 t past. Sinc e t past for Britney's mus ic is 7 9 months , there is a 6 0
perc ent c hanc e that Britney's mus ic will be heard for between
7 9 /4 months and 7 9 x4 months longer. I n other words , we c an be
6 0 perc ent s ure that Britney will be a c ultural forc e for
s omewhere between 1 9 .7 5 months (1 .6 years ) and 3 1 6 months
(2 6 .3 years ) from now.

Sixty perc ent is a good c onfidenc e level


for quic k es timation; not only is it a
better- than- even c hanc e, but the fac tors
1 /4 and 4 are eas y to us e.
By the s ame token, we c an expec t people to lis ten to Bac h's
mus ic for s omewhere between another 3 0 0 /4 and 3 0 0 x4 years
at the 6 0 perc ent c onfidenc e level, or s omewhere between 7 5
years and 1 ,2 0 0 years from now. T hus , we c an predic t that
there's a good c hanc e that Britney's mus ic will die with her fans ,
and there's a good c hanc e that Bac h will be lis tened to in the
fourth millennium.

How It Works

Suppos e we are s tudying the lifetime of s ome objec t that we'll


c all the target. A s we've already s een, there's a 6 0 perc ent
c hanc e we are s omewhere in the middle 6 0 perc ent of the
objec t's lifetime (Figure 3 - 4 ).3

Figure 3-4. The middle 60 percent of the lifetime


I f we are at the very end of this middle 6 0 perc ent, we are at the
s ec ond point marked "now? " in Figure 3 - 4 . A t this point, only 2 0
perc ent of the target's lifetime is remaining (Figure 3 - 5 ), whic h
means that t future is equal to one- fourth of t past (8 0 perc ent).
T his is the minimum remaining lifetime we expec t at the 6 0
perc ent c onfidenc e level.

Figure 3-5. The minimum remaining lifetime (60


percent confidence level)

Similarly, if we are at the beginning of the middle 6 0 perc ent (the


firs t point marked "now? " in Figure 3 - 4 ), 8 0 perc ent of the
target's exis tenc e lies in the future, as depic ted in Figure 3 - 6 .
T herefore, t future (8 0 perc ent) is equal to 4 xt past (2 0 perc ent).
T his is the maximum remaining lifetime we expec t at the c urrent
c onfidenc e level.

Figure 3-6. The maximum remaining lifetime (60


percent confidence level)

Sinc e there's a 6 0 perc ent c hanc e we're between thes e two


points , we c an c alc ulate with 6 0 perc ent c onfidenc e that the
future duration of the target (t future) is between t past/4 and
4 xt past.

In Real Life

Suppos e you want to inves t in a c ompany and you want to


es timate how long the c ompany will be around to determine
whether it's a good inves tment. You c an us e G ott's P rinc iple to
do s o. A lthough it's not public ly traded, let's take O 'Reilly
M edia, the publis her of this book, as an example.

I c ertainly didn't pic k O 'Reilly M edia at


random, and plenty of his toric al
information is available about how long
c ompanies tend to las t, but let's try G ott's
P rinc iple as a rough- and- ready es timate of
O 'Reilly's longevity anyway. A fter all,
there's probably good data on the
longevity of Broadway s hows , but G ott
didn't s hrink from analyzing themand I
hes itate to s ay that now that O 'Reilly has
publis hed Mind Performance Hacks , its
immortality is as s ured.

A c c ording to the Wikipedia, O 'Reilly s tarted in 1 9 7 8 as a


c ons ulting firm doing tec hnic al writing. I t's J uly 2 0 0 5 as I write
this , s o O 'Reilly has exis ted as a c ompany for approximately 2 7
years . H ow long c an we expec t O 'Reilly to c ontinue to exis t?

H ere's O 'Reilly's likely lifetime, c alc ulated at the 5 0 perc ent


c onfidenc e level:

Minimum

2 7 /3 = 9 years (until J uly 2 0 1 4 )

Maximum

2 7 x3 = 8 1 years (until J uly 2 0 8 6 )


H ere are our expec tations at the 6 0 perc ent c onfidenc e level:

Minimum

2 7 /4 = 6 years and 9 months (until A pril 2 0 1 2 )

Maximum

2 7 x4 = 1 0 8 years (until J uly 2 1 1 3 )

Finally, here's our predic tion with 9 5 perc ent c onfidenc e:

Minimum

2 7 /3 9 = 0 .6 9 years = about 8 months and 1 week (until


mid- M arc h 2 0 0 6 )

Maximum

2 7 x3 9 = 1 ,0 5 3 years (until J uly 3 0 5 8 )

I n the pos t- dot- c om ec onomy, thes e figures look pretty good.


For example, A pple C omputer's aren't muc h better, and
M ic ros oft was founded in 1 9 7 5 , s o the s ame c an be s aid for it. A
real inves tor would want to c ons ider many other fac tors , s uc h as
annual revenue and s toc k pric e, but as a firs t c ut, it looks as
though O 'Reilly M edia is at leas t as likely to outlive a
hypothetic al inves tor as to tank in the next dec ade.
Endnotes

1. Ferris , T imothy. "H ow to P redic t E verything." The New


Yorker, J uly 1 2 , 1 9 9 9 .

2. G ott, J . Ric hard I I I . "I mplic ations of the C opernic an


P rinc iple for O ur Future P ros pec ts ." Nature, 3 6 3 , M ay
27, 1993.

3. G ott, J . Ric hard I I I . "A G rim Rec koning."


http://pthbb.org/manual/s ervic es /grim.

Ron H ale- E vans


Hack 34. Make Wise Medical Decisions

Medical tests provide diagnostic screening inf ormation that is


of ten misunderstood by patients and, sometimes, even by
doctors. Understanding the probability characteristics called
"sensitivity" and "specif icity" can provide a more accurate and
(sometimes) reassuring picture.

A s a c ons umer of medic al information, you have to make


dec is ions about behavior, treatment, s eeking a s ec ond opinion,
and s o on. You likely rely on medic al informationnews paper
s tories , your doc tor's advic e, tes t res ults to make thos e
dec is ions . H owever, muc h of the medic al information you get
from your doc tor has a known amount of error. T his is es pec ially
true about diagnos tic tes t res ults that indic ate the probability
that you have a c ertain c ondition.

T his hac k is all about us ing information about the


c harac teris tic s of thos e medic al tes ts to get a more ac c urate
pic ture of reality and, hopefully, make better dec is ions about
treatment.

Statistics and Medical Screening

To us e medic al tes t information wis ely, we have to learn jus t a


bit about what the c onc ept of accuracy means for thes e tes ts .
T he four pos s ible outc omes of medic al tes ts , in terms of
ac c urac y, are s hown in Table 3 - 1 2 .

Table Possible medical test outcomes


Patient
Patient
actually
actually
does not
has the
have
condition
condition
(A)
(B)
Test result
True positive
indicates False positive
(score is
patient has (score is wrong)
correct)
condition
Test result
indicates False
True negative
patient does negative(score
(score is correct)
not have is wrong)
condition

T he reliability [H ac k #6 ] of medic al s c reening tes ts is


s ummarized by two proportions c alled s ens itivity and s pecificity.
E s s entially, thos e who rely on thes e tes ts are c onc erned with
three ques tions of ac c urac y:

I f a pers on has the dis eas e, how likely is the pers on to


s c ore a pos itive tes t res ult? T his likelihood is
s ens itivity. O f thos e people in c olumn A , what perc ent
will rec eive a pos itive tes t res ult?

I f the pers on does not have the dis eas e, how likely is the
pers on to s c ore a negative tes t res ult? T his likelihood is
s pecificity. O f thos e people in c olumn B, what perc ent
will rec eive a negative tes t res ult?

I f a pers on s c ores a pos itive tes t res ult, how likely is


the pers on to have the dis eas e? From the patient's
pers pec tive, this is the ultimate ques tion, and it c an be
thought of as the bas ic validity c onc ern with thes e tes ts .
D oc tor, c an I trus t thes e tes t res ults , or c ould there be
s ome mis take?

N otic e in Table 3 - 1 2 that there are


different people in c olumns A and B.
P eople with the dis eas e are in c olumn A
and people without the dis eas e are in
c olumn B. I f you are in c olumn A , you
c annot s c ore a fals e pos itive on the tes t,
bec aus e a pos itive res ult is c orrec t. I f you
are in c olumn B, you c annot s c ore a fals e
negative, bec aus e a negative res ult is
c orrec t.
Whic h c olumn anyone is in depends on the natural dis tribution of
the dis eas e. T he c hanc e that s omeone will be in c olumn A (the
c hanc e the pers on ac tually has the dis eas e) depends on the
bas e rate of the dis eas e. I f 5 perc ent of the population has the
dis eas e, 5 perc ent of the population would find thems elves in
c olumn A .

Understanding Breast Cancer Screening

Breas t c anc er is an example of a s erious c ondition for whic h


there are diagnos tic s c reening tes ts . Breas t c anc er s c reening
begins with a mammogram tes t. A pos itive res ult on this tes t
res ults in further tes ting: another mammogram, ultras ound, or
biops y.

We are firs t interes ted in ans wering the ques tions regarding the
s ens itivity and s pec ific ity of breas t c anc er s c reening. With that
information and knowledge of the bas e rate for breas t c anc er, we
c an ans wer the mos t important ques tion:

I f a woman s c ores a pos itive tes t res ult, how likely is


s he to have breas t c anc er?

By as king your doc tor or doing s ome res earc h, you might
dis c over that s ens itivity for mammograms is about 9 0 perc ent.
Spec ific ity is about 9 2 perc ent.

T he exac t s ens itivity and s pec ific ity for


breas t c anc er s c reenings c hange over
time as different populations take the tes t.
Younger women now have mammograms
more c ommonly than in the pas t, and the
tes t is les s s ens itive and les s s pec ific for
younger women. O f c ours e, you s hould
c hec k with a phys ic ian or expert for
c urrent levels of prec is ion.

Table 3 - 1 3 s hows thos e numbers in the layout us ed in Table 3 -


1 2 . Bec aus e c olumns A and B mus t both independently add to
1 0 0 perc ent, we c an als o es timate the rate of fals e negatives
and rate of fals e pos itives .

Table Theoretical mammogram results for


10,000 women
Patient Patient
actually actually does
has breast not have
cancer (A) breast cancer
N=120 (B) N=9,880
Mammogram
indicates Sensitivity90 False positives8
cancer percentN=108 percentN=790
Mammogram
False
does not Specificity92
negatives10
indicate percentN=9,090
percentN=12
cancer

Table 3 - 1 3 als o s hows the outc omes for 1 0 ,0 0 0 hypothetic al


women, bas ed on the bas e rate of breas t c anc er in the
population, whic h is about 1 .2 perc ent.

I t turns out that it is diffic ult to identify an


ac c urate inc ident rate for breas t c anc er
bec aus e of the different ways one c an
define the relevant population and, of
c ours e, limitations in the ac c urac y of
breas t c anc er tes ting. I 'm us ing an often-
reported and fairly well- ac c epted es timate
of the c urrent perc entage of women aged
4 0 to 8 4 that have breas t c anc er.

L et's return now to the third ques tion in our lis t of important
ques tions to as k before interpreting the res ults of a medic al
tes t. I f a pers on s c ores a pos itive tes t res ult, how likely is the
pers on to have the dis eas e? O ut of 1 0 ,0 0 0 women who have a
breas t c anc er s c reening, 8 9 8 will rec eive a pos itive s c ore. For
7 9 0 of thos e women, the s c ore is wrong; they do not ac tually
have breas t c anc er. For 1 0 8 of thos e women, the tes t was right;
they do have c anc er. I n other words , if a pers on s c ores a
pos itive res ult, it is only 1 2 perc ent likely that they have the
dis eas e. T he mos t c ommon res ult for follow- up tes ting to a
pos itive mammogram is that the patient is , in fac t, c anc er free.

What about the ac c urac y of a negative res ult? O f the 9 ,1 0 2


women who will s c ore negative on the s c reening, 1 2 ac tually
have c anc er. T his is a relatively s mall 1 /1 0 of 1 perc ent, but the
tes ting will mis s thos e people altogether, and they will not
rec eive treatment.

Why It Works

M edic al s c reening ac c urac y us es a s pec ific applic ation of a


generalized approac h to c onditional probability attributed to
T homas Bayes , a philos opher and mathematic ian in the 1 7 0 0 s .
"I f this , then what are the c hanc es that..." is a c onditional
probability ques tion.

Bayes 's approac h to c onditional probabilities was to look at the


naturally oc c urring frequenc ies of events . T he bas ic formula for
es timating the c hanc e that one has a dis eas e if one has a
pos itive tes t res ult is :

E xpres s ed as c onditional probabilities , the formula is :

To ans wer the all- important ques tion in our breas t c anc er
example ("I f a woman s c ores a pos itive tes t res ult, how likely is
s he to have breas t c anc er? "), the mammogram equation takes
on thes e values :
Making Informed Decisions

M edic al tes ts are us ed to indic ate whether patients might have a


dis eas e or be at ris k for getting one. I dentifying the pres enc e or
abs enc e of a dis eas e s uc h as c anc er is a proc es s that us ually
has at leas t two s teps . I n s tep one, a patient is adminis tered a
s c reening tes t, typic ally a relatively s imple and noninvas ive tes t
that looks for indic ations that a pers on might have a c ertain
medic al c ondition. I f the res ult is pos itive, the s ec ond s tep is to
c onduc t a s ec ond tes t (or s eries of tes ts ) that is typic ally more
c omplex, invas ive, and expens ive, but als o muc h more ac c urate,
to c onfirm or dis c onfirm the original finding.

M edic al tes ts are not perfec tly reliable and valid. Tes t res ults
c an be wrong. T here are four pos s ibilities for anyone who
undergoes medic al tes ting. A patient might have the dis eas e
and the tes t indic ates this , or the patient does not have the
dis eas e and the tes t finds no pres enc e of it. I n thes e c as es , the
tes t worked right and the s c ores are valid.

C onvers ely, the tes t res ults might reflec t the oppos ite of the
true medic al c ondition, with a pos itive res ult wrongly indic ating
pres enc e of a dis eas e that is not there, or a negative res ult
wrongly indic ating that the patient is dis eas e- free. I n thes e
c as es , the tes t did not work right and the res ults are not valid.
T his table of outc omes is s imilar to the pos s ibilities when one
ac c epts or rejec ts a hypothes is in s tatis tic al dec is ion making
[H ac k #4 ].

Breas t c anc er s c reening is very good at finding breas t c anc er


when it is there to find. H owever, one drawbac k to s uc h a
s ens itive tes t for a low- inc idenc e dis eas e is that many more
people will be told that they might have the dis eas e than
ac tually do. T here is a trade- off in medic al tes ting between tes t
s ens itivity and tes t s pec ific ity. M ore s ens itive tes ts tend to
res ult in more fals e pos itives , but in s erious s ituations like life
and death, this s eems to be a res ult we c an live with.

See Also

G igerenzer, G . (2 0 0 2 ). Calculated ris ks . How to know when


numbers deceive you. N ew York: Simon and Sc hus ter.
Chapter 4. Beating the Odds
Why ris k more than you have to when you take ris ks ? C as ino
games require you to take s ome c hanc es , but this c hapter of
real- world s tat hac ks will help you keep your edge and perhaps
even overc ome the hous e's edge.

Start with Texas H old 'E m poker [H ac k #3 6 ]. (M aybe you've


heard of it? ) When you play poker [H ac k #3 7 ], play the odds
[H ac k #3 8 ].

M ake s ure, of c ours e, to always gamble s mart [H ac k #3 5 ],


regardles s of what you play, though when it c omes to the level of
ris k you take, s ome games [H ac ks #3 9 and #4 0 ] are better than
others [H ac k #4 1 ].

I f you like to make friendly wagers with friends or s trange wagers


with s trangers , you c an us e the power of s tatis tic s to win s ome
s urpris ingly winnable bar bets with c ards [H ac ks #4 2 and #4 4 ],
dic e [H ac k #4 3 ], or jus t about anything els e you c an think of
[H ac k #4 6 ], inc luding your friends ' birthdays [H ac k #4 5 ].

Speaking of weird gambling games (and I think we were), there


are s ome odd s tatis tic al quirks [H ac ks #4 7 and #4 9 ] you'll
need to know when you play them, even if it is jus t flipping a c oin
[H ac k #4 8 ].
Hack 35. Gamble Smart

Whatever the game, if money and chance are involved, there


are some basic gambling truths that can help the happy
statistician stay happy.

A lthough this c hapter is full of hac ks aimed at partic ular games ,


many of them games of c hanc e, there are a variety of tips and
tools that are us eful ac ros s the board for all gamblers . M uc h
mys tery, s upers tition, and mathematic al c onfus ion pervade the
world of gambling, and knowing a little more about the geography
of this world s hould help you get around. T his hac k s hows how to
gamble s marter by teac hing you about the following things :

T he G ambler's Fallac y, an intuitive yet fals e belief


s ys tem that has c os t many an otherwis e well- informed
gamer

C as inos and money

Sys tems , s ophis tic ated money management, and


wagering proc edures that do not work

The Gambler's Fallacy


D id you ever have s o many bad blac kjac k hands in a row that
you inc reas ed your bet, knowing that things were due to c hange
anytime now? I f s o, you s uc c umbed to the gambler's fallacy, a
belief that bec aus e there are c ertain probabilities expec ted in
the long run, a s hort- term s treak of bad luc k is likely to c hange
s oon.

T he gambler's fallac y is that there is a s winging pendulum of


c hanc e and it s wings in the region of bad outc omes for a while,
los es momentum, and s wings bac k into a region of good
outc omes for a while. T he problem with following this minds et is
that luc k, as it applies to games of pure c hanc e, is a s eries of
independent events , with eac h individual outc ome unrelated to
the outc ome that c ame before. I n other words , the loc ation of the
pendulum in a good region or bad region is unrelated to where it
was a s ec ond before, andhere's the rubthere is n't even a
pendulum. T he fic kle finger of fate pops randomly from pos s ible
outc ome to pos s ible outc ome, and the probability of it appearing
at any outc ome is the probability as s oc iated with eac h outc ome.
T here is no momentum. T his truth is often s ummarized as "the
dic e have no memory."

E xamples of beliefs c ons is tent with the gambler's fallac y


inc lude:

A s lot mac hine that has n't paid out in a while is due.

A poker player who has had nothing but bad hands all
evening will s oon get a s uper c olos s al hand to even
things out.

A los ing bas eball team that has los t the las t three
games is more likely to win the fourth.
Bec aus e rolling dic e and getting three 7 s in a row is
unlikely to oc c ur, rolling a fourth after having jus t rolled
three s traight mus t be bas ic ally impos s ible.

A roulette ball that has landed on eight red numbers in a


row pretty muc h mus t hit a blac k number next.

A void fallac ies like this at all c os ts , and gambling s hould c os t


you les s .

Casinos and Money

C as inos make money. O ne reas on they make a profit is that the


games thems elves pay off amounts of money that are s lightly
les s than the amount of money that would be fair. I n a game of
c hanc e, a fair payout is one that makes both partic ipants , the
c as ino and the player, break even in the long run.

A n example of a fair payout would be for c as inos to us e roulette


wheels with only 3 6 numbers on them, half red and half blac k.
T he c as ino would then double the money of thos e who bet on red
after a red number hits . H alf the time the c as ino would win, and
half the time the player would win. I n reality, A meric an c as inos
us e 3 8 numbers , two of them neither red nor blac k. T his gives
the hous e a 2 /3 8 edge over a fair payout. O f c ours e, it's not
unfair in the general s ens e for a c as ino to make a profit this way;
it's expec ted and part of the s oc ial c ontrac t that gamblers have
with the c as inos . T he truth is , though, that if c as inos made
money only bec aus e of this edge, few would remain in bus ines s .

T he s ec ond reas on that c as inos make money is that gamblers


do not have infinitely deep poc kets , and they do not gamble an
infinite period of time. T he edge that a c as ino has the 5 .2 6
perc ent on roulette, for exampleis only the amount of money
they would take if a gambler bet an infinite number of times . T his
infinite gambler would be up for a while, down for a while, and at
any given time, on average, would be down 5 .2 6 perc ent from her
s tarting bankroll.

What happens in real life, though, is that mos t players s top


playing s ometime, us ually when they are out of c hips . M os t
players keep betting when they have money and s top betting
when they don't. Some players , of c ours e, walk away when they
are ahead. N o player, though, keeps playing when they have no
money (and no c redit).

I magine that Table 4 - 1 repres ents 1 ,0 0 0 players of any c as ino


game. A ll players s tarted with $ 1 0 0 and planned to s pend an
evening (four hours ) playing the games . We'll as s ume a hous e
edge of 5 .2 6 perc ent, as roulette has , though other games have
higher or lower edges .

Table Fate of 1,000 hypothetical gamblers


Have Have
Time Mean
some lost all Stil
spent bankroll
money their playin
playing left
left money
After an
hour of 900 $94.74 100 900
play
After two
800 $94.74 200 800
hours of
play
After three
hours of 700 $94.74 300 700
play
After four
hours of 600 $94.74 400 600
play

I n this examplewhic h us es made- up but, I bet, c ons ervative


dataafter four hours , the players s till have $ 5 6 ,8 4 4 , the c as ino
has $ 4 3 ,1 5 6 , and from the total amount of money available, the
c as ino took 4 3 .1 6 perc ent. T hat's s omewhat more than the
offic ial 5 .2 6 perc ent hous e edge.

I t is human behaviorthe tendenc y of players to keep playingnot


the probabilities as s oc iated with a partic ular game, that makes
gambling s o profitable for c as inos . Bec aus e the hous e rules are
publis hed and reported, s tatis tic ians c an figure the hous e edge
for any partic ular game.

C as inos are not required to report the ac tual money they take in
from table games , however. Bas ed on the depth of the s hag
c arpet at L um's Travel I nn of L aughlin, N evada (my favorite
c as ino), though, I 'm gues s ing c as inos do okay. T he general
gambler's hac k here is to walk away after a c ertain period of
time, whether you are ahead or behind. I f you are luc ky enough
to get far ahead before your time runs out, c ons ider running out
of the c as ino.
Systems

T here are s everal general betting s ys tems bas ed on money


management and c hanging the amount of your s tandard wager.
T he typic al s ys tem s ugges ts inc reas ing your bet after a los s ,
though s ome s ys tems s ugges t inc reas ing your bet after a win.
A s all thes e s ys tems as s ume that a s treak, hot or c old, is
always more likely to end than c ontinue, they are s omewhat
bas ed on the gambler's fallac y. E ven when s uc h s ys tems make
s ens e mathematic ally, though, anytime wagers mus t inc reas e
until the player wins , the law of finite pocket s ize s abotages the
s ys tem in the long run.

H ere's a true s tory. O n my firs t vis it to a legal gambling


es tablis hment as a young adult, I was eager to us e a s ys tem of
my own devis ing. I notic ed that if I bet on a c olumn of 1 2
numbers at roulette, I would be paid 2 to 1 . T hat is , if I bet $ 1 0
and won, I would get my $ 1 0 bac k, plus another $ 2 0 . O f c ours e,
the odds were agains t any of my 1 2 numbers c oming up, but if I
bet on two s ets of 1 2 numbers , then the odds were with me. I
had a 2 4 out of 3 6 (okay, really 3 8 ) c hanc e of winningbetter than
5 0 perc ent!

I unders tood, of c ours e, that I wouldn't triple my money by


betting on two s ets of numbers . A fter all, I would los e half my
wager on the s et of 1 2 that didn't c ome up. I s aw that if I
wagered $ 2 0 , about two- thirds of the time I would win bac k $ 3 0 .
T hat would be a $ 1 0 profit. Furthermore, if I didn't win on the
firs t s pin of the wheel, I would bet on the s ame numbers again,
but this time I would double my bets ! (I am a s uper genius , you
agree? ) I f by s ome s lim c hanc e I los t on that s pin as well, I
would double my bet one more time, and then win all my money
bac k, plus make that 5 0 perc ent profit. To make a long s tory
s hort, I did jus t as I planned, los t on all three s pins and had no
money left for the res t of the long weekend and the 2 2 - hour drive
home.

T he s imples t form of this s ort of s ys tem is to double your bet


after eac h los s , and then whenever you do win (whic h you are
bound to do), you are bac k up a little bit. T he problem is that it is
typic al for a long s eries of los s es to happen in a row; thes e are
the normal fluc tuations of c hanc e. D uring thos e los ing s treaks ,
the c ons tant doubling quic kly eats up your bankroll.

Table 4 - 2 s hows the res ults of doubling after jus t s ix los s es in a


row, whic h c an happen frequently in blac kjac k, roulette, c raps ,
video poker, and s o on.

Table The "double after a loss" system


Loss Bet Total
number size expenditure
1 $5 $5
2 $10 $15
3 $20 $35
4 $40 $75
5 $80 $155
6 $160 $315

Six los s es in a row, even under an almos t 5 0 /5 0 game s uc h as


betting on a c olor in roulette, is very likely to happen to you if
you play for more than jus t a c ouple of hours . T he ac tual c hanc e
of a los s on this bet for one trial is 5 2 .6 perc ent (2 0 los ing
outc omes divided by 3 8 pos s ible outc omes ). For any s ix s pins
in a row, a player will los e all s pins 2 .1 1 perc ent of the time
(.5 2 6 x.5 2 6 x.5 2 6 x.5 2 6 x.5 2 6 x.5 2 6 ).

I magine 1 0 0 s pins in two hours of play. A player c an expec t s ix


los s es in a row to oc c ur twic e during that time. C ommonly, then,
under this s ys tem, a player is forc ed to wager 3 2 times the
original bet, jus t to win an amount equal to that original bet. O f
c ours e, mos t of the time (5 2 .6 perc ent), when there have been
s ix los s es in a row, there is then a s eventh los s in a row!

Sys tems do exis t for gambling games in whic h players c an make


informed s trategic dec is ions , s uc h as blac kjac k (with c ard
c ounting) and poker (reading your opponent), but in games of
pure c hanc e, s tatis tic ians have learned to expec t the expec ted.
Hack 36. Know When to Hold 'Em

In Texas Hold 'Em, the "rule of f our" uses simple counting to


estimate the chance that you are going to win all those chips.

Texas H old 'E m N o L imit P oker is everywhere. A s I write this , I


c ould point my s atellite dis h to E SP N , E SP N 2 , E SP N C las s ic s ,
FO X Sports , Bravo, or E ! and s ee profes s ional poker players ,
luc ky amateurs , major c elebrities , minor c elebrities , and even
(L ord help us , on the Speed c hannel) N A SC A R drivers playing
this s imple game.

You probably play yours elf, or at leas t watc h. T he mos t popular


vers ion of the game is s imple. A ll players s tart with the s ame
amount of c hips . When their c hips are gone, s o are they. E very
round, players get two c ards eac h that only they (and the
patented tiny little poker table c ameras ) s ee. T hen, three
c ommunity c ards are dealt fac e up. T his is the flop. A nother
c ommunity c ard is then dealt fac e up. T hat's the turn. Finally,
one more c ommunity c ard, the river, is dealt fac e up. Betting
oc c urs at eac h s tage. P layers us e any five of the s even c ards
(five c ommunity c ards , plus the two they have in their hands ) to
make the bes t five- c ard poker hand they c an. T he bes t hand
wins .

Bec aus e s ome c ards are fac e up, players have information. T hey
als o know whic h c ards they have in their own hands , whic h is
more information. T hey als o know the dis tribution of all c ards in
a s tandard 5 2 - c ard dec k. A ll this information about a known
dis tribution of values [H ac k #1 ] makes Texas H old 'E m a good
opportunity to s tat hac k all over the plac e [H ac ks #3 6 and
#3 8 ].

O ne partic ularly c ruc ial dec is ion point is the round of betting
right after the flop. T here are two more c ards to c ome that might
or might not improve your hand. I f you don't already have the
nuts (the bes t pos s ible hand), it would be nic e to know what the
c hanc es are that you will improve your hand on the next two
c ards . T he rule of four allows you to eas ily and fairly ac c urately
es timate thos e c hanc es .

How It Works

T he rule of four works like this . C ount the number of c ards


(without moving your lips ) that c ould c ome off of the dec k that
would help your hand. M ultiply that number by four. T hat produc t
will be the perc ent c hanc e that you will get one or more of thos e
c ards .

Example 1

You have a J ac k of D iamonds and a T hree of D iamonds . T he flop


c omes King of C lubs , Six of D iamonds , and Ten of D iamonds . You
have four c ards toward a flus h, and there are nine c ards that
would give you that flus h. O ther c ards c ould help you, c ertainly
(a J ac k would give you a pair of J ac ks , for example), but not in a
way that would make you feel good about your c hanc es of
winning.

So, nine c ards will help you. T he rule of four es timates that you
have a 3 6 perc ent c hanc e of making that flus h on either the turn
or the river (9 x4 = 3 6 ). So, you have about a one out of three
c hanc e. I f you c an keep playing without ris king too muc h of your
s tac k, you s hould probably s tay in the hand.

Example 2

You have an A c e of D iamonds and a Two of C lubs . T he flop


brings the King of H earts , the Four of Spades , and the Seven of
D iamonds . You c ould c ount s ix c ards that would help you: any of
the three A c es or any of the three Twos . A pair of twos would
likely jus t mean trouble if you bet until the end, s o let's s ay
there are three c ards , the A c es , that you hope to s ee. You have
jus t a 1 2 perc ent c hanc e (3 x4 = 1 2 ). Fold 'em.

Why It Works

T he math involved here rounds off s ome important values to


make the rule s imple. T he thinking goes like this . T here are
about 5 0 c ards left in the dec k. (M ore prec is ely, there are 4 7
c ards that you haven't s een). When drawing any one c ard, your
c hanc es of drawing the c ard you want [H ac k #3 ] is that number
divided by 5 0 .

I know, it's really 1 out of 4 7 . But I told


you s ome things have been s implified to
make for the s imple mnemonic "the rule of
four."
Whatever that probability is , the thinking goes , it s hould be
doubled bec aus e you are drawing twic e.

T his als o is n't quite right, bec aus e on the


river the pool of c ards to draw from is
s lightly s maller, s o your c hanc es are
s lightly better.

For the firs t example, the rule of four es timates a 3 6 perc ent
c hanc e of making that flus h. T he ac tual probability is 3 5
perc ent. I n fac t, the es timated and ac tual perc ent c hanc e us ing
the rule of four tends to differ by a c ouple perc entage points in
either direc tion.

Other Places It Works

N otic e that this method als o works with jus t one c ard left to go,
but in that c as e, the rule would be c alled the rule of two. A dd up
the c ards you want and multiply by two to get a fairly ac c urate
es timate of your c hanc es with jus t the river remaining. T his
es timate will be off by about two perc entage points in mos t
c as es , s o s tatis tic ally s avvy poker players c all this the rule of
two plus two.

Where It Doesn't Work

T he rule of four will be off by quite a bit as the number of c ards


that will help you inc reas es . I t is fairly ac c urate with 1 2 outs
(c ards that will help), where the ac tual c hanc e of drawing one of
thos e c ards is 4 5 perc ent and the rule of four es timate is 4 8
perc ent, but the rule s tarts to overes timate quite a bit when you
have more than 1 2 c ards that c an help your hand.

To prove this to yours elf without doing the c alc ulations , imagine
that there are 2 5 c ards (out of 4 7 ) that c ould help you. T hat's a
great s pot to be in (and right now I c an't think of a s c enario that
would produc e s o many outs ), but the rule of four s ays that you
have a 1 0 0 perc ent c hanc e of drawing one of thos e c ards . You
know that's not right. A fter all, there are 2 2 c ards you c ould
draw that don't help you at all. T he real c hanc e is 7 9 perc ent. O f
c ours e, making a mis c alc ulation in this s ituation is unlikely to
hurt you. U nder either es timate, you'd be nuts to fold.
Hack 37. Know When to Fold 'Em

In Texas Hold 'Em, the concept of pot odds provides a powerf ul


tool f or deciding whether to call or f old.

I f you watc h any poker on T V, you quic kly pic k up a boatload of


jargon. You'll hear about big s lick and bullets and all-in and tilt.
You'll als o hear dis c us s ions about pot odds , as in, "H e might c all
here, not bec aus e he thinks he has the bes t hand, but bec aus e
of the pot odds ."

When the pot odds are right, you s hould c all a hand even when
the odds are that you will los e. So, what are pot odds and why
would I ever put more money into a pot that I am likely to los e?

Pot Odds

P ot odds are determined by c omparing the c hanc e that you will


win the pot to the amount of c hips you would win if you did win
the pot. For example, if you es timate that there is a 5 0 perc ent
c hanc e that you will win a pot, but the pot is big enough that
winning it would win you more than double the c os t of c alling the
bet in front of you, then you s hould c all.

To s ee how pot odds works in prac tic e, here is a s c enario with


four players : T helma, L ouis e, M ike, and V inc e. A s s hown in
Table 4 - 3 , T helma is in the bes t s hape before the flop.
T he tables that follow s how the dec is ions
eac h player makes bas ed on the pot odds
at eac h point in a round. Read the following
tables left to right, following eac h c olumn
all the way down, to s ee what T helma
thinks and does , then what L ouis e thinks
and does , and s o on.

Table Players' starting hands


Player Thelma Louise Mike Vince
4 King
Ace Clubs, 2 Clubs, 4 Hearts, Diamonds,
Cards
Ace Hearts Clubs 5 10
Spades Diamonds
Opening
50 50 50 50
bet

T hen c omes the flop: A c e Spades , 3 D iamonds , 6 D iamonds .


Table 4 - 4 s hows the revis ed analys is of the players ' pos itions .
A fter the flop, three of them are hoping to improve their hands ,
while one of them, T helma, would be s atis fied with no
improvement of her hand, thinking s he has the bes t one now.
T helma is driving the betting, and the other three players are
dec iding whether to c all.

Table Analysis after the flop


Player Thelma Louise Mike Vince
Any of
Any of
Needed Any of four 2s
nine
cards four 5s or four
diamonds
7s
Chance of
16 32 36
getting
percent percent percent
card
Current
200 250 250 300
pot
Cost to call
as 20 20 17
percentage percent percent percent
of pot
Action Bet 50 Fold Call 50 Call 50

Table 4 - 4 s hows the us e of pot odds after the flop. T helma has a
pair of ac es to s tart and hits the third ac e on the flop.
C ons equently, s he begins eac h round by betting. T he other
players who have yet to hit anything mus t dec ide whether to
s tic k around and hope to improve their hands into s trong, likely
winners .
P ot odds c ome into play primarily when making the dec is ion
whether to s tic k around or fold. L ouis e needs a five to make her
s traight, and s he es timates a 1 6 perc ent c hanc e of getting that
5 s omewhere in the next two c ards . H owever, with that pot
c urrently at $ 2 5 0 and a $ 5 0 rais e from T helma, whic h s he would
have to c all, L ouis e would have to pay 2 0 perc ent of the pot.
T his is a 2 0 perc ent c os t c ompared with a 1 6 perc ent c hanc e of
winning the pot. T he ris k is greater than the payoff, s o L ouis e
folds . M ike and V inc e, however, have more outs , s o pot odds
dic tate that they s tic k around.

T hen c omes the turn: the J ac k of C lubs . A s s hown in Table 4 - 5 ,


after the turn, with only one c ard left to go, M ike's pot odds are
no longer better than his c hanc es of drawing a winning c ard, and
he folds . T hough V inc e s tarts out with a potentially better hand
than M ike, he too eventually folds when the pot odds indic ate he
s hould.

Table Analysis after the turn


Player Thelma Louise Mike Vince
Same Same
Needed
as as
cards
before before
Chance of
18 20
getting
percent percent
card
Current pot 350 450 450
Cost to call
as 22 22
percentage percent percent
of pot
Action Bet 100 Fold Fold

L et's as s ume that the players are us ing only pot odds to make
their dec is ions , ignoring for the s ake of illus tration that they are
probably trying to get a read on the other players (e.g., who c ould
bluff, rais e, and s o on). By the way, players are c alc ulating the
c hanc e that they will get a c ard to improve their hand us ing the
rule of four and the rule of 2 + 2 [H ac k #3 6 ].

Why It Works

I magine a game that c os ts a dollar to play. P retend the rules are


s uc h that half the time you will win and get paid three dollars .
T he other half of the time you would los e one dollar and gain two
dollars . O ver time, if you kept playing this c razy game, you
would make a whole lot of money.

I t is the s ame s ort of thinking that governs the us e of pot odds


in poker. With a 3 6 perc ent c hanc e of making a flus h, a perfec tly
fair bet would be to wager 3 6 perc ent of the pot. You would get
your flus h 3 6 perc ent of the time and break even over the long
run. I f you c ould play a game in whic h you c ould pay les s than
3 6 perc ent of the pot and s till win 3 6 perc ent of the time in the
long run, you s hould play that c razy game, right? Well, every
time you find yours elf in a s ituation in whic h the pot odds are
better than the proportion of the pot you have to wager, you have
an opportunity to play jus t s uc h a c razy game. Trus t the
s tatis tic s . P lay the c razy game.
Where Else It Works

E xperienc ed players not only make us e of pot odds to make


dec is ions about folding their hands , but they even make us e of a
s lightly more s ophis tic ated c onc ept known as implied pot odds .
I mplied pot odds are bas ed not on the proportion of the c urrent
pot that a player mus t c all, but on the proportion of the pot total
when the betting is c ompleted for that betting round.

I f players have yet to ac t, a player who is undec ided about


whether to s tay in bas ed on pot odds might expec t other players
to c all down the line. T his inc reas es the amount of the final pot,
inc reas es the amount the player would win if s he hit one of her
wis h c ards , and inc reas es the ac tual pot odds when all the
wagering is done.

T he phras e "implied pot odds " is als o s ometimes us ed to refer


to the relative c os t of betting c ompared to the final, total pot
after all rounds of betting have been c ompleted. I have als o
heard the term "pot odds " us ed to des c ribe the idea that if you
happen to " hit the nuts " (get a s trong hand that's unlikely to be
beaten) or c los e to it, then you are likely to win a pot muc h
bigger than the typic al pot. Some players s pend a lot of energy
and a lot of c alls jus t hoping to hit one of thes e s uper hands and
really c lean up.

I mplied pot odds works like this . I n the s c enario in Table 4 - 3 ,


M ike might have c alled after Fourth Street (the fourth c ard
revealed), antic ipating that V inc e would als o c all. T his would
have inc reas ed the final pot to 6 5 0 , making M ike's c ontribution
that round only 1 5 perc ent and jus tifying his c all.

I nteres tingly, if V inc e had been betting into a s lightly larger pot
that c ontained M ike's c all, the pot odds for V inc e's 1 0 0 - c hip
c all would then have dropped to 1 8 perc ent and V inc e might
have c alled. I n fac t, if M ike were a s uper genius - type player, he
well c ould have c alled on the turn knowing that would c hange the
pot odds for V inc e and therefore enc ourage him to c all. Real- life
profes s ional poker players who are really, really goodreally do
think that way s ometimes .

Where It Doesn't Work

Remember that pot odds are bas ed on the as s umption that you
will be playing poker for an infinite amount of time. I f you are in a
no- limit tournament format, though, where you c an't dig into your
poc kets , you might not be willing to ris k all or mos t of your c hips
on your faith about what will happen in the long run.

T he other problem with bas ing life and death dec is ions on pot
odds is that you are treating a "really good hand" as if it were a
guaranteed winner. O f c ours e, it's not. T he other players may
have really good hands , too, that are better than yours .
Hack 38. Know When to Walk Away

In Texas Hold 'Em, when you are "short-stacked," you have only
a couple of choices: go all-in right now or go all-in very soon. A s
you might have guessed, knowing when to make your last stand
is all about the odds.

I hear the T V poker c ommentators talking about how "eas y" it is


in Texas H old 'E m tournaments to play when you are s hort-
s tacked. T hey mean it is eas y bec aus e you don't have many
options from whic h to c hoos e.

T he term "s hort- s tac ked" c an be us ed in a c ouple of different


ways . Sometimes , it is us ed to refer to whoever has the fewes t
c hips at the table. U nder this us e of the term, even if you have
thous ands of c hips and c an afford to pay a hundred antes and big
blinds , you are s hort- s tac ked if everyone els e has more c hips .

A better definition, whic h is more applic able to s tatis tic s - bas ed


dec is ion making, is that you are s hort- s tac ked when you c an
only afford to pay the antes and blinds for a few more times
around the table. U nder this definition, there is mounting
pres s ure to bet it all and hope to double or triple up and get bac k
in the game. I prefer this us e of the term bec aus e without
pres s ure to play, being "s hort- s tac ked" is not a partic ularly
meaningful s ituation.

I t does n't feel eas y, though, does it, when you are s hort- s tac ked
and have to go all-in (bet everything you have)? I t feels very,
very hard for two reas ons :
You are probably not going to win the tournament. You
realize that you are down to very few c hips and would
have to double up s everal times to get bac k in the game.
Realis tic ally, you doubt that you have muc h of a c hanc e.
T hat's depres s ing, and any dec is ion you make when you
are s ad is diffic ult.

O ne mis take and you are out. T here is little margin for
error, and it is hard to pull the trigger in s uc h a high-
s takes s ituation.

A pplying s ome bas ic s tatis tic al princ iples to the dec is ion might
help make you feel better. A t leas t you'll have s ome
nonemotional guidelines to follow. When you los e (and you s till
probably will; you're s hort- s tac ked, after all), now you c an blame
me, or the fates , and not yours elf.

Recognizing a Short-Stacked Situation

I n tournament s ettings , at s ome point you often will have s o few


c hips that you will run out s oon. U nles s you bet and win s oon,
you will be blinded outthe c os t of the mandatory bets will bleed
you dry.

H ow few c hips mus t you have to be s hort- s tac ked? E ven if we


define s hort- s tac ked as having s ome multiple of the big blind
(the larger of two forc ed bets that you mus t make on a rotating
bas is ), how many of thos e big blinds you need is a matter of
s tyle, and there is no s ingle c orrec t number. H ere are s ome
different pers pec tives on how many c hips you mus t have in front
of you to c ons ider yours elf s hort- s tac ked.
Twelve times the big blind or less

T hough you c ould play quite a while longer without running out of
c hips , you will want to bet on any dec ent hand. You hope to win
s ome blinds here. T he more blinds you win, the longer you c an
wait for killer hands . I f you are rais ed, at leas t c ons ider
res ponding with an all- in.

P layers who s tart to think of thems elves as s hort- s tac ked in


this pos ition wis h to go all- in now on a good hand, rather than
being forc ed to go all- in on a medioc re hand later on. A nother
advantage of s tarting to take ris ks is that an announc ement of
"all- in" will s till pull s ome weight here. You will have enough
c hips to make s omeone think twic e before they c all you. L ater
on, your mis erable little s tac k won't be enough to pus h anyone
around.

C hoos e your opponent wis ely, if you c an,


when you go all- in and want a fold in
res pons e. Your rais e of all- in agains t
another s mall s tac k will be muc h more
powerful than the s ame tac tic agains t a
mons ter s tac k. By the s ame token, if you
want a c all, don't hes itate to go all- in
agains t players with tons of c hips . T hey
will be more than happy to double you up.
Eight times the big blind or less

I n any pos ition, whether you are on the button, in the big blind,
or the firs t to bet, c ons ider announc ing all- in with any top- 1 0
hand. You s till have enough c hips here to s c are off s ome players ,
es pec ially thos e with s imilarly s ized s tac ks .

You are s tarting to get low enough, though, that you really want
to be c alled. I f you c an play s ome low pairs c heaply, try it, but
bail out if you don't get three of a kind in the flop. You need to
keep as many big blinds as you c an to c oas t on until you get
that all-in opportunity.

H ere are the 1 0 hands that are the mos t likely to double you up:

A pair of A c es , Kings , Q ueens , J ac ks , or 1 0 s

A c e- King, A c e- Q ueen, A c e-J ac k, or King- Q ueen of the


s ame s uit

A c e- King of different s uits

Four times the big blind or less

A t this point, you need to go all- in, even on hands that have a
more than 5 0 perc ent c hanc e of los ing. P urpos efully making a
bad wager s eems c ounterintuitive, but you are fighting agains t
the ever- s hrinking bas e amount you hope to double up. I f you
wait and wait until you have c los e to a s ure thing, whatever
s tac k remains will have to be doubled a few extra times to get
you bac k.

A form of pot odds [H ac k #3 7 ] kic ks in at this point. I f you pas s


up a 2 5 perc ent c hanc e of winning while waiting for a 5 0 perc ent
c hanc e, you might be able to win only half as muc h when (and if)
you ever get to play the better hand. D efinitely go all- in on any
pair, an A c e and anything els e, any fac e c ard and a good kic ker,
or s uited c onnec tors .

A good rule of thumb when you're very,


very s hort-s tac ked (i.e., your total c hips
are fewer than four times the big blind) is
to bet it all as s oon as you get a hand that
adds up to 1 8 or better. Kings c ount as 1 3 ,
Q ueens 1 2 , J ac ks 1 1 , and the res t are
their fac e value. A c es c ount as 1 4 , but
you are already going all- in with an A c e-
anything, s o that does n't matter. E ighteen-
point hands inc lude 1 0 - 8 , J ac k- 7 , Q ueen-
6 , and King- 5 .

Statistical Decision Making


T he s tatis tic al ques tion that determines when you s hould make
your movewhether it is announc ing all- in or, at leas t, making a
dec is ion to be pot committed (s o many c hips in the pot that you
will go all- in if pus hed)is , "A m I likely to get a better hand before
I run out of c hips ? "

I 'm going to group 5 0 dec ent, playable s tarting Texas H old 'E m
poker hands , hands that give you a c hanc e to win agains t a
s mall number of opponents . I 'll be us ing three groupings , s hown
in Tables 4 - 6 , 4 - 7 , and 4 - 8 . While different poker experts might
quibble a bit about whether a given hand is good or jus t okay,
mos t would agree that thes e hands are all at leas t playable and
s hould be c ons idered when s hort- s tac ked.

By the way, thes e hands are not


nec es s arily in order of quality within eac h
grouping.

Table Ten great starting hands


Different
Pairs Same suit
suits
Ace-AceKing-
Ace-King Ace-
KingQueen-
Queen Ace-Jack Ace-King
QueenJack-Jack10-
King-Queen
10
Table Fifteen good starting hands
Different
Pairs Same suit
suits
Ace-
Ace-Ten King-Jack King-
9-98- QueenAce-
10Queen-JackQueen-10Jack-
87-7 JackKing-
10Jack-910-99-8
Queen

Table Twenty-five okay starting hands


Different
Pairs Same suit
suits
Ace-10King-
Ace-9Ace-8Ace-7Ace-
JackQueen-
6Ace-5Ace-4Ace-3Ace-
6-65-5 JackKing-
2King-9Queen-910-89-
10Queen-
78-78-67-66-55-4
10Jack-10
When you are s hort- s tac ked and the blinds and antes are
c oming due, you know you have a c ertain number of hands left
before you have to make a move. Table 4 - 9 s hows the
probability that you will be dealt a great, good, or okay hand over
the next c ertain number of deals .

Table Chance of getting a playable hand


Hand Next In 5 In 10 In 15 In 20
quality hand deals deals deals deals
4 20 36 49 59
Great
percent percent percent percent percent
7 29 50 65 75
Good
percent percent percent percent percent
11 46 70 84 91
Okay
percent percent percent percent percent
Okay or 22 72 92 98 99
better percent percent percent percent percent

I c alc ulated the probabilities for Table 4 - 9


by firs t figuring the probabilities for any
s pecific pair (you are jus t as likely to get a
pair of A c es as a pair of 2 s ): .0 0 4 5 . I then
figured the probabilities for getting any two
s pecific different c ards that are the s ame
s uit (.0 0 3 ), and the c hanc es of getting any
two s pecific different c ards that are not the
s ame s uit (.0 0 9 ). T hen, for eac h
c ategorygreat, good, or okayI multiplied
the appropriate probability by the number
of pairs , unpaired s uited hands , and s o on,
in that c ategory. I then c alc ulated the
c hanc e of one of thes e hands not hitting
ac ros s the given number of opportunities
and s ubtrac ted that value from 1 to get the
values for eac h c ell in the table.

H ere is how to us e Table 4 - 9 . I magine you are s hort- s tac ked


and have jus t been dealt a good hand. I f you think you really
have to go all- in s ometime during the next five hands , there is
only a 2 0 perc ent c hanc e that you will be dealt a better hand.
You s hould probably s take everything on this good hand.

I f you c an hang on for 2 0 more deals , there is a greater than 5 0


perc ent c hanc e that you will get a gangbus ter hand, s o if you
want to be c ons ervative, you c an lay thes e c ards down for now.
M ore c ommonly, s hort- s tac ked players c ons ider going all- in with
a hand that is not even a top- 5 0 hands omething like King- 8
uns uited, for example. U s ing the probabilities in Table 4 - 9 , you
might s afely lay it down and hope for a better hand in the next
five hands . T here is a 7 2 perc ent c hanc e you will get it.

Finally, imagine that you have jus t a few hands left bec aus e the
blinds are s hrinking your s tac k down to nothing. You look down
and s ee a dec ent hand, an okay hand, s uc h as 8 - 7 in the s ame
s uit. Table 4 - 9 allows you to ans wer the big ques tion: is it likely
that your very next hand will be better than this one? T here is
about an 1 1 perc ent c hanc e of getting a good or great hand next.
So, no, it is unlikely you will improve. Stake your future on this
hand.

Getting Your Mind Right

We talked earlier about why it is s o emotionally diffic ult to play


when s hort- s tac ked. H ere are s ome ps yc hologic al tips to fight
the pain of being c aught between a roc k and a hard plac e:

Be realis tic

I n blac kjac k, when a player hits her 1 6 agains t the


dealer's 7 , s he knows s he is likely to bus t. She does it
anyway bec aus e the dealer is likely to have a 1 0 - c ard
down, and it gives her the bes t c hanc es in an almos t no-
win s ituation. She takes pleas ure in knowing s he did all
s he c ould to give hers elf the bes t c hanc es of s urviving.
T he s ame thinking applies here: take pleas ure in
knowing you gave yours elf the bes t c hanc es to c ome
bac k and win.

E njoy the all-in experienc e

T here is nothing more exc iting than having it all on the


line. Bec aus e you had no real c hoic e about going all- in,
jus t relax and enjoy it the bes t you c an. N o player will
c hide you about doing "s uc h a s tupid thing," bec aus e
you jus t did the s martes t thing you c ould.

Take control

To avoid feeling forc ed to do s omething you don't want to


do, s tart your c omebac k attempt before you have to.
P lay to avoid the s hort- s tac ked s ituation by s tarting to
make your moves when you s till have 1 0 to 1 2 times the
big blind in c hips . You have a lot more c hoic es at this
point than you will have later on, and s o you c an play
with more s ubtlety, bas ing your bet on pos ition,
opponents , tells , and s o on. T he s maller your s tac k gets ,
the les s power you have to c ontrol your own des tiny.
Hack 39. Lose Slowly at Roulette

Roulette has so many pretty colors and shiny objects that


kittens love it. Plus, you'll look pretty cool playing it. But in the
long run, you'll lose money, and with your cat allergy and all....

L ike mos t games in a c as ino, roulette is a game of pure c hanc e.


N o one has any s kill when it c omes to predic ting whic h of the 3 7
(E uropean- s tyle) or 3 8 (U .S.- s tyle) partitioned s ec tions the tiny
ball will end up in. T he bes t a player c an do is know the odds ,
manage his money, and as s ume going in that he will los e.

O f c ours e, he might get luc ky and win s ome money, whic h would
be dandy, but the L aw of Big N umbers [H ac k #2 ] mus t be
obeyed. I n the long run, he is mos t likely to have les s money
than if he had never played at all. I n fac t, if he plays an infinite
amount of time, he is guaranteed to los e money. (M os t roulette
players play for a period of time s omewhat les s than infinity, of
c ours e.) To extend your amount of playing time, there is
important s tatis tic al information you s hould know about this
game with the s pinning wheel, the orbiting ball, and the blac k and
red layout.

Basic Wagers

Figure 4 - 1 s hows the betting layout of a typic al roulette game.


T his is an A meric an- s tyle layout, whic h means there are two
green numbers , 0 and 0 0 , whic h do not pay off any bets on red
and blac k or odd and even. E uropean- s tyle roulette wheels have
only one green number, 0 , whic h c uts in half the hous e advantage
c ompared to U .S. c as inos .

Figure 4-1. Typical roulette betting layout

P layers c an bet in a large variety of ways , whic h is one reas on


roulette is s o popular in c as inos . For example, a player c ould
plac e one c hip over a s ingle number, touc hing two numbers , on a
c olor, adjac ent to a c olumn of 1 2 numbers , and s o on. L ike any
other probability ques tion, the c hanc e of randomly getting the
des ired outc ome is a func tion of the number of des ired outc omes
(winning) divided by the total number of outc omes .

T here are 3 8 s pac es on the wheel and, bec aus e all 3 8 pos s ible
outc omes are equally likely, the c alc ulations are fairly
s traightforward. Table 4 - 1 0 s hows the types of bets players c an
make, the information nec es s ary to c alc ulate the odds of winning
for a s ingle s pin of the wheel and a one- dollar bet, the ac tual
amounts the c as ino pays out, and the hous e advantage.

Table Statistics of roulette for e


Number
Type Number
of Ca
of of losing Odds
winning p
wager outcomes
outcomes
Single
1 37 37 to 1 $35
number
36 to 2
Two
2 36 or18 to $17
numbers
1
20 to
Single 18
18 20 or1.11 $1
color
to 1
20 to
Even or
18 20 18 $1
odd
or1.11
to 1
26 to
Twelve 12
12 26 $2
numbers or2.17
to 1

T he hous e advantage is figured by firs t determining what the


c as ino s hould pay bac k for eac h dollar bet if there were no
advantage to the c as ino. T he fair paybac k would be to give the
winner an amount of money equal to the ris k taken. T he amount
of ris k taken is , es s entially, the number of pos s ible los ing
outc omes . T his ac tual amount paid to the winner is then
s ubtrac ted from the amount that s hould be paid if there were no
hous e advantage. T hes e "extra" dollars that the hous e keeps is
divided by the proportion of total outc omes to winning outc omes .
I f there are no extra dollars , the game is evenly matc hed
between player and c as ino and the hous e edge is 0 perc ent.

I f you s tudy the s tatis tic s of roulette in Table 4 - 1 0 , a c ouple of


c onc lus ions are apparent. Firs t, the c as ino makes its profit by
pretending that there are only 3 6 numbers on a roulette wheel
(i.e., only 3 6 pos s ible outc omes ) and pays out us ing that
pretend dis tribution.

Sec ond, regardles s of the type of wager that is made at a


roulette wheel, the hous e edge is a c ons tant 5 .2 6 perc ent. T his
is true exc ept for one obs c ure wager, whic h is allowed at mos t
c as inos . P layers are often allowed to bet on the two zeros and
their adjac ent numbers , 1 , 2 and 3 , for a total of five numbers .
T his is done by plac ing a c hip to the s ide, touc hing both the 0
and the 1 . I 'd tell you more about c hec king with the pers on who
s pins the wheel to make s ure they take this wager, and s o on,
exc ept that this is the wors t bet at the roulette table and no
s tatis tic ian would advis e it. C as inos who allow this bet pay out
as if it were a bet on s ix numbers . So, the c as ino's us ual edge of
5 .2 6 perc ent is even larger here: 7 .8 9 perc ent, as s hown in
Table 4 - 1 1 .

Table Statistics for betting on five numbers in rou


Number
Type Number
of Ca
of of losing Odds
winning p
wager outcomes
outcomes
33 to 5
Five
5 33 or6.6 to $6
numbers
1

Why It Works

Roulette's popularity is bas ed partly on the fac t that s o many


different types of wagers are pos s ible. A gambler with a lot of
c hips c an s pread them out all over the table, with a wide variety
of different bets on different numbers and c ombinations of
numbers . A s long as s he avoids the wors t bet at the table (five
numbers ), s he c an res t as s ured that the advantage to the hous e
will be the s ame hones t 5 .2 6 perc ent for eac h of her bets . I t is
one les s thing for the gambler to worry about.

T he fac t that there is s uc h a large variety of bets that c an be


plac ed on a s ingle layout is no luc ky happens tanc e, though. T he
dec is ion to us e 3 6 numbers was a wis e one, and no doubt it was
made all thos e years ago bec aus e of the large number of fac tors
that go into 3 6 . T hirty- s ix c an be evenly divided by 1 , of c ours e,
but als o by 2 , 3 , 4 , 6 , 9 , 1 2 , and 1 8 , making s o many s imple
bets pos s ible.
Hack 40. Play in the Black in Blackjack

Perhaps the most potentially prof itable application of statistics


hacking is at the blackjack table.

I n blac kjac k, the objec t of the game is to get a hand of c ards


that is c los er to totaling 2 1 points (without going over) than the
dealer's c ards . I t's a s imple game, really. You s tart with two
c ards and c an as k for as many more as you would like. C ards are
worth their fac e value, with the exc eption of fac e c ards , whic h
are worth 1 0 points , and A c es , whic h c an be either 1 or 1 1 .

You los e if you go over 2 1 or if the dealer is c los er than you


(without going over). T he bets are even money, with the
exc eption of getting a blackj ack: two c ards that add up to 2 1 .
Typic ally, you get paid 3 - to- 2 for hitting a blac kjac k. T he dealer
has an advantage in that s he does n't have to ac t until after you.
I f you bus t (go over 2 1 ), s he wins automatic ally.

Statis tic ians c an play this game wis ely by us ing two s ourc es of
information: the dealer's fac e- up c ard and the knowledge of
c ards previous ly dealt. Bas ic s trategies bas ed on probability will
let s mart players play almos t even agains t the hous e without
having to pay muc h attention or learn c omplic ated s ys tems .
M ethods of taking into ac c ount previous ly dealt c ards are
c ollec tively c alled counting cards , and us ing thes e methods
allows players to have a s tatis tic al advantage over the hous e.
U .S. c ourts have ruled that c ard c ounting
is legal in c as inos , though c as inos wis h
you would not do it. I f they dec ide that you
are c ounting c ards , they might as k you to
leave that game and play s ome other
game, or they might ban you from the
c as ino entirely. I t is their right to do this .

Basic Strategy

Firs t things firs t. Table 4 - 1 2 pres ents the proper bas ic


blac kjac k play, depending on the two- c ard hand you are dealt
and the dealer's up c ard. M os t c as inos allow you to s plit your
hand (take a pair and s plit it into two different hands ) and double
down (double your bet in exc hange for the limitation of rec eiving
jus t one more c ard). Whether you s hould s tay, take a c ard, s plit,
or double down depends on the likelihood that you will improve or
hurt your hand and the likelihood that the dealer will bus t.

Table Basic blackjack strategy against dealer's


up card
Your Double
Hit Stay Split
hand down
5-8 Always
9 2, 7-A 3-6
10-11 10 or A 2-9
2, 3, 7-
12 4-6
A
13-16 7-A 2-6
17-20 Always
2, 2 8-A 2-7
3, 3 2, 8-A 3-7
4, 4 2-5, 7-A 6
5, 5 10 or A 2-9
6, 6 7-A 2-6
7, 7 8-A 2-7
8, 8 Always
2-6, 8, 7, 10,
9, 9
9 A
10, 10 Always
A, A Always
A, 2 2-5, 7-A 6
A, 3 or A, 4 2-4, 7-A 5 or 6
2 or 3,
A, 5 4-6
7-A
A, 6 2, 7-A 3-6
A, 7 9-A 2, 7-A 3-6
A, 8 or 9 or
Always
10
I n Table 4 - 1 2 , "Your hand" is the two- c ard
hand you have been dealt. For example,
"5 - 8 " means your two c ards total to a 5 ,
6 , 7 , or 8 . "A " means A c e. A blank table
c ell indic ates that you s hould never
c hoos e this option, or, in the c as e of
s plitting, that it is not even allowed.

T he remaining four c olumns pres ent the typic al options and what
the dealer's c ard s hould be for you to c hoos e eac h option. A s
you c an s ee, for mos t hands there are only a c ouple of options
that make any s tatis tic al s ens e to c hoos e. T he table s hows the
bes t move, but not all c as inos allow you to double- down on jus t
any hand. M os t, however, allow you to s plit any matc hing pair of
c ards .

Why It Works

T he probabilities as s oc iated with the dec is ions in Table 4 - 1 2


are generated from a few c entral rules :

T he dealer is required to hit until s he makes it to 1 7 or


higher.

I f you bus t, you los e.

I f the dealer bus ts and you have not, you win.

T he primary s trategy, then, is to not ris k bus ting if the dealer is


likely to bus t. C onvers ely, if the dealer is likely to have a nic e
hand, s uc h as 2 0 , you s hould try to improve your hand. T he
option that gives you the greates t c hanc e of winning is the one
indic ated in Table 4 - 1 2 .

T he rec ommendations pres ented here are


bas ed on a variety of c ommonly available
tables that have c alc ulated the
probabilities of c ertain outc omes
oc c urring. T he s tatis tic s have either been
generated mathematic ally or have been
produc ed by s imulating millions of
blac kjac k hands with a c omputer.

H ere's a s imple example of how the probabilities battle eac h


other when the dealer has a 6 s howing. T he dealer c ould have a
1 0 down. T his is ac tually the mos t likely pos s ibility, s inc e fac e
c ards c ount as 1 0 . I f there is a 1 0 down, great, bec aus e if the
dealer s tarts with a 1 6 , s he will bus t about 6 2 perc ent of the
time (as will you if you hit a 1 6 ).

Sinc e eight different c ards will bus t a 1 6 (6 , 7 , 8 , 9 , 1 0 , J ac k,


Q ueen, and King), the c alc ulations look like this :

8 /1 3 = .6 1 6

O f c ours e, even though the s ingle bes t gues s is that the dealer
has a 1 0 down, there is ac tually a better c hanc e that the dealer
does not have a 1 0 down. A ll the other pos s ibilities (9 /1 3 ) add
up to more than the c hanc es of a 1 0 (4 /1 3 ).

A ny c ard other than an A c e will res ult in the dealer hitting. A nd


the c hanc es of that next c ard breaking the dealer depends on
the probabilities as s oc iated with the s tarting hand the dealer
ac tually has . P ut it all together and the dealer does not have a
6 2 perc ent c hanc e of bus ting with a 6 s howing. T he ac tual
frequenc y with whic h a dealer bus ts with a 6 s howing is c los er to
4 2 perc ent, meaning there is a 5 8 perc ent c hanc e s he will not
bus t.

N ow, imagine that you have a 1 6 agains t the dealer's down c ard
of 6 . Your c hanc e of bus ting when you take a c ard is 6 2 perc ent.
C ompare that 6 2 perc ent c hanc e of an immediate los s to the
dealer's c hanc e of beating a 1 6 , whic h is 5 8 perc ent. Bec aus e
there is a greater c hanc e that you will los e by hitting than that
you will los e by not hitting (6 2 is greater than 5 8 ), you s hould
s tay agains t the 6 , as Table 4 - 1 2 indic ates .

A ll the branc hing pos s ibilities for all the different permutations
of s tarting hands vers us dealers ' up c ards res ult in the
rec ommendations in Table 4 - 1 2 .
Sucker Bet

M any c as inos offer a c hanc e for you to buy ins urance if


the dealer's up c ard is an A c e. I ns uranc e means that
you wager up to half your original bet, and if the dealer
has a blac kjac k (a 1 0 or fac e c ard as the down c ard),
you win that s ide bet but los e your original wager
(unles s you, too, have a blac kjac k, in whic h c as e it's a
tie and you get your wager bac k).

T he c hanc es of the dealer having a 1 0 underneath are


4 /1 3 , or 3 1 perc ent. You will los e your ins uranc e money
muc h more often then you will win it. U nles s you are
c ounting c ards , never take ins uranc e. Yes , even if you
have a blac kjac k.

Simple Card-Counting Methods

T he bas ic s trategies des c ribed earlier in this hac k as s ume that


you have no idea what c ards s till remain in the dec k. T hey
as s ume that the original dis tribution of c ards s till remains for a
s ingle dec k, or s ix dec ks , or whatever number of dec ks is us ed
in a partic ular game. T he moment any c ards have been dealt,
however, the ac tual odds c hange, and, if you know the new odds ,
you might c hoos e different options for how you play your hand.

E laborate and very s ound (s tatis tic ally s peaking) methods exis t
for keeping trac k of c ards previous ly dealt. I f you are s erious
about learning thes e tec hniques and dedic ating yours elf to the
life of a c ard c ounter, more power to you. I don't have the s pac e
to offer a c omplete, c omprehens ive s ys tem here, though. For the
res t of us , who would like to dabble a bit in ways to inc reas e our
odds , there are a few c ounting proc edures that will improve your
c hanc es without you having to work partic ularly hard or
memorize many c harts and tables .

T he bas ic method for improving your c hanc es agains t the c as ino


is to inc reas e your wager when there is a better c hanc e of
winning. T he wager mus t be plac ed before you get to s ee your
c ards , s o you need to know ahead of time when your odds have
improved. T he following three methods for knowing when to
inc reas e your bet are pres ented in order of c omplexity.

Counting Aces

You get even money for all wins , exc ept when you are dealt a
blac kjac k. You get a 3 - to- 2 payout (e.g., $ 1 5 for every $ 1 0 bet)
when a blac kjac k c omes your way. C ons equently, when there is a
better- than- average c hanc e of getting a blac kjac k, you would
like to have a larger- than- average wager on the line.

T he c hanc es of getting a blac kjac k, all things being equal, is


c alc ulated by s umming two probabilities :

Getting a 10-card firs t and then an Ace

4 /1 3 x4 /5 1 = .0 2 4 1
Getting an Ace firs t and then a 10-card

1 /1 3 x1 6 /5 1 = .0 2 4 1

A dd the two probabilities together, and you get a .0 4 8 2 (about 5


perc ent) probability of being dealt a natural 2 1 .

O bvious ly, you c an't get a blac kjac k unles s there are A c es in
the dec k. When they are gone, you have no c hanc e for a
blac kjac k. When there are relatively few of them, you have les s
than the normal c hanc e of a blac kjac k. With one dec k, a
previous ly dealt A c e lowers your c hanc es of hitting a blac kjac k
to .0 3 6 2 (about 3 .6 perc ent). D ealing a quarter of the dec k with
no A c es s howing up inc reas es your c hanc es of a blac kjac k to
about 6 .5 perc ent.

Q uic k tip for the budding c ard c ounter:


don't move your lips .

Counting Aces and 10s

O f c ours e, jus t as you need an A c e to hit a blac kjac k, you als o


need a 1 0 - c ard, s uc h as a 1 0 , J ac k, Q ueen, or King. While you
are c ounting A c es , you c ould als o c ount how many 1 0 - c ards go
by.

T here is a total of 2 0 A c es and 1 0 - c ards , whic h is about 3 8


perc ent of the total number of c ards . When half the dec k is gone,
half of thos e c ards s hould have been s hown. I f fewer than 1 0 of
thes e key c ards have been dealt, your c hanc es of a blac kjac k
have inc reas ed. With all 2 0 s till remaining halfway through a
dec k, your c hanc es of s eeing a blac kjac k in front of you
s kyroc kets to 1 9 .7 perc ent.

Going by the point system

Bec aus e you want proportionately more high c ards and


proportionately fewer low c ards when you play, a s imple point
s ys tem c an be us ed to keep a running "c ount" of the dec k or
dec ks . T his requires more mental energy and c onc entration than
s imply c ounting A c es or c ounting A c es , 1 0 s , and fac e c ards , but
it provides a more prec is e index of when a dec k is loaded with
thos e magic high c ards .

Table 4 - 1 3 s hows the point value of eac h c ard in a dec k under


this point s ys tem.

Table Simple card-counting point system


Card Point value
10, Jack, Queen, King, Ace -1
7, 8, 9 0
2, 3, 4, 5, 6 +1
A new dec k begins with a c ount of 0 , bec aus e there are an equal
number of - 1 c ards and +1 c ards dealt in the dec k. Seeing high
c ards is bad, bec aus e your c hanc es of blac kjac ks have dropped,
s o you los e a point in your c ount. Spotting low c ards is good,
bec aus e there are now proportionately more high c ards in the
dec k, s o you gain a point there.

You c an learn to c ount more quic kly and


eas ily by learning to rapidly rec ognize the
total point value of c ommon pairs of c ards .
P airs of c ards with both a high c ard and a
low c ard c anc el eac h other out, s o you c an
quic kly proc es s and ignore thos e s orts of
hands . P airs that are low- low are worth big
points (2 ), and pairs that are high- high are
trouble, meaning you c an s ubtrac t 2 points
for eac h of thes e dis appointing
c ombinations .

You will only oc c as ionally s ee runs of c ards that dramatic ally


c hange the c ount in the good direc tion. T he c ount s eldom gets
very far from 0 . For example, with a s ingle new dec k, the firs t s ix
c ards will be low les s than 1 perc ent of the time, and the firs t ten
c ards will be low about 1 /1 0 0 0 of 1 perc ent of the time.

T he c ount does n't have to be very high, though, to improve your


odds enough to s urpas s the almos t even c hanc e you have jus t
following bas ic s trategy. With one dec k, c ounts of +2 are large
enough to meaningfully improve your c hanc es of winning. With
more than one dec k, divide your c ount by the number of
dec ks this is a good es timation of the true c ount.

Sometimes you will s ee very high c ounts , even with s ingle


dec ks . When you s ee that s ort of s tring of luc k, don't hes itate to
rais e your bet. I f you get very c omfortable with the point s ys tem
and have read more about s uc h s ys tems , you c an even begin to
c hange the dec is ions you make when hitting or s tanding or
s plitting or doubling down.

E ven if you jus t us e thes e s imple s ys tems , you will improve your
c hanc es of winning money at the blac kjac k tables . Remember,
though, that even with thes e s orts of s ys tems , there are other
pitfalls awaiting you in the c as ino, s o be s ure to always follow
other good gambling advic e [H ac k #3 5 ] as well.
Hack 41. Play Smart When You Play the
Lottery

Your odds of winning a big prize in a giant lottery are really,


really small, no matter how you slice it. You do have some
control over your f ate, however. Here are some ways to give
yourself an advantage (albeit slight) over all the other lotto
players who haven't bought this book.

I n O c tober of 2 0 0 5 , the bigges t P owerball lottery winner ever


was c rowned and awarded $ 3 4 0 million. I t was n't me. I don't
play the lottery bec aus e, as a s tatis tic ian, I know that playing
only s lightly inc reas es my c hanc es of winning. I t's not worth it
to me.

O f c ours e, if I don't play, I c an't win. Buying a lottery tic ket is n't
nec es s arily a bad bet, and if you are going to play, there are a
few things you c an do to inc reas e the amount of money you will
win (probably) and inc reas e your c hanc es of winning (pos s ibly).
Whoever bought the winning $ 3 4 0 million tic ket in J ac ks onville,
O regon, that O c tober day likely followed a few of thes e winning
s trategies , and you s hould too.

Bec aus e P owerball is a lottery game played in mos t U .S. s tates ,


we will us e it as our example. T his hac k will work for any large
lottery, though.
Powerball Odds

P owerball, like mos t lotteries , as ks players to c hoos e a s et of


numbers . Random numbers are then drawn, and if you matc h
s ome or all of the numbers , you win money! To win the bigges t
prizes , you have to matc h lots of numbers . Bec aus e s o many
people play P owerball, many tic kets are s old, and the prize
money c an get huge.

O f c ours e, c orrec tly pic king all the winning numbers is hard to
do, but it's what you need to do to win the jac kpot. I n P owerball,
you c hoos e five numbers and then a s ixth number: the red
powerball. T he regular white numbers c an range from 1 to 5 5 ,
and the powerball c an range from 1 to 4 2 . Table 4 - 1 4 s hows the
different c ombinations of matc hes that res ult in a prize, the
amount of the prize, and the odds and probability of winning the
prize.

Table Powerball payoffs


Match Cash Odds Percentage
Powerball
$3 1 in 69 1.4 percent
only
1 white
ball and
$4 1 in 127 0.8 percent
the
powerball
3 white
$7 1 in 291 0.3 percent
balls
2 white
balls and $7 1 in 745 0.1 percent
the
powerball
3 white
balls and
$100 1 in 11,927 0.008 percent
the
powerball
4 white
$100 1 in 14,254 0.007 percent
balls
4 white
balls and
$10,000 1 in 584,432 0.0002 percent
the
powerball
5 white 1 in
$200,000 0.00003 percent
balls 3,563,609
5 white
balls and Grand 1 in 0.0000006
the prize 146,107,962 percent
powerball

Powerball Payoff

A rmed with all the wis dom you likely now have as a s tatis tic ian
(unles s this is the firs t hac k you turned to in this book), you
might have already made a few interes ting obs ervations about
this payoff s c hedule.

Easiest prize

T he eas ies t prize to win is the powerball only matc h, and even
then there are s lim c hanc es of winning. I f you matc h the
powerball (and no other numbers ), you win $ 3 . T he c hanc es of
winning this prize are about 1 in 6 9 .

T his is not a good bet by any reas onable s tandard. I t c os ts a


dollar to buy a tic ket, to play one time, and the expec ted payout
s c hedule is $ 3 for every 6 9 tic kets you buy. So, on average,
after 6 9 plays you will have won $ 3 and s pent $ 6 9 .

A c tually, your payoff will be a little better than that. T he odds


s hown in Table 4 - 1 4 are for making a s pec ific matc h and not
doing any better than that. Some proportion of the time when you
matc h the powerball, you will als o matc h a white ball and your
payoff will be $ 4 , not $ 3 . C hoos ing five white ball numbers and
matc hing at leas t 1 will happen 3 9 perc ent of the time.

So, after having matc hed the powerball, you have a little better
than a third c hanc e of hitting at leas t one white ball as well. E ven
s o, your expec ted payoff is about $ 3 .3 9 for every $ 6 9 you throw
down that rat hole (I mean, s pend on the lottery), whic h is s till
not a good bet.

Powerball only
T he odds for the powerball only matc h don't s eem quite right. I
s aid there were 4 2 different numbers to c hoos e from for the
powerball, s o s houldn't there be 1 out of 4 2 c hanc es to matc h it,
not 1 in 6 9 ?

Yes , but remember this s hows the c hanc es of hitting that prize
only and not doing better (by matc hing s ome other balls ). Your
odds of winning s omething, anything, if you c ombine all the
winning permutations together are 1 in 3 7 , about 3 perc ent. Still
not a good bet.

Grand prize

T he odds for the grand prize don't s eem quite right either. (O kay,
okay, I don't really expec t you to have "notic ed" that. I didn't
either until I did a few c alc ulations .)

I f there are 5 draws from the numbers of 1 to 5 5 (the white balls )


and 1 draw from the numbers 1 to 4 2 (the red ball), then a quic k
c alc ulation would es timate the number of pos s ibilities as :

I n other words , the odds are 1 out of 2 1 ,1 3 7 ,9 4 3 ,7 5 0 . O r, if you


were thinking a little more c learly, realizing that the number of
balls gets s maller as they are drawn, you might s peedily
c alc ulate the number of pos s ible outc omes as :

But the odds as s hown are s omewhat better than 1 out of 1 .7


billion. T he firs t time I c alc ulated the odds , I didn't keep in mind
that the order does n't matter, s o any of the remaining c hos en
numbers c ould c ome up at any time. H enc e, here's the c orrec t
s eries of c alc ulations :
Winning Powerball

O K, M r. Big Shot Stats G uy (you are probably thinking), you're


going to tell us that we s hould never play the lottery bec aus e,
s tatis tic ally, the odds will never be in our favor. A c tually, us ing
the c riteria of a fair payout, there is one time to play and to buy
as many tic kets as you c an afford.

I n the c as e of P owerball, you s hould play anytime the grand


prize inc reas es to pas t $ 1 4 6 ,1 0 7 ,9 6 2 (or double that amount if
you want the lump s um payout). A s s oon as it hits $ 1 4 6 ,1 0 7 ,
9 6 3 , buy, buy, buy! Bec aus e the c hanc es of matc hing five white
balls and the one red ball are exac tly one out of that big number,
from a s tatis tic al pers pec tive, it is a good bet anytime your
payout is bigger than that big number.

For P owerball and its number of balls and their range of values ,
1 4 6 ,1 0 7 ,9 6 2 is the magic number. T he idea that your c hanc es
of winning haven't c hanged but the payoff amount has inc reas ed
to a level where playing is worthwhile is s imilar to the c onc ept of
pot odds in poker [H ac k #3 7 ].

You c an c alc ulate the "magic number" for


any lottery. O nc e the payoff in that lottery
gets above your magic number, you c an
jus tify a tic ket purc has e. U s e the "c orrec t
s eries " of c alc ulations in our example for
P owerball as your mathematic al guide.
A s k yours elf how many numbers you mus t
matc h and what the range of pos s ible
numbers is . Remember to lower the
number you divide by one eac h time you
"draw" out another ball or number, unles s
numbers c an repeat. I f numbers c an
repeat, then the denominator s tays the
s ame in your s eries of multiplic ations .

O ne important hint about dec iding when to buy lottery tic kets
has to do with determining the actual magic number, the prize
amount, whic h triggers your buying s pree. T he amount that is
advertis ed as the jac kpot is not, in fac t, the jac kpot. T he
advertis ed "jac kpot" is the amount that the winner would get
over a period of years in a regular s eries of s maller portions of
that amount. T he real jac kpotthe amount you s hould identify as
the payout in the gambling and s tatis tic al s ens eis the amount
that you would get if you c hos e the one lump s um option. T he one
lump s um is typic ally a little les s than half of the advertis ed
jac kpot amount.

So, if you have determined that your lottery has grown a jac kpot
amount that s ays it is now s tatis tic ally a good time to play, how
many tic kets s hould you buy? Why not buy one of eac h? Why not
s pend $ 1 4 6 ,1 0 7 ,9 6 2 and buy every pos s ible c ombination? You
are guaranteed to win. I f the jac kpot is greater than that amount,
then you'll make money, guaranteed, right? Well, ac tually not.
O therwis e, I 'd be ric h and I would never s hare this hac k with
you. Why wouldn't you be guaranteed to win? T he probably is
that you might be forc ed to...wait for it...s plit the prize! A rgh!
See the next s ec tion...

Don't Split the Prize


I f you do win the lottery, you'd like to be the only winner, s o in
addition to dec iding when to play, there are a variety of
s trategies that inc reas e the likelihood that you'll be the only one
who pic ked your winning number.

Firs t off, I 'm working under the as s umption that the winning
number is randomly c hos en. I tend not to be a c ons pirac y
theoris t, nor do I believe that G od has the time or inc lination to
affec t the drawing of winning lottery numbers , s o I 'm going to not
lis t any s trategy that would work only if there were not
randomnes s in the drawing of lottery numbers . H ere are s ome
more reas onable tips to c ons ider when pic king your lottery
numbers :

Let the computer pick

L et the c omputer do the pic king, or, at leas t, c hoos e


random numbers yours elf. Random numbers are les s
likely to have meaning for any other player, s o they are
les s likely to have c hos en them on their own tic kets .
T he P owerball people report that 7 0 perc ent of all
winning tic kets are c hos en randomly by the in- s tore
c omputer. (T hey als o point out, in a bit of "We told you
that res ults are random" whims y that 7 0 perc ent of all
tic kets purc has ed had numbers generated by the
c omputer.)

Don't pick dates

D o not pic k numbers that c ould be dates . I f pos s ible,


avoid numbers lower than 3 2 . M any players always play
important dates , s uc h as birthdays and annivers aries ,
pris on releas e dates , and s o on. I f your winning number
c ould be s omeone's luc ky date, that inc reas es the
c hanc e that you will have to s plit your winnings .

Stay away from well-known numbers

D o not pic k numbers that are well known. I n the big


O c tober 2 0 0 5 P owerball res ults , hundreds of players
c hos e numbers that matc hed the lottery tic ket numbers
that play a large role in the popular fic tional T V s how
Los t. N one of thes e folks won the big prize, but if they
had, they would have had to divide the millions into
hundreds of s lic es .

T here is als o a family of purely


philos ophic al tips that have to do with
abs trac t theories of c aus e and effec t and
the nature of reality. For example, s ome
philos ophers would s ay to pic k las t week's
winning numbers . Bec aus e, while you
might not know for s ure what is real and
what c an and c annot happen in this world,
you do know that, at leas t, it is pos s ible for
las t week's numbers to be this week's
winning numbers . I t happened before; it
c an happen again.

T hough your odds of winning a giant lottery prize are s lim, you
c an follow s ome s tatis tic al princ iples and do a few things to
ac tually c ontrol your own des tiny. (T he word for des tiny in
I talian, by the way, is lotto.) O h, and one more thing: buy your
tic ket on the day of the drawing. I f too muc h time pas s es
between your purc has e and the announc ement of the winning
numbers , you have a greater likelihood of being hit by lightning,
drowning in the bathtub, or being s truc k by a minivan than you
do of winning the jac kpot. T iming is everything, and I 'd hate for
you to mis s out.
Hack 42. Play with Cards and Get Lucky

While it is true that Uncle Frank spends much of his time in


taverns using dice to win silly bar bets and smiling real
charming-like at the ladies, there is more to his lif e than that.
For instance, sometimes he uses playing cards instead of dice.

P eople, es pec ially c ard players , and es pec ially poker players ,
feel pretty good about their level of unders tanding of the
likelihood that different c ombination of c ards will appear. T heir
experienc e has taught them the relative rarity of pairs , three- of-
a- kind, flus hes , and s o on. G eneralizing that intuitive knowledge
to playing- c ard ques tions outs ide of game s ituations is diffic ult,
however.

M y s tats - s avvy unc le, U nc le Frank, knows this . Sometimes ,


U nc le Frank us es his knowledge of s tatis tic s for evil, not good, I
am s orry to s ay, and he has perfec ted a group of bar bets us ing
dec ks of playing c ards , whic h he c laims helped pay his way
through graduate s c hool. I 'll s hare them with you only for the
purpos e of demons trating c ertain bas ic s tatis tic al princ iples . I
trus t that you will us e your newfound knowledge to entertain
others , fight c rime, or win inexpens ive nonalc oholic beverages .

Getting a Li'l Flush

I n poker, a flus h is five c ards , all of the s ame s uit. For my U nc le


Frank, though, there is s eldom time to deal out c omplete poker
hands before he is as ked to leave whatever es tablis hment he is
in. C ons equently, U nc le Frank often makes wagers bas ed on
what he c alls li'l flus hes .

The bet

A little flus h (oops , s orry; I mean li'l flus h) is any two c ards of
the s ame s uit. Frank has a wager that he almos t always wins
that has to do with finding two c ards of the s ame s uit in your
hand. A gain, bec aus e of time c ons traints , his poker hands have
only four c ards , not five.

T he wager is that you deal me four c ards out of a random dec k,


and I will get at leas t two c ards of the s ame s uit. While this
might not s eem too likely, it is ac tually muc h les s likely that
there would be four c ards of all different s uits . I figure the
c hanc e of getting four different s uits in a four- c ard hand is about
1 1 perc ent. So, the likelihood of getting a li'l flus h is about 8 9
perc ent!

Why it works

T here are a variety of ways to c alc ulate playing- c ard hand


probabilities . For this bar bet, I us e a method that c ounts the
number of pos s ible winning hand c ombinations and c ompares it
to the total number of hand c ombinations . T his is the method
us ed in "P lay with D ic e and G et L uc ky" [H ac k #4 3 ].
To think about how often four c ards would repres ent four different
s uits , with no two- c ard flus hes amongs t them, c ount the number
of pos s ible four- c ard hands . I magine any firs t c ard (5 2
pos s ibilities ), imagine that c ard c ombined with any remaining
s ec ond c ard (5 2 x5 1 ), add a third c ard (5 2 x5 1 x5 0 ) and a fourth
c ard (5 2 x5 1 x5 0 x4 9 ), and you'll get a total of 6 ,4 9 7 ,4 0 0
different four- c ard hands .

N ext, imagine the firs t two c ards of a four- c ard hand. T hes e will
matc h only .2 3 5 2 of the time (1 2 c ards of the s ame s uit remain
out of a 5 1 - c ard dec k). So, about one- and- a- half million four-
c ard deals will find a flus h in the firs t two c ards . T hey won't
matc h another .7 6 4 8 of the time. T his leaves 4 ,9 6 8 ,6 0 1 hands
with two differently s uited firs t two c ards .

O f that number of hands , how many will not rec eive a third c ard
that does not s uit up with either of the firs t two c ards ? T here are
5 0 c ards remaining, and 2 6 of thos e have s uits that have not
appeared yet. So, 2 6 /5 0 (5 2 perc ent) of the time, the third c ard
would not matc h either s uit.

T hat leaves 2 ,5 8 3 ,6 7 3 hands that have three firs t c ards that


are all uns uited. N ow, of that number, how many will now draw a
fourth c ard that is the fourth unrepres ented s uit? T here are 1 3
out of 4 9 c ards remaining that repres ent that final fourth s uit.
2 6 .5 3 perc ent of the remaining hands will have that s uit as the
fourth c ard, whic h c omputes to 6 8 5 ,4 6 4 four- c ard c ombinations
with four different s uits . 6 8 5 ,4 6 4 divided by the total number of
pos s ible hands is .1 0 5 5 (6 8 5 ,4 6 4 /6 4 9 7 4 0 0 ).

T here's your 1 1 perc ent c hanc e of having four different s uits in a


four- c ard hand. Whew! By the way, s ome s uper- genius - type
c ould get the s ame proportion by us ing jus t the relevant
proportions , whic h we us ed along the way during our different
c ounting s teps , and not have to c ount at all:
Finding a Match with Two Decks of Cards

You have a dec k of c ards . I have a dec k of c ards . T hey are both
s huffled (or, perhaps , s ouffl\x8 e d, as my s pell c hec k s ugges ted
I meant to s ay). I f we dealt them out one at a time and went
through both dec ks one time, would they ever matc h? I mean,
would they ever matc h exactly, with the exac t s ame c ardfor
example, us both turning up the J ac k of C lubs at the s ame time?

The bet

M os t people would s ay no, or at leas t that it would c ertainly


happen oc c as ionally, but not too frequently. A s toundingly, not
only will you often find at leas t one matc h when you pas s through
a pair of dec ks , but it would be out of the ordinary not to. I f you
make this wager or c onduc t this experiment many times , you will
get at leas t one matc h on mos t oc c as ions . I n fac t, you will not
find a matc h only 3 6 .4 perc ent of the time!

Why it works

H ere's how to think about this problem s tatis tic ally. Bec aus e the
dec ks are s huffled, one c an as s ume that any two c ards that are
flipped up repres ent a random s ample from a theoretic al
population of c ards (the dec k). T he probability of a matc h for any
given s ample pair of c ards c an be c alc ulated. Bec aus e you are
s ampling 5 2 times , the c hanc e of getting a matc h s omewhere in
thos e attempts inc reas es as you s ample more and more pairs of
c ards . I t is jus t like getting a 7 on a pair of dic e: on any given
roll, it is unlikely, but ac ros s many rolls , it bec omes more likely.

To c alc ulate the probability of hitting the outc ome one wis hes
ac ros s a s eries of outc omes , the math is ac tually eas ier if one
c alc ulates the c hanc es of not getting the outc ome and
multiplying ac ros s attempts . For any given c ard, there is a 1 out
of 5 2 c hanc e that the c ard in the other dec k is an exac t matc h.
T he c hanc es of that not happening are 5 1 out of 5 2 , or .9 8 0 8 .

You are trying to make a matc h more than onc e, though; you are
trying 5 2 times . T he probability of not getting a matc h ac ros s
5 2 attempts , then, is .9 8 0 8 multiplied by its elf 5 2 times . For
you math types , that's .9 8 0 8 52.

Wait a s ec ond and I 'll c alc ulate that in my head (.9 8 0 8 times
.9 8 0 8 times .9 8 0 8 and s o on for 5 2 times is ...about...0 .3 6 4 3 ).
O K, s o the c hanc e that it won't happen is .3 6 4 3 . To get the
c hanc e that it will happen, we s ubtrac t that number from 1 and
get .6 3 5 7 .

You'll find at leas t one matc h between two dec ks about two-
thirds of the time! Remarkable. G o forth and win that free
lemonade.
Hack 43. Play with Dice and Get Lucky

Here are some honest wagers using honest dice. Just because
you aren't cheating, though, doesn't mean you won't win.

I t is an unfortunate s tereotype that s tatis tic ians are glas s es -


wearing intros pec tive nerds who never have a beer with the
gang. T his is s uc h an abs urd belief, that jus t thinking about it
las t Saturday and Sunday at my weekly Dungeons & Dragons
gathering, I laughed s o hard that my monoc le almos t landed in
my s herry.

T he truth is that dis playing knowledge of s imple probabilities in


a bar c an be quite entertaining for the patrons and make you the
life of the party. A t leas t, that's what happens ac c ording to my
U nc le Frank, who for years has us ed his s tats s kills to win free
drinks and pic kled eggs (or whatever thos e things are in that big
jar that are always dis played in the bars I s ee on T V ).

H ere are a few ways to win a bet us ing any fair pair of dic e.

Distribution of Dice Outcomes

Firs t, let's get ac quainted with the pos s ibilities of two dic e rolled
onc e. You'll rec all that mos t dic e have s ix s ides (my fantas y
role- playing friends and I c all thes e s ix-s ided dice) and that the
values range from 1 to 6 on eac h c ube.
C alc ulating the pos s ible outc omes is a matter of lis ting and
c ounting them. Figure 4 - 2 s hows all pos s ible outc omes for
rolling two dic e.

Figure 4-2. Possible outcomes for two dice

T his dis tribution res ults in the frequenc ies s hown in Table 4 - 1 5 .

Table Frequency of outcomes for rolling two dice


Total roll Chances Frequency
2 1 2.8 percent
3 2 5.6 percent
4 3 8.3 percent
5 4 11.1 percent
6 5 13.9 percent
7 6 16.7 percent
8 5 13.9 percent
9 4 11.1 percent
10 3 8.3 percent
11 2 5.6 percent
12 1 2.8 percent
Total number of
36 100 percent
possible outcomes

T he game of c raps , of c ours e, is bas ed entirely on thes e


expec ted frequenc ies . Some interes ting wagers might c ome to
mind as you look at this frequenc y dis tribution. For example,
while a 7 is the mos t c ommon roll and many people know this , it
is only s lightly more likely to c ome up than a 6 or 8 .

I n fac t, if you didn't have to be s pec ific , you c ould wager that a 6
or an 8 will c ome up before a 7 does . O f all totals that c ould be
s howing when thos e dic e are done rolling, more than one- fourth
of the time (about 2 8 perc ent) the dic e will total 6 or 8 . T his is
s ubs tantially more likely than a 7 , whic h c omes up only one-
s ixth of the time.

Bar Bets with Dice


M y U nc le Frank us ed to bet any dull- witted patron that he would
roll a 5 or a 9 before the patron rolled a 7 . U nc le Frank won 8 out
of 1 4 times .

Sometimes , old Frankie would wager that on any one roll of a pair
of dic e, there would be a 6 or a 1 s howing. T hough, at firs t
thought, there would s eem to be at leas t a les s than 5 0 perc ent
c hanc e of this happening, the truth is that a 1 or 6 will be
s howing about 5 6 perc ent of the time. T his is the s ame
probability for any two different numbers , by the way, s o you
c ould us e an attrac tive s tranger's birthday to pic k the digits and
maybe s tart a c onvers ation, whic h c ould lead to marriage,
c hildren, or both.

I f you are more hones t than my U nc le Frank (and there is a 9 8


perc ent c hanc e that you are), here are s ome even- money bets
with dic e. T he outc omes in c olumn A are equally as likely to
oc c ur as the outc omes in c olumn B:

A B
2 or 12 3
2, 3, or 4 7
5, 6, or 7 8, 9, 10, 11, or 12

T he odds are even for either outc ome.

Why It Works
For the bets pres ented in this hac k, here are the c alc ulations
demons trating the probability of winning:

Number
of
Wager Calculation Resulti
winning
outcomes
5 or 9
8 versus 6 8/14 .571
versus 7
1 or 6
20 20/36 .556
showing
2 or 12
2 versus 2 2/4 .500
versus 3
2, 3 or 4
6 versus 6 6/12 .500
versus 7
5, 6 or 7
versus 8 15 versus 15 15/30 .500
or higher

T he "Wager" c olumn pres ents the two c ompeting outc omes (e.g.,
will a 5 or 9 c ome up before a 7 ? ). T he "N umber of winning
outc omes " c olumn indic ates number of different dic e rolls that
would res ult in either s ide of the wager (e.g., 8 c hanc es of getting
a 5 or 9 vers us only 6 c hanc es of getting a 7 ). T he "Res ulting
proportion" c olumn indic ates your c hanc es of winning.

You c an win two different ways with thes e s orts of bets . I f it is an


even- money bet, you c an wager les s than your opponent and
s till make a profit in the long run. H e won't know the odds are
even. I f c hanc e favors you, though, c ons ider offering your target
a s lightly better payoff, or pic k the outc ome that is likely to
c ome up more often.
Hack 44. Sharpen Your Card-Sharping

In Texas Hold 'Em and other poker games, there are a f ew basic
preliminary skills and a bit of basic knowledge about probability
that will immediately push you f rom absolute beginner to the
more comf ortable level of knowing just enough to get into
trouble as a card sharp.

T he profes s ional Texas H old 'E m poker players who appear on


televis ion are different from you and me in jus t a c ouple of
important ways . (Well, they likely differ from you in jus t a c ouple
of important ways ; they differ from me in s o many important
ways that even my c omputer brain c an't c ount that high.) H ere
are two areas of poker playing that they have mas tered:

Knowing the rough probability of hitting the c ards they


want at different s tages in a hand (in the flop, on the
river, and s o on)

Q uic kly identifying the pos s ible better hands that c ould
be held by other players

T his hac k pres ents s ome tips and tools for moving from novic e
to s emi- pro. T hes e are s ome s imple hunks of knowledge and
quic k rules of thumb for making dec is ions . L ike the other poker
hac ks in this book, they provide s trategy tips bas ed purely on
s tatis tic al probabilities , whic h as s ume a random dis tribution of
c ards in a s tandard 5 2 - c ard dec k.

Improving Your Hand

H alf the time, you will get a pair or better in Texas H old 'E m. I 'll
repeat that bec aus e it is s o important in unders tanding the
game. H alf the time (a little under 5 2 perc ent ac tually), if you
s tay in long enough to s ee s even c ards (your two c ards plus all
five c ommunity c ards ), you will have at leas t one pair. I t might
have been in your hand (a pocket or wired pair), it might be made
up of one c ard in your hand and one from the c ommunity c ards ,
or your pair might be entirely in the c ommunity c ards for
everyone to c laim.

I f for the majority of the time the average player will have a pair
when dealt s even c ards , then s tic king around until the end with a
low pair means you areonly s tatis tic ally s peaking, of
c ours elikely to los e. I n other words , there is a greater than 5 0
perc ent c hanc e that the other player has at leas t a pair, and that
pair will probably be 8 s or higher (only s ix out of thirteen pairs
are 7 s or lower.)

Knowing how c ommon pairs are explains why A c es are s o highly


valued. M uc h of the time, heads - up battles c ome down to a
battle of pair vers us pair. A nother good proportion of the time,
the A c e plays an important role as a kic ker or tiebreaker. A c es
are good to have, and it's all bec aus e of the odds .

Probabilities
D ec is ions about s taying in or rais ing your bet in an attempt to
lower the number of opponents you have to beat c an be made
more wis ely if you know s ome of the c ommon probabilities for
s ome of the c ommonly hoped- for outc omes . Table 4 - 1 6
pres ents the probability of drawing a c ard that helps you at
various s tages in a hand. T he probabilities are c alc ulated bas ed
on how many c ards are left in the dec k, how many different c ards
will help you (your outs ), and how many more c ards will be drawn
from the dec k. For example, if you have an A c e- King and hope to
pair up, there are s ix c ards that c an make that happen; in other
words , you have s ix outs . I f you have only an A c e high but hope
to find another A c e, you have three outs . I f you have a poc ket
pair and hope to find a powerful third in the c ommunity c ards ,
you have jus t two outs .

Table Probability of improving your hand


Cards leftto Six Three Two
be dealt outs outs outs
49 19
5 (before the flop) 28 percent
percent percent
24 8
2 (after the flop) 12 percent
percent percent
13 4
1 (after the turn) 7 percent
percent percent

T he s ituations des c ribed here as s ume you have already been


dealt two c ards . A fter all, in mos t poker games , the bet before
the flop is predetermined, s o there are no dec is ions to be made.
By the way, bec aus e you s hould probably bac k out of hands that
did not amount to anything in the flop, you'll want to know your
c hanc es of improving in the flop its elf. T hey are:

Remaining Odds you'll hit a winning card in the


outs flop
6 32 percent
3 17 percent
2 12 percent

Implications

H ere are a few quic k obs ervations and implic ations to etc h in
your mind, bas ed on the dis tribution des c ribed in Table 4 - 1 6 .

H alf the time, you will pair up. T his is true for high c ards , s uc h as
Big Slick (A c e- King) or low c ards , s uc h as 2 - 7 . You c an even
pic k from the two c ards you have and pair that one up 2 8
perc ent of the time. I mplic ation: when low on c hips in
tournament play, go all- in as s oon as you get that A c e.

I f you don't hit the third c ard, you need to turn a pair into a s et
(three of a kind) on the flop, and there is only an 8 perc ent
c hanc e you will hit it down the road. I mplic ation: don't s pend too
muc h money waiting around for your low pair to turn into a
gangbus ter hand.

Your A c e- King or A c e- Q ueen that looked pretty good before the


flop diminis hes in potential as more c ards are revealed without
pairing up or getting s traight draws . 8 7 out of 1 0 0 times , that
great s tarting hand remains a meas ly high- c ard- only hand if you
haven't hit before the river. I mplic ation: s tay in with the
unfulfilled dream that is A c e- King only if you c an do s o c heaply.

Reading the Community Cards in a Flash

H ere are s ome c ommon- s ens e s tatements about your


opponents ' hands that mus t be true but aren't always s aid out
loud:

If the community Your


cards do not opponent(s)
have... cannot have...
A pair Four of a kind
A pair A full house
Three cards of the same
A flush
suit
Three cards within a five-
A straight
card range
You c an make quic ker dec is ions about what your opponents
might have by learning thes e rules . T hen, you c an automatic ally
rule out killer hands when the s ituation is s uc h that they are
impos s ible. You may not be worried about s peed, but you c an
s pend your time c onc entrating on more important dec is ions if
you don't have to was te mental energy figuring thes e things out
from s c ratc h eac h time.
Hack 45. Amaze Your 23 Closest Friends

What are the chances of at least two people in a group sharing


a birthday? Depending on the number of people present,
surprisingly high. Impress your f riends at parties (and perhaps
win some money in a bar bet) using these simple rules of
probability.

Some events that s eem logic ally unlikely c an ac tually turn out
to be quite probable in s ome c as es . O ne s uc h example is
determining the probability that at leas t two people in a group
s hare a birthday. M any people are s hoc ked to learn that as long
as there are at leas t 2 3 people in the group, there is a better
than 5 0 perc ent c hanc e that at leas t 2 of them will have the
s ame birthday! By us ing a few s imple rules of probability, you
c an figure out the likelihood of this event oc c urring for groups of
any s ize, and then amaze your friends when your predic tions
c ome true.

You c ould als o us e this res ult to make


s ome c as h in a bar bet (as long as there
are at leas t 2 3 people there).
So, how do you figure out the probability of at leas t two people
s haring a birthday? To s olve this problem, you need to make a
c ouple of as s umptions about how birthdays are dis tributed in the
population and know a few rules about how probabilities work.

Getting Started

To determine the c hanc es of at leas t two people s haring a


birthday, we have to make a c ouple of reas onable as s umptions
about how birthdays are dis tributed. Firs t, let's as s ume that
birthdays are uniformly dis tributed in the population. T his means
that approximately the s ame number of people are born on every
s ingle day of the year.

T his is not nec es s arily perfec tly true, but it's c los e enough for
us to s till trus t our res ults . H owever, there is one birthday for
whic h this is definitely not true: February 2 9 , whic h oc c urs only
every four years on L eap Year. T he good news is that few enough
people are born on February 2 9 that it is eas y for us to jus t
ignore it and s till get ac c urate es timates .

O nc e we've made thes e two as s umptions , we c an s olve the


birthday problem with relative eas e.

Applying the Law of Total Probability

I n our problem, there are only two mutually exclus ive pos s ible
outc omes :

A t leas t two people s hare a birthday.


N o one s hares a birthday.

Sinc e one of thes e two things mus t oc c ur, the s um of the two
probabilities will always be equal to one. Statis tic ians c all this
the Law of Total Probability, and it c omes in handy for this
problem.

T he term mutually exclus ive means that if


one event oc c urs , the other one c annot
oc c ur, and vic e vers a.

A s imple c oin- flipping example c an help pic ture how this works .
With a fair c oin, the probability of getting a heads is 0 .5 , jus t as
the probability of getting a tails is 0 .5 (whic h is another example
of mutually exc lus ive events , bec aus e the c oin c an't c ome up
heads and tails in the s ame flip! ). O nc e you flip the c oin, one of
two things has to happen. I t mus t land either heads up or tails
up, s o the probability of heads or tails oc c urring is 1 (0 .5 + 0 .5 ).
C onvers ely, we c an think of the probability of heads as one
minus the probability of tails (1 - 0 .5 = 0 .5 ), and vic e vers a.

Sometimes , it's eas ier to determine the probability of an event


not oc c urring and then us e that information to determine the
probability that it will oc c ur. T he probability of no one s haring a
birthday is a bit eas ier to figure out, and it depends only on how
many people are in the group.
I magine that our group c ontains only two people. What's the
probability that they s hare a birthday? Well, the probability that
they don't s hare a birthday is eas y to figure: pers on #1 has
s ome birthday, and there are 3 6 4 other birthdays pers on #2
might have that would res ult in them not s haring a birthday. So,
mathematic ally, it's 3 6 4 divided by 3 6 5 (the total number of
pos s ible birthdays ), or 0 .9 9 7 .

Sinc e the probability of the two people not s haring a birthday is


0 .9 9 7 (a very high probability), the probability of them ac tually
s haring a birthday is equal to 1 - 0 .9 9 7 (0 .0 0 3 , a very low
probability). T his means that only 3 out of every 1 ,0 0 0 randomly
s elec ted pairs of people will s hare a birthday. So far, this makes
perfec t logic al s ens e. H owever, things c hange (and c hange
quic kly) onc e we s tart adding more people to our group!

Calculating the Probability of Independent


Events

T he other tric k we need to s olve our problem is applying the idea


of independent events . Two (or more) events are s aid to be
independent if the probability of their c o- oc c urrenc e is equal to
the produc t of their individual probabilities .

O nc e again, this is s imple to unders tand with a nic e, eas y c oin-


flipping example. I f you flip a c oin twic e, the probability of
getting heads twic e is equal to the probability of heads multiplied
by the probability of heads (0 .5 x0 .5 = 0 .2 5 ), bec aus e the
outc ome of one c oin flip has no influenc e on the outc ome of the
other (henc e, they are independent events ).

So, when you flip a c oin twic e, one out of every four times the
res ults will be two heads in a row. I f you wanted to know the
probability of flipping three heads in a row, the ans wer is 0 .1 2 5
(0 .5 x0 .5 x0 .5 ), whic h means that three heads in a row happens
only one time out of every eight.

I n our birthday problem, every time we add another pers on to the


group, we've added another independent event (s inc e one
pers on's birthday does n't influenc e anyone els e's birthday), and
thus we'll be able to figure out the probability of at leas t two of
thos e people s haring a birthday, regardles s of how many people
we add; we'll jus t keep on multiplying probabilities together.

To review, no matter how many people are in our group, only one
of two mutually exclus ive events c an oc c ur: at leas t two people
s hare a birthday or no one s hares a birthday. Bec aus e of the L aw
of Total P robability, we know that we c an determine the
probability of no one s haring a birthday, and one minus that
value will be equal to the probability that at leas t two s hare a
birthday. L as tly, we als o know that eac h pers on's birthday is
independent of the other group members . G ot all that? G ood, let's
proc eed!

Solving the Birthday Problem

We've already determined that the probability of two people not


s haring a birthday in a group of two is equal to 0 .9 9 7 . L et's s ay
we add another pers on to the group. What is the probability of no
one s haring a birthday? T here are 3 6 3 other birthdays pers on
#3 c ould have that would res ult in none of them s haring a
birthday. T he probability of pers on #3 not s haring a birthday with
the other two is therefore 3 6 3 /3 6 5 , or 0 .9 9 5 (s lightly lower).

But remember, we're interes ted in the probability that no one


s hares a birthday, s o we us e the rule of independent events and
multiply the probability that the firs t two won't s hare a birthday
by the probability that the third pers on won't s hare a birthday
with the other two: 0 .9 9 7 x0 .9 9 5 = 0 .9 9 2 . So, in a group of three
people, the probability that none of them s hare a birthday is
0 .9 9 2 , whic h means that the probability that at leas t two of them
s hare a birthday is 0 .0 0 8 (1 - 0 .9 9 2 ).

T his means that only 8 out of every 1 ,0 0 0 randomly s elec ted


groups of 3 people will res ult in at leas t 2 of them s haring a
birthday. T his is s till a pretty s mall c hanc e, but note that the
probability has more than doubled by moving from two people to
three (0 .0 0 3 c ompared to 0 .0 0 8 )!

O nc e we s tart adding more and more people to our group, the


probability of at leas t two people s haring a birthday s tarts to
inc reas e very quic kly. By the time our group of people is up to
1 0 , the probability of at leas t 2 s haring a birthday is up to
0 .1 1 7 . H ow do we determine this in general? For every pers on
added to the group, another frac tion is multiplied by the previous
produc t. E ac h additional frac tion will have 3 6 5 as the
denominator, and the numerator will be 3 6 5 minus the number of
additional people beyond the firs t.

So, for our previous ly mentioned group of 1 0 people, the


numerator for the las t frac tion is 3 5 6 (3 6 5 - 9 ), determined like
s o:

T his tells us that the probability of no one s haring a birthday in a


group of 1 0 people is equal to 0 .8 8 3 (muc h lower than what we
s aw for 2 or 3 people), s o the probability that at leas t 2 of them
will s hare a birthday is 0 .1 1 7 (1 - 0 .8 8 3 ).

T he firs t frac tion is the probability that the s ec ond pers on won't
s hare a birthday with the firs t pers on. T he s ec ond frac tion is the
probability that the third pers on won't s hare a birthday with the
firs t two. T he third frac tion is the probability that the fourth
pers on won't s hare a birthday with the firs t three, and s o on. T he
ninth and final frac tion is the probability that the tenth pers on
won't s hare a birthday with any of the other nine.

I n order for no one to s hare a birthday,


every s ingle one of the events in the c hain
has to c o- oc c ur, s o we determine the
probability of all of them happening in the
s ame group of people by multiplying all of
the individual probabilities together. E very
time we add another pers on, we inc lude
another frac tion into the equation, whic h
makes the final produc t get s maller and
s maller.

Solving for Almost Any Group Size

A s the group s ize inc reas es , it bec omes inc reas ingly more likely
that at leas t two people will s hare a birthday. T his makes perfec t
s ens e, but what s urpris es mos t people is how quic kly the
probability inc reas es as the group gets bigger. Figure 4 - 3
illus trates the rate at whic h the probability goes up when you
add more and more people.

Figure 4-3. Chances of matching birthdays


For 2 0 people, the probability is 0 .4 1 1 ; for 3 0 people, it's 0 .7 0 6
(meaning that 7 times out of 1 0 you will win money on your bet,
whic h are pretty good odds ). I f you have 2 3 people in your
group, the c hanc es are jus t s lightly better than 5 0 /5 0 that at
leas t 2 people will s hare a birthday (the probability is equal to
0 .5 0 7 ).

When all is s aid and done, this is a pretty neat tric k that never
c eas es to s urpris e people. But remember to make the bar bet
only if you have at leas t 2 3 people in the room (and you're
willing to ac c ept 5 0 /5 0 odds ). I t works even better with more
people, bec aus e your c hanc es of winning go up dramatic ally
every time another pers on is added. To have a better than 9 0
perc ent c hanc e of winning your bet, you'll need 4 1 people in the
room (probability of at leas t 2 people s haring a birthday =
0 .9 0 3 ). With 5 0 people, there's a 9 7 perc ent c hanc e you'll win
your money. O nc e you have 6 0 people or more, you are
prac tic ally guaranteed to have at leas t 2 people in the room who
s hare a birthday and, of c ours e, if you have 3 6 6 people pres ent,
there is a 1 0 0 perc ent c hanc e of at leas t 2 people s haring a
birthday. T hos e are great odds if you c an get s omeone to take
the bet!

William Skorups ki
Hack 46. Design Your Own Bar Bet

With a f ew calculations, and perhaps some spreadsheet


sof tware, you can f igure the probabilities associated with all
sorts of "spontaneous" f riendly wagers.

Several of the s tatis tic s hac ks els ewhere in this c hapter us e


dec ks of c ards [H ac k #4 2 ] or dic e [H ac k #4 3 ] as props to
demons trate how s ome s eemingly rare and unus ual outc omes
are fairly c ommon. A s s omeone who's interes ted in educ ating
the world on s tatis tic al princ iples , you no doubt will wis h to us e
thes e teac hing examples to impres s and ins truc t others . H ey, if
you happen to win a little money along the way, that's jus t one of
the benefits of a teac her's life.

But there's no need to rely on the s pec ific examples provided


here, or even to c arry c ards and dic e around (though, knowing
you the way I think I do, you might have plenty of other reas ons
to c arry c ards and dic e around). H ere are a c ouple of bas ic
princ iples you c an us e to make up your own bar bet with any
known dis tribution of data, s uc h as the alphabet, numbers from 1
to 1 0 0 , and s o on:

Principle 1

A n unlikely event inc reas es in likelihood if there are


repeated opportunities for it to oc c ur.
Principle 2

I f there are a large number of pos s ible events , the


c hanc e of any s pec ific event oc c urring s eems s mall.

T he res t of this hac k will s how you how to us e thes e princ iples to
your advantage in your own c us tom- made bar bets .

Principle 1

T he probability of any given event oc c urring is equal to the


number of outc omes , whic h equal the event divided by the
number of pos s ible outc omes . For example, what are the
c hanc es that you and I were born in the s ame month?
P retending for a s ec ond that births are dis tributed equally
ac ros s all months , the probability is 1 /1 2 . T here is only one
outc ome that c ounts as a matc h (your birth month), and there
are 1 2 pos s ible outc omes (the 1 2 months of the year).

What about the probability that any one of two people reading
this book has the s ame birth month as me? I ntuitively, that
s hould be a bit more likely than 1 out of 1 2 . T he formula to figure
this out is not quite as s imple as one would like, unfortunately. I t
is not 1 /1 2 times its elf, for example. T hat would produc e a
s maller probability than we began with (i.e., 1 /2 4 ). N or is the
formula 1 /1 2 + 1 /1 2 . T hough 2 /1 2 s eems to have promis e as
the right ans werbec aus e it is bigger than 1 /1 2 , indic ating a
greater likelihood than beforethes e s orts of probabilities are not
additive. To prove to yours elf that s imply adding the two
frac tions together won't work, imagine that you had 1 2 people in
the problem. T he c hanc e of finding a matc h with my birth month
among the 1 2 is obvious ly not 1 2 /1 2 , bec aus e that would
guarantee a matc h.

T he ac tual formula for c omputing the c hanc es of an event


oc c urring ac ros s multiple opportunities is bas ed on the notion of
taking the proportional c hanc e that an event will not happen and
multiplying that proportion by its elf for eac h additional "roll of
the dic e." A t the c onc lus ion of that proc es s , s ubtrac ting the
res ult from 1 .0 s hould give the c hanc e that the event will
happen.

T his formula has a theoretic al appeal bec aus e it is logic ally


equivalent to the more intuitive methods (it us es the s ame
information). I t is appealing mathematic ally, too, bec aus e the
final es timate is bigger than the value as s oc iated with a s ingle
oc c urrenc e, whic h is what our intuition believes ought to be the
c as e. T hink about it this way: how many times will it not happen,
and among thos e times , how many times will it not happen on the
next oc c urrenc e?

H ere's the equation to c ompute the probability that s omeone


among two readers will have the s ame birth month as I do:

Principle 2

To get s omeone to ac c ept a wager or to amaze an audienc e with


the oc c urrenc e of any given outc ome, the likelihood mus t s ound
s mall. So, wagers or magic tric ks having to do with the 3 6 5 days
in a year, or the 5 2 c ards in a dec k, or all the pos s ible phone
numbers in a phone book are more effec tive and as tounding
bec aus e thos e numbers s eem big in c omparis on to the number
of winning outc omes (e.g., one).

T he c hanc e of any unlikely event oc c urring on any s ingle event


is indeed s mall, s o the intuitive belief expres s ed in this princ iple
is c orrec t. A s we have s een, though, the c hanc es of the event
oc c urring inc reas es if you get more than one s hot at it, and it
c an inc reas e rapidly.

Rolling Your Own Bar Bet

L et's walk through the s teps that verify my advantage for a


c ouple of wagers I jus t made up.

Letters of the alphabet

For this wager, I 'll pic k five letters of the alphabet. I bet that if I
c hoos e s ix people and as k them to randomly pic k any s ingle
letter, one or more of them will matc h one of my five letters .
H ere's how the bet plays out:

Number of pos s ible choices

T here are 2 6 letters in the alphabet.

Probability of a s ingle attempt failing

T here are 2 1 out of 2 6 pos s ibilities that are not


matc hes : 2 1 /2 6 = .8 0 8 .

Number of attempts
6

Probability of all 6 attempts failing

.8 0 8 6 = .2 7 8

Probability of s omething other than the previous options occurring

1 - .2 7 8 = .7 2 2

T he c hanc e of my winning this bet is 7 2 perc ent.

Pick a number, any number

T his time, I 'll pic k 1 0 numbers from 1 to 1 0 0 . I bet that if I


c hoos e 1 0 people and as k them to randomly pic k any s ingle
number from 1 to 1 0 0 , one or more of them will matc h one of my
ten numbers . H ere's how this one works out:

Number of pos s ible choices

T here are 1 0 0 numbers to c hoos e from.

Probability of a s ingle attempt failing

T here are 9 0 out of 1 0 0 pos s ibilities that are not


matc hes : 9 0 /1 0 0 = .9 0 .
Number of attempts

10

Probability of all 10 attempts failing

9 0 10 = .3 4 9

Probability of s omething other than the previous options occurring

1 - .3 4 9 = .6 5 1

T he c hanc e of my winning this bet is 6 5 perc ent.

On your own

C opy the s teps and c alc ulations jus t s hown to develop your own
original party tric ks . N one of thes e demons trations require any
props , jus t a willing and hones t volunteer.

N otic e that the c alc ulations are bas ed on people randomly


pic king numbers . I n reality, of c ours e, people will not pic k a
letter or number that they have jus t heard s omeone els e pic k. I n
other words , their c hoic es will not be independent of other
c hoic es . I f the c hoic es are made bas ed on the knowledge that
previous ans wers are not c orrec t, this helps your odds a little
bit. For example, on the 1 0 - out- of- 1 0 0 numbers wager, if there
is no c hanc e that the 1 0 people will c hoos e a number that has
already been c hos en, your c hanc es of getting a matc h go from
6 5 perc ent to 6 7 perc ent.

Make Sure the Sucker Isn't You!

I t is fun to play with others , but you never know when you will get
c aught in s omeone els e's c lever s tatis tic s trap. For ins tanc e,
remember that 1 - out- of- 1 2 c hanc e that you have the s ame birth
month as me? I fooled you! I was born in February. T here are
fewer days in that month than the others , s o your c hanc es of
being born in that month are ac tually les s than 1 out of 1 2 .
T here are 2 8 .2 5 days in February (an oc c as ional February 2 9
ac c ounts for the .2 5 ) and 3 6 5 .2 5 days in the year (the
oc c as ional L eap Year ac c ounted for again). T he c hanc e that you
were born in the s ame month as me is 2 8 .2 5 /3 6 5 .2 5 , or 7 .7 3
perc ent, not the 8 .3 3 perc ent that is 1 out of 1 2 .

So, you are les s likely to have the s ame birth month as me.
C ome to think of it, the rec ords of my birth, my birth c ertific ate,
and s o on were los t in a fire many years ago. So, the original
data about my birth is now mis s ing.

For all I know, I might not even be born yet!


Hack 47. Go Crazy with Wild Cards

Wild cards are added to a poker game to ratchet up the f un.


Statistically, though, they make things all discombobulated.

H undreds of years ago, poker players s ettled on a rank order of


hands and dec ided what would beat what. P leas antly, for the field
of s tatis tic s , the order they s ettled on is a perfec t matc h with
the probability that a player will be dealt eac h hand. P res umably,
the developers of poker rules either did the c alc ulations or
referenc ed their own experienc e as to how frequently they s aw
eac h kind of hand in ac tual play. I t is als o pos s ible that they
took a dec k of c ards , paper and penc il, and a free afternoon,
dealt thems elves many thous ands of random poker hands , and
c ollec ted the data. Whatever the method, the rank order of poker
hands is a perfec t matc h with the relative s c arc ity of being dealt
thos e partic ular c ombinations of c ards .

Rank ordering, though, does not take into ac c ount the


meaningful dis tanc e between one type of hand and the type of
hand ranked immediately below it. A s traight flus h, for example,
is 1 6 times les s likely to oc c ur than the hand ranked
immediately below it, whic h is four of a kind, while a flus h is only
half as likely as a s traight, the hand ranked immediately below a
flus h.

Before we talk about the problem with playing with wild cards
(c ards , often jokers , that c an take on any value the holder
wis hes ), let's review the ranking of poker hands . Table 4 - 1 7
s hows the probability that a given hand will oc c ur in any random
five c ards , as well as eac h hand's relative rarity when c ompared
to the hand ranked jus t below it in the table.

Table Poker hands, probabilities, and


comparisons
Relative
Hand Probability
rarity
Straight flush .000015 16 times less likely
Four of a kind .00024 5.8 times less likely
Full house .0014 1.4 times less likely
Flush .0019 2.1 times less likely
Straight .0039 4.4 times less likely
Three of a
.021 2.3 times less likely
kind
Two pair .048 8.8 times less likely
One pair .42 1.2 times less likely
Nothing .50 -----

To gamblers , there are s everal obs ervations of note from Table


4 - 1 7 . Firs t, with five c ards , half the time players have nothing.
A lmos t half the time, a player has a pair. A player will have
s omething better than a pair only 8 perc ent of the time.

Sec ond, s ome hands treated as if they are wildly different in


rarity are almos t equally likely to oc c ur. N otic e that a flus h and a
full hous e oc c ur with about the s ame frequenc y.

Finally, after three of a kind, the likelihood of a better hand


oc c urring drops quic kly. I n fac t, there are two giant drops in
probability: having either nothing or a pair oc c urs mos t of the
time (9 2 perc ent), then two pair or three of a kind oc c urs another
7 perc ent of the time, and s omething better than three of a kind
is s een les s than 1 perc ent of the time.

The Problem with Wild Cards

T his is all very interes ting, but what does it have to do with the
us e of wild c ards ? Well, adding wild c ards to the dec k s c rews up
all of thes e time- tes ted probabilities . A s s uming that the holder
of a wild c ard wis hes to make the bes t hand pos s ible, and als o
as s uming that one wild c ard, a joker, has been added to the dec k,
Table 4 - 1 8 s hows the new probabilities , c ompared to the
traditional ones .

Table Probability of poker hands with one wild ca


the deck
Change
Probability
Classic probabi
Hand with wild
probability with w
card
card
Five of
.0000045 ----- -----
a kind
Straight .000064 .000015 +327 perce
flush
Four of
.0011 .00024 +358 perce
a kind
Full
.0023 .0014 +64 percen
house
Flush .0027 .0019 +42 percen
Straight .0072 .0039 +85 percen
Three
of a .048 .021 +129 perce
kind
Two
.043 .048 -10 percen
pair
One
.44 .42 +5 percent
pair
Nothing .45 .50 -10 percen

T he problem with wild c ards is apparent as we look at the new


probabilities , es pec ially when we look at three of a kind and two
pair. T hree of a kind is now more c ommon than two pair!

T he rank order that traditionally determines whic h hand beats


what is no longer c ons is tent with ac tual probabilities .
A dditionally, the c hanc es of getting two pair ac tually drop when a
wild c ard is added. O ther probabilities c hange, of c ours e, with all
the other playable hands bec oming more likely. Some s uper
hands , while remaining rare, inc reas e their frequenc y quite
dramatic ally: hands better than three of a kind are about twic e
as c ommon as they were before.

Knowing thes e new probabilities gives s mart poker players an


edge. I n fac t, c ontrary to the s tereotype that experienc ed and
profes s ional poker players avoid games with wild c ards bec aus e
they are c hildis h or for amateurs , s ome informed players s eek
out thes e games bec aus e they believe they have the advantage
over your more naïve types . (You know, thos e naïve types , like
people who don't read H ac ks books ? )

Why It Works

A s you c an s ee in Table 4 - 1 8 , us ing wild c ards les s ens the


c hanc e of getting two pair. But why would this be? Surely adding
a wild c ard means that s ometimes I c an turn a one- pair hand
into a two- pair hand. T his is true, but why would I ? I magine a
player has one pair in her hand, and s he gets a wild c ard as her
fifth c ard. Yes , s he could matc h that wild c ard up with a s ingleton
and c all it a pair, dec laring a hand with two pairs . O n the other
hand, it would be s marter for her to matc h it up with the pair s he
already has and dec lare three of a kind. G iven the option
between two pair and three of a kind, everyone would c hoos e the
s tronger hand.

The Other Problem with Wild Cards

T he exis tenc e of wild c ards c reates a paradox that drives game


theoris ts c razy. T he paradox works like this :

1. T he ranking of hands and their relative value in a poker


game s hould be bas ed on the frequenc y of their
oc c urrenc e. T he les s frequently oc c urring hand s hould
be valued more than more c ommonly oc c urring hands .

2. I n the c as e of c hoos ing whether to us e a wild c ard to


turn a hand into two pair or three of a kind, players will
us ually c hoos e to c reate three of a kind. T his c hanges
the frequenc y in prac tic e s uc h that two pair bec omes
les s c ommon than three of a kind.

3. Bec aus e rankings s hould be bas ed on probabilities , the


rules of poker s hould be c hanged when wild c ards are in
play to make two pair more valuable than three of a kind.

4. With revis ed rankings , three of a kind would be worth


les s than two pair, s o now s mart players would us e their
wild c ard to make two pair ins tead of three of kind, s o
two pair would quic kly bec ome more c ommon than three
of a kind.

5. T he ranking rules would then have to be c hanged again


to matc h the ac tual frequenc ies res ulting from the
previous rule c hange, and a never- ending c yc le would
begin.

Table 4 - 1 8 avoids this paradox by as s uming that players want


to make their bes t hand bas ed on traditional rankings . C lever of
me, huh? Want to play c ards ?
Hack 48. Never Trust an Honest Coin

Of all that is sacred in the of ten secular world of statistics, no


concept has more f aith than the honest spin of an honest coin.
Fif ty percent chance of either heads or tails, right? The
troubling answer is, apparently...no!

A bas ic explanation of c hanc e and how it operates almos t


always inc ludes a s imple example of flipping or s pinning a c oin.
"H eads you win; tails I win" is the c us tomary method for s ettling
a variety of dis putes , and the binomial dis tribution [H ac k #6 6 ] is
us ually des c ribed and taught as the pattern of random c oin
outc omes .

But as it turns out, if you s pin a c oin, es pec ially a brand- new
c oin, it might land tails up more often than heads up.

Shiny New Pennies

You know the look and feel of a brand- new, mint- c ondition
penny? I t's s o bright that it looks fake. I t's s o detailed and
s harp around the edges that you have to be c areful not to c ut
yours elf.

Well, get yours elf one of them bright, s harp little fellas and s pin it
1 0 0 times or s o. C ollec t data on the heads - or- tails res ults , and
prepare to be amazed bec aus e tails is likely to c ome up more
than 5 0 times . I f our unders tanding of the fairnes s of c oins is
c orrec t, a c oin s hould c ome up tails more than half the time les s
than half the time. (Say that las t s entenc e out loud and it makes
more s ens e.) N ot with the s pin of a new penny, though.

N ew c oins , at leas t new pennies , tend to have a c ris p edge that


ac tually is a bit longer or taller on the tail's s ide (the tail s ide is
imprinted a little deeper into the penny than the head s ide).
Figure 4 - 4 gives a s ens e of how this edge looks . I f you s pin an
objec t s haped like this , there is a tendenc y for the s ide with the
extra long edge to land fac e- up.

Figure 4-4. Spinning a new penny


I magine s pinning the c ap from a bottle of beer or s oda pop. N ot
only would it not s pin s o well, but you als o wouldn't be s urpris ed
to s ee it land with the edge s ide up. A new penny is s haped kind
of like a bottle c ap, jus t not quite s o as ymmetric al. T he little
extra edge, though, is enough over many s pins to give tails the
advantage.

Binomial Expectations

T he pos s ible exis tenc e of a bottle-cap effect pres ents a tes table
hypothes is :

T he probability of a fres hly minted s pinning penny landing with


tails up is greater than 5 0 perc ent.

O f c ours e, jus t by c hanc e, over a few flips we might find a c oin


landing tails up more often than heads , but that wouldn't really
prove anything. We know that c hanc e will bring res ults in s mall
s amples that don't repres ent the nature of the population from
whic h the s amples were drawn.

O ur s ample of c oin s pins s hould repres ent a population of


infinite c oin s pins . I f we s pin a c oin 1 0 0 times and find 5 1 tails ,
is that ac c eptable evidenc e for our hypothes is ? P robably not;
c hanc e c ould be the explanation for a proportion other than .5 0 .
H ow about 5 2 tails ? H ow about 5 2 perc ent tails and a million
s pins ?

Statis tic s c omes to the res c ue onc e again and provides a


s tandard by whic h to judge the outc ome of our experiment. We
know from the binomial dis tribution that 1 0 0 s pins of a
theoretic ally fair c oin (one without the unbalanc ed edge
weirdnes s ) will produc e 5 1 or more tails 4 2 perc ent of the time.
O ld- s c hool s tatis tic al proc edures require that an outc ome mus t
have a 5 perc ent or lower c hanc e of oc c urring to be treated as
s tatis tically s ignificantnot likely to have oc c urred by c hanc e. So,
we probably wouldn't ac c ept 5 1 perc ent after 1 0 0 s pins as
ac c eptable s upport for the hypothes is .

O n the other hand, if we s pun that hard- working c oin 6 ,7 7 4


times and got 5 1 perc ent tails , that would happen by c hanc e
only 5 perc ent of the time. O ur level of s ignific anc e for that
res ult is .0 5 . Table 4 - 1 9 s hows the likelihood of getting a
c ertain proportion of tails jus t by c hanc e, when the expec ted
outc ome is 5 0 perc ent tails . D eviations from this expec ted
proportion that are s tatis tic ally s ignific ant c an be treated as
evidenc e of s upport for our hypothes is .

Table Coin spins and probability of certain


outcomes
Probability of
Number
Proportion the given
of
of tails proportion or
spins
higher
100 .51 .42
100 .55 .16
100 .58 .05
500 .51 .33
500 .55 .01
500 .58 .0002
1,000 .51 .26
1,000 .55 .001
1,000 .58 .0000002

N otic e that the power of this analys is really inc reas es as the
s ample s ize gets big [H ac k #8 ]. You need only a s light
fluc tuation from the expec ted to s upport your hypothes is if you
s pin that c oin 5 0 0 or 1 ,0 0 0 times . With 1 0 0 s pins , you need to
s ee a proportion of tails at or above .5 8 to believe that there
really is an advantage for tails with a newly minted penny.

T he dis tanc e of the obs erved proportion from the expec ted
proportion is expres s ed as a z s c ore [H ac k #2 6 ]. H ere's the
equation that produc es z s c ores and generated the data in Table
4 -1 9 :

T he probability as s igned is the area under the normal c urve,


whic h remains above that z s c ore.

Where It Doesn't Work

O nc e you prove to yours elf that this tail advantage is real, heed
this reminder before you go running off to win all s orts of c razy
wagers . You mus t s pin the c oin! D on't flip it. Say it with me:
s pin, don't flip.
See Also

T he term bottle-cap effect was s ugges ted on an


entertaining web page that inc ludes a nic e dis c us s ion of
the extra- tall edge on penny tails . I t is maintained by D r.
G ary Rams eyer at http://www.ils tu.edu/~gc rams ey/.
Hack 49. Know Your Limit

Humans don't always make rational decisions. Even smart


gamblers will sometimes ref use a wager when the expected
payof f could be huge and the odds are f air. The St. Petersburg
Paradox gives an example of a perf ectly f air gambling game
that perf ectly healthy statisticians probably wouldn't play, just
because they happen to be human.

T he s tandard dec is ion- making proc es s for s tatis tic ally s avvy
gamblers involves figuring the average payoff for a hypothetic al
wager and the c os t to play, and then determining whether they
are likely to break even or, better yet, make a boatload of money.
T hough one c ould produc e dozens of s tatis tic al analys es of
gambling all about when a pers on s hould and s houldn't play, the
ps yc hology of the human mind s ometimes takes over, and people
will refus e to take a wager bec aus e it jus t does n't feel right.

The Game of St. Petersburg

T he game of St. P eters burg is about 3 0 0 years old. T he


parameters of the game were des c ribed by D aniel Bernoulli in
1 7 3 8 . H ere are the rules :

1. You pay me a fee to play upfront.

2. Flip a c oin. I f it c omes up heads , you win and I 'll pay you
$2.

3. I f it does n't c ome up heads , we'll flip again. I f heads


c omes up that time, I 'll pay you 2 2 ($ 4 ).

4. Suppos ing heads s till has n't c ome up, we flip again.
H eads on this third flip, and I pay you 2 3 ($ 8 ).

So far, it s ounds pretty good and more than fair for you. But it
gets better. We keep flipping until heads c omes up. When it
eventually arrives , I pay you $ 2 n, where n is the number of flips
it took to get heads .

G reat game, at leas t from your pers pec tive. But here's the killer
ques tion: how muc h would you pay to play?

T he game of St. P eters burg might not


really have ever exis ted as a popular
gambling game in the s treets of old- time
Rus s ia, but it's been us ed as a
hypothetic al example of how the mind
proc es s es probability when money is
involved. I t provided many early
s tatis tic ians an exc us e to analyze the way
"expec ted outc omes " works in our heads .
T he paper was ac tually publis hed, by the
way, by St. P eters burg A c ademy, thus the
name.
D ec iding how muc h you would pay to play is an interes ting
proc es s . A s a s mart s tatis tic ian, you would c ertainly pay
anything les s than $ 2 . E ven without all the bigger payoff
pos s ibilities , betting you will get heads on a c oin flip and getting
paid more than the c os t of playing is c learly a great bet, and
you'd go for it in a s hot.

You als o probably would gladly pay a full $ 2 . You will win the $ 2
bac k half the time, and the other half of the time you will get
muc h more than that! T his is a game you are guaranteed to win
eventually, s o it's not a ques tion of winning. When you don't get
heads the firs t time, you have guaranteed yours elf at leas t $ 4
bac k, and pos s ibly morepos s ibly muc h more.

So, maybe you'd pay $ 4 to play. O f c ours e, oc c as ionally, your


payoff would be really big money$ 8 , $ 1 6 , $ 3 2 ,
$ 6 4 ...theoretic ally, the payoff c ould be c los e to infinite. But how
muc h would you pay? T hat's the 6 4 - dollar ques tion.

Statistical Analysis

Some s oc ial s c ienc e res earc hers s ugges t that mos t people
would play this game for s omething around four buc ks , maybe a
little more. Few would pay muc h more. What about s tatis tic ally,
though? What is the mos t you s hould pay?

Well, this is where I c ons ider turning in my Stats Fan C lub


members hip c ard, bec aus e I am afraid to tell you the c orrec t
ans wer. T he rules of probability as they relate to gambling
s ugges t that people s hould play this game at any c os t. Yes , a
s tatis tic ian would tell you to play this game for any price! A s
long as the c os t is s omething s hort of infinity, this is ,
theoretic ally, a good wager.
L et's figure this out. H ere's the payoff for the firs t s ix c oin flips :

Proportion
Flips Likelihood Winning
of games
1 1 out of 2 .50 $2
2 1 out of 4 .25 $4
3 1 out of 8 .125 $8
4 1 out of 16 .0625 $16
5 1 out of 32 .03125 $32
6 1 out of 64 .015625 $64

Expected payoff is the amount of money


you would win on average ac ros s all
pos s ible outc omes . For a s ingle flip, there
are two outc omes : for heads , you win $ 2 ;
for the other pos s ibility, tails , you get $ 0 .
T he average payout is $ 1 , the expec ted
payoff for one c oin flip (and, it turns out, for
any number of c oin flips ).
I f you play this game 6 4 times , you will get to the s ixth c oin flip
jus t onc e, but you will win $ 6 4 . 3 2 of thos e 6 4 times you will win
jus t $ 2 . T he average payoff s ounds lowjus t a buc k.
O c c as ionally, though, heads won't c ome up for a very long time,
and when it finally does , you have won yours elf a lot of money.
When you s tart the game, you have no idea how long it will go
and it c ould be very long indeed (a lot like a P eter J ac ks on film).

N otic e a few things about this s eries of flips and how the
c hanc es drop at the s ame rate as the winnings go up:

O nly s ix c oin flips are s hown. T heoretic ally, the flipping


c ould go on forever, though, and no head might ever
c ome up.

With eac h c oin flip, the winnings amount c ontinues to


double and the proportion of games where that number of
flips would be reac hed c ontinues to be c ut in half.

T he "P roportion of games " c olumn never adds to 1 .0 or


1 0 0 perc ent, bec aus e there is always s ome c hanc e, no
matter how very s mall, that one more flip will be needed.

T he dec is ion rule among us Stats Fan C lub members for whether
to play a gambling game is whether the expected value of the
game is more than the c os t of playing. E xpec ted value is
c alc ulated by adding up the expec ted payoff for all pos s ible
outc omes .

You'll rec all that the expec ted payoff for eac h pos s ible trial is
$ 1 . T here are an infinite number of pos s ible outc omes , bec aus e
that c oin c ould jus t keep flipping forever. To get the expec ted
value, we s um this infinite s eries of $ 1 and get a huge total. T he
expected value for this game is infinite dollars . Sinc e you s hould
play any game where the c os t of playing is les s than the
expec ted value, you s hould play this game for any amount of
money les s than infinity.

Why It Doesn't Work

O f c ours e, in real life, people won't pay muc h more than $ 2 for
s uc h a game, even if they knew all the s tatis tic s . N o one really
knows for s ure why s mart people turn their nos es up at paying
very muc h money for s uc h a pros pec t, but here are s ome
theories .

Infinite is a lot

E ven if you ac c ept in s pirit that the game is fair over the long run
and would oc c as ionally pay off really big if you played it many,
many times , that "long run" is infinitely long, whic h is an awfully
long time. Few people have the patienc e or deep enough poc kets
to play a game that relies on s o muc h patienc e and demands
s uc h a large fee.

Decreasing marginal utility

T he originator of the problem, Bernoulli, believed that people


perc eive money as valuable, but the perc eption is not
proportional to the amount of money. I n other words , while
having $ 1 6 is better than having $ 8 , the relative value of one to
the other is different than the relative value of having $ 1 2 8
c ompared to $ 6 4 .

So, at s ome point, the infinite doubling of money s tops being


equally meaningful as a prize. Bernoulli als o believed that if you
have a lot of money, a s mall wager is les s meaningful than if you
have very little money. (Kind of like thos e wealthy c artoon
c harac ters who light their c igars with hundred dollar bills .)

Risk versus reward

H umans tend to be ris k avers e. T hat is , they will oc c as ionally


ris k s omething in exc hange for a reward, but they want that ris k
to be fairly c los e to the c hanc es of s uc c es s . I t is true that the
game of St. P eters burg has a c hanc e for a mas s ive reward, but
the c hanc e might be s een as too little c ompared to a ris k of even
$4.

Infinity doesn't exist

Some philos ophers would argue that people do not ac c ept the
c onc ept of infinity as a c onc rete reality. A ny s ales pitc h to
enc ourage people to play this game by promoting the infinity
as pec ts would be les s than c ompelling.

T his might be why I don't buy lottery tic kets . I don't play the
lottery bec aus e my odds of winning are inc reas ed only s lightly
by ac tually playing. I n my mind, the odds of me winning are
infinitely s mall, or c los e enough to it that I don't treat the
pos s ibility of winning as real.

See Also

"G amble Smart" [H ac k #3 5 ]

A very interes ting and thoughtful dis c us s ion of the St.


P eters burg P aradox is in the Stanford Encyclopedia of
Philos ophy. T he online entry c an be found at
http://plato.s tanford.edu/entries /paradox- s tpeters burg.
Chapter 5. Playing Games
H ac ks 5 0 - 6 0

G ames don't have to involve gambling to involve s tatis tic s . You


c an us e knowledge of game- s pec ific probabilities to win on T V
game s hows [H ac k #5 0 ], at M onopoly [H ac k #5 1 ], or when
c oac hing a football team [H ac k #5 8 ].

T he mos t c ommon plac e you s ee s tatis tic s in your everyday life


is probably in the world of s ports , though the word "s tatis tic s "
is n't really us ed the s ame way a s tat- hac ker us es it. Sports fans
tend to think of the data as the s tatis tic. Regardles s , there are
plenty of hac ks that c an help you predic t the outc ome of a game
before it is over [H ac k #5 6 ] or even begun [H ac k #5 5 ].

Sinc e his tory is always our bes t guide to the future, your bes t
predic tions will require various ways to trac k, vis ualize [H ac k
#5 7 ], and rank [H ac k #5 9 ] the performanc e of teams and
players .

O f c ours e, if you have the heart of a true s tatis tic s hac ker, then
you think that s ome s tatis tic al games s uc h as building a learning
c omputer out of c oc onuts [H ac k #5 2 ], doing c ard tric ks through
the mail [H ac k #5 3 ], keeping your iP od hones t [H ac k #5 4 ], or
es timating the value of pi purely by c hanc e [H ac k #6 0 ]are fun
all by thems elves .
Hack 50. Avoid the Zonk

On the TV ga me show Let's Make a Deal, contestants of ten


ha d to choose between three curta i ns. For these sorts of
si tua ti ons, there i s a sta ti sti ca l stra tegy tha t wi l l hel p y ou
to wi n the B ui ck i nstea d of the l i f eti me suppl y of R i ce-A-
R oni .

I magine, if you will, that you are traveling with your U nc le Frank
through an unc harted region of Tonganoxie, Kans as . You c ome to
a fork in the road that branc hes out into three pos s ible paths : A ,
B, and C . You don't know whic h will lead you to your des tination,
the fabled world's larges t ball of twine (in C awker C ity, Kans as ).
A n old pros pec tor is res ting with his burro at the c ros s roads .

"Say, old timer," you s ay, "whic h road leads to the world's larges t
ball of twine? "

"Well," s ays he, "I know, but I won't tell you. What I will do,
though, is tell you that one road is the c orrec t road. Two are
wrong and lead to c ertain dis as ter (or at leas t poorly maintained
res trooms ). G o ahead and take your pic k, c ity s lic ker. A s you
drive off, look bac k at me. I won't s ignal whether you are right or
wrong, but I will point at one of the other two roads . T he one I
point at will be a wrong road. You s till won't know for s ure
whether you gues s ed right or not, of c ours e, but I guarantee that
I 'll point at one of the two roads you are not on and it will be a
wrong road."

You ac c ept the s trange man's offer (what c hoic e do you really
have? ) and you as k U nc le Frank, the experienc ed gambler
among you, to pic k a road. H e does s o randomly and you head off
optimis tic ally down one of the three paths let's s ay A . A s you
look bac k, the kindly pros pec tor points to one of the other
roads let's s ay B. I mmediately, you s lam on the brakes and bac k
the c ar up. O ver the objec tions of U nc le Frank, you head down
the remaining road, C , with the peddle to the metal, fairly
c onfident that you are now on the right path.

C razy, are you? Suffering from white- line fever? N o, you've jus t
applied the s tatis tic al s olution to what is known as the Monty Hall
problem and c hos en the road among the three that has the
greates t c hanc e of being c orrec t. H ard to believe? Read on, my
friend, and prepare to win ric hes beyond your wildes t dreams .

T he bes t s trategy in this c as e is s o c ounterintuitive and


downright weird that the world's s martes t people have dis agreed
aggres s ively about whether it even really is the bes t s trategy.
But believe meit is .

The Monty Hall Problem and Game Show


Strategy

I n our example with the three roads and the pros pec tor, there is ,
in fac t, a two- thirds (about 6 7 perc ent) c hanc e that C is the
c orrec t road. To apply this odd s trategy to a more realis tic
s ituation, think of c ontes tants on game s hows or gamblers in
any game in whic h prizes are hidden in boxes or behind doors .
A s typic ally dis c us s ed among game s how theoris ts and c ranky
s tatis tic ians , the problem is pres ented as a fairly c ommon
ac tual s ituation on the game s how Let's Make a Deal (whic h had
its heyday in the 1 9 6 0 s and 1 9 7 0 s ), but it is a s ituation s till
s een today in T V game s hows . T he hos t of Let's Make a Deal was
M onty H all, s o the problem c arries his name.

A s a game s how s c enario, the problem goes like this . M onty


pres ents to you three c urtains . H e knows what is behind eac h
c urtain. H e explains that behind one of the c urtains is a brand-
new c ar. T he other two c urtains hide worthles s prizes , what
M onty us ed to c all zonks . (Zonks were often s omething like a
donkey or a giant roc king c hair, s omething that wouldn't be of
any real us e.) H e lets you pic k a c urtain, and you will win
whatever is behind it. L et's s ay you pic k c urtain A . H e then
opens one of the unc hos en c urtains B, for exampleto s how you
that it has a zonk behind it. H e then offers to let you trade your
original c hoic e for the remaining c urtain, C . Should you s witc h?

A s with the three roads problem, the ans wer is yes , you s hould
s witc h. T he ans wer jus t never s eems right the firs t time one
hears it. But, if you want to inc reas e your odds of winning the c ar,
you s hould now s witc h.

Why You Should Always Switch

T hink of the probability of you gues s ing the c orrec t c urtain. L et's
as s ume that it is a random gues s none of this "I notic e that one
c urtain moved, s o I figured there was a donkey behind it" s tuff.

T hree c urtains , with only one c urtain being a winner, means there
is a 1 out of 3 c hanc e that you will gues s right and win the c ar.
T hat's about 3 3 perc ent. O n that firs t gues s , with no additional
information, you are likely to be wrong; in fac t, you have a 2 out
of 3 c hanc e of being wrong. I n other words , there is about a 6 7
perc ent c hanc e that the c ar is s omewhere behind the two
c urtains you did not pic k.

O nc e you know that one of thos e other two c urtains does not
have the c ar, that does n't c hange the original probability that the
c ar is 6 7 perc ent likely to be s omewhere behind thos e two
uns elec ted c urtains . Remember, M onty will always have a wrong
c urtain he c an open, no matter whic h one you c hoos e. T he 6 7
perc ent c hanc e that the c ar is behind B or C remains true, even
after B is revealed to not be hiding the c ar. T he 6 7 perc ent
likelihood now trans fers to c urtain C . T hat's why you s hould
always s witc h to the other c urtain.

I f you were given the option of s wapping


your pic k of one c urtain for both the other
two c urtains , you'd s witc h in a s ec ond
wouldn't you? T hat's es s entially what is
offered in the M onty H all problem.

Some figures might be nec es s ary to pers uade your inner


s keptic . L ook at Table 5 - 1 , whic h s hows the probability
breakdown for the three options at the s tart of the game. You
have a one- third c hanc e of gues s ing the winning c urtain and a
two- thirds c hanc e of pic king a nonwinning c urtain.

Table Probability of car's location at start of


game
Curtain A Curtain B Curtain C
33.33 percent 33.33 percent 33.33 percent
Table 5 - 2 s hows the s ame probabilities grouped in a different
way, but it has n't c hanged any of the parameters of the problem.

Table Restated probability of car's location at


start of game
Curtain A Curtain B or Curtain C
33.33 percent 66.66 percent

Table 5 - 3 s hows the probabilities after M onty reveals one of the


nonc hos en c urtains (C urtain B) to be a nonwinner. T he 6 7
perc ent likelihood now trans fers to c urtain C .

Table Probability of car's location after curtain B


is opened
Curtain A Curtain B Curtain C
33.33 percent 0.00 percent 66.66 percent

I n any s ituation like this , you s hould s witc h. You might be wrong,
of c ours e, but you have a better s hot of winning that c ar or
whatever other prize you are playing for if you ac c ept any offers
to s witc h. T his is always the bes t s trategy, if a few c riteria are
met:

T he hos t knows what is behind eac h c urtain.

T he hos t reveals one of the unc hos en c urtains and the


prize is not behind it.

Your original c hoic e was random.


The Controversy

T he M onty H all problem and the general game s how


s trategy that res ulted was firs t introduc ed to the
mas s es in 1 9 9 1 by M arilyn Vos Savant, a c olumnis t for
Parade Magazine. Bec aus e s he is known for being a "high
I Q genius ," Vos Savant ans wered ques tions from
readers , s ometimes of a brain teas er nature. Someone
s ent in the problem as I 've des c ribed it, and s he
publis hed the ans wer I have given here.

A pparently, s he rec eived many letters , s ome angry,


from s tatis tic ians , philos ophers , and s uc h c laiming that
s he got it wrong. I n s c holarly journals , there were even
publis hed debates about whether her ans wer was
c orrec t. M y read of the debate is that it turned out that
mos t of the arguments c entered on a key ingredient of
the ques tion: M onty knows what is behind eac h door, s o
when he opens that firs t c urtain, he knows it will be a
zonk. O therwis e, the reveal does not c ount as new
information and the ans wer Vos Savant gave does
bec ome debatable. M os t of the c ritic s of her ans wer
mis s ed that part of the original publis hed ques tion.

D on't be too c onc erned if the c orrec tnes s of this s olution is n't
immediately apparent. Really s mart people often firs t view the
new odds as being 5 0 /5 0 between the two unopened c urtains
and, therefore, it does n't matter if you s witc h. T he key to
remember, though, is that your original c hanc e of pic king the
c orrec t door, 3 3 .3 perc ent, c annot c hange no matter what
happens after you make your c hoic e. E ven experts s ometimes
dis agree about the bes t way to view this ques tion. E ven people
as wis e as the old pros pec tor you met out in Tonganoxie that
s tarted our dis c us s ion don't always know the right ans wer to the
M onty H all problem. H ow do you think he won that burro?
Hack 51. Pass Go, Collect $200, Win the
Game

Monopoly is a game of chance (and Chance cards). A s such, the


best strategies f or winning capitalize on probability.

Winning the popular P arker Brothers board game Monopoly


requires negotiating s kill, c lever money management, and
ins ightful inves tment planning. I t als o requires a little bit of luc k.

A s two s ix- s ided dic e (and a randomly s huffled pile of c ards ) are
the primary determinants for dec iding what s quare you land on,
luc k pays more than jus t a s mall role in the outc ome.
C ompetitive s tatis tic ians s uc h as you and me (or, at leas t, me)
are drawn to any game in whic h probability plays a key part
bec aus e, by applying a few probability bas ic s , we s hould win
more often than your average, run- of- the- mill railroad baron.

Monopoly Statistical Basics

L et's s tart by examining the s imple effec ts of rolling two dic e.


Figure 5 - 1 s hows the mos t c ommon s quares landed on in the
firs t c ouple of turns for everyone.

Figure 5-1. Likely opening rolls


I magine the s tart of the game, when everybody is on G o. With
two s ix- s ided dic e, there is a 4 4 .5 perc ent c hanc e that a 6 , 7 , or
8 will be rolled, with 7 as the mos t likely outc ome (1 6 .7
perc ent). For your firs t two dic e rolls , then, s ome s quares are
more likely to be hit (e.g., the light blues and V irginia A venue)
and s ome les s likely (Baltic A venue or I nc ome Tax). Bas ed on
opening dic e rolls alone, not all s quares are equally likely to be
landed on.

P oor M editerranean A venue c annot even


be landed on when s tarting at G o, bec aus e
a dic e roll of 1 is not pos s ible with two
dic e. H ave you ever notic ed that it is
almos t always one of the las t properties
s till available for purc has e?

T he G o s quare is a good s tarting point to begin c alc ulating the


various likelihoods for landing. N ot only does everyone s tart
there at the beginning, but there is als o a C hanc e c ard that
s ends players there. O n the other hand, if a player hits the "G o
to J ail" s pac e, s he goes direc tly to jail, bypas s ing G o. So, the
probability for landing on G o is affec ted by not jus t the pos s ible
permutations of dic e rolls , but als o the various C hanc e c ards ,
whic h s end players various plac es , and the rules of the game
its elf, whic h inc lude s quares that make things happen, going to
jail s ituations , and getting out of jail s ituations .

Key Properties

I 've been us ing G o as an example s quare, but, of c ours e, G o


is n't even a s quare we c an purc has e. What we really want to
know is what properties to buy or trade for and where to build
firs t. We want high traffic areas ; the s ec ret to real es tate
s uc c es s is "loc ation, loc ation, loc ation" (and, apparently, for
s ome reas on I 've never unders tood, a nic e wooden dec k).

Table 5 - 4 s hows the top 2 0 mos t landed- upon s quares , taking


all rules into ac c ount. T he table als o s hows the c hanc e that a
player will c ome to res t on any one of thos e s quares . Keep in
mind that an "average" s quare has a 2 .5 perc ent c hanc e of
being your final res ting plac e (4 0 s quares divided by 1 0 0 is
2 .5 ).

Table Best real estate in all of Atlantic City


Chance of ending
Square Rank
your turn on it
Jail 1 11.60 percent
Illinois
2 2.99 percent
Avenue
Go 3 2.91 percent
B&O
4 2.89 percent
Railroad
Free Parking 5 2.83 percent
Tennessee
6 2.82 percent
Avenue
New York
7 2.81 percent
Avenue
Reading
8 2.80 percent
Railroad
St. James
9 2.68 percent
Place
Water Works 10 2.65 percent
Pennsylvania
11 2.64 percent
Avenue
Kentucky
Avenue 12 2.61 percent

Electric
13 2.61 percent
Company
Indiana
14 2.56 percent
Avenue
St. Charles
15 2.56 percent
Place
Atlantic
16 2.54 percent
Avenue
Pacific Avenue 17 2.52 percent
Ventnor
18 2.52 percent
Avenue
Boardwalk 19 2.48 percent
North
Carolina 20 2.47 percent
Avenue

Table 5 - 4 is derived from information provided by Truman


C ollins on his web s ite at http://www.tkc s -
c ollins .c om/truman/monopoly/monopoly.s html. C lever M r.
C ollins developed both probability trees and a c omputer
s imulation to verify thes e values , and offers them for two
s ituations : when players wis h to remain in jail as long as
pos s ible (to earn rent and not have to pay rent) and when they
wis h to get out of jail as quic kly as pos s ible (to buy s till
available properties ). I reported the values that apply to the
former s trategy.

You c an draw s ome important tac tic al c onc lus ions from this
data:

Capitalize on the j ailbirds

A remarkable 1 2 perc ent of the time, your opponent will


begin a turn on the J ail s quare. C learly, owning and
developing the land that rec ently releas ed parolees are
mos t likely to land upon is a wis e goal. T his amounts to
the orange properties (St. J ames and his brothers ) and,
to a les s er extent, the reds (e.g., I llinois A venue) and
the purples (St. C harles and friends ).

Own the oranges

A ll three orange properties are in the top 1 0 . A bout 1


out of every 1 2 rolls will res ult in a hit on Tennes s ee or
N ew York A venue or St. J ames P lac e. G etting the
monopoly with thes e properties and developing quic kly
would s eem to be the s trategy that a pure s tatis tic ian
would c hoos e.

Avoid the far s ide

P roperties on the far s ide of the boardthe greens ,


Boardwalk, and P ark P lac eare les s likely to be landed
upon, even deep into the game. O nly Boardwalk and
P ac ific A venue rank high, and Boardwalk is there, no
doubt, bec aus e there is a C hanc e c ard that s ends
players there. T hes e properties are als o the mos t
expens ive to develop, s o inc luding thes e monopolies
prominently in one's game plan is a bit ris ky.

Importance of the Monopoly Prison


System

Without a s tatis tic al analys is , it might not be s o c lear the c ruc ial
role that the J ail and "G o to J ail" s quares play in the overall true
value of real es tate. O ne wis hes it was for s ale. P layers will s tart
or end their turn on the J ail s quare more often than they will land
on any monopoly on the board. A c ons tant s tream of releas ed
pris oners flood ac ros s one s ide of the board, inc reas ing the
opportunity to c ollec t rents on properties all the way up to
I llinois .

J ail c an als o provide a welc ome res pite from having to travel the
s treets paying rent to other players , though early in the game,
J ail c an prevent you from buying up your dream properties . A
final obs ervation on the importanc e of J ail: there is only one
s quare that you c an never end your turn on. C an you name it? Go
to Jail.

See Also

Bill Butler runs another web s ite that pres ents the
probabilities as s oc iated with M onopoly at
http://www.durangobill.c om/M onopoly.html. A mong other
things , the s ite hos ts a dis c us s ion of the inc redible
c alc ulation diffic ulties involved when one wis hes to
inc lude every real- life detail of M onopoly play, s uc h as
keeping trac k of whether a partic ular C hanc e or
C ommunity C hes t c ard has been drawn already.

T he bas ic formula for c alc ulating the probability of


landing on a s quare (with c ool L ondon, E ngland, s treet
names in the example) is pres ented at
http://hometown.aol.c o.uk/monopolyc heat/prob/method.ht
Hack 52. Use Random Selection as
Artificial Intelligence

Statisticians have been able to build intelligent, learning


computers long bef ore the advent of the microprocessor. You
can use coconut shells and the laws of probability to build a
machine that will learn to never lose at Tic-Tac-Toe.

A c ommon joke about the 1 9 6 0 s T V s how Gilligan's I s land is


that the P rofes s or was always building c omputers or was hing
mac hines or roc ket s hips out of c oc onuts and vines . I don't
know about was hing mac hines and roc ket s hips thos e do s ound
pretty far- fetc hedbut the c as taways c ertainly c ould have built
c omputers out of c oc onuts . You c an, too. I f you are ever
s tranded on a des ert is land and want a c ompanion, build one.

You won't need a volleyball like Tom H anks 's buddy in Cas taway,
and it won't have muc h pers onality, but your c omputer will be
able to play games with you, and it will even learn and get
s marter. T he driving forc es behind the learning algorithm are
c hanc e and the power of random s elec tion.

Trial-and-Error Learning

A c c ording to behavioral ps yc hologis ts , all animals (inc luding


humans , otters , and s ingle- c elled c reatures ) learn es s entially
the s ame way. E xperienc e pres ents s ituations in whic h c hoic es
lead to outc omes . A s the animal rec eives feedbac k about the
outc ome, it adapts . I f the outc ome was pos itive, the c reature is
more likely to make the s ame c hoic e in the future. I f the
outc ome was negative, the c reature is les s likely to make that
c hoic e again.

N otic e that there is no guarantee that a "good" behavior is


always repeated or that a bad behavior bec omes extinc t; it is
only a matter of probability. T he right dec is ion is more likely to
be made and the wrong dec is ion is les s likely to be made. To
make a mac hine that mimic s the way that animals learn, we
mus t build on this probability angle.

G ame playing reflec ts muc h of the trial- and- error learning


proc es s bec aus e outc omes are eas ily interpreted as pos itive (a
win) or negative (a los s ). I n games , the feedbac k is often
immediate, and s tudies s how that the c los enes s in time between
the c hoic e and the feedbac k is a key fac tor in whether learning
has oc c urred. A nd learning, remember, is defined here as an
inc reas e in the likelihood of c orrec t c hoic es or a dec reas e in the
likelihood of inc orrec t c hoic es .

Building a Tic-Tac-Toe Machine

Stuc k on this is land with no friends , you might want to fight


boredom by playing games with a s mart opponent. H ere are
ins truc tions for building a c ontraption that does not us e any
elec tric ity or s ilic on, but will play a game and provide dec ent
c ompetition.

T his mac hine learns : the more times you play agains t it, the
better it will be. T he game this mac hine plays is T ic -Tac -Toe,
but theoretic ally, you c ould build a devic e for any two- pers on
s trategy game us ing the s ame princ iples . T ic -Tac -Toe is s imple
enough that it demons trates well the methods of des ign,
c ons truc tion, and operation.

I f the P rofes s or on Gilligan's I s land ever did build a c omputer out


of c oc onuts , he was likely influenc ed by the pioneering work of
biologis t D onald M ic hie and his matc hboxes . M ic hie publis hed
an artic le in the very firs t is s ue of the Computer Journal in 1 9 6 3 ,
a few years before G illigan and his pals were s tranded on their
is land. M ic hie des c ribes how he des igned and ac tually built a
nonelec tric c omputer with the following c omplete lis t of parts :

287 matchboxes

E ac h matc hbox has a little drawer that c an be opened.


M ic hie labeled eac h matc hbox with one of 2 8 7 different
pos s ible T ic -Tac -Toe c onfigurations throughout a game.
T here are ac tually many more pos s ible pos itions , but
bec aus e the s tandard T ic -Tac -Toe layout of three rows
and three c olumns is s ymmetric al, four different unique
pos itions c an be s ummarized with jus t one pos ition. A t
any point in the game, the c urrent layout of the "board"
direc ts the human operator to the c orres ponding
matc hbox.

A large s upply of beads of nine different colors

T he nine c olors repres ent eac h of the nine different


s pac es on the T ic -Tac -Toe board. E ac h matc hbox
begins with an equal s upply of beads for eac h of the
pos s ible next moves . O nly beads repres enting legal
moves are put in eac h box. D ifferent pos itions and
matc hboxes , of c ours e, c orres pond to only a s mall s et of
legal next moves , s o eac h box has a s lightly different
mixture of beads .

T he P rofes s or would have us ed c oc onut s hells ins tead of


matc hboxes and s and pebbles or s eeds (or perhaps M r. H owell's
c oin c ollec tion, whic h he never goes anywhere without) ins tead
of beads . G ather thes e s upplies from your tropic al s urroundings ,
organize the pebble- filled c oc onuts in an effic ient grouping, and
you have your des ert is land game- playing c omputer. Yes , you'll
need to find 2 8 7 c oc onuts , but do you have anything better to
do?

Operating the Computer

To play a game of T ic -Tac -Toe agains t your pebble- powered P C ,


follow thes e ins truc tions :

1. T he c omputer goes firs t. Find the c oc onut that is labeled


with the c urrent pos ition. (For the firs t move, it is a blank
layout.) C los e your eyes and randomly draw out a
pebble.

2. M ark an X on your board (drawn in the s and, I 'm


as s uming) in the s pac e indic ated by the c olor of the
pebble. Set the pebble as ide in a s afe plac e.

3. M ake your move, marking an O in your c hos en s pac e.

4. T here is a new pos ition on the board now. G o to the


c orres ponding c oc onut and randomly draw out a pebble
from it. Return to s tep 2 .

5. Repeat s teps 2 through 4 until there is a winner or a


draw.
What happens next is the mos t important part bec aus e it res ults
in the c omputer learning to play better. Behavioral ps yc hologis ts
c all this final s tage reinforcement.

I f the c omputer los es , "punis h" it by taking the pebbles that you
drew randomly from the c oc onuts and throwing them into the
oc ean.

I f the mac hine wins or draws the game, return the pebbles to the
c oc onuts from whic h they c ame and "reward" it by adding an
additional pebble of the s ame c olor.

Why It Works

T he proc es s of rewarding or punis hing the c omputer es s entially


duplic ates the proc es s by whic h animals learn. P os itive res ults
lead to an inc reas e in the likelihood of the rewarded behavior,
while negative res ults lead to a dec reas e in the likelihood of the
punis hed behavior. By adding or removing pebbles , you are
literally inc reas ing or dec reas ing the true probability of the
mac hine making c ertain moves in the game.

C ons ider this s tage of a game, where the c omputer, playing X,


mus t make its move:

X O X
O
You probably rec ognize that the bes t movereally, the only move
to c ons ideris for the c omputer to bloc k your impending win by
putting its X in the bottom c enter s pac e. T he c omputer, though,
rec ognizes s everal pos s ibilities . I t c ons iders any legal move.
Two moves that it would c ons ider (whic h means , literally, that it
would allow to be drawn randomly out of the c oc onut s hell) are
the bes t move and a bad move:

X O X X O X
O O
X X

When the c omputer firs t s tarts playing the game, both thes e
moves (or behaviors ) are equally likely. O ther moves are als o
pos s ible in this s ituation, and they are als o equally likely. T he
move on the left probably won't res ult in a los s , at leas t not
immediately, s o as pebbles repres enting that move are added to
the c oc onut, the relative probability of that move inc reas es
c ompared to other moves . T he move on the right probably ends
in a los s (exc ept agains t G illigan, maybe), s o the c hanc e of that
move being s elec ted next time mathematic ally dec reas es , as
there are fewer pebbles of that c olor to be randomly s elec ted.

T he probability of any given move being s elec ted c an be


repres ented by this s imple expres s ion:

T he mac hine begins with an equal number of pebbles or, in other


words , an equal likelihood of any of a variety of moves being
c hos en. O f c ours e, s ome moves look foolis h to our experienc ed
game- playing eye and would never be made in a real game
exc ept by the mos t naive of players . T he point that behavioral
ps yc hologis ts argue, though, is that all c reatures are novic es
until they have built up a large pool of experienc es that have
s haped the bas ic probabilities that they will engage in a
behavior.

Hacking the Hack

T here are s everal ways to modify your mac hine to make it


s marter. For example, you c an c hoos e to reward moves that lead
to wins more than moves that lead to ties . T his s hould produc e a
good player more quic kly. M ic hie s ugges ted three beads for a
win and one bead for a tie.

I f you want to s imulate the way animal learning oc c urs , you c an


adjus t the s ys tem s o that moves near the end of the game are
more c ruc ial than thos e made at the beginning. T his is meant to
mirror the obs ervation that reinforc ement that c omes c los es t in
time to when the behavior oc c urs is mos t effec tive. I n the c as e
of T ic -Tac -Toe, mis takes that lead to immediate los s es s hould
be dealt with and punis hed more effec tively. By having fewer
total beads in us e for moves late in the game, the learning will
oc c ur more quic kly.

A n obvious upgrade is to make your c omputer s marter by not


even allowing bad moves . D on't even plac e pebbles repres enting
moves that will res ult in immediate defeat into your c ontainers .
T his will s olve the problem of your c omputer's initial low
intelligenc e, but it does n't really reflec t the way animals learn.
So, while this might make for a s tronger c ompetitor, the
P rofes s or would be dis appointed in your lac k of s c ientific rigor.
Hack 53. Do Card Tricks Through the
Mail

A shuf f l ed deck of ca rds i s mea nt to be ra ndom. Sci enti f i c


a na l y ses show tha t i t a ctua l l y i sn't ra ndom, a nd y ou ca n
ca pi ta l i ze on known proba bi l i ti es of ca rd di stri buti ons to
perf orm a n a ma zi ng ca rd tri ck f or peopl e y ou ha v e nev er
met.

I magine you rec eive a thic k, mys terious envelope in the mail.
Rather than having it dis pos ed of by the neares t domes tic
s ec urity offic ers , you open it and find an ordinary dec k of c ards
and the following s et of ins truc tions :

1. C ut the dec k.

2. Shuffle the c ards onc e, us ing a riffle s huffle (defined


later in this hac k).

3. C ut the dec k again.

4. Shuffle the c ards one more time us ing a riffle s huffle.

5. C ut the dec k again.

6. Remove the top c ard of the dec k, write it down, and plac e
it anywhere in the dec k.

7. C ut the dec k again.


8. Shuffle again.

9. C ut one more time.

10. M ail this dec k bac k to the enc los ed addres s (a pos t
offic e box in Tonganoxie, Kans as , or s ome other plac e
with a name that c onjures up wonder and whims y).

You follow all thes e ins truc tions (while wearing protec tive rubber
gloves ) and return the dec k. A bout a week later, a s maller
envelope arrives . I n it is your c hos en c ard! (T here als o might be
a reques t for $ 3 0 0 and an offer to predic t your future, but you
jus t throw the offer away.)

A mazing, yes ? I mpos s ible, you s ay? T hanks to the known likely
dis tribution of s huffled c ards , it is more than pos s ible, and even
a budding s tatis tic ian like you c an do it. N o enrollment in
H ogwarts nec es s ary.

How It Works

Q uite a bit is known, mathematic ally, about the effec ts of various


types of s huffles on a dec k of c ards . T hough a thorough s huffle
(s uc h as a dovetail or riffle s huffle, whic h interlac es two halves of
the dec k) is meant to really s c ramble up a dec k from whatever
order the c ards were in to s ome new order that's quite different
from the original, parts of the original s equenc e of c ards remain
even after s everal c uts and s huffles .

Statis tic ians have analyzed thes e patterns and publis hed them
in s c ientific journals . T he work is s imilar to that whic h res ulted
in the groundbreaking s ugges tion that one s hould s huffle a dec k
of c ards exac tly s even times to attain the bes t mix before
dealing the next round of hands for poker, s pades , or bridge.
P ic ture a dec k of c ards in s ome order. A fter one s huffle, if the
s huffle is perfec t, the original order would s till be vis ible within
the now s uppos edly mixed dis tribution of c ards . I n fac t, there
would be two original s equenc es now overlapping eac h other, and
by taking the alternate c ards , you c ould rec ons truc t the original
overall order.

Table 5 - 5 s hows a dec k of c ards before and after a perfec t


s huffle. J us t 1 2 are s hown for effic ienc y's s ake, but thes e
princ iples all apply to a full 5 2 - c ard dec k.

Table Effect of perfect shuffling on card


distribution
Before After perfect riffle
shuffle shuffle
1. Ace of Clubs 1. Ace of Clubs
2. Two of Clubs 7. Seven of Clubs
3. Three of Clubs 2. Two of Clubs
4. Four of Clubs 8. Eight of Clubs
5. Five of Clubs 3. Three of Clubs
6. Six of Clubs 9. Nine of Clubs
7. Seven of Clubs 4. Four of Clubs
8. Eight of Clubs 10. Ten of Clubs
9. Nine of Clubs 5. Five of Clubs
10. Ten of Clubs 11. Jack of Clubs
11. Jack of Clubs 6. Six of Clubs
12. Queen of
Clubs 12. Queen of Clubs

I f you knew the s tarting order of thes e 1 2 c ards , you c ould pic k
it out fairly eas ily by jus t looking at every other c ard in the new
grouping. T hes e s ubpatterns are c harac terized as ris ing
s equences : the c ards ris e in value as you move along the
s equenc e. I f c ards begin in one long ris ing s equenc e (or a group
of four, bec aus e there are four s uits ), riffle s huffles will maintain
thes e ris ing s equenc es ; they will jus t be interwoven together.
T hes e groupings of ris ing s equenc es will remain, even after
many s huffles .

I f at any time during the s huffling and c utting of the dec k, a c ard
is taken from the dec k and purpos efully plac ed anywhere els e in
the dec k, it will appear "out of plac e" c ompared to the overall
pattern of ris ing s equenc es . T his , of c ours e, is exac tly what the
c ard tric k's ins truc tions demand, and it explains how your
mys terious magic ian (or you when you as s ume that role) c ould
s pot what c ard has been moved.

For the s equenc e s hown in Table 5 - 5 , imagine that the A c e of


C lubs (#1 in the original s equenc e) was removed from the top of
the dec k and plac ed randomly s omewhere in the middle of the
c ards . L et's s ay it ends up, between the 4 and 1 0 of C lubs
(between #4 and #1 0 in the new dis tribution). I t would now be
permanently out of s equenc e, and it is unlikely that anymore
s huffling would move it bac k to where it belongs .

C utting the c ards between s huffles does


nothing to affec t the overall s equenc e, if
we think of a dec k of c ards as an endles s
loop. N ons tandard s huffles , s uc h as
c utting the dec k into three equal piles and
c hanging the order of thos e piles before
s huffling, will break down the s equenc e,
however, and the magic tric k ins truc tions
mus t c learly s tate that c ards s hould be
c ut onc e into two piles .

O f c ours e, realis tic analytic work about what happens to playing


c ards in the hands of real- life people mus t take into ac c ount that
people are human and make human errors . A s the philos ophers
s ay, "To s huffle badly is human." Some c ards that s hould have
been s eparated by exac tly one c ard in a perfec t riffle s huffle
might, unpredic tably, be s eparated by two c ards or might be
adjac ent to eac h other and not s eparated at all. Table 5 - 6 s hows
one pos s ible outc ome of a more human, les s perfec t, s huffle.

Table Possible effect of sloppy shuffling on card


distribution
Before After realistically human
shuffle riffle shuffle
1. Ace of
1. Ace of Clubs
Clubs
2. Two of
Clubs 7. Seven of Clubs

3. Three of
8. Eight of Clubs
Clubs
4. Four of
2. Two of Clubs
Clubs
5. Five of
3. Three of Clubs
Clubs
6. Six of
9. Nine of Clubs
Clubs
7. Seven of
10. Ten of Clubs
Clubs
8. Eight of
5. Five of Clubs
Clubs
9. Nine of
4. Four of Clubs
Clubs
10. Ten of
11. Jack of Clubs
Clubs
11. Jack of
6. Six of Clubs
Clubs
12. Queen of
12. Queen of Clubs
Clubs

T his randomnes s in how a pers on will ac tually s huffle the c ards


c reates both a dilemma and an opportunity. T he dilemma is that
c orrec tly identifying whic h c ard is out of s equenc e is now not
c ertain, bec aus e the s equenc es c annot be perfec tly
rec ons truc ted and the magic ian mus t rely a bit on probabilities ,
whic h adds s ome ris k to the tric k.

T he opportunity c omes when the s ubjec t of the tric k realizes


that you c ould not pos s ibly c ount on the exec ution of perfec t
s huffles . When you identify the c hos en c ard anyway, in the mids t
of this random unc ertainty, the bewilderment will be even greater.

Probability of Success

Bec aus e the exac t nature of the s c rambling of the dec k c annot
be known, the magic ian c an identify a c ard as out of s equenc e
only bec aus e the s huffles were les s than perfec t. A ls o, the tric k
is muc h more likely to be s uc c es s ful (only one c ard is out of
s equenc e) if the ins truc tions do not allow anymore c utting or
s huffling after the c ard is taken from the top of the dec k and
plac ed in the middle.

Statis tic ians from C olumbia and H arvard U nivers ity, D ave Bayer
and P ers i D iac onis , have c onduc ted a mathematic al exploration
of the pos s ible outc omes of a dec k of c ards s huffled and mixed
in the ways des c ribed for this magic tric k. (P res umably, the
fac ulty at thes e ins titutions has a lot of free time on its hands ? )
T hey developed a mathematic al formula for identifying the one
c ard out of plac e and ran a million c omputer s imulations to tes t
the ac c urac y of gues s es by their c yber- s orc erer as to the
c hos en c ard. T heir analys is as s umed perfec t dovetail s huffles .
T hey found that with only a c ouple of s huffles , the tric k works
pretty well, but the odds of s uc c es s dec reas e quic kly as more
s huffles are allowed.

Table 5 - 7 s hows the probability of s uc c es s for a 5 2 - c ard dec k


s huffled different numbers of times . I t als o s hows the c hanc es
that the c orrec t c ard would be c hos en if more than one gues s
were allowed.

Table Chance of pulling off the seemingly


Number
Two Three Four
of
shuffles shuffles shuffles s
guesses
99.7 83.9 28.8
1 8.
percent percent percent
94.3 47.1 16
2 100 percent
percent percent pe
96.5 59.0 23
3 100 percent
percent percent pe

O f c ours e, the odds go down s lightly when one takes into


ac c ount random errors in real- world s huffling, but the relative
s uc c es s rate would s till be as Table 5 - 7 indic ates . I f you
perform the tric k as des c ribedone gues s , after three s huffles the
gues s s hould be c orrec t around 8 0 perc ent of the time (lowering
the 8 3 .9 perc ent es timate s omewhat arbitrarily to take into
ac c ount bad s hufflers ).

To play it s afe, you might do the tric k with at leas t three people.
T hen, as s uming 8 0 perc ent likelihood for eac h pers on, the
c hanc es that you will amaze at leas t one of thos e three people
inc reas es to 9 8 .4 perc ent, whic h is almos t a c ertainty. I f you are
wrong on all three, jus t never s peak or write to thos e people
again, c los e your pos t offic e box, and c onc entrate on more
important things in life. A fter all, with hard work, you might get
into C olumbia or H arvard s omeday and do really important
things .

See Also

T he Bayer and D iac onis s tudy appeared in 1 9 9 2 in The


Annals of Applied Probability, 2, 2 , 2 9 4 - 3 1 3 . I n that
artic le, they c ite two magic ians as early developers of
c ard tric ks bas ed on the ris ing s equences princ iple (s ee
the following two bullets ).

Williams , C .O . (1 9 1 2 ). "A c ard reading." The Magician


Monthly, 8, 6 7 .

J ordan, C .T. (1 9 1 6 ). "L ong dis tanc e mind reading." The


Sphinx, 15, 5 7 . T his is the pres entation on whic h the
effec t des c ribed in this hac k is bas ed.
Hack 54. Check Your iPod's Honesty

Find out how random your iPod's "random" shuf f le really is.

P ers onalized s ong ratings in A pple's iTunes , the s oftware that


allows you to play s ongs on your iP od, lets you quic kly find your
favorites and helps the P arty Shuffle feature play more of what
you like mos t. T he algorithm iTunes us es to pic k what c omes
next in the playlis t is meant to s elec t randomly from your
favorites . I s it really random, though?

A fter hearing one artis t played over and over during a s huffled
play of your entire mus ic library in iTunes , you might think your
player has a preferenc e of its own. A pple, though, c laims the
iTunes 's s huffle algorithm is c ompletely random. T he s huffle
algorithm c hoos es s ongs without replacement. I n other words ,
muc h like going through a s huffled dec k of c ards , you will hear
eac h s ong only onc e until you have heard them all (or until you
have s topped the player or s elec ted a different playlis t).

iTunes 's P arty Shuffle is a different matter. I ts algorithm s elec ts


s ongs with replacement, meaning the entire library is res huffled
after eac h s ong is played (like res huffling a dec k of c ards after
every time a c ard is drawn). T he "P lay higher rated s ongs more
often" option does exac tly what it s ays , but how muc h
preferenc e is given to higher rated s ongs ?
T his hac k originally appeared as an artic le
on the O mniN erd web s ite at
http://www.omninerd.com/.

Assessing iTunes's Selection Procedures

I wanted to tes t two different s ong s elec tion options : Party


Shuffle and "P lay higher rated s ongs more often." I c reated a
s hort playlis t of s ix s ongs : one from eac h different s tar rating
and a s ong left unrated. T he s ongs were from the s ame genre
and artis t and were eac h c hanged to be only one s ec ond in
duration.

I c onduc ted my tes ts on iTunes 5 . iTunes


6 has added a Smart Shuffle feature, whic h
may dec reas e the c hanc es of hearing
s ongs from the s ame artis t or album
c ons ec utively, but I haven't tes ted it yet.

A fter res etting the play c ount to zero, I hit P lay and left my des k
for the weekend. I ran the s ame s ongs twic e: onc e s elec ting
random (P arty Shuffle) and onc e s elec ting both random and the
"P lay higher rated s ongs more often" option. Table 5 - 8 s hows
the play c ounts , as of M onday morning.

Table Song selection distribution


Based
Random
on
selection
rating
Song Percentage Times Percentag
Times played
rating of total played of total
16.70
None 9,105 2,052 3.9 percen
percent
16.60 11.8
1 9,055 6,238
percent percent
16.67 15.4
2 9,090 8,125
percent percent
16.71 18.9
3 9,114 10,020
percent percent
16.55 23.0
4 9,027 12,158
percent percent
16.77 27.0
5 9,146 14,293
percent percent
100 100
Total 54,537 52,886
percent percent
T he play c ounts in the random trial were very c los e to eac h
other, as c an be expec ted with a random s elec tion. For the trial
bas ed on s ong ratings (or rating bias ed s election), the preferenc e
algorithm appears to be linear from 1 2 perc ent to 2 7 perc ent for
the rated s ongs . M oving from the five- s tar rating downward, the
linear preferenc e dec lines around 4 perc ent with eac h s tep down
in rating, but the drop doubles from one- s tar to unrated, with a
fall of 8 perc ent. While one s tar might s eem like the lowes t
rating, no rating proved the blac k s heep of the lot.

Your iP od as s umes that if you haven't


provided a rating for a s ong, you mus t want
to hear it even les s frequently than thos e
s ongs to whic h you have as s igned your
lowes t rating. T his is a bit like c hoos ing a
movie with bad reviews over a movie that
has n't been reviewed.

Figure 5 - 2 s hows the effec ts of different s ong s elec tion options .


You c an judge the randomnes s of the true random s election option
by s eeing if thos e "Random" bars in the figure all s eem the s ame
height. T he linear nature of the "Rating Bias ed" bars c an be
judged by imagining whether there are equal jumps in height as
one moves from a rating of 1 to a rating of 5 .
Figure 5-2. Patterns of song selection

Calculating the Statistics of the Selection


Process

C hanging the number of s ongs within eac h rating c hanges the


probabilities for eac h s ong's s elec tion. With multiple s ongs of
eac h rating, the c hanc e of a s ong with rating r c oming up next in
the ratings - bias ed P arty Shuffle c an be c alc ulated us ing this
expres s ion:

Subs c ripts in this expres s ion indic ate the s ong rating. T he
c hanc e of a s ong being c hos en is bas ed on x (number of s ongs
with eac h rating) and P (the proportional weight as s igned by the
iTunes algorithm for eac h rating).

With iTunes 's preferenc e probabilities for eac h rating determined


from the weekend- long s ampling run, here's the res ulting
expres s ion:

A lthough the higher- rated s ongs are given preferenc e, you will
not definitively hear more five- s tar rated s ongs than all other
s ongs . L et's as s ume mos t people follow a normal dis tribution for
their ratings [H ac k #2 3 ], with the three- s tar rating being the
mos t c ommon. Table 5 - 9 dis plays a hypothetic al iTunes library
with this bell- s haped c urve for the rating s ong c ount.

Table Typical song rating distribution


Song rating Number of songs
None 72
1 321
2 1,527
3 1,812
4 507
5 95
I f I run thes e hypothetic al numbers through our frequenc y
equations , I get a dis tribution that looks like Figure 5 - 3 .

Figure 5-3. Probability distribution of song


selection

A s you c an s ee in Figure 5 - 3 , the c hanc e of a s ong with a


partic ular rating c oming up next in the playlis t is greatly
determined by the s ong c ount within the rating. T he iTunes
preferenc e for higher- rated s ongs and dis like for lower- rated
s ongs only s lightly rais es or lowers the probability determined
firs t from the s ong c ount.

T hes e c hanc es of hearing a s ong with a c ertain rating c an be


applied to find the c hanc es of hearing a partic ular s ong. I f we
remove the s ong c ount from the numerator in the s ong s elec tion
expres s ion, we c an c alc ulate the c hanc e of a c ertain s pec ific
s ong, not jus t the rating, c oming up next:

Explaining Statistical Surprises

A bout a month after running thes e tes ts , I notic ed my iTunes


P arty Shuffle at work played the s ame s ong two times in a row.
T his was the firs t time I had notic ed a c ons ec utive repeat, and I
c hec ked the playlis t. N ot only did I find N irvana's "Territorial
P is s ings " lis ted twic e in a row, but A .F.I .'s "D eath of Seas ons "
was lis ted twic e in a row three trac ks later.

I us e the "P lay higher rated s ongs more often" option, but thes e
were eac h middle- of- the- road 3 - s tar s ongs , and my s ong library
has nearly 4 ,0 0 0 s ongs . T he odds might s eem outrageous at
firs t, but you have to realize jus t how many s ongs you hear
throughout a workday. I f I average 1 0 hours at work eac h day
and average a 3 1/2- minute s ong duration, odds s ay I s hould
hear a c ons ec utive repeat in les s than a month.

M any c laim to s till s ee patterns as iTunes rambles through their


mus ic c ollec tion, but the majority of thes e patterns are s imply
multiple s ongs from the s ame artis t. T hink of it this way: if you
have 2 ,0 0 0 s ongs and 4 0 of them are from the s ame artis t,
there is always about a 2 perc ent c hanc e of hearing them next
with random play. Right after one of their s ongs finis hes , odds
s how a 5 0 perc ent c hanc e a s ong by the s ame artis t will play
again within the next 3 5 s ongs and a 6 4 perc ent c hanc e they
will be played again within the next 5 0 s ongs . T his c an be
c alc ulated following this equation:
A s we have s een in other hac ks , a low likelihood event (s uc h as
our 2 perc ent c hanc e of repeating an artis t) bec omes a highly
likely event after jus t a few opportunities [H ac k #4 6 ].

I t's s imply the mind's tendenc y to find a pattern that makes you
think iTunes has a preferenc e.

See Also

A dditional tec hnic al information about iP ods and s huffling c an be


found at thes e s ourc es :

L evy, Steven. "D oes Your iP od P lay Favorites ." J anuary


31, 2005.
http://ms nbc .ms n.c om/id/6 8 5 4 3 0 9 /s ite/news week/.

H offerth, J errod. "U s ing P arty Shuffle in iTunes ." A ugus t


22, 2004.
http://ipodlounge.c om/index.php/artic les /c omments /us ing
party- s huffle- in- itunes /.

Brian H ans en
Hack 55. Predict the Game Winners

The inf ormation provided by correlations allows f or predicting


any outcome, especially sports. With multiple regression
techniques and a little sof tware, you can guess the winner
bef ore the game is played. The trick is picking the right
predictors.

T he c onventional us e of c orrelations [H ac k #1 1 ] is to find out


how muc h two variables s hare in c ommonor, more tec hnic ally,
how muc h variance is s hared between the two variables .

Shared variance is a mathematic al term to


des c ribe the amount of redundant
information reflec ted in two variables .
When lots of varianc e is s hared, predic tion
is eas y and ac c urate bec aus e knowledge
of one variable leads to knowledge about a
s ec ond. Shared varianc e is es timated by
s quaring the c orrelation.

But our everyday world c ons is ts of way more than only one
variable predic ting another. I n fac t, in mos t c as es there are
s everal or multiple variables that predic t a partic ular outc ome.
H ere we are not dealing with the predic tion of jus t one variable
from another, but the predic tion of one variable from s everal.
T his tool is c alled multiple regres s ion (bec aus e there is more
than one predic tor variable).

Serious s ports gamblers , bookies , and c as ino operators are


familiar with multiple regres s ion, or at leas t they s hould be. So
muc h information is available about s ports teams that there are
almos t c ertainly all s orts of variables that, in the right
c ombinations , c an fairly ac c urately predic t whic h team will win.

Betting on profes s ional football is one of the mos t c ommon of all


gambling prac tic es (or s o I have been told). T his hac k s hows
how to gather data and us e multiple regres s ion to predic t the
winner of any football matc h up. T his example involves
predic ting who will win the Super Bowl, the N ational Football
L eague's c hampions hip game.

Choosing Predictor Variables

T he firs t s tep is to build your model (the predic tors and their
weights that you will us e to make your predic tion). For football,
there are dozens of s tatis tic s kept and available about teams '
pas t performanc es and player c harac teris tic s . Some make
s ens e as predic tors of future performanc e (e.g., pas t
performanc e), while others do not (e.g., c utenes s of the mas c ot).
T he c hanc e to win money, though, is a powerful motivator, s o I
would take the time and effort to c ollec t jus t about every
s tatis tic I c ould find about every team and every game. T he key
is to find variables that on their own c orrelate pretty well with
winning the Super Bowl.
L et's pretend that you have done your res earc h and found s ix
variables that c orrelate with whether a team wins or los es . Some
make s ens e; s ome do not. You are interes ted in getting the mos t
ac c urate real- life predic tion you c an get, s o you are willing to
inc lude the kitc hen s ink if it will make a differenc e. To be c lear,
you took eac h year that a team was in a Super Bowl and then
gathered data for that team from that year.

I magine you've found that the following variables are of interes t


and might be us eful in predic ting this outc ome bas ed on
previous years ' performanc e and the c harac teris tic s of 3 0
teams . T he variables you'll be us ing in your model begin with the
outc ome of interes tnamely, did the team win the Super Bowl
during the year that the data is gathered from (Yes = 1 , N o = 2 )?

T he following variables were found to c orrelate with the outc ome:

N umber of eas y wins during the s eas on (won by more


than nine points )

A verage attendanc e during the s eas on

A verage number of hot dogs s old per game

A verage temperature of team's G atorade

A verage weight of defens ive linemen

When you do this analys is with real data, you'll likely find a
different mix of potential predic tors .
Entering the Data into a Spreadsheet

Soc ial s c ientis ts often us e s tatis tic al s oftware s uc h as SP SS or


SA S, but for this example, I us ed an E xc el works heet and
E xc el's very c ool D ata A nalys is Toolpac k (and the Regres s ion
Tool). I entered s ome made- up but realis tic data into the
s preads heet s hown in Table 5 - 1 0 .

What? You thought I was going to s how


you a real s ec ret formula for predic ting the
outc omes of football games ? I 'm only
s howing you how to make your own. I 'll
keep mine to mys elf, thank you very muc h!

Table Super Bowl predictors


Won
Easy Hot
Team Super Attendance
wins dogs
Bowl?
A 1 11 56,533 4,798
B 2 9 44,543 5715
C 1 8 45,543 9,753
D 1 6 45,768 8,020
E 1 8 76,786 5,395
F 1 11 56,533 1,054
G 2 9 56,554 750
H 2 12 44,675 6,576
I 2 11 56,667 9,187
J 2 10 65,545 4,533
K 2 12 78,756 1,963

Table 5 - 1 0 s hows s ome of the 3 0 rows of fic tional data I


c ollec ted, repres enting 3 0 examples I us ed in my s tatis tic al
analys is . T he more rows of data, the more ins tanc es you c an get
and the more ac c urate your eventual predic tions will be.

Building a Regression Equation

You might remember from your high s c hool days that the formula
for a s imple s traight line looks s omething like this :

T his equation is made up of the following variables :

Y'

P redic ted s c ore on variable Y


b

T he s lope of the line

T he s c ore of a s ingle predic tor

T he interc ept (where the s traight line c ros s es the Y or


vertic al axis )

So, for example, if you wanted to predic t human height from


weight and had a bunc h of data to c reate s uc h a formula after
plugging in the various values , you might get s omething that
looks like this :

T his means that if your weight (the X variable) is 1 2 5 pounds ,


the predic tion is that you will be about 6 4 inc hes tall, or about 5
feet 3 inc hes .

But when we have more than one predic tor variable, things get
more interes ting and more fun. T here is a longer s eries of
predic tors (many Xs ) and weights (many bs ).

I ran a multiple regres s ion analys is us ing this data in SP SS, a


s tatis tic al s oftware program, but you c an get muc h of the s ame
information us ing E xc el (s ee the "G etting Regres s ion I nfo in
E xc el" s idebar).
Getting Regression Info in Excel

T here are two ways to get s tatis tic al regres s ion info
us ing E xc el. Firs t, you c an us e the SLOPE and INTERCEPT
func tions , whic h you c an find on the I ns ert Func tion
menu. Selec t the func tion and enter the argument (the
c ells where the data is loc ated), and E xc el returns
thes e values , allowing you to plug in known values and
predic t others . T his method works bes t when there is
jus t one predic tor.

You c an als o make us e of the Regres s ion option in the


D ata A nalys is ToolP ak, an E xc el add- on (whic h you
might have to ins tall). U s ing this option on the Tools
menu, you c an tes t the s ignific anc e of the regres s ion
c oeffic ient us ing an F tes t, a s tatis tic al tes t s imilar to a
t tes t [H ac k #1 7 ].

T he res ults (a.k.a. the output) are s hown in Tables 5 - 1 1 and 5 -


1 2 . L et's s ee whic h of the variables bes t as s is t us in predic ting
whether a team will win the Super Bowl.

Table Regression statistics


Multiple R R square Observations
0.8483 0.7196 30
Table Regression equation
T P-
Variable Coefficients
stat value
Intercept -0.784 -1.010 0.323
Easy wins 0.119 4.274 0.000
Attendance 0.000 -0.822 0.416
Hot dogs
0.000 1.043 0.308
sold
Gatorade 0.013 2.457 0.022
Weight 0.001 0.580 0.567

Table 5 - 1 2 s hows a c oeffic ient (a weight) for eac h of the five


variables that were entered into the equation to tes t how well
eac h one predic ts Super Bowl wins . For example, the c oeffic ient
as s oc iated with "E as y wins " is .1 1 9 .

I f we c ombine all of thes e into one big equation for predic ting
Super Bowl outc omes , here's the model we get:

So, for eac h of the predic tors (variables X 1 through X 5), there is
s pec ific weight (the bs in the formula or the c oeffic ients in the
res ults ).
N ow, the s ame formula in E nglis h:

b* Wins + b* A verage A ttendanc e + b* H ot D ogs + b* Temp +


b* Weight + a

A nd us ing the numbers from the output s hown in Table 5 - 1 2 ,


here's the real live regres s ion equation:

Interpreting and Applying the Regression


Equation

I magine us ing this equation with all the rows of data you entered
into your s preads heet. T here would be a pretty high c orrelation
between the ac tual Super Bowl outc omes and the predic ted
outc ome. I know this bec aus e of the "M ultiple R" part of the
output s hown in Table 5 - 1 1 , whic h s hows a pretty high
c orrelation. 0 .8 4 is c los e to 1 , whic h is the highes t c orrelation
you c ould get.

T he "R s quare" of .7 2 is the proportion of


s hared variance that we talked about earlier
in this hac k.

What does this mean? T he c ombination of thes e predic tor


variables is a pretty effec tive way to judge whether a team will
win the Super Bowl. Foolproof? O f c ours e not, s inc e the
c ombination of thes e variables does not perfec tly predic t the
outc ome, but it does a pretty s olid job.

So, let's s ay that this year's D enver C annonballs has the data
points s hown in Table 5 - 1 3 .

Table Data for Denver Cannonballs


Variable Value
Easy wins 13
Attendance 35,678
Hot dogs 4,567
Gatorade 65
Weight 267

P lugging this data into the equation s hown earlier, here's what we
get for a predic tor of Y:

T he final value for Y is 1 .8 7 5 , a bit c los er to 2 (meaning they are


not predic ted to win) than to 1 (meaning they are predic ted to
win).

What's the key to a good s et of predic tors ?

A ll the predic tors s hould be independent of eac h other (if


at all pos s ible) s inc e you want them to make a unique
c ontribution to the unders tanding of what you are
predic ting.
E ac h of the predic tors s hould be as highly related as
pos s ible to the outc ome that you are predic ting.

Improving Your Regression Equation

A c areful examination of the equation produc ed in this hac k


indic ates that the bulk of the predic tive power c omes from jus t
two variables : the number of eas y vic tories and the temperature
of the team's G atorade. A ls o, many of the predic tors have zero
weights , whic h means you don't need them at all. You c ould
remove thes e unhelpful variables (attendanc e and hot dogs s old)
to s treamline your formula. I n fac t, c ollec ting data on eas y wins
and G atorade temperature alone is enough to make fairly
ac c urate predic tions in our example.

N eil Salkind
Hack 56. Predict the Outcome of a
Baseball Game

Turn your radio on in the middle of a baseball game f or f ive


seconds and then turn it of f . Without hearing the score, you'll
be able to name the winner, and you'll be right more than half
of the time.

L ook, I 'm a bus y guy. I 'm always looking for a way to s ave time
on the les s important things in life, s uc h as following my loc al
bas eball team, s o I 'll have more time to s pend on the important
things in lifefriends , family, debating the logic of the H olms '
s equential Bonferroni proc edure as the appropriate follow- up
method to analys is of varianc e, and s o on. A c as e in point
happened jus t the other day. Wanting to know whether the
Kans as C ity Royals would win a bas eball game that was in
progres s , I hardly had time to wait until the game was over. I
wanted to know right now!

M uc h like Veruc a Salt and her interes t in


owning one of Willy Wonka's O ompa-
L oompas "now! ", I don't have muc h
patienc e.
L ike a bolt from the blue, I realized that I c ould turn on my c ar
radio for jus t a few s ec onds and have enough information to
gues s the outc ome of the game. A nd I c ould do that without
hearing the s c ore or who was on bas e.

How It Works

D uring the firs t c ouple hours of a bas eball game, turn on the
radio broadc as t of that game. L is ten jus t long enough to identify
the team that is at bat. T hat team has a greater than 5 0 perc ent
c hanc e of winning that game.

Why It Works

Bas eball is a game where the longer you are on offens e, the more
points you c an s c ore. A s more batters c ome to bat in a s ingle
inning, the c hanc es of moving runners along the bas e paths and
ac ros s home plate inc reas es . A nother way to look at it is to
imagine the end of an inning that was huge for one team. I f a
team s c ored a lot of runs , they had to have us ed c ons iderably
more than the minimum of three batters in that inning and,
c ons equently, been at bat a proportionately longer length of time
than the other team. O ver the c ours e of a game, the team that is
at bat longes t is more likely to s c ore more (or have more
produc tive innings ).

Sampling theory [H ac k #1 9 ] s ugges ts that a s ample is mos t


likely to c apture the mos t c ommon elements of a population. O ur
population here is all the moments during a game that we c ould
lis ten to. T he mos t c ommon c harac teris tic in the population (in
terms of who is at bat) belongs to the team that is at bat the
mos t.

Figure 5 - 4 s ugges ts a pos s ible dis tribution of at- bat time for a
regulation nine- inning game. I n this example, the winning team
was on offens e for 5 8 perc ent of the time. I n retros pec t, a
random tuning in to the broadc as t had a 5 8 perc ent c hanc e of
finding the winning team at bat.

Figure 5-4. Time at bat for winning and losing


teams

T he ac c urac y of predic tion s hould be above 5 0 perc ent over the


long run of bas eball broadc as ts , but it won't be really, really
ac c urate. T his is bec aus e the relations hip between time at bat
and s c oring a vic tory is not a perfec t c orrelation [H ac k #1 1 ].
P layers c an s c ore quic klyhit a homerun on their firs t pitc h, for
exampleor they c an take their time getting many hits but s trand
many runners and never s c ore.
O verall, the c orrelation between the two variables s hould be
pos itive, however. E ven the perhaps unimpres s ive 5 8 perc ent
ac c urac y in my imagined data in Figure 5 - 4 means that you will
be right 1 6 perc ent more often than a blind gues s . With s uc h an
advantage at the blac kjac k tables , you would be a millionaire in
a week.

Proving It Works

To tes t the ac c urac y of my c laim, you c an us e the data that


appears in your daily news paper. While mos t box s c ores do not
inc lude information about total time-at-bat for eac h team, there is
a variable that provides almos t the s ame information. T here will
almos t c ertainly be a "total at- bats " reported. While this
s tatis tic is not the s ame as time s pent at bat, it s hould c orrelate
pretty highly. E ac h day, this information is provided for more
than a dozen games , and jus t a few days ' worth of data s hould be
enough to tes t my theory. G ather the total at- bats for eac h team,
inc luding whic h team won the game.

Real- life res earc hers often don't have


ac c es s to the variable they would really
like to know about, and us us ing number of
at-bats ins tead of time at bat is good
example of this . I ns tead, we mus t s ettle
for the next bes t thing available.
Sc ientis ts c all thes e s ubs titutes proxy
variables or s urrogate variables .
M y hypothes is is that the team with the mos t at- bats s hould win
the game more than 5 0 perc ent of the time. O ut of c urios ity, I
tes ted this hypothes is mys elf. I us ed the C hic ago C ubs as an
example, bec aus e their s tats were readily available on the Web.
I arbitrarily c hos e 2 0 0 3 and the C ubs ' firs t 2 5 games . A n
analys is of thes e games found that the team with the mos t at-
bats won 5 6 perc ent of the time. I f I had eliminated the three
s ituations where there were ties in at- bats , I c ould have
predic ted with 6 3 perc ent ac c urac y.

While the team with the fewes t at- bats s ometimes did win the
C hic ago C ubs games , the larger the dis c repanc y between at-
bats , the more likely the team with the mos t at- bats was to win
the game. When the mos t- at- bats teams won, they averaged
4 .1 4 more at- bats than the los er. When the leas t- at- bats teams
won, they averaged only 2 .8 8 at- bats les s than the los er.

Other Places It Works

Some people have s ugges ted that in the c as e of my team, the


Kans as C ity Royals , if I want to be right more than half the time,
I s hould always predic t a los s . Yes , yes , very funny.

Where It Doesn't Work

T he ac c urac y of this method s hould be low if you turn on the


radio in the ninth inning, whic h is why I s ugges t you try it during
the firs t c ouple hours of the game. U nder the rules of bas eball, if
the home team is leading after the top of the ninth inning, they
never c ome to bat. T hey win. G ame over. A s home teams win
more often than vis iting teams , this means that often the winning
team never c omes to bat at all in the ninth inning.

T his pres ents an interes ting variation of this predic tion method
that applies only to the ninth inning. Turn on the game in the
ninth inning; if your team is batting, things don't look s o good.
T he data pres ented for the C hic ago C ubs that found the winning
team oc c as ionally having fewer at- bats than their opponent c an
be partly explained by the fac t that the winning team s ometimes
bats in only eight innings .

T his method does n't work for all s ports . I n bas ketball, for
example, time of pos s es s ion wouldn't be expec ted to pos itively
c orrelate with points s c ored and, in the c as e of high- energy,
fas t- s c oring teams , might even negatively c orrelate. I n football,
on the other hand, time of pos ition is c ons idered a key indic ator
of quality performanc e and us ually c orrelates with a win.
Hack 57. Plot Histograms in Excel

Use Microsof t Excel to plot data distributions so that you can


have a better understanding of statistics.

T here is s ome truth to the c lic h\x8 e "a pic ture is worth a
thous and words ." A pic ture is often the bes t way to unders tand
1 ,0 0 0 numbers . P eople are vis ually oriented. We're good at
looking at a pic ture and obs erving different c harac teris tic s ;
we're bad at looking at a lis t of 1 ,0 0 0 numbers .

O ne of the mos t powerful tools available for unders tanding data


is the his togram, a pic ture of the dis tribution of values . H ere is
the idea of a his togram. Suppos e you have a lot of datas ay, the
batting averages for all 6 ,0 3 2 bas eball players between 1 9 5 5
and 2 0 0 4 who averaged 3 .1 or more plate appearanc es per
game. L et's als o as s ume you want to know how thes e values are
dis tributed. What are the lowes t and highes t values ? A re there
more low values than high values ? Were batting averages totally
random numbers between 0 and .4 0 0 , or was there s ome
pattern?

Batting average c an take many different values . Between 1 9 5 5


and 2 0 0 4 , 6 ,0 3 2 players had qualifying batting averages , and
there were 1 ,2 2 9 unique values for batting average. You c an plot
the number of players with eac h unique batting average (though
I c an't imagine what this graph would look like). But we don't
really c are about eac h unique value; for example, the fac t that
1 3 players had a batting average of .2 8 6 2 is not that
interes ting. I ns tead, we might want to know the number of
players with very s imilar batting averages s ay, between .2 8 5 and
.2 9 0 .

L et's think of eac h range as a buc ket. E very player- s eas on goes
into a buc ket. For example, in 1 9 5 9 , H ank A aron had a .3 5 4
average, s o we'll put that s eas on in the .3 5 0 - .3 5 5 buc ket. So,
here's our plan: we'll put eac h player- s eas on into a buc ket,
c ount the number of player- s eas ons in eac h buc ket, and draw a
graph s howing (in as c ending order) the number of players in
eac h buc ket. T his s ingle diagram is a his togram.

The Code

I n this example, I wanted to look at the dis tribution of batting


average. I us ed a table c ontaining the total batting s tatis tic s for
eac h player in eac h year (and the lis t of all teams for whic h eac h
player played), and I c alled the table b_and_t. I s elec ted only
batters with enough plate appearanc es to qualify for a league
title, and only thos e players who played between 1 9 5 5 and
2004:

SELECT b.playerID, M.nameLast, M.nameFirst, b.yearID, b.teamG,


b.teamIDs, b.AB, b.H,
b.H/b.AB AS AVG,
b.AB + b.BB + b.HBP + b.SF as PA
FROM b_and_t b inner join Master M
on b.playerID=m.playerID
WHERE yearID > 1954
AND b.AB + b.BB + b.HBP + b.SF > b.teamG * 3.1;

A fter running this query, I s aved the res ults to an E xc el file


named batting_averages .xls .

O ne way to draw his tograms in E xc el is to us e the A nalys is


ToolP ak add- in. You c an add this by s elec ting A dd- I ns ... from
the Tools menu, and then s elec ting A nalys is ToolP ak. T his adds
a new menu item to the Tools menu, c alled D ata A nalys is , whic h
introduc es s everal new func tions , inc luding a H is togram
func tion. But I find this interfac e c onfus ing and inflexible, s o I do
s omething els e.

H ere is my method for c reating a his togram:

1. I n the data works heet, c reate a new c olumn c alled Range.

2. I n the firs t c ell of this c olumn, us e a func tion to round


the value for whic h you would like to plot the dis tribution.
T he s imple way to do this is to us e the Signific ant
Figures option of the ROUND func tion. I n my works heet,
c olumn I c ontained the value for whic h I wanted to
c alc ulate the dis tribution (batting average), s o I c ould
us e a formula s uc h as ROUND(I2,2) to round to the
neares t .0 1 0 . P ers onally, I find a buc ket s ize of .0 0 5 to
be more des c riptive, s o I us e a tric k. You c an multiply a
value ins ide the ROUND func tion and then divide outs ide
the func tion to get buc kets of almos t any s ize. I ns ide
the ROUND func tion, I multiply by the rec iproc al of the
buc ket s izein this c as e, 1 / .005 = 200. O uts ide the
func tion, I multiply by the buc ket s ize. I n my works heet,
c olumn I c ontained the average values . So, I us ed
ROUND(I2 * 200,0) / 200 as my formula. C opy and pas te
this formula into every row of the works heet. (You c an
double- c lic k the bottom- right c orner of the c ell to do
this quic kly.)

3. N ow, we're ready to c ount the number of players in eac h


buc ket. Selec t all of the data in the works heet, inc luding
the new Range c olumn. From the D ata menu, s elec t P ivot
Table and P ivot C hart Report. Selec t P ivot C hart Report
and c lic k Finis h (we'll us e all the defaults ). We will s elec t
two fields for our pivot table. From the P ivot Table Field
L is t palette, s elec t Range. D rag- and- drop this onto the
D rop Row Fields H ere part of the pivot table. N ext, drag-
and- drop "playerI D " onto the D rop D ata I tem H ere part
of the pivot table. By default, E xc el will c ount the number
of player I D s in the underlying data that matc h eac h
range value. T he pivot table is now s howing the number
of items in eac h buc ket. You s hould s ee a (very ugly)
graph with the number of players in eac h buc ket.

4. C lean up the graph. (I like to eras e the bac kground fill


and lines and c hange the width of the c olumns .) Figure
5 - 5 s hows an example of a c leaned- up graph.

Figure 5-5. Histogram from a pivot chart report


L ooking at the his togram, we s ee that the dis tribution looks
s imilar to a bell c urve; it s kews toward the right and is c entered
at around .2 7 5 .

Hacking the Hack

O ne of the nic e things about c alc ulating bins with formulas is


that you c an eas ily c hange the formula for binning. H ere are a
few s ugges tions for other formulas :
ROUNDDOWN( <value> , <significance> ) and ROUNDUP( <value> ,
<significance> )

T his ROUNDDOWN func tion rounds down to the neares t


s ignific ant figure. For example, ROUNDDOWN(3.59,0) equals
3, and ROUNDDOWN(3.59,1) equals 3.5. Similarly, ROUNDUP
rounds up to the neares t s ignific ant figure. ROUNDUP(3.59,
0) equals 4, and ROUNDUP(3.59,1) equals 3.6.

LOG( <value> , <base> )

Sometimes it's us eful to plot a value on a logarithmic


s c ale, and to us e logarithmic - s ize bins . You c an c ombine
LOG func tions with ROUND func tions to c reate variable- s ize
bins .

CONCATENATE(...)

T he CONCATENATE func tion does n't c ompute numbers , it


puts text together. I f you want to explic itly lis t ranges
(s uc h as 3 .5 0 0 - 3 .5 9 9 ), you c an us e the CONCATENATE
func tion to c reate thes e; for example,
CONCATENATE(ROUNDDOWN(3.59,1)," to
",ROUNDUP(3.59,1)-0.01) returns 3.5 to 3.59.

I f you want to take this to the next level, you c an replac e the bin
s ize with a named value. (For example, name c ell A 1 bin_size.)
T his makes it eas y to c hange the bin s ize dynamic ally and
experiment with different numbers of bins .

J os eph A dler
Hack 58. Go for Two

In f ootball, when is the two-point conversion attempt the right


choice? Regardless of which "chart" you're using, the problem
gets even more complicated when statisticians enter the
debate.

A few years bac k, I was enjoying watc hing my loc al profes s ional
football team as they were los ing a c los e game. I was n't
entertained by my team's dis mal performanc e as muc h as I was
delighted by my team's befuddled c oac h as he attempted to read
and unders tand a two-point convers ion chart.

I n football, after a touc hdown is s c ored


(the touc hdown its elf is worth s ix points ),
the s c oring team has two options for
s c oring an "extra point" or two. U s ually,
the team c hoos es to kic k a s ingle extra
point through the uprights (like a s hort-
dis tanc e field goal), but they might als o
c hoos e to "go for two" points (known as
the two-point convers ion), whic h involves
the offens e rus hing or pas s ing for another
trip into the end zone.
A t the time, as was later "c onfirmed" by s ports writers , it was
c lear that he was n't s ure how to read the c hart. Spec ific ally,
when interpreting the c olumn on the c hart that lis ted how many
points behind or ahead a team was , he thought this meant how
many points ahead or behind a team would be if they made the
point- after c onvers ion.

A s I mus ed about how an N FL head c oac h might never have


learned to read s uc h a c hart, I began to wonder who produc ed
this "c hart" and what princ iples it was bas ed on. L ater, as I
s earc hed for the "offic ial c hart," I found two "offic ial" c harts , and
they didn't always agree.

M ore rec ently, I ran ac ros s a c hart bas ed on a s tatis tic al


analys is of the probability of pos s ible outc omes and on the
amount of time remaining (as indic ated by the number of
pos s es s ions remaining). T his c hart didn't agree with either of
the earlier c harts I dis c overed.

T his hac k is for you, C oac h. I t examines from a s tatis tic al


pers pec tive when to go for two points and when to s ettle for one.

Traditional Two-Point Conversion Charts

When you s ee a c oac h on T V holding a plas tic laminated c ard


and s tudying it before dec iding whether to go for two,
s ports c as ters like to refer to the c ard as the chart, though, as
mentioned in the previous s ec tion, there's more than one c hart
in us e. T he s light differenc es might be due to the fac t that one is
identified as being us ed in the N FL and the other is identified as
a c las s ic s et of s tandard dec is ions us ed in c ollege football.
T he differenc es might als o be bas ed on the fac t that the c ollege
c hart was produc ed for a c ertain team that may have had a more
aggres s ive or c onfident s tyle. T he c ollege c hart s eems to play
for a vic tory, not a tie. T hough c ollege ball now has overtime
rules , they are a fairly rec ent development, whereas the pros
have had overtime for a while.

T he N FL c hart is provided on N orm H itzges ' web s ite (N orm is a


broadc as ter in D allas and an all- around s ports guru) at
http://www.normhitzges .c om/thec hart.htm. T he c ollege c hart
(found at http://www.N FL .c om/fans /twopointc onv.html) is
identified as the one us ed in the 1 9 7 0 s and developed at the
U nivers ity of C alifornia, L os A ngeles (U C L A ). Table 5 - 1 4
provides the s ugges ted dec is ions from both c harts and is
c ondens ed a bit.

Table Classic decision making for two-point


attempts
Points
behind or
ahead
0 1 2 3 4 5 6 7 8 9 10 11 12
Behind
1 1211211112 1 1
(NFL)
Behind
221 211121 2 2
(College)
0 1 2 3 4 5 6 7 8 9 10 11 12
Ahead
1 2112211111 2 2
(NFL)
Ahead
2112211111 1 2
(College)

T he U C L A c hart does not provide s ugges tions for when the


s c ore is tied or when your team is behind by four points . T he N FL
c hart, on the other hand, is full of advic e for all oc c as ions . A s
dis c us s ed, the primary differenc e s eems to be whether you're
willing to play for the tie or not. U C L A c learly did not wis h to play
for the tie, while the N FL c hart has no s uc h hes itanc y.

Modern Super-Scientific Chart

I n the real world, a s et of s tatis tic al probabilities c ontrols the


outc ome of a s porting event, and the dec is ion about whether to
go for two or take the extra point s hould be bas ed on more
information than jus t the s c ore and whether your team is winning
or los ing. I n ac tual game s ituations , s mart c oac hes take the
following additional fac tors into ac c ount:

T he likelihood that their field goal kic ker will make the
field goal

T he likelihood that their team will s c ore on a given two-


point c onvers ion play

T he c urrent health, attitude, and s kill of their players


H ow many more pos s es s ions their team will rec eive

P as t s tatis tic s s how that the average N FL football team makes


about 9 8 perc ent of its extra points and about 4 0 perc ent of its
two- point attempts . C oac hes mus t us e their experienc e and
intuition to gauge their players ' c urrent ability level, and a c hart
is n't muc h help on that s c ore.

A s for pos s es s ions left, however, this is exac tly the type of
information that dec is ion s ys tems bas ed on probability need to
take into ac c ount. Bas ed on a proc es s of working bac kward from
the ending of a hypothetic al football game that takes the
probability of s uc c es s on either option (9 8 perc ent for one- point
plays and 4 0 perc ent for two- point plays ) into ac c ount,
s tatis tic ians have produc ed a c hart bas ed on not only on the
c urrent s c ore, but als o on the total number of pos s es s ions
remaining for both teams .

I n a 2 0 0 0 is s ue of Chance magazine (Vol. 1 3 , N o. 3 ), H arold


Sac krowitz pres ented the res ults of s uc h an analys is us ing a
proc es s c alled dynamic programming. Table 5 - 1 5 s hows a
portion of D r. Sac krowitz's c hart.

Table Modern decision making for two-point


attempts
Points
behind
or
ahead
0 1 2 3 4 5 6 7 8 9 10 11
Possessions
remaining
1 Behind 1 12 1
Ahead 1 211 2111
2 Behind 1 12112 12 2
Ahead 1 21112111
3 Behind 1 12112 12 2
Ahead 1 2111211111 1
4 Behind 1 1211211222 1
Ahead 1 2211211111 1
5 Behind 1 1211211222 1
Ahead 1 2111211111 1
6 Behind 1 1211211222 1
Ahead 1 2211211111 1

T his two- point c onvers ion c hart is bas ed on the branc hing
pos s ibilities s tarting at different points in the game and
as s uming bas ic probabilities of s uc c es s for either an extra point
or a two- point c onvers ion. A n average N FL quarter s ees s ix
pos s es s ions in total, s o think of this c hart as being mos t us eful
in the fourth quarter. Sac krowitz als o as s umes a 5 0 perc ent
c hanc e for overtime vic tories .
How It Works

T he c alc ulations for Table 5 - 1 5 work s omething like this s imple


example:

1. I magine you are down by one point without muc h c hanc e


of getting the ball again.

2. You have a 9 8 perc ent c hanc e of making an extra point


kic k and a 5 0 perc ent c hanc e of winning in overtime.
G oing for the extra point res ults in a vic tory 4 9 perc ent
of the time (.9 8 x .5 0 = .4 9 ).

3. You have a 4 0 perc ent c hanc e of c onverting a two- point


play, s o going for two points res ults in a vic tory 4 0
perc ent of the time. Failure ends the game, and s uc c es s
wins the game.

4. 4 9 perc ent is better than 4 0 perc ent, s o you s hould


elec t to go for the extra point. N otic e that if you believe
your team's c hanc es of c onverting the two- point play
are better than 4 9 perc ent, you s hould go for it.
C alc ulations like thes e, but over a longer s eries of
pos s es s ions , res ult in the dec is ion tree reflec ted in
Table 5 - 1 5 .

Whic h c hart s hould you us e the next time you find yours elf
c oac hing in a c ruc ial football game with a key dec is ion to make?
T hat's up to you, but jus t remember that befuddled football
c oac h I watc hed on T V a few years ago. N ot only was he
replac ed the next year by D ic k Vermeil, c ons idered one of the
brighter football c oac hes around, but it was Vermeil who helped
develop the U C L A two- point c onvers ion c hart s hown in Table 5 -
1 4 . N ow you know the res t of the s tory!
Hack 59. Rank with the Best of Them

There are many ways to use data to make judgments about who
is best in any sport. A ll the intuitive ways to compare
perf ormance in individual sports have validity concerns,
however.

M y friends and I are a c ompetitive lot. O ur arena of c ombat,


mos t rec ently, has been poker. O n a regular bas is , my friends
and I gather at my home and take part in a Texas H old 'E m
poker tournament. I t's an informal affair, but we all take it very
s erious ly. T he way our poker tournaments work, everyone s tarts
with the s ame amount of c hips , and when they are gone they are
gone. T here is a firs t one out, a las t one out, and everything in
between. So, for example, if s even people play, s omeone c omes
in firs t, s ec ond, third, fourth, fifth, s ixth, and s eventh.

We all think of ours elves as pretty good and, being c ompetitive,


we have longed for an objec tive method of c omparing
performanc e ac ros s tournaments . A s one of the s tatis tic ians in
the group, I took it upon mys elf to devis e various ways of
produc ing s ome s ort of objec tive index that would allow all
partic ipants to c ompare their performanc e with eac h other to
dec ide onc e and for all who is the bes t player and who is only
luc ky now and again. T his is the s tory of my ques t and the
s tatis tic al s olutions I c hos e. N ot to give the ending away, but I
learned that there is no s ingle bes t s olution.
How to Rank Fairly

T his bus ines s of how to identify the bes t is a c ommon problem


for c ompetitive organizations s uc h as s ports leagues and
as s oc iations . T he problem is how to s ummarize performanc e
ac ros s a variety of c ategories , venues , and oc c as ions .

T here are three methods c ommonly us ed in the world of s ports


to make determinations about who is the "bes t." A ll of the
approac hes make s ome intuitive s ens e, though eac h method has
its own s pec ific advantages and dis advantages .

Firs t, let's take a look at the nature of the data I had to analyze.
Your data will likely be s imilar, whether you run your weekly home
M onopoly game or you run the P rofes s ional G olf A s s oc iation.
T hough poker is not a s port, any organized c ompetitive endeavor
provides data for rankings . Table 5 - 1 6 s hows the res ults from
eight tournaments in my own s ummer poker league.

Table Summer poker league data


Paul Lisa Billy BJ Mark Bruce Cathy Tim David
5/14 6 5 4 3 2 1
5/21 3 6 4 5 7 2 1
5/28 5 4 1 3 2
6/4 4 6 3 7 2 5 1
6/11 4 5 6 1 2 3
6/18 5 4 2 3 1
6/25 1 4 3 5 2
7/2 1 5 4 3 2
You c an s ee that nine players took part in at leas t one
tournament, but no event had partic ipation from all players . I f a
pers on rec eived no points on a given night, it was bec aus e s he
didn't play. T his is c ommonly the c as e in s ports s uc h as golf and
tennis as well.

O n two oc c as ions , s even people played, but on other oc c as ions ,


as few as five s at down together. Four people have played in all
eight tournaments . (T hes e are the hard- c ore players who have
to admit that they have a bit of a problem rec ognizing what is
important in life.) O ne player, D avid, played in only one
tournament.

T he points under eac h player's name indic ate the order in whic h
they went out. I f there are s ix players and you go out firs t, you
get one point for taking las t plac e. I f you are the winner among
s ix players , you get s ix points for taking firs t.

N otic e a c ouple of things about this point


s ys tem. Firs t, you get at leas t a point jus t
for s howing up. Sec ond, you get more
points for winning a tournament with more
players .

H ow, then, to rank players in the poker league? H ere are three
c ommon s olutions , all of whic h work to s ome extent.
Total points

T he firs t thought that c ame to mind in my s ituation was to


s imply add up the points ac ros s tournaments and rank players
bas ed on their total points . T his is the approac h taken when
c elebrities are ranked by inc ome or bank robbers are ranked by
their number of c rimes . J us t partic ipating a lot moves you up in
thes e rankings . To be golfer of the year, you have to have played
in many events , in addition to performing O K in them.

Mean performance

A s ec ond method is to average the points by dividing the total


points by the number of tournaments in whic h a player
partic ipated. T he beauty of produc ing an average is that you get
a number that repres ents a typic al level of performanc e. T his is
ideal for meas uring s omething elus ive, s uc h as talent. Your
average performanc e at poker (or anything els e) s hould be the
bes t s ingle indic ator of ability.

Total wins

A third method, the s imples t and mos t c ommonly us ed in team


s ports , is to c ount vic tories . T he player who wins mos t often is
the bes t player. T his method works well for tournament- s tyle
poker (the kind we play) and any events in whic h there is one
c ompetitor who is the c lear winner.
Comparing the Three Methods

T hough eac h ranking approac h has s ome c lear advantages and


does the job adequately, Table 5 - 1 7 s hows the values for eac h
player under all three ranking s ys tems .

Table Summarizing poker performance


Paul Lisa Billy BJ Mark Bruce Cathy Tim David
Points 9 11 28 36 28 25 12 8 1
Mean 4.5 5.5 3.5 4.5 3.5 3.13 1.71 4.0 1.0
Wins 1 1 2 1 2 2 0 0 0

A ll three s c oring s ys tems make s ens e. But the ques tion about
who is the bes t has a different ans wer under eac h of the three
s ys tems ! T his is c ertainly a frus trating finding for a poker
s c ientis t like me. Bec aus e one c ould defend any of the three
methods as the "bes t" way to rank, it is a bit of a paradox that
eac h method produc es a different "bes t" poker player. Table 5 -
1 8 s hows how the rankings differ under eac h s c oring method.

Table Poker rankings


Paul Lisa Billy BJ Mark Bruce Cathy Tim David
Points 7 6 2.5 1 2.5 4 5 8 9
Mean 2.5 1 5.5 2.5 5.5 7 8 4 9
Wins 4 4 2 4 2 2 6 6 6
N otic e how the "bes t player" is different under eac h s ys tem. BJ
is the bes t under the P oints s ys tem. L is a is the bes t under the
M ean s ys tem. T hree people tie for firs t under the Wins s ys tem,
but BJ and L is a are not among them. T he only real agreement
ac ros s the three methods is that D avid is ranked as the wors t
player. (Sorry, D avid, but numbers don't lie. A nd s orry about the
public ridic ule. M aybe I c an make it up to you with a free c opy of
this book? )

I broke ties when as s igning rankings by


averaging the ranking among thos e who
were tied. I n other words , Billy, M ark, and
mys elf were all tied for the number one
ranking under the Wins s ys tem, s o the
ranks of 1 , 2 , and 3 average to 2 , and that
was our ranking.

I f three different s c oring s ys tems res ult in three different


rankings , it is c lear they c annot all be equally valid. T hey c annot
all produc e s c ores that truly reflec t the variable of interes t,
whic h is poker- playing ability defined in the s ame way. T he
s olution does not involve pic king the s ingle bes t approac h. I t
was not my goal to identify the bes t s ys tem and go with it; my
goal was to provide valid information and let others interpret the
data how they want.
M y s olution was to provide all three rankings bas ed on the three
s c oring methods . T hat way, players c ould c hoos e to foc us on the
ranking res ults from the method that makes the mos t s ens e to
them.

The End of the Story

T he s ys tem that made the mos t s ens e to the players in my


poker league turned out to be the one that ranked them the
highes t. I magine that.

I s leep at night s ec ure in the knowledge that any of the methods


is probably ac c eptable and "ac c urate." A fter all, none of the
three methods makes the mis take of identifying me as the one
bes t player. T hat's got to be s ome s ort of validity evidenc e in
and of its elf!

Real- life profes s ional s ports organizations have dealt with the
advantages and dis advantages of eac h s ys tem by c reating
c ompos ite point s ys tems . Some of the tinkering to improve
ranking s ys tems in tennis and golf (and tournament poker, too)
inc ludes :

C ombining performanc e data over a long period of time

A warding more points for winning more diffic ult


tournaments

U s ing both the mean performanc e and total points


together to reward exc ellenc e and frequent partic ipation
I t is a bit ironic that thes e s ys tems that are likely fairer and
more ac c urate are often perc eived by the pres s and fans as
overly c omplex and c razy. A ttempts to make the ranking
s ys tems more valid have res ulted, often, in a rejec tion of the
s ys tems by the public as invalid.
Hack 60. Estimate Pi by Chance

Statisticians like to think that anything important can be


discovered using statistics. That might actually be true, since it
turns out that you can use statistics to estimate the value of
one of the most important basic values in science: pi.

T he ability to c alc ulate pi is one of the routine s kills for all


budding genius es . I remember, for example, that dividing 2 2 by 7
c omes pretty c los e. T here are a variety of other ways , s ome
more ac c urate than others . M y favorite method, though, requires
the element of c hanc e and a long, lonely s ea voyage or other
period of enforc ed s olitude. I ntrigued? Read on, G illigan.

Before s howing how to es timate the value of pi, I 'll begin our
dis c us s ion by pres enting a c ouple of bas ic fac ts from geometry.
D on't panic ; I don't know muc h about geometry, s o we won't
s pend a lot of time on this . I 'll jus t c over the bas ic s we need to
apprec iate the magic of this hac k.

Pi

I n geometry, key relations hips have been found between pi, a


number that is roughly 3 .1 4 1 5 9 (s ymbolized by p), and the way
various parts of a c irc le fit together, as s hown in Figure 5 - 6 .
Figure 5-6. Calculating pi

For example, if you take the diameter of a c irc le and multiply it


by pi, you will get the c irc umferenc e of the c irc le. I f you take the
radius of a c irc le, s quare it, and multiply that value by pi, you will
get the c irc le's area.

A ll pretty c ool, perhaps , but it is primarily of interes t to thos e


who like to play with geometry, not with s tatis tic s . But jus t wait.

Pi and Falling Needles


I n the 1 7 0 0 s , G eorges - L ouis L ec lerc pres ented a half-
geometry/half- s tatis tic s puzzle to the world. H e was the C ount of
Buffon, or s omething, s o this problem is known as Buffon's
N eedle P roblem. H e pres ented it generally, without s pec ific s ,
and I s ummarize it here:

I magine a needle lands randomly on a drawing of two parallel


horizontal lines . T he lines are further apart than the length of the
needle. What are the c hanc es that the needle will land in s uc h a
way that it touc hes one of the lines ?

T his is one of thos e problems that s eem impos s ible to s olve the
firs t time you hear it, but it is s olvable. T here's no need to s pend
any time c alc ulating the s olution here, though I c ertainly could
do it, I as s ure you. Really, I c ould. Really. T he s olution has to do
with geometry, and it takes into ac c ount two key c omponents of
information. T he keys to any given random landing pos ition are:

Where the c enter of the needle is in terms of dis tanc e


from the c los es t line

T he angle of the needle in relations hip to the


perpendic ular of the c los es t line

D efining the random pos ition of the needle with thes e two bits of
information allows for s ome general obs ervations that help to
s implify the problem:

I f the c enter of the needle is exac tly on one of the lines ,


then the needle will always touc h that line, regardles s of
its angle.
I f the c enter of the needle is c los e enough to a line,
within half the needle's length, then the needle will
s ometimes touc h a line. T he angle of the needle
determines whether the needle touc hes a line.

I f the c enter of the needle is further away from a line


than half the needle's length, then the needle will never
touc h that line, regardles s of its angle.

T he c los er to a line, the greater the c hanc es are of a


needle touc hing that line.

A ll the pos s ible needle loc ations c an be graphed as a c urve,


illus trating all pos s ible dis tanc es from a line and all pos s ible
angle- from- perpendic ulars of the needle. Trigonometry enters
the pic ture here, and mathematic ians have defined s uc h a c urve
with this equation:

T his is the ans wer to the problem. L et's try it quic kly with s ome
real numbers , jus t to c hec k L ec lerc 's work. I magine a needle
three inc hes long falling randomly on a s ewing table with a
pattern on the grain s uc h that there are two parallel lines four
inc hes apart. What proportion of the time will the needle touc h
one of the two lines ? H ere are the nec es s ary c omputations :

T he needle will touc h a line about 4 8 perc ent of the time.

A lready, your gambling juic es might be


flowing as you envis ion a large room full of
needle- dropping and lines on the floor and
s uc h. G o for it; more power to you. T his
princ ipal is already in play in s ome
c arnival games you've probably s een. E ver
notic e how rarely thos e ping- pong balls
land in thos e fis hbowls or the football gets
through that hoop?

Probability and Pi

I promis ed you that you c ould us e c hanc e to es timate pi,


though, not us e pi to figure c hanc e. T he power of math allows us
to move around any element of any equation, and s o any element
to the right of the equals s ign c an be moved to the left. We c an
s c ramble our probability equation to produc e a pi equation like
s o:

I 'll prove it works by us ing the s ame numbers we us ed when we


tes ted the probability equation. We already know what the right
ans wer for pi is , s o let's s ee if the equation works :

T his equation c alc ulates pi as 3 .1 4 4 7 , whic h is pretty darn


c los e to 3 .1 4 1 5 9 . I f we had allowed our numbers to go many
plac es pas t the dec imal, we would have had an even more
ac c urate ans wer.

Estimating Pi Using Probability


I n our example, we knew the probability, s o we c ould c alc ulate pi
us ing that information. But what if you didn't know pi and needed
to c alc ulate it? What if you were s tuc k on a des ert is land or on a
long oc ean voyage or in bed with a broken leg and had no ac c es s
to referenc e works that inc luded a fairly exac t value for pi?
Further, s uppos e you needed to c alc ulate the c irc umferenc e of a
c irc le or the volume of a s phere or any of a number of other
values in geometry or financ e or phys ic s that make us e of the pi
value? A nightmare s c enario, eh? You c ould us e this formula to
c alc ulate pi pretty ac c urately by jus t c onduc ting an experiment
and c ollec ting data.

Set up an area with two horizontal lines , drop s ome needles , and
keep trac k. M eas ure the dis tanc e between your lines and the
length of your needle, and let the random whims of c hanc e do all
the c ognitive heavy lifting. C ollec t a large s ample of data from
many needle drops to get a probability that is prec is e to s everal
plac es pas t the dec imal, perhaps a thous and drops or s o. G ood
luc k and keep c areful rec ords .

L et's s ay that you drew two lines that were 8 inc hes apart and
us ed a knitting needle about 7 inc hes long. I f you us ed this
equipment for a large number of drops , you would likely find that
the needle touc hed a line s omewhere between 5 0 and 6 0
perc ent of the time. L et's s ay it was 5 5 perc ent. To us e this data
to c alc ulate pi, you would apply the math like this :

You'll find that 3 .1 8 is pretty c los e to the ratio of the


c irc umferenc e to the diameter s hown in Figure 5 - 6 .

I f your eyes ight is n't what it us ed to be, there's no need to us e a


hard- to- s ee needle. You c an apply the s ame logic us ing a penc il
falling off your des k, or a marble rolling ac ros s the floor into a
defined area, or a parac hutis t landing on a rec tangular target.
You need two parallel lines that the penc il, marble, or parac hutis t
c an have a c hanc e of landing on, and you need to know the
length of the objec t. A s long as the outc ome is random, anything
will work, and a parac hutis t landing on a hays tac k is a lot eas ier
to find than a needle s omewhere in one.
Chapter 6. Thinking Smart
H ac ks 6 1 - 7 5

T his c hapter c onc entrates on hac ks that help you to think more
c learly, c leverly, or c reatively. Start out by us ing the rules of
probability and proving yours elf s marter than a s uperhero [H ac k
#6 1 ]. Keep feeling s mart by mas tering s tatis tic al s hortc uts
[H ac k #6 6 ] and the ability to detec t fraud [H ac k #6 4 ].

C ontinue impres s ing yours elf and others by tapping into your
s keptic al s ide: demys tify amazing c oinc idenc es [H ac k #6 2 ] and
hac k your way to the truth about weird phenomena [H ac k #6 3 ].
A fter dis proving (or perhaps proving) the exis tenc e of E SP
[H ac k #6 8 ], your friends will be amazed when you read their
minds [H ac k #6 7 ].

Finally, wrap up your s elf- improvement c ours e by learning to


avoid a c ommon illogic al trap [H ac k #6 9 ].

N ow that you are s o s mart, it s hould be a breeze to notic e things


around you that others do not. You c an mas ter the fine art of the
traffic jam [H ac k #7 4 ], explore your c onnec tions to Kevin Bac on
and everybody els e [H ac k #7 2 ], and s pot bogus elec tion
s ys tems [H ac k #7 3 ] known only to politic al s c ientis ts .

Round out this c hapter by expanding your horizons . Try out


different exc iting profes s ions s uc h as es pionage and c ode
breaking [H ac k #7 0 ], and dis c over new s pec ies [H ac k #7 1 ] and,
perhaps , even life on other planets [H ac k #7 5 ].
Hack 61. Outsmart Superman

Lightning can strike twice in the same place, but it is very


unlikely. The laws of probability allow us to calculate the
likelihood of a series of rare occurrences happening all in a row.

O c c as ionally, we hear s tories of highly unlikely events


happening more than onc e to the s ame pers ona fores t ranger
who has been s truc k by lightning s even times , for example, or a
N ew J ers ey c ouple winning the lottery twic e. When they appear
in the news , thes e s tories often inc lude an interview with the
loc al s tats profes s or, who es timates the odds of s uc h a thing
happening.

T he math for c alc ulating the total likelihood of a s eries of events


is fairly s imple. T he more diffic ult part is figuring good es timates
for the probability of any s ingle event happening onc e. T hen, you
s imply multiply the individual probabilities together to get the
total likelihood for the whole c hain of weird happenings .

Lucky Lois Lane

To s how the s teps involved in c alc ulating the likelihood of a


whole s eries of events , I 've c hos en an example from c las s ic
literature. T his s eries of rare events is des c ribed in the Lois Lane
c omic magazine #5 6 , publis hed by D C C omic s in A pril of 1 9 6 5 .
A c ommon pattern in the s tories involved L ois having s ome
apparently s upernatural experienc e that was hard to explain but,
at the end of the s tory, turned out to have s ome s imple
explanation.

L ois L ane, now wife (but former girlfriend


and number one fan) of the c omic book
hero Superman, was a very popular
c harac ter in the line of D C c omic books in
the 1 9 6 0 s and 1 9 7 0 s . A mong
s ophis tic ated c omic afic ionados , Lois Lane
c omic s of that era are now enjoyed as
examples of partic ularly s trange c omic
writing. L ois tended to beat the odds
almos t on a daily bas is . H er c omic s s hould
be required reading in s tatis tic s c ours es .

O ne example of a s trange experienc e that was then "explained"


by Superman at the end of the s tory involved the applic ation of a
s tatis tic s hac k. L ois is pretending to be telepathic s o s he c an
hang around mobs ter "L ong O dds " L arkin and maybe get a
s c oop for her news paper.

I t works all too well, as s he is kidnapped by L arkin and forc ed to


provide him with "telepathic " information s o he c an c ommit
c rimes . Fortunately for L ois , and for the mobs ter, her blind
gues s es turn out to be c orrec t and L arkin keeps her alive. H er
gues s es are s o ac c urate that L ois c omes to believe that s he
ac tually has ps yc hic powers .
I t turns out, ac c ording to Superman, who eventually res c ues her,
that L ois was jus t luc ky! Very luc ky. A s toundingly,
inc omprehens ibly luc ky. E ven though the odds of L ois c orrec tly
making the lengthy s eries of c orrec t predic tions and ac c urate
gues s es were extremely s lim, s he jus t luc ked out. C ongrats ,
L ois !

Superman pres ents what he s ays are the odds for the fantas tic
feats that L ois performs , but the author of the s tory (anonymous )
does not provide the c alc ulations . L et's review the random
gues s es that L ois makes , do our own c alc ulations , and c hec k
the M an of Steel's math. For determining the probability of this
s eries of independent events , we will apply the multiplic ative
rule [H ac k #2 5 ].

The Guesses

I n the s tory, L ois c orrec tly gues s es totally at random, mind


youthe following:

1. Whic h of five duplic ate armored truc ks is ac tually


c arrying M etro Bank c as h

2. T he c ombination to a s afe that holds a large c ompany's


payroll funds

3. T he unlis ted phone number to the ric hes t pers on in town

4. U nder whic h of 2 0 ,0 0 0 trees a bank robber's loot is


buried

She finally fails , after Superman has res c ued her, to gues s the
number of jellybeans in a jar. A s Superman explains to M s . L ane
that s he is not ps yc hic , he s ugges ts that the odds of her making
thes e four c orrec t gues s es by c hanc e are 3 2 6 ,4 5 4 ,8 3 9 ,0 4 7 to
1 , or 1 out of 3 2 6 ,4 5 4 ,8 3 9 ,0 4 8 .

"I s ee, Superman! " s he s ays . "I was luc ky enough to hit that
'one c hanc e'." "Yes ," s ays Superman, "after all, s omeone always
wins big lotteries , too" (or s ome s uc h nons ens e to that effec t).
T hat number c alc ulated by Superman or his Superc omputer
c ertainly is big, whic h s eems right, but I don't think it is c los e to
being c orrec t. M y gues s is that this outc ome is even more
mirac ulous .

The Calculations

L et's work through our own c alc ulations . For gues s es 1 and 4 ,
we c an figure pretty c los e to the odds of gues s ing the ans wer to
that problem independently. For gues s es 2 and 3 , we'll have to
make s ome as s umptions .

H ere again are the gues s es L ois made and real c alc ulations of
the odds for eac h one, taken by thems elves .

T he math involved here is the eas y part for


s tatis tic ians who are as ked to produc e
s tatements of likelihood for a s tring of
unlikely events . T he hard part is
determining the s tarting values , the piec es
of the equations . A s you s ee with our
attempts to es timate how luc ky L ois was ,
we will have to make s ome moderately
wild, though reas onable, gues s es to
s omehow know what the c hanc es of any
partic ular oc c urrenc e are. Statis tic ians
c an't really know the bas ic odds muc h of
the time. T hey tend to foc us on theoretic al
s ituations where the odds can be known,
not real- life problems like thos e of M s .
L ane.

Guess 1

Whic h of five duplic ate armored truc ks is ac tually c arrying M etro


Bank c as h? T his is the eas ies t one. Five pos s ibilities , one
c orrec t c hoic e. T he c hanc es are 1 out of 5 or 1 /5 .

Guess 2

L ois gues s es the c ombination to a s afe that holds a large


c ompany's payroll funds . T his is a real puzzler. N ot only does
L ois gues s the five numbers that one s hould turn the dial to, but
s he als o gues s es that there is a s equenc e of five different
numbers that mus t be us ed, and the direc tions that the wheel
mus t be turned.

I n the real world, there are a variety of different types of


c ombination loc ks produc ed, s o it is hard to know for s ure what
as s umptions we s hould make about this problem. I 've done a
little res earc h about s afe c rac king (for the s ake of this hac k,
let's s ay) and learned a little about c ombination s afes . U s ually,
there is a total of anywhere from one to eight numbers in a
c ombination s equenc e. I 'd gues s that three or five numbers in a
s equenc e is mos t c ommon. T he numbers on a dial c an be any
range of values , but 0 to 9 9 is c ommon for larger s afes , s uc h as
the payroll s afe in the s tory.

So, for s tarters , let's s ay that s he randomly pic ks between this


s afe having a three- or five- number c ombination. C hanc es for
that gues s are 1 out of 2 , or 1 /2 . Say s he randomly pic ks a
number from 0 to 9 9 eac h time: 1 out of 1 0 0 , or 1 /1 0 0 , for eac h
number in the s equenc e. She als o has to gues s the s tarting
direc tion. L et's s ay that mos t s afes , 8 0 perc ent, s tart to the left,
and only 2 0 perc ent, 1 out of 5 , s tart to the right (whic h is her
gues s ).

So far, s o good. I t gets very tric ky here, though, bec aus e of the
c ombination L ois ac tually s ugges ts . She predic ts "1 1 right...1 3
left...5 left...bac k to 8 ...forward to 1 5 ." T his is a very odd
c ombination. Firs t, a c ombination is us ually read in a different
order: left 13, ins tead of 13 left. Sec ond, what c an it pos s ibly
mean to go left twice in a row! Surely you have to c hange
direc tion of the dial to loc k in eac h number in the s equenc e.
A fter all, the dial pas s es over many numbers on its way left
every time. H ow does it know whether to c ount eac h number it
pas s es as a part of the c ombination s equenc e? I 'm going to jus t
pretend that the s equenc e is mis reported s lightly by the
anonymous author; otherwis e, I 'd have to paus e here in an
endles s loop of c onfus ion, with my fingers over the keyboard,
never able to c ontinue.

Finally, why does L ois s tart s aying "bac k" and "forward" ins tead
of left and right? T his jus t makes her direc tions unc lear (perhaps
to c over hers elf in c as e of failure? ). A gain, I 'm going to as s ume
s he us es the terms to mean a c hange in direc tion, even though
back probably means left and forward probably means right,
whic h would jus t c omplic ate things more. A c ons ervative s et of
probabilities for this gues s , then, is
1 /2 x1 /5 x1 /1 0 0 x1 /1 0 0 x1 /1 0 0 x1 /1 0 0 x1 /1 0 0 . T hat's 1 out of
1 0 0 ,0 0 0 ,0 0 0 ,0 0 0 .

Guess 3

L ois als o gues s es the unlis ted phone number to the ric hes t
pers on in town. T here are a c ouple of ways to figure this .

Firs t, if L ois were a bit naïve (and, no offens e to L ois 's fans , but
I 'm gues s ing s he is ), s he might s et only the parameters that the
phone number had to have s even digits and not s tart with 0 .
U nder thes e rules , there are 9 ,0 0 0 ,0 0 0 pos s ible phone
numbers . T his as s umes that we s tart with 1 0 ,0 0 0 ,0 0 0 pos s ible
s even- digit numbers (9 ,9 9 9 ,9 9 9 is the highes t s even- digit
number, plus add one for the number 0 ,0 0 0 ,0 0 0 ).

I f we c an't c ount any numbers that s tart with 0 , that eliminates


the number 0 ,0 0 0 ,0 0 0 and all s ix- digit or les s numbers (there
are 9 9 9 ,9 9 9 of thos e). T hat's an even million pos s ibilities we
c an eliminate. So, under this s c enario, L ois 's c hanc e of gues s ing
the number would be 1 out of 9 ,0 0 0 ,0 0 0 or 1 /9 ,0 0 0 ,0 0 0 . L et's
give L ois the benefit of the doubt for a s ec ond and imagine that
s he wouldn't gues s her own phone number or other phone
numbers s he knows by heart. I 'd gues s there are maybe 1 0 of
thos e. So, L ois would have 1 out of 8 ,9 9 9 ,9 9 0 to c hoos e from.

A s marter L ois (let's s ay for the s ake of argument) might know


the partic ular exc hanges in us e in M etropolis , or thos e likely to
be us ed for unlis ted numbers , or for the ric h part of town, or
whatever. Bac k in the day, there was a s mall s et of pos s ibilities
for the firs t three digits in a partic ular area c ode known as
exc hanges . A c ity the s ize of M etropolis might have fifty or s o
that were us ed mos t c ommonly, s o s he might c hoos e from thos e.
U nder the "s mart L ois " s c enario, her odds improve c ons iderably.
N ow, s he might blindly gues s out of 5 0 0 ,0 0 0 numbers , not
9 ,0 0 0 ,0 0 0 . H er c hanc es might have been 1 out of 5 0 0 ,0 0 0 or
1 /5 0 0 ,0 0 0 . M y rough es timation of L ois 's intelligenc e s ugges ts
that this s c enario is not the mos t likely, but s he is a reporter for
a major metropolitan news paper, s o s he may have this
knowledge. L et's be c haritable and go with it.

Guess 4

Finally, L ois gues s es under whic h of "2 0 ,0 0 0 " trees a bank


robber's loot is buried. L ike gues s 1 , this is als o fairly eas y to
c alc ulate. I f there really are exac tly 2 0 ,0 0 0 trees in the woods
where the loot is buried (and this number is probably an es timate
or rounded off), the c hanc e of gues s ing c orrec tly is 1 out of
2 0 ,0 0 0 or 1 /2 0 ,0 0 0 .

Final Probability

So, the c hanc es of gues s ing c orrec tly on thes e four problems in
a row, giving L ois all s orts of benefits of the doubt for knowing all
s orts of things about s afes and telephone numbering s ys tems , is
1 /5 x1 /1 0 0 ,0 0 0 ,0 0 0 ,0 0 0 x1 /5 0 0 ,0 0 0 x1 /2 0 ,0 0 0 . T he c hanc es
of this s equenc e of luc ky gues s es oc c urring is , c ons ervatively, 1
out of 5 ,0 0 0 ,0 0 0 ,0 0 0 ,0 0 0 ,0 0 0 ,0 0 0 ,0 0 0 even more remarkable
than the already hard to believe 1 out of 3 2 6 ,4 5 4 ,8 3 9 ,0 4 8 .
"I s ee, Superman! I was luc ky enough to hit that one c hanc e,"
L ois c onc ludes . I ndeed. O f c ours e, the odds were even wors e
that Superman would propos e to L ois s omeday, and that
happened. So, who am I to rain on M r. and M rs . Superman's
parade?
Hack 62. Demystify Amazing
Coincidences

The patterns of probability produce some unusually interesting


alignments. Here's how to interpret coincidences that seem
unbelievable.

O ne of the oc c as ional s ad duties of s tatis tic ians is to take a


world full of whims y, delightful s erendipity, and s urpris es around
every c orner and turn it into a dull, predic table, uninteres ting
plac e. I 'm about to do that here, s o if you would rather keep
wearing ros e- c olored glas s es , put them on now, s kip this hac k,
and pic k another one (I s ugges t more pleas ant topic s , s uc h as
winning M onopoly [H ac k #5 1 ]).

I c hoos e to be s c ientific and treat the world as rational and built


on c ons equenc es that follow c hains of c aus e and effec t. M y
problemand perhaps yours too, if you think like meis that when I
fac e anomalies (hard to explain, unexpec ted things ), it is
tempting for me to treat the happening as evidenc e of s omething
mys tic al, or ps yc hic , or paranormal in s ome way. C oinc idenc es
are a good example. When I witnes s an inc redible c oinc idenc e, I
am tempted to fall into a c omforting pit of nons c ientific
explanations , s uc h as fate or s ync hronic ity.

Synchronicity is the term that pioneering


ps yc hiatris t C arl J ung us ed for pers onally
meaningful c oinc idenc es . H e s aw them as
providing ins ight into the inner world of the
unc ons c ious , but was not above as s igning
ps eudo- mys tic al explanations to them as
well. H e was not a s tatis tic ian.

T he s olution to my problemand perhaps yours , if you're s till with


meis to think a bit and apply s ome bas ic rules of probability.
T his way, I c an get a handle on things and treat s uc h
c oinc idenc es as inevitable c ons idering the large s ample s izes
that exis t in the univers e. By applying s uc h rules , I c an feel
better about the world I live in. I c an s leep peac efully in the
arms of c hanc e, and I have no need for mys tic al, magic al
explanations . H ere are three s trategies for tac kling the next
amazing c oinc idenc e you c ome ac ros s .

Compare the Number of Possible


Outcomes

When I was a kid, I us ed to s ee a c ommon advertis ement in the


c omic books I read (e.g., Statboy and His Flying Dog, Parameter).
T he ad s old U .S. pennies that had been altered to inc lude a
portrait of J ohn F. Kennedy in addition to the s tandard L inc oln
profile. To jus tify why thes e two pres idents s hould be inc luded
together, a long lis t of "remarkable" c oinc idenc es s hared by
thes e two pres idents was pres ented (and, as I rec all, if I
purc has ed a s et of thes e pennies , I would even get a s mall
pos ter that lis ted thes e s imilarities .)

T he lis t inc luded things beyond the obvious , s uc h as the fac ts


that both were as s as s inated and both were s uc c eeded by vic e
pres idents named J ohns on. I c ould (and did) interpret thes e
c oinc idenc es as evidenc e of s ome important, s omewhat- magic al
c onnec tion between the two. L et's us e thes e c oinc idenc es as an
example and approac h it as a res earc h ques tion: is there an
unus ual number of s imilarities between thes e two pres idents ?

I t oc c urs to me now that the c omic book


ad led me to think for a time that the word
coincidence was derived from the word coin.
O f c ours e, I quic kly learned otherwis e (by
graduate s c hool, c ertainly) that it has
s omething to do with c o- inc idents .

O ne tool to us e when dec iding whether a c oinc idenc e is


remarkable or predic table is to c ount the number of pos s ible
outc omes and then determine whether the given outc ome (the
c oinc idenc e) is unlikely to have oc c urred by c hanc e. T his is the
approac h taken when predic ting s hared birthdays in a large
group [H ac k #4 5 ].

C olumn one of Table 6 - 1 pres ents a lis t of s ome of the


c oinc idenc es s hown in thos e old c omic book ads and als o found
in "H ard to Believe"- type public ations . C olumn two s hows a brief
lis t of c harac teris tic s that both men c ould have s hared, but did
not.
Table Comparing Abraham Lincoln and John F.
Kennedy
Some
Some amazing
unremarkable
coincidences
noncoincidences
Both assassinated. Different heights.
Both elected in years
Different weights.
ending with 60.
Kennedy assassin shot
from a warehouse and
They were different ages
hid in a theater. Lincoln
when they died (though
assassin shot in a
they were the same age
theatre and hid in a
when they were born).
warehouse (well, a barn
anyway).
Lincoln was shot in They were born on
Ford's theater. Kennedydifferent dates in
was shot in a Ford. different years.
Both were killed on a Both men had different
Friday. middle names.
Both men had wives with
Both were killed while
different names and,
sitting next to their
probably, different shoe
wives.
sizes.
Both succeeded by men Succeeded men with
named Johnson. different names.
Lincoln had a beard;
Kennedy did not. (Come
to think of it, their faces
are different in hundreds
of ways.)
Kennedy probably had
bowled occasionally;
Lincoln never bowled a
game in his life.

By paying attention to only the relatively few c onc ordanc es


between L inc oln and Kennedy (the hits ) and ignoring all the non-
hits , of whic h there are almos t infinitely more, it is eas y to
mis perc eive the exis tenc e of s ome unc anny link. O f c ours e,
there s till might be s ome unc anny link, but the "c oinc idenc es "
do not provide evidenc e for it.

Figure Out the Actual Odds

I f you play poker with any regularity (and, if you are a minor
H ollywood c elebrity, you apparently play all the time), you know
that you rarely s ee a royal flus h: a five- c ard hand with the 1 0 ,
J ac k, Q ueen, King, and A c e all of one s uit. I f your opponent were
dealt a royal flus h, would that be remarkable? Would you s us pec t
c heating? I t all depends on how many poker hands you have
s een in your lifetime, I gues s , or perhaps in rec ent memory.

L et's us e a s imple deal of five c ards to do our math. To figure the


c hanc es of getting a royal flus h on one deal of five c ards , we
would firs t c alc ulate the number of pos s ible five- c ard poker
hands and c ompare that to the number of thos e c ombinations
that are defined as a royal flus h. T he proc es s takes three s teps :

1. C alc ulate the number of pos s ible hands , if order makes a


differenc e. We s tart this way bec aus e the math is
eas ies t. A ny one c ard of 5 2 c ould be the firs t c ard, then
any one of the remaining 5 1 c ould be next, then any one
c ard out of 5 0 , and s o on down to any one c ard out of 4 8 .
So, the number of pos s ible hands when the order matters
is :

1. O rder does not matter, though. So, we divide this giant


total of all pos s ible hands by the number of pos s ible
different s equenc es of c ards . T his number of different
s equenc es is 5 x4 x3 x2 x1 = 1 2 0 , s o the number of
pos s ible five- c ard poker hands is :

1. Bec aus e there are only four pos s ible royal flus hes , one
for eac h s uit, we divide this number of pos itive outc omes
(4 ) by the total number of pos s ible outc omes
(2 ,5 9 8 ,9 6 0 ), for a probability of .0 0 0 0 0 1 5 3 9 , or 1 out
of 6 4 9 ,7 4 0 .

Your opponent or you s hould be dealt five c ards that make a


royal flus h onc e every 6 4 9 ,7 4 0 hands . So, if it does happen, it is
c ertainly rare. I f it happens more than onc e in the s ame game,
you s hould interpret that as an amazing c oinc idenc e or as
evidenc e of c heating. You dec ide. I know what my c alc ulator and
I would gues s .

What about drawing to a royal flus h? A fter


all, in draw poker and in Texas H old 'E m,
players have an opportunity to improve
their hand or at leas t guide it toward s ome
objec tive. I n draw poker, if you have four
c ards to a royal flus h and wis h to dis c ard
the fifth and draw a new c ard, you have a 1
out of 4 7 c hanc e for s uc c es s , or .0 2 1
perc ent. I f you have two c hanc es to
improve your hand, the odds go up to .0 4 3
perc ent, or about 1 out of every 2 5
attempts .

Remove Meaning Assigned to


Meaningless Events

T he human brain is at its bes t when it mus t make meaning out of


data. O ur remarkable intelligenc e c an find meaning even where
there is none. O ften, this is the c as e when we think we have
witnes s ed a mirac ulous s et of c oinc idenc es . We s ee
c oinc idenc es when we look for them.

H ighly improbable events happen all the timeevery day, and


every minute of every hour. T he highly improbable events are
interes ting only when we dec ide they are interes ting. T hink of
our poker example. Bec aus e there are about 2 .6 million pos s ible
five- c ard poker hands , the c hanc es of any s pec ific hand are one
out of about 2 .6 million. T he odds are the s ame for the hands we
have dec ided are partic ularly meaningful, s uc h as a 1 0 , J ac k,
Q ueen, King, and A c e of Spades , as they are for hands that we
have dec ided are not partic ularly meaningful, s uc h as a 4 of
C lubs , 6 of Spades , J ac k of D iamonds , Q ueen of Spades , and
A c e of H earts . Why is it amazing that you jus t drew a royal flus h
and not equally amazing when you draw any other random
c ombination of c ards ? T he probability is the s ame for all poker
hands . We as s ign the meaning to a partic ular outc ome.

T he next time you are at a c rowded plac e, s uc h as a bas eball


game, amus ement park, or airport, and you run into s omeone you
know, notic e that the c oinc idenc e is meaningful only bec aus e
you happen to know the pers on. Yes , the c hanc es were s lim that
you would run into that partic ular pers on (unles s you are being
s talked), but it is 1 0 0 perc ent c ertain that you would run into
other people. A ll thos e other people jus t happen to be there the
s ame time you are. I t is a c oinc idenc e, and it is highly
improbable that this partic ular mix of individuals is in the s ame
plac e at the s ame time. I t is not a meaningful c oinc idenc e for
you, though.

T he odds are even good that you would run


into s omeone you know, if we c ount
anybody you know. L et's s ay you know 2 0 0
people and you, by yours elf, go to a
Kans as C ity Royals bas eball game one
night. I f eac h of thos e 2 0 0 people goes to
a Royals game one time eac h s eas on and
there are 8 1 home games , eac h of thos e
2 0 0 people has a 1 /8 1 c hanc e of being
there the s ame night as you. I t's unlikely
then that you would run into any partic ular
pers ons uc h as your U nc le Frank, for
examplebut it is highly likely that s omeone
you know will be there. T here is about a 9 2
perc ent c hanc e that one or more of your
2 0 0 pals will be there, even though eac h of
them rarely goes to a game. E ven if you
only know 5 6 people, the c hanc es are
greater than 5 0 perc ent that one or more
of them will be there.

We are c ons tantly expos ed to a large s et of events and people


and things that interac t and c oinc ide in very unlikely ways .
O c c as ionally, thos e c oinc idenc es have meaning to us , and s o we
notic e them. What is amazing is that we do not notic e thes e
highly improbable events more often.
Hack 63. Sense the Real Randomness of
Life

Bef ore you accuse the casino of running a crooked game or


threaten your boss with a lawsuit f or hiring only blonde women,
here's a tool f or separating those nonrandom-seeming situations
that probably did occur randomly f rom those nonrandom-
seeming situations that probably did not occur randomly.
Probably.

A s you bec ome more and more aware of the role that c hanc e
plays in the world around you, and begin to habitually s tat- hac k
your way through everyday s ituations , you might bec ome overly
s ens itive to patterns that don't s eem right. D on't abus e your
newfound powers , though, and treat probabilities as c ertainties .
A dditionally, don't make the mis take of expec ting events that
are s uppos ed to be random to look random.

What Does Random Look Like?

Looking random and being random are not the s ame things . When
events have s everal pos s ible and equally likely outc omes , any
of them c an happen. T he way the human mind works , though,
many people think that the pattern of outc omes of events with
s everal equally likely outc omes ought to look a c ertain way, a
way that s omehow looks random (whatever that means ).
For example, real- world res earc h has found that people tend to
believe that, when flipping c oins , the mos t probable outc omes
are thos e that look the mos t mixed up. To illus trate this idea,
look at Table 6 - 2 . (A void looking at Table 6 - 3 until you have
read a bit more.) Whic h exac t s equenc e of c oin flips do you think
is mos t likely to oc c ur?

Table Coin-flip patterns, with probabilities not


shown
Pattern of
Answer heads ands Probability
tails
Heads, Tails, Heads,
A ?
Heads, Tails
Tails, Tails, Tails, Tails,
B ?
Tails
Heads, Heads, Tails,
C ?
Tails, Tails
Heads, Heads, Heads,
D ?
Heads, Tails

M any people give the ans wer "A ." M aybe you did, too. When
as ked to explain why A s eems the mos t likely outc ome, the
ans wers inc lude s tatements like thes e:
"T he others are too ordered."

"A is more mixed up, s o it's more likely."

"A looks more random, like it c ould really happen."

E ven though you know that c oin flipping is random (as s uming the
c oin is n't weighted), looking random does n't make s omething
more probable. A ll of thes e patterns of c oin flips are ac tually
equally probable, as s hown by the math in Table 6 - 3 .

Table Coin-flip patterns, with probabilities


Pattern of
Answer heads and Probability
tails
Heads, Tails,
1/2x1/2x1/2x1/2x1/2
A Heads, Heads,
= 1/32 = .03125
Tails
Tails, Tails, Tails, 1/2x1/2x1/2x1/2x1/2
B
Tails, Tails = 1/32 = .03125
Heads, Heads, 1/2x1/2x1/2x1/2x1/2
C
Tails, Tails, Tails = 1/32 = .03125
Heads, Heads,
1/2x1/2x1/2x1/2x1/2
D Heads, Heads,
= 1/32 = .03125
Tails
When as ked to predic t a s pecific outc ome of a s eries of c oin
flips , all pos s ible outc omes mus t be equal, bec aus e eac h flip of
the c oin is independent of the other flips . I n other words , the
c oin does n't know whether it jus t landed on H eads or Tails , s o
there is no way that the c oin c an know whic h s ide it is s uppos ed
to land on the next time it is flipped. A c oin, like dic e or a
roulette wheel, has no memory.

How to Spot Random Outcomes

To know an unus ual s equenc e of events when you s ee it, you


need to dec ide whether you are s uppos ed to be paying attention
to a combination or a permutation. I n probability theory, we talk
about c alc ulating odds by looking at the probabilities of c ertain
combinations (s ay, three H eads and two Tails in any order) and
the probabilities of c ertain permutations (an exac t s equenc e that
would res ult in three H eads and two Tails , s uc h as H eads , Tails ,
H eads , H eads , Tails , in that partic ular order).

I f you are as ked a ques tion about whic h outc ome is the mos t
likely, or whether a given outc ome c ould have oc c urred by
c hanc e, firs t determine whether you are being as ked about
c ombinations (the total number of H eads and Tails in any order,
for example, or the number of different ways of drawing five
playing c ards of the s ame s uit) or about the permutations that
are pos s ible. H ere are the important dis tinc tions between the
two:

Combinations
A c ombination is the total number of ways that one c ould
end up with a partic ular number of values when drawing
randomly from s ome population. C oin flips are s amples
drawn from a theoretic ally infinitely large population
made up of 5 0 perc ent H eads and 5 0 perc ent Tails . T he
number of c ombinations varies , depending on the
number of a c ertain value one is interes ted in. I n other
words , with five draws or flips , there are more ways to
draw out three heads than there are ways to draw out
five heads . So, drawing three heads is likelier than five
heads .

Permutations

P ermutations are the number of ways that a given


number of elements c ould be arranged. I n other words ,
they are the number of exac t s equenc es . I n our c oin- flip
example, 5 elements that c an eac h be 1 of 2 values
res ults in 3 2 different pos s ible orders of arrangement.
So, eac h of the permutations s hown in Table 6 - 3 will
oc c ur 1 out of every 3 2 times .

How to Calculate Combinations

T he number of pos s ible c ombinations is c alc ulated by taking the


number of pos s ible values for one draw (e.g., two values for a
c oin: H eads or Tails ) and multiplying it by its elf for eac h draw:

T here are 3 2 pos s ible c ombinations of 5 c oin flips (2 5).

T he equation for c omputing the number of ways to get a particular


draw (e.g., three H eads ) out of a partic ular number of elements
drawn from a population is :

T he previous equation requires thes e variables :

T he number of elements or draws (e.g., 5 c oin flips ).

T he partic ular draw of interes t (e.g., 3 H eads ).

Fac torial, whic h means to take the number and multiply


it by that number minus 1 , then by that number minus 2 ,
and s o on, all the way down to 1 . For example, 5 !
repres ents 5 x4 x3 x2 x1 = 1 2 0 (whic h, by the way, is why
there are 1 2 0 pos s ible c ombinations of five c ards in a
poker hand [H ac k #6 2 ]).

So, the number of ways to get three H eads out of five c oin flips
is :

1 0 c ombinations out of 3 2 pos s ible c ombinations means that


you will get exac tly 3 heads by flipping a c oin 5 times 1 0 /3 2
times , or about 3 1 perc ent of the time.
Statistics Hacking on a Desert Island

I f you were on a des ert is land and didn't have ac c es s to


books or equations and had to find out how often exac tly
three heads s hould c ome up in a group of five c oin flips ,
you c ould us e the brute forc e method of lis ting all the
pos s ible patterns of flips and c ounting how many of
them have exac tly three heads . I t would look like this ,
with the outc ome of interes t (three heads ) s hown in
bold:

H H H H H T H H H H H H H H T THHHT HHTTH T H T T H H H T T T
T H T T T H H H T H THHTH HHHTT T H H T T H H T H H THTHH
HHTHT T H T H T H T H H H TTHHH HTHHT T T H H T H T T T H
T T T T H H T H T T T T H T T HTTHH T T T H H H T T T T T T T T T
HTHTH T T H T H H T T H T T T T H T

When to Be Suspicious

D ec iding whether a pattern is random (i.e., what one would


expec t by c hanc e) is a matter of:

Knowing the c hanc es of c ertain c ombinations (not


permutations )
Fighting the ps yc hologic al tendenc y to expec t c hanc e
res ults to not produc e a rec ognizable pattern

Setting a s tandard for how unlikely an event mus t be


before ques tioning the data

L et's return to our table of c oin flips , s hown now in Table 6 - 4


with the added c hanc es of c ertain outc omes of interes t.

Table Coin-flip outcomes and probabilities


Order Outcom
Order Outcome
probability probabili
Heads,
Tails,
Heads, .03125 Three Heads .31250
Heads,
Tails
Tails,
Tails,
Tails, .03125 Five Tails .03125
Tails,
Tails
Heads,
Heads,
Tails,
.03125 Three Tails .31250
Tails,
Tails
Heads,
Heads,
Heads, .03125 Four Heads .15625
Heads,
Tails

T he rares t of thes e outc omes is five Tails , whic h will oc c ur about


3 times for every 1 0 0 times you produc e five c oin flips . I t is
unlikely to happen by c hanc e on a given attempt, but it will
happen oc c as ionally ac ros s a s eries of attempts . I f it happens
frequently ac ros s a s eries of attempts , s omething might be up.

What level of likelihood are you c omfortable with? H ow rare mus t


an event be before you dec ide it did not oc c ur by c hanc e?
Sc ientis ts have s et a s tandard of 5 perc ent. I f s tudy res ults
s ugges t an outc ome that would oc c ur by c hanc e only 5 perc ent
or les s of the time, it is us ually c ons idered to be s ignific ant, and
is probably evidenc e that s omething other than c hanc e is in
play.

You get to dec ide for yours elf, though, when you want to ac c us e
s omeone of being a c heat. G ood luc k on making that dec is ion! I t
s hould res ult in fis t fights les s than 5 perc ent of the time.

J ill L ohmeier with Bruc e Frey


Hack 64. Spot Faked Data

If you haven't given it much thought bef ore, it might be quite


natural to assume that all digits are equally likely to show up in
most random data sets. But according to Benf ord's law, f or
many types of naturally occurring data, the lower the digit, the
more f requently it will occur as a leading digit. You can use this
secret knowledge to check the authenticity of any data set.

I n the 1 9 th c entury, long before the age of elec tronic c alc ulators ,
s c ientis ts us ed tables publis hed in books to find values of
logarithms . A partic ularly obs ervant 1 9 th- c entury as tronomer
and mathematic ian, Simon N ewc omb, notic ed that the pages of
logarithm tables were more worn in the firs t pages than in the
las t pages . N ewc omb c onc luded that numbers beginning with 1
oc c ur more frequently than numbers beginning with 2 , numbers
beginning with 2 oc c ur more frequently than numbers beginning
with 3 , and s o on.

N ewc omb publis hed an empiric al res ult bas ed on his


obs ervations in the A meric an J ournal of M athematic s in 1 8 8 1 ,
whic h s tated the probabilities of a number in many types of
naturally oc c urring data, beginning with digit d for d = 1, 2, ... 9.
N ewc omb's firs t s ignificant digit law rec eived little attention and
was largely forgotten until over 5 0 years later when Frank
Benford, a phys ic is t at G eneral E lec tric , notic ed the s ame
pattern of wear and tear of logarithm tables .

A fter extens ive tes ting (2 0 ,2 2 9 obs ervations ! ) on a wide


variety of datainc luding atomic weights , drainage areas of rivers ,
c ens us figures , bas eball s tatis tic s , and financ ial data, among
other things Benford publis hed the s ame probability law
c onc erning the firs t s ignific ant digit in the P roc eedings of the
A meric an P hilos ophic al Soc iety (Benford, 1 9 3 8 ). T his time, the
firs t s ignific ant digit law attrac ted greater attention and bec ame
known as Benford's law. A lthough Benford's law bec ame fairly
well known after the 1 9 3 8 paper, whic h inc luded s ubs tantial
s tatis tic al evidenc e, it lac ked a rigorous mathematic al
foundation until that evidenc e was provided by G eorgia Tec h
M athematic s profes s or T heodore H ill in 1 9 9 6 (H ill, 1 9 9 6 ).

Today, Benford's law is routinely applied in s everal areas in


whic h naturally oc c urring data aris e. P erhaps the mos t prac tic al
applic ation of Benford's law is in detec ting fraudulent data (or
unintentional errors ) in ac c ounting, an applic ation pioneered by
Saint M ic hael's C ollege Bus ines s A dminis tration and
A c c ounting profes s or M ark N igrini (http://www.nigrini.c om/).

T he detec tion of fabric ated data is important not only in


ac c ounting, but als o in a wide variety of other applic ations (for
example, c linic al trials in drug tes ting). T his hac k des c ribes
Benford's law, s hows you how to apply it, provides s ome intuitive
jus tific ation on why it works , and gives s ome guidelines on when
Benford's law c an be applied.

How It Works

I n its s imples t form, Benford's law s tates that in many naturally


oc c urring numeric al data, the dis tribution of the firs t (nonzero)
s ignific ant digit follows a logarithmic probability dis tribution
des c ribed as follows . Following H ill (1 9 9 7 ), let D 1(x) denote the
firs t bas e 1 0 s ignific ant digit of a number x. For example, D
1(9 1 0 8 ) = 9 , and D 1(0 .0 2 5 1 0 8 ) = 2 .

T hen, ac c ording to Benford's law, the probability that D 1(x) = d,


where d c an equal 1 , 2 , 3 , ..., 9 , is given by the following
equation:

T hus , Table 6 - 5 gives the probabilities of the firs t s ignific ant


digits .

Table Probabilities of first digits under Benford's


Law
First
Probability according
nonzero
to Benford's law
digit
1 0.301
2 0.176
3 0.125
4 0.097
5 0.079
6 0.067
7 0.058
8 0.051
9 0.046
Laying Down the Law

To demons trate Benford's law, I 'll c ons ider two examples that
you c an verify yours elf.

Street addresses

To s ee Benford's law in ac tion, open the phone book of your c ity


or town to any page, and rec ord the number of hous e numbers
that begin with eac h nonzero dec imal digit. Two pages s hould be
s uffic ient. U nles s there is s omething very unus ual about your
town, the relative frequenc ies s hould res emble the res pec tive
probabilities predic ted by Benford's law.

Table 6 - 6 s hows res ults c omputed from the 4 1 3 hous e numbers


taken from two pages of the 2 0 0 5 - 2 0 0 6
N arragans ett/N ewport/Wes terly, RI Yellow Book (White P ages
s ec tion).

Table Addresses following Benford's law


Relative Probability
frequency for according
First
first digit of to
nonzero
house Benford's
digit
number law
1 0.334 0.301
2 0.174 0.176
3 0.143 0.125
4 0.075 0.097
5 0.073 0.079
6 0.075 0.067
7 0.046 0.058
8 0.043 0.051
9 0.036 0.046

Figure 6 - 1 s hows the pattern more c learly.

Figure 6-1. Street addresses following Benford's


law
A lthough the agreement with Benford's law is not perfec t, you
c an s ee a reas onably good fit. I f you take a larger s ample of
addres s es , the res ulting relative frequenc ies will be even c los er
to the probabilities predic ted by Benford's law.
Stock prices

T he s toc k market is known to follow Benford's law. You c an verify


this yours elf by obtaining up- to- the- minute N A SD A Q Sec urities
pric es at http://quotes .nas daq.c om/referenc e/c omlookup.s tm.

Figure 6 - 2 and Table 6 - 7 s how the relative frequenc ies of the


firs t nonzero dec imal digits for N A SD A Q Sec urities as of
J anuary 2 7 , 2 0 0 6 , c ompared to the probabilities predic ted by
Benford's law.

Figure 6-2. The stock market following


Benford's law
Table NASDAQ securities following Benford's
law
Relative Probability
frequency for according
First first digit of to
nonzero NASDAQ Benford's
digit securities law

1 0.301 0.301
2 0.167 0.176
3 0.133 0.125
4 0.095 0.097
5 0.082 0.079
6 0.071 0.067
7 0.055 0.058
8 0.045 0.051
9 0.049 0.046

You c an obtain the M atlab c ode us ed to


produc e the tables and figures in this s ec tion
at
http://homepage.mac .c om/s amc hops /benford/.
A dditionally, M ark N igrini provides his D ATA S
s oftware (inc luding a free s tudent E XC E L
program), whic h performs a more s ophis tic ated
data analys is of the firs t, s ec ond, and firs t two
digits , at
http://www.nigrini.c om/datas _s oftware.htm.

More General Statements of Benford's


Law

Benford's law does not apply to the firs t nonzero digit only, but
als o inc ludes probabilities of other digits . O nc e again, following
the treatment dis c us s ed earlier, let D 2(x) denote the s ec ond
bas e- 1 0 s ignific ant digit of a number x. For example, D 2(9 1 0 8 )
= 1 , D 2(9 0 1 8 ) = 0 , and D 1(0 .0 2 5 1 0 8 ) = 5 . N otic e that, unlike
the firs t s ignific ant digit, the s ec ond s ignific ant digit c an be zero.

T hen, ac c ording to Benford's law, the probability that D 2(x) = d,


where d c an equal 0 ,1 , 2 , ..., 9 , is given by the following
equation:

T his formula leads to the probabilities of the s ec ond s ignific ant


digit, s hown in Table 6 - 8 .

Table Benford's second-digit law


Table Benford's second-digit law
Second Probability
significant according to
digit Benford's law

0 0.11968
1 0.11389
2 0.10882
3 0.10433
4 0.10031
5 0.09668
6 0.09337
7 0.09035
8 0.08757
9 0.08500

From Table 6 - 8 , you c an s ee that the differenc es among the


probabilities of the s ec ond digit are not nearly as dramatic as
thos e probabilities c orres ponding to the firs t digit.

N ow, bac k to the s toc k market. To illus trate Benford's law as it


relates to the s ec ond s ignific ant digit, I c omputed the relative
frequenc ies of the s ec ond s ignific ant digits of our earlier
N A SD A Q Sec urities example. T he res ults in Table 6 - 9 s how,
again, a c los e agreement with Benford's law.

Table NASDAQ securities following Benford's


second-digit law
Relative Probability
Second frequency according to
digit of second Benford's
digit law
0 0.12803 0.11968
1 0.11427 0.11389
2 0.10918 0.10882
3 0.10290 0.10433
4 0.10230 0.10031
5 0.09273 0.09668
6 0.09064 0.09337
7 0.09153 0.09035
8 0.08406 0.09035
9 0.08436 0.08500

A more general Benford's probability formula c an be us ed to


c ompute the res pec tive probabilities of the nth digit. L et D k(x)
denote the kth bas e- 1 0 s ignific ant digit of a number x. T hen,
ac c ording to Benford's law, the probability that D 1(x)=d 1, D
2(x)=d 2,..., and D n(x)=d n is given by the following equation:

N ote that if k does not equal 1 , then d k c an equal 0 , 1 , 2 , ..., 9


and, as noted earlier, d 1 c an equal 1 , 2 , ..., 9 .

Where Else It Works

Two unique properties of Benford's L aw are s cale invariance and


bas e invariance.

Scale invariance

Benford's law is s c ale- invariant; that is , if you multiply the data


by any nonzero c ons tant, you s till wind up with a dis tribution that
c los ely follows Benford's law. T hus , it makes no differenc e
whether you meas ure s toc k quotes in dollars , dinars , or s hekels ,
or whether you meas ure lengths of rivers in miles or kilometers .
You'll always wind up with data that follows Benford's law.

To prove this , I took the N A SD A Q s ec urities data us ed in the


earlier example and multiplied eac h value by p. A s you c an s ee
in Table 6 - 1 0 , the relative frequenc ies s till follow Benford's law.

Table NASDAQ securities scaled by following


Benford's law
Relative Probability
First frequency for according
nonzero first digit of to
digit NASDAQ Benford's
securities law
1 0.306 0.301
2 0.176 0.176
3 0.123 0.125
4 0.097 0.097
5 0.081 0.079
6 0.066 0.067
7 0.058 0.058
8 0.049 0.051
9 0.045 0.046

Base invariance

T he bas e- invariant property of Benford's law s tates that it


applies not only in bas e 1 0 , but als o in more general bas es .
M oreover, T heodore H ill s howed that Benford's law is the only
probability law that has this property (H ill, 1 9 9 5 ).

You c an find the formula for Benford's law


in the general bas e- b c as e in H ill (1 9 9 7 ).
See the "See A ls o" s ec tion for public ation
details .

Benford's law works bes t on data that has the following


c harac teris tic s :

Sufficient variability

T he higher the variability, the better Benford's law


applies .

No built-in maximum or other s imilar cons traint

For example, Benford's law does not apply to the ages of


high s c hool s eniors , or to members of the loc al s enior
c itizen c enter.

Numbers that res ult from counting or meas uring

For example, it does not work well for s oc ial s ec urity


numbers and ZI P C odes , bec aus e they are s imply
identifiers and are not true numeric al values .

Large s ample s ize

T he larger the data s et, the better Benford's law applies .

Random s ampling

T he data res ults from a large number of random s amples


from a large number of randomly s elec ted probability
dis tributions . T his realization by H ill led him to his proof
of Benford's law (Bec ker, 2 0 0 0 ; H ill, 1 9 9 9 ).

Sinc e tax data s trongly follows Benford's law, it has been us ed


quite s uc c es s fully to identify fraudulent tax returns . I n
des c ribing s ome of the bas ic features of Benford's law, we
s howed how anyone c an perform a quic k- and- dirty tes t for
irregularities in data. Spec ific ally, anyone c an eas ily c ompute
relative frequenc ies of firs t digits and eyeball the res ults
juxtapos ed with probabilities predic ted by Benford's law.

I n prac tic e, the programs us ed by experts and authorities to


identify deviations from Benford's law and other irregularities
c an be quite s ophis tic ated. I t is als o important to keep in mind
that deviation from Benford's law does not prove fraud, but it
does rais e red flags s ugges ting that further inves tigation might
be indic ated.

For more details on the applic ation of


Benford's law to detec t fraud, inc luding a
"goodnes s - of- fit" tes t, s ee N igrini (1 9 9 6 ).
C ons ult this hac k's "See A ls o" s ec tion for
public ation details .

Why It Works

A lthough the proof of Benford's law is quite tec hnic al, there are
s ome ins ightful and intuitive explanations for this mathematic al
princ iple. O ne s uc h explanation that I find partic ularly attrac tive
has been provided by M ark N igrini (1 9 9 9 ).

H is explanation goes s omething like this . I f you imagine that


s ome inves tment with an initial amount of $ 1 0 0 is expec ted to
grow at an annual rate of 1 0 perc ent, it would take about 7 .3
years for the firs t digit of the total amount to c hange to 2 . T his is
bec aus e the total amount has to inc reas e by 1 0 0 perc ent to
reac h a value of $ 2 0 0 . I n c ontras t, c ons ider the time it would
take for $ 5 0 0 to inc reas e to $ 6 0 0 . I f we c ontinue to as s ume an
annual growth rate of 1 0 perc ent, it would take about 1 .9 years
to reac h $ 6 0 0 . So, the amount of time until the inves tment
amount has a firs t digit of 5 is c ons iderably les s that the amount
of time it has a firs t digit of 1 . O nc e the total amount reac hes
$ 1 ,0 0 0 , it will again take about 7 .3 years before it will have a
firs t digit of 2 (after another 1 0 0 perc ent inc reas e).

T he real world is a bit more c omplic ated, but this does help to
explain why 1 is a more c ommon firs t digit than larger digits .
A nother intuitive explanation is that there are more s mall towns
than large c ities , and there are more s hort rivers than long
rivers .

Where It Doesn't Work

Benford's law is les s likely to apply in data s ets with ins uffic ient
variability or data s ets that are nonrandomly s elec ted. For
example, c omputer files s izes approximately follow Benford's
law, but only if no res tric tion is plac ed on the type of files
s elec ted.

To illus trate this , I found the frequenc ies of the firs t digit of the
file s izes on an A pple P owerBook G 4 . T he res ults s hown in
Figure 6 - 3 and Table 6 - 1 1 exhibit the Benford's law pattern.

Figure 6-3. Computer files that follow Benford's


law
Table Computer files that approximately follow
Benford's law
Relative Probability
frequency for
First first digit of according
nonzero 660,172 to
digit computer files Benford's
law
1 0.277 0.301
2 0.181 0.176
3 0.144 0.125
4 0.107 0.097
5 0.076 0.079
6 0.067 0.067
7 0.054 0.058
8 0.054 0.051
9 0.041 0.046

A lthough the res ults s hown in Figure 6 - 3 and Table 6 - 1 1 are


bas ed on 6 6 0 ,1 7 2 files , Table 6 - 1 2 demons trates that a s ample
s ize of 6 0 0 is large enough to exhibit the Benford's law pattern
(albeit not as well as the larger s ample), provided the s ample of
files is random.

Table Random selection of 600 computer files


sizes
Relative Probability
First frequency for according
nonzero first digit of to
digit 600 computer Benford's
files law
1 0.262 0.301
2 0.187 0.176
3 0.147 0.125
4 0.107 0.097
5 0.069 0.079
6 0.070 0.067
7 0.052 0.058
8 0.057 0.051
9 0.052 0.046

For c omparis on, I c omputed the relative frequenc ies of M P 3 files


in an iTunes mus ic library on the s ame c omputer. Table 6 - 1 3 and
Figure 6 - 4 s how that this s et of files does not follow Benford's
law.

Table Music MP3 files that do not follow


Benford's law
Probability
Relative
First according
frequency for
nonzero to
first digit of
digit Benford's
601 MP3 files
law
1 0.080 0.301
2 0.097 0.176
3 0.276 0.125
4 0.270 0.097
5 0.161 0.079
6 0.070 0.067
7 0.023 0.058
8 0.013 0.051
9 0.001 0.046

Figure 6-4. Music MP3 files that do not follow


Benford's law
T he fac t that the file s izes of about 6 0 0 M P 3 mus ic files do not
approximate Benford's law is not s urpris ing, s inc e the s izes of
M P 3 mus ic files exhibit muc h les s variability than a more
random s elec tion of any 6 0 0 c omputer files .

See Also
Bec ker, T. J . (2 0 0 0 ). "Sorry, wrong number: C entury- old
math rule ferrets out modern- day digital dec eption,"
Georgia Tech Res earch Horizons ,
http://gtres earc hnews .gatec h.edu/res hor/rh-
f0 0 /math.html.

Browne, M . (1 9 9 8 ). "Following Benford's law, or looking


out for no. 1 ." The New York Times , A ugus t 4 , 1 9 9 8 .

Fawc ett, W. (n.d.). "Signific ant figure generator."


http://williamfawc ett.c om/flas h/SigFigD is tbG en.htm.

Benford, F. (1 9 3 8 ). "T he law of anomalous numbers ."


Proceedings of the American Philos ophical Society, 7 8 ,
5 5 1 -5 7 2 .

H ill, T. P. (1 9 9 6 ). "A s tatis tic al derivation of the


s ignific ant digit law." Statis tical Science, 1 0 , 3 5 4 - 3 6 3 .

H ill, T. P. (1 9 9 5 ). "Bas e- invarianc e implies Benford's


law." Proceedings of the American Mathematical Society,
1 2 3 , 8 8 7 -8 9 5 .

H ill, T. P. (1 9 9 7 ). "Benford's law." Encyclopedia of


Mathematics Supplement, 1 , 1 1 2 . Kluwer.

H ill, T. P. (1 9 9 9 ). "T he diffic ulty of faking data." Chance,


2 6 , 8 -1 3 .
N ewc omb, S. (1 8 8 1 ). "N ote on the frequenc y of us e of
the different digits in natural numbers ." American Journal
of Mathematics , 4 , 7 2 - 4 0 .

N igrini, M . (1 9 9 9 ). "I 've got your number: H ow a


mathematic al phenomenon c an help C P A s unc over fraud
and other irregularities ." AI CPA Journal of Accountancy
Online Journal, M ay 1 9 9 9 ,
http://www.aic pa.org/pubs /jofa/may1 9 9 9 /nigrini.htm.

N igrini, M . (1 9 9 6 ). "A taxpayer c omplianc e applic ation


of Benford's law." Journal of the American Taxation
As s ociation, 1 8 , 7 2 - 9 1 .

You c an obtain the Matlab c ode us ed to produc e the


tables and figures in this s ec tion at
http://homepage.mac .c om/s amc hops /benford/. You'll
need to have Matlab (http://www.mathworks .c om)
ins talled to run the c ode.

E rnes t E . Rothman
Hack 65. Give Credit Where Credit Is
Due

Stylometrics is a statistical procedure that identif ies the


underlying dimensions that def ine an author's style. It uses the
method of f actor analysis to judge who wrote what.

P rofes s or H owe- M utc h had a problem. Two of his bes t s tudents


were s itting in his offic e, hoping to res olve a dis pute. D r. H owe-
M utc h had awarded an A + to P aul's final paper (on the his toric al
importanc e of c hoc olate milk). T he problem was that L is a
c laimed to have written it. A n ac c us ation of plagiaris m had been
made! Both were good s tudents who had written many quality
papers for him in the pas t. So, the s olution as to true authors hip
was not a s imple one, nor was the realization that one of his
favorite s tudents was a c heat.

Fortunately, the good doc tor of philos ophy had many years of
experienc e and was wis er than his adjunc t pos ition at State
C ommunity C ollege and Truc king Sc hool might have s ugges ted.
A mong other obs c ure s tatis tic al hobbies , D r. H owe- M utc h
dabbled in the art of s tylometry, a s tatis tic al method for
c ategorizing the s tyle of written works . T he method c an als o be
us ed to identify anonymous authors . I t works bes t when there
are a c ouple of pos s ibilities or s us pec ts to c hoos e from, and
when the typic al writing s tyles of the s us pec ts are known and
have been quantified. L et's watc h as the broken- hearted
profes s or applies thes e tec hniques to find the true author.
Building a Model

Firs t, D r. H owe- M utc h as ks P aul and L is a to bring in all the other


papers they have eac h written in the pas t and about whic h there
is no dis pute. I n jus t a few moments , the papers are s c anned
into a c omputer, providing a databas e of all the different words
us ed by both writers .

O r they were s ent to him elec tronic ally s o


no s c anning was nec es s ary; none of this is
relevant to the s tory, s o why are you
ques tioning me about it?

For the firs t analys is , all the words written by the two writers are
kept together. D r. H owe- M utc h c ounts the frequenc y with whic h
eac h word is us ed and identifies the 5 0 to 1 0 0 mos t c ommonly
us ed words in the c ombined databas e. T hes e words bec ome the
items or key variables that s upply the data for a factor analys is .
Fac tor analys is is a s tatis tic al proc es s that looks at the
c orrelations [H ac k #1 1 ] among groups of variables and
identifies c lus ters of variables that c orrelate better among
thems elves than they do with other variables . Whatever thes e
grouped- together variables have in c ommon is as s umed to be a
fac tor, c omponent, or dimens ion that they all s hare.

For the s ake of our s tory, I 'll s how only 1 0 of the words that D r.
H owe- M utc h identified as mos t c ommon ac ros s both writers '
works . Table 6 - 1 4 s hows the words and their frequenc y of us e.
When looking at all the words P aul and L is a wrote, the was us ed
4 .2 perc ent of the time, weas el was us ed 1 perc ent of the time,
and s o on.

Table Paul and Lisa's commonly used words and


their frequency
Word Frequency
the 4.2 percent
and 2.1 percent
to 1.8 percent
a or an 1.2 percent
weasel 1.0 percent
of 0.8 percent
in 0.8 percent
that 0.5 percent
it 0.4 percent
not 0.2 percent

T hes e words ac t as variables to try to identify the underlying


fac tors that des c ribe one or more dimens ions of s tyle. P aul and
L is a's s tyles might be at different plac es along thes e
dimens ions . I t might be that only one dimens ion or fac tor is
nec es s ary to ac c ount for variability in the us age of thes e words ,
or there might be many dimens ions . O nc e thes e
dimens ions defined by the variables that c orrelate together, or
load, on the dimens ionare identified, any writing s ample c ould be
placed in the theoretic al s pac e framed by the fac tors .

T he data for D r. H owe- M utc h's fac tor analys is are s upplied by
eac h s ec tion of 5 0 0 words in the writing s amples . E ac h s ec tion
rec eives a s c ore on eac h of the word variables . T he s c ore will be
the number of times the word is us ed in that paragraph. Table 6 -
1 5 s hows examples of the data M r. H owe- M utc h c ollec ts .

Table Sample of study data


the and to a/an weasel of in that
Section
21 8 11 5 4 0 0 1
1
Section
10 7 15 5 2 10 1 0
2
Section
5 5 5 2 6 12 2 4
3
Section
0 2 4 3 1 4 6 8
4
Section
4 11 16 2 0 3 5 0
5

I n Table 6 - 1 5 , s c ores indic ate the number


of times eac h word appears in the text
s ec tions .

Factor Analysis

N ext, D r. H owe- M utc h performs the fac tor analys is , a fairly


c omplex mathematic al proc es s that thes e days is done us ing
c omputers , while the res earc her makes many theory- driven
dec is ions at different points along the way. Bas ic ally, the fac tors
are identified by exploring the relations hips among variables
until a s mall number of variable groupings are found that s eem to
ac c ount for as muc h variability as pos s ible ac ros s the data. T he
c ommonality s hared by variables in eac h grouping provides the
mathematic al fodder that defines the fac tor. O nc e the fac tors are
c hos en, any obs ervationin this c as e, a s ample of textc an
rec eive s c ores on the fac tor and then be plac ed in that
theoretic al s pac e, with the fac tor s c ores s erving as c oordinates .

I n this c as e, the analys is s ugges ts that two fac tors do a good


job of des c ribing the s ample texts . Fac tor 1 is defined by the us e
of words s uc h as the and a/an at one end and of and in at the
other. I n other words , the text s ec tions differed bas ed on how
frequently they us ed artic les , and the s ec tions that had a higher
frequenc y of artic le us e tended to be lower in their us e of
prepos itions . Fac tor 2 is defined by the frequenc y of the us e of
the word weas el.
I n exploratory fac tor analys is , typic ally res earc hers are
interes ted in dis c overing and naming the underlying cons tructs
(i.e., invis ible traits ) that ac c ount for human behaviors and
c harac teris tic s . For this us e, though, P rofes s or H owe- M utc h is
interes ted only in defining thes e dimens ions bas ed on the
variables (e.g., word us e) that anc hor them at both ends . H e is
not interes ted in figuring out why thos e text s ec tions that tend
to c ontain the word the frequently als o tend to c ontain a or an
frequently. H e als o is not interes ted in why the us e of the word
weas el c ould dis tinguis h between his different writing s amples .
For his purpos es , he is c ontent jus t knowing that thes e two
fac tors provide a c ouple of good axes to map the s pac e of all
words that the two authors c hos e to us e in their s amples .

When the fac tor s c ores from eac h of P aul and L is a's s ample
papers are c omputed, it bec omes c lear that the two authors
have different s tyles . L is a tends to us e the word weas el more
frequently than P aul; her papers s c ore high on Fac tor 2 . L is a's
papers als o tend toward the high us e of artic les and rec eive
fairly high Fac tor 1 s c ores . P aul's papers , on the other hands ,
tend to avoid the us e of the word weas el and tend toward the
prepos ition end of Fac tor 1 .

T his is diffic ult to gras p us ing words alone, s o an illus tration will
help draw a pic ture to demons trate the plac ement of the s ample
texts . Figure 6 - 5 s hows the two fac tors , the word us age that
defines them, and where the different writing s amples loaded on
the two fac tors . For the s ake of c onvenienc e for this dis c us s ion,
Figure 6 - 5 dis plays only a few of the writing s amples and maps
only the 1 0 words in Table 6 - 1 4 and Table 6 - 1 5 . A ls o inc luded
in the figure is the plac ement of the dis puted paper in this
theoretic al dimens ional s pac e.

Figure 6-5. Factor analysis of text samples


T he s olution to the mys tery is now c lear. T he dis puted paper
s hares the c harac teris tic s of L is a's papers , not P aul's . Bec aus e
P aul and L is a's earlier papers dis play a c ons is tent but different
s tyle, at leas t as defined by word c ounts , the fac tor map is a
us eful tool to identify the mos t likely author of the paper.

D r. H owe- M utc h awards the A + to L is a, ac c us es P aul of


plagiaris m, and is now engaged in a lengthy c ourt battle with
P aul's attorneys , whic h will no doubt leave our fine s tatis tic ian
friend penniles s . T he important thing, though, is that a
s tatis tic al proc edure was able to make the invis ible vis ible.
Sc ienc e triumphed onc e again.

See Also

"Who wrote the 1 5 th book of O z? ," by J .N .G . Binongo in


Chance, 16, 2, 9 - 1 7 .
Hack 66. Play a Tune on Pascal's Triangle

Need to know the odds quickly? Pascal's Triangle is a simple


layout of numbers that allows f or quick and easy calculations of
probability. It's worked f or 300 years, so I bet it will work f or
you.

T he thing that s tatis tic ians do mos t often is c alc ulate


probabilities , whic h c an des c ribe expec ted outc omes for a
variety of s ituations . A s imple example is flipping a c oin.
I magine that you have been as ked to wager on the outc ome of a
c oin flip. With two pos s ible outc omes , heads or tails , the
c hanc es of getting either outc ome on a s ingle c oin flip is 1 out of
2 , or 1 /2 .

T he math is eas y if you know the number of different ways to get


the winning outc ome and the number of pos s ible outc omes . I n
the c oin flip example, there's only one way to get a winning
outc ome, and there are only two pos s ible outc omes . T he math is
jus t a bit harder if we have more than one c oin flip and wonder
about the number of all pos s ible outc omes and how many of
thos e c ombinations would matc h our winning c riteria. For
example, if I want two heads in a row on two c oin flips , I c ould
lis t all pos s ible outc omes , identifying the number of thos e
outc omes that make me a winner, and then s ee what proportion
of all outc omes are winners for me. T hat proportion would be my
c hanc es of winning.

T he number of pos s ible outc omes that c ount as winners is often


more c omplex than our s imple c oin flip examples , though,
bec aus e there might be many trials (or dic e rolls , or purc has e of
lottery tic kets , or whatever) and many different c ombinations .
For example, you might want to figure the number of pos s ible
c ombinations of different elements from any s et of objec ts you're
drawing out of a hat or c hoos ing through s ome other random
s elec tion proc es s .

I magine you are one of s ix relatives driving to the airport and


you mus t all drive there in one big van. N one of you like eac h
other muc h, s o you need s ome fair way to dec ide who will s it
where. You will randomly pic k two names to drive together in the
front s eat.

A private note to my U nc le Frank: yes ,


this example is bas ed on the
"unpleas antnes s " las t T hanks giving. A ll is
forgiven, at leas t on my s ide of the family,
but we agree it would be bes t if you
brought your own c ar next year.

N ow, you need to know the c hanc es that you will be in the front
s eat and who you might be with. T he problem is c alc ulating how
many different c ombinations of relatives c ould be in that front
s eat. For both s imple wagers , s uc h as c oin flips , and life- and-
death s ituations , s uc h as long c ar trips , you c an us e a layout of
numbers c alled P as c al's Triangle to do the math for you.
Presenting Pascal's Triangle

P as c al's Triangle is s hown in Figure 6 - 6 . T his layout of numbers


has s ome interes ting properties . H ere it is s hown made up of 1 0
rows , with 1 0 numbers in its lowes t row, but it c an be made
infinitely large with an infinite number of rows . T he outer edges
going down are all 1 s . T he next diagonals s tart with 1 but
inc reas e by 1 as they go down.

Figure 6-6. Pascal's Triangle


Similar interes ting progres s ions are found throughout the
Triangle. N otic e that eac h number is the s um of the two numbers
above it: 8 4 is 5 6 + 2 8 , 7 is 6 + 1 , and s o on. T hes e c ool
patterns aren't the reas on that the Triangle is of interes t to us ,
however. I ns tead, we're going to us e it to c alc ulate the
probabilities for a variety of outc omes .

Calculating Probability Using the Triangle

P as c al's Triangle, named for Blais e P as c al (a very s mart, very


early c ontributor to probability theory who lived in the 1 6 0 0 s ),
has already made us e of all the c alc ulations we need to ans wer a
variety of ques tions .

T hough this pattern of numbers is known


as Pas cal's Triangle, P as c al did not, and
never c laimed to have, originated it.
Similar patterns of numbers were
pres ented by P as c al's teac her, H \x8 e
rigone, as well as other c olleagues writing
at about the s ame time.

A general formula exis ts to determine the number of pos s ible


outc omes of a c ertain type. T his general formula works any time
there are exac tly two pos s ible outc omes ; thus , the term binomial
coefficient is us ed to des c ribe the outc ome of the formula (bi-
nomial means having two names or, in a s tatis tic al s ens e, two
outcomes ). To determine the number of pos s ible binomial
c ombinations of outc omes ac ros s a given number of trials , this
formula is us ed:

A range of pos s ible values that c ould be entered into this


formula are the c oordinates on P as c al's map. T he n in the
equation, whic h repres ents the number of trials or events ,
indic ates whic h row to go to. T he k in the equation tells us whic h
entry on that row to go to. T he 1 s along the left of the map are
like a border: they c ount as zero. So, to us e the triangle, we s tart
c ounting at 0 .

T he exc lamation point after s ome numbers


in this formula means factorial, whic h, in
turn, means that you are s uppos ed to
c ount down from that number to 1 ,
multiplying the numbers together as you
go. For example, 5 factorial means
5 x4 x3 x2 x1 , or 1 2 0 . By the way, by rule,
0 ! c ounts as 1 .

Assessing the probability of flipped coin


outcomes
For our s ec ond, s lightly more c omplex c oin flip ques tion, the
c hanc es of getting exac tly two heads when flipping a c oin two
times , us e the triangle like this :

1. T he row to go to is determined by the number of c oin


flips we will make: 2 . T he entry we will c ount over to in
that row is determined by the number of heads we want
to s ee: 2 . With our c oin flip example, 2 heads on 2 trials ,
c ount down two rows to this row:

2. 1 2 1

3. T hen, c ount over two entries to the 1. O ur ans wer is 1 ,


and s o there is one c hanc e that we will get two heads .

4. But 1 c hanc e out of how many c hanc es ? A dd up all the


numbers in the row you are in to get that ans wer! 1 + 2
+ 1 = 4 , s o our c hanc es are 1 out of 4 , or 2 5 perc ent.

T he Triangle ans wers more c omplic ated ques tion, as well.


Suppos e you want exactly 3 heads out of s ix c oin flips :

1. C ount down s ix rows (remember to c ount the top of the


triangle as 0 ). You get to this row:

2. 1 6 15 20 15 6 1

3. C ount over three numbers and you hit 2 0 . T here are 2 0


different ways that exac tly three heads c ould c ome up in
s ix c oin flips .

4. 2 0 out of how many pos s ibilities , you as k? Summing all


the values in that row gives us 6 4 . 2 0 out of 6 4 times
you will get exac tly three heads (or three tails ). T hat's
about 3 1 perc ent of the time.
Assessing the probability of a bad car
trip

A nother way to us e the triangle is to s ee how many


c ombinations are pos s ible from a c ertain number of elements
drawn out in a c ertain way. O ur c ar trip example is interes ted in
how many pos s ible c ombinations of two people c an be drawn
from a group of s ix.

From a s et of s ix elements , you will draw out two of them and


matc h thos e up. For this ques tion, and in the binomial formula
that defines the triangle, think of the s ix relatives as n and the
two names to be drawn as the k:

1. C ount down s ix rows and ac ros s two entries , and you hit
the number 1 5 . T here are 1 5 pos s ible c ombinations of
two people drawn from s ix people.

2. I n this c as e, you are interes ted only in your c hanc es of


being in the front s eat with one s pec ific pers on. T hat's 1
c ombination of front s eat pas s engers out of 1 5 pos s ible
c ombinations . So, you will be matc hed up with your
annoying U nc le Frank, or A unt T illie, or whomever, in the
front s eat jus t 1 out of 1 5 times .

Why It Works

T he numbers in the triangle matc h the values that you would


derive if you ac tually did the math us ing the binomial equation,
but you'll notic e that the triangle ans wers s everal other
ques tions along the way. T he patterns of the numbers , their
progres s ion, are c ons is tent with other formulas us ed in
determining probability.

For ins tanc e, the total pos s ible number of c oin flip c ombinations
for s ix c oin flips is ans wered on the triangle by totaling the
values in the s ixth row: 6 4 . You would have mathematic ally
derived that value by applying the general formula for number of
pos s ible outc omes for a c oin: 2 number of flips = 2 6 = 6 4 .

A s for the c hanc es that you would both be c hos en as one of two
people out of s ix and that the other pers on would be a s pec ific
one of the other people (our trip to the airport example), the
triangle s aid 1 out of 1 5 . But you als o c ould have figured it this
way:

1. C hanc e of being in a group of two people out of s ix = 2 /6


= .3 3

2. C hanc e of a s pec ific "other" pers on being c hos en = 1


pers on out of 5 "others " = 1 /5 = .2 0

3. C hanc e of both outc omes oc c urring = .3 3 x.2 0 = .0 6 6 ,


and .0 6 6 = 1 out of 1 5

So, when you have a c omplic ated- looking problem that involves
c ombinations and permutations and s o many pos s ibilities that it
makes your head s pin, let the s oothing mus ic of P as c al's
Triangle bring peac e to your troubled mind.
Hack 67. Control Random Thoughts

The rambling nature of our inner thoughts is of ten perceived as


creating an unpredictable random path. You can take advantage
of this misperception to guess the thoughts of those around you
by increasing the probability that they will f ocus on whatever
you wish.

N o s tranger to c reepy s c enes , E dgar A llen P oe relates this one


in Murders in the Rue Morgue:

O c c upied with thought, neither of us had s poken a s yllable for


fifteen minutes at leas t. A ll at onc e, D upin broke forth with thes e
words : "H e is a very little fellow, that's true, and would do better
for the Theatre des Varietes ." "T here c an be no doubt of that," I
replied unwittingly... "D upin," s aid I gravely, "this is beyond my
c omprehens ion. I do not hes itate to s ay I am amazed, and c an
s c arc ely c redit my s ens es . H ow was it pos s ible you s hould know
(what) I was thinking of...? "

H ave you ever been talking to s omeone, and your mind wanders
off for a little while? T hen, you bring up whatever it was that you
were thinking about and, lo and behold, the other pers on was
thinking about the exac t s ame thing!

Why does this happen? C an you make it happen? C an you


predic t what the other pers on is going to s ay? M ore than likely,
yes , you c an s ometimes make it happen, and s ometimes you
c an predic t what the other pers on is going to s ay. T his is
es pec ially true if the two of you s hare a c ommon bac kground.
Mind Control

O ur memories are filled with words , thoughts , s tories , and s o on


that are as s oc iated with other words , thoughts , and s tories . I f
you want s omeone to think of a c ertain topic s o that you c an
read her mind, the eas ies t way to tric k her into thinking what you
want her to think is by bringing up a topic that is c los ely related
to the des ired topic .

For example, if you want your friend to s tart thinking about lions
and tigers and bears , you might prime her thought proc es s with
words that are as s oc iated with that themewords s uc h as Wizard
of Oz, Dorothy, Toto, or even s tripes , s inc e s tripes and tigers are
highly as s oc iated with eac h other.

A ll words have a c ertain frequenc y of oc c urrenc e in written and


s poken language. Some words have a very high frequenc y of
oc c urrenc e (s uc h as the, it, etc .), while other words have very
low frequenc y of oc c urrenc e (s uc h as aardvark). A dditionally,
s ome words oc c ur with other words quite frequently (s uc h as s alt
and pepper or rhythm and blues ). I n fac t, they oc c ur s o often with
the other words that res earc h has found that people think both
words even when only one is s aid.

By learning thes e as s oc iations , we c an proc es s inc oming


information more quic kly. I f we hear s alt and are already thinking
s alt and pepper, we are one s tep ahead and c an begin to reac h for
both before our dinner c ompanion even finis hes as king us to
pas s them.

So, if you want to "c ontrol" s omeone's mind, the tric k is s imply
to know whic h things oc c ur mos t frequently together. T he more
frequent a word is , the more likely s omeone is to think it.
L ikewis e, the more frequently two words oc c ur together, the more
likely one is to think of both words when only one is s tated.
Probability and Word Association

Res earc hers interes ted in thos e words that tend to be


as s oc iated have c ollec ted data over the years to s ee what is
normal for us humans . P s yc hiatris ts us e knowledge of typic al
free as s oc iations between words as a tool for reading the
s ubc ons c ious . C ognitive ps yc hologis ts us e the s ame
information to map the way the brain proc es s es information.

A huge amount of information is known about cues (the word


pres ented that might lead to an as s oc iation) and targets (the
words thought of after the c ue is pres ented). Table 6 - 1 6 s hows a
s ample of word c ues and the probability that normal people, s uc h
as your friends , will think of partic ular targets . T he table
provides a range of good c ues and bad c ues to give you an idea
of how mos t minds work.

Table Chances of word associations


Cue Target Probability
condom sex .53
bumpy sex .01
broccoli green .25
broccoli gross .01
pajamas sleep .36
accident car .36
accident oops .01
mother father .60
mother goose .02
orthodontist teeth .42
hero Superman .17
hero Batman .02
statistics numbers .26
statistics boring .03
coleslaw fish .01

I nformation like this is us eful for when you want your s ubjec t to
think of c ertain words or ideas . With s ex, for example, you will
have more luc k c ueing with condom than you will with bumpy.

Table 6 - 1 6 draws on the s eemingly


exhaus tive lis t of typic al as s oc iations for
thous ands of words found at
http://w3 .us f.edu/FreeA s s oc iation/,
provided by N els on, M c E voy, and
Sc hreiber, res earc hers at the U nivers ities
of South Florida and Kans as .

Building a List of Word Associations


A s s oc iated ideas and words form s lightly different webs of
c onnec tions in eac h pers on, but within groups of people with a
s hared c ulture (pop or otherwis e) and s hared experienc es , the
networks are s imilar. To be able to s tart s aying your friends '
thoughts out loud (and s pooking the hec k out of them), you'll
need to know the likely as s oc iations in your metaphoric al c orner
of the world.

You c an c onduc t a s mall s tudy to help you determine whic h


words among your friends are mos t s trongly as s oc iated with
eac h other. C reate a s ample of a few repres entative friends or
family members . M ake up a lis t of tes t words , and as k your
s ample to s ay the firs t thing that c omes to mind when you s ay
eac h word. Words in c ommon phras es or titles work well. Words
that elic it thoughts of favorite in- jokes , movies , or s ongs ,
though, are the type of words that s hould work bes t for us e in
ac tual c onvers ation later on.

Your mini- s tudy is a quic k way to get a


s mall s ample of the s ame kinds of data
that real- world c ognitive ps yc hologis ts
us e in their res earc h to learn more about
thought proc es s es .

I f there are s ome words that many of your friends give in


res pons e to a word, you c an as s ume it is s trongly as s oc iated
with the tes t word. You want words with the highes t probability of
priming the mental pump toward a predic table outc ome.
Why It Works

T he human brain is s o effic ient that it proc es s es words or ideas


in the c ontext of whatever words or c onc epts have been
previous ly over- learned. Res earc h s tudies have found that when
people are as ked to s tate whether a s eries of letters is a word,
they res pond more quic kly to words that have been primed or
preac tivated by words that were s hown to them jus t prior to the
identific ation tas k. For example, if s tripes is s hown, and then
either tiger or lemon, people will res pond more quic kly for tiger
than for lemon.

By talking about words or topic s that are c los ely related to other
words or topic s , you begin a thought proc es s in your friend's
brain in whic h ac tivation of neurons s preads to neurons that
generally fire at the s ame time. Your brain has learned that
c ertain words and topic s almos t always oc c ur together, s o it
knows that when one of the as s oc iated words or topic s is
ac tivated, it s hould als o fire in the regions where thos e
as s oc iated words and topic s are ac tivated. T hat way, your
thought proc es s c an proc eed s moothly.

Where Else It Works

T his partic ular mind tric k has s ome ris k of failure, es pec ially
when the as s oc iations that you are relying on are low- probability
as s oc iations . H owever, you might jus t enjoy knowing that you
are s ec retly manipulating others and don't have to make a big
s how out of it.

We c an prime people to do lots of things that s eem to jus t c ome


naturally bec aus e they oc c ur s o effortles s ly and often. For
example, it is likely that you c an make s omeone yawn s imply by
yawning yours elf. You might even be able to get a friend to yawn
by talking about yawning or s leep. (I n fac t, as I wrote this , I
yawned.) L ikewis e, if there is s omething that s ounds good to you
for dinner, you might be able to get your family members to c rave
it too by mentioning that kind of food.

You probably have been primed yours elf many times . When you
are lis tening to your favorite C D and one s ong ends , do you s tart
hearing the next s ong in your head before it even begins ? I f you
know what things s omeone as s oc iates with other things , it
bec omes relatively eas y to predic t s omeone's thoughts after
you've primed them. T his is partially why married people c an
often finis h eac h other's s entenc es .

Where It Doesn't Work

I f s omeone does n't s hare your language bac kground, bec aus e
they s peak either a different language or a different dialec t, they
might not have the s ame word as s oc iations that you have.

I t als o might not work if a word has s everal equally likely word
as s oc iations . For example, if you prime s omeone with the word
hot, s ome people might s tart thinking about the weather (hot and
cold). Some might think about food (hot dogs ). O thers might s tart
thinking about people they admire (a hot babe).

What do you think of next when you s ee the word hot? I knew you
were going to s ay that!

J ill L ohmeier with Bruc e Frey


Hack 68. Search for ESP

Though most scientists agree that there isn't much evidence


that ESP actually exists, they might be wrong. You or your
f riend or your monkey might have ESP, and there's no time like
the present to f ind out!

T he term extra-s ens ory perception (E SP ) was c oined to des c ribe


perc eptions that are independent of the traditional five s ens es :
s ight, hearing, touc h, tas te, and s mell. T he firs t to us e the term
was a ps yc hologis t at D uke U nivers ity in the 1 9 2 0 s and 1 9 3 0 s
named J .B. Rhine. T here was muc h exc itement at the time, as
Rhine and his c olleagues were able to identify individuals who
s eemed to exhibit E SP abilities . I n the popular pres s and s ome
of the s c ientific writing of that period through the 1 9 7 0 s , it was
even taken for granted that there was s uc h a thing as E SP and
that we all had the trait to a c ertain degree.

Today, though, you don't really hear muc h about E SP, and mos t
s c ientis ts have c onc luded that s uc h a thing probably does not
exis t. M ore s pec ific ally, it has n't met the c riteria for s c ientific
ac c eptanc e that any other hypothes ized phenomenon is
expec ted to meet, s uc h as experimental evidenc e, replic ated
s tudies , and s o on. You c an add to the data, though, by
c onduc ting your own s tudies and identifying whether you or your
friend might be ps yc hic .

Identifying Psychic Abilities


T hough there is a wide range of s uppos ed ps yc hic abilities ,
ranging from reading minds to moving objec ts with one's mind,
the traditional way to s tudy E SP has been to us e a dec k of c ards
c alled Zener c ards . Zener dec ks have 2 5 c ards with matc hing
bac ks . T he fac e of eac h c ard dis plays one of five s ymbols : a
c irc le, c ros s , s quare, s tar, or wavy lines , as s hown in Figure 6 - 7 .

Figure 6-7. Zener cards

I f you don't have a dec k of thes e c ards handy, you c an make


them pretty eas ily with a pac k of blank index c ards and a blac k
magic marker. J us t make s ure that no one c an s ee right through
them (unles s they are ps yc hic , in whic h c as e they c an s ee right
through you, too). M ake 5 c ards of eac h s ymbol for a total of 2 5
c ards .

T here are a few different ways you c an us e a s huffled dec k of


Zener c ards to c onduc t an E SP tes t:

O ne pers on tries to gues s the order of the c ards by


announc ing eac h one before it is turned over.
O ne pers on looks at the fac e of eac h c ard and attempts
to "s end" it to another pers on telepathic ally who is
s itting nearby.

A pers on in another room or in a dis tant loc ation looks


at the fac e of eac h c ard and attempts to s end it
telepathic ally to another pers on over a great dis tanc e.
Sometimes , the rec eiver imagines that they are in the
room with the s ender and c an s ee the c ard.

With whatever method you c hoos e, the proc edure is to go


through all 2 5 c ards and keep trac k of the hits and mis s es . H ow
many c ards out of the 2 5 did the s ubjec t c orrec tly identify? I n
s ome s tudies , the rec eiver is told how he is doing while they are
going through the whole dec k; s ometimes , he is not told until the
end of the experiment. T he outc ome variable is the number or
perc entage of c ards that were c orrec tly identified.

I n E SP res earc h, the pers on who is trying


to read s omeone's thoughts is the receiver
and the pers on who wants her mind read is
the s ender.

Analyzing the Results


I f the res ults are what would be expec ted by c hanc e alone, treat
the outc ome as evidenc e that the s ubjec t is not ps yc hic . I f the
s ubjec t gets many more c orrec t than would be expec ted by
c hanc e, that outc ome s ugges ts that the s ubjec t might have E SP.

So, what would be expec ted by c hanc e? I f you are gues s ing for
2 5 c ards and there are five c ards of eac h type, c hanc e alone
would get about 5 c orrec t. I magine, for example, that you
gues s ed s tar every s ingle time ac ros s all 2 5 times . You would be
guaranteed 5 hits and 2 0 mis s es bec aus e you know s tar will
c ome up exac tly five times overall. I f you gues s ed randomly
eac h time among the 5 pos s ibilities , your average s uc c es s
would als o be 5 out of 2 5 , or 2 0 perc ent.

What if you had a higher s uc c es s rate than 2 0 perc ent, though?


What if you were c orrec t 6 out of 2 5 times , for a s uc c es s rate of
2 4 perc ent? Should we treat that as evidenc e that s omething
other than c hanc e is playing a role here? What we need is a
s tatis tic al analys is of the different pos s ible outc omes , to
identify what perc entage s hould be c ons idered s o unus ual that it
mus t be evidenc e for the pres enc e of s omething s o unus ual.

A s tatis tic al tes t reveals only whether


c hanc e is the bes t explanation for an
outc ome. For our experiment, a
s tatis tic ally s ignific ant outc ome does n't
prove that E SP is at work, only that
c hanc e is not the bes t explanation. A fter
all, the bes t explanation for a high hit rate
might be that the rec eiver s ees the c ards
reflec ted in the s ender's glas s es , or s ome
other les s interes ting c aus e.
We know that over the s hort run (or in a s mall s ample, to us e the
s tats jargon), res ults that differ from the population are c ommon.
We als o know, though, that a large differenc e from that
population value is unc ommon, es pec ially over the long run (or
with a large s ample). I n fac t, the probability of finding a differenc e
of a given s ize between a s ample value and the population value
is direc tly related to the s ize of the s ample.

For E SP experiments , the s ample s ize is the number of gues s es


or trials , and the population is the known dis tribution of the
different s ymbols ac ros s all trials . T he population value for any
number of gues s es is 2 0 perc ent c orrec t; that is what would be
expec ted by c hanc e. I f there is a large differenc e between the
s ample value and this population value, then s omething other
than c hanc e is likely operating.

T he s tatis tic al analys is appropriate here is s omething c alled the


Z-tes t for comparis on of an obs erved proportion to an expected
proportion. I t is s imilar to other c ommon s tatis tic al tes ts , s uc h
as t tes ts [H ac k #1 7 ], whic h c alc ulate a differenc e and
determine how frequently s uc h a differenc e would be found if a
given s ample really was randomly drawn from a population with
c ertain c harac teris tic s .

T he probability of any differenc e is bas ed on the s ize of the


s ample. For example, if after 2 5 attempts , a pers on gues s ed 2 4
perc ent c orrec tly ins tead of the expec ted 2 0 perc ent, the
information needed for this analys is would be:

A s ample s ize of 2 5
A n obs erved proportion of .2 4

A n expec ted proportion of .2 0

Without s howing the formula and c alc ulations for this partic ular
analys is , I 'll s how you the res ult. By c hanc e alone, with 2 5
gues s es , a s ubjec t will gues s at leas t 2 4 perc ent of the c ards
c orrec tly 3 1 perc ent of the time. A nother way of s aying that is
that out of 1 0 0 s ubjec ts going through your s tudy, 3 1 of them
will get this res ult or better. So, a hit rate of 2 4 perc ent is better
than average, but not s o unus ual that I would c all The National
Enquirer jus t yet.

What about other hit rates or if you tes t with more than 2 5
trials ? Table 6 - 1 7 s hows the c hanc e of gues s ing given
perc entages of c ards (or higher) c orrec tly. T his table as s umes
an expec ted hit rate of 2 0 perc ent.

Table Likelihood of selected ESP hit rates


Number Percent Probability of
of correct hit rate or
guesses (hit rate) better
25 20 percent 50 percent
25 30 percent 11 percent
25 40 percent 1 percent
25 50 percent .01 percent
100 20 percent 50 percent
100 30 percent 1 percent
100 40 percent .00001 percent
.000000000001
100 50 percent
percent

N otic e the dramatic drop in likelihood for extreme outc omes as


the s ample s ize inc reas es . For example, with jus t 2 5 gues s es ,
the c hanc es of getting 4 0 perc ent c orrec t is about 1 perc ent; if
you went through a pac k of 2 5 c ards 1 0 0 times , you are likely to
do that well or better jus t one time. I f you took 1 0 0 gues s es ,
though, maybe going through the dec k four times , you would get
4 0 perc ent or better c orrec t jus t 1 out of 1 0 0 ,0 0 0 ,0 0 0 ,0 0 0 ,0 0 0
times !

How Much Is Enough?

I f you want to c onduc t E SP experiments , you s hould es tablis h a


s tandard for how unlikely a performanc e mus t be for you to
c ons ider it evidenc e that s omething other than c hanc e is the
operating fac tor. Typic ally, in s tatis tic al res earc h, if a res ult is
likely to oc c ur by c hanc e 5 perc ent of the time or les s , the res ult
is c ons idered s tatis tic ally s ignific ant. For E SP experiments with
2 5 Zener c ards and 2 5 gues s es , you will gues s 8 or more c ards
c orrec tly about 7 perc ent of the time. You will gues s 9 or more
c orrec tly jus t 2 perc ent of the time. So, s ome s tandard between
8 or 9 hits is s c ientific ally reas onable.

T he s keptic in me feels c ompelled to leave you with a warning. I f


you perform this experiment and get a s ignific ant res ult on
yours elf or s omeone els e, that's pretty c ool. I f you c an repeat
the finding, though, replic ating the experiment with the s ame
pers on and getting s imilar res ults , that's when it will s tart to get
exc iting! I f that happens , s end me a telegram immediately. I 'll
s ell my hous e, buy a train, and we'll hit the road to fame and
fortune!
Hack 69. Cure Conjunctionitus

The probability of two independent events both happening can


never be more likely than either of the events happening alone.
Surprisingly, this common sense truth is not commonly sensed.

I magine that you are introduc ed to J ohn, a tall, pleas ant,


athletic - looking man at a dinner party. You c hat with J ohn for a
few minutes and dis c over that he is friendly and quic k to laugh,
but not exac tly bright. J ohn is eager to talk about the c urrently
ongoing World Series and als o as ks you about the c ar you drive.

O n your way home from the dinner party, your s pous e as ks you
about the man you were c hatting with before dinner. You s hare a
little bit about J ohn, but realize that you never learned what he
does for a living. I n fac t, as you realize, you really don't know
that muc h about him. Your s pous e dec ides to play a little mind
game with you and explains :

I know a little about J ohn. I 'm going to provide a s eries of


s tatements about him. T hey might be true or not true. A ll might
be true. A ll might be untrue. T here might be a mix. I want you to
plac e the s tatements in order bas ed on how c onfident you are
that eac h s tatement is true. When we are done, I 'm going to
diagnos e whether you s uffer from a c ommon brain ailment known
as Conj unctionitus .

Your s pous e then as ks you to rank the following s tatements ,


gues s ing whic h are mos t likely true about J ohn:
1. J ohn is a c omputer s c ientis t.

2. J ohn is a c ar s ales man.

3. J ohn is a former bas eball player.

4. J ohn is a Republic an.

5. J ohn is a c omputer s c ientis t who us ed to play bas eball.

6. J ohn is a preac her who runs marathons .

7. J ohn plays the c larinet.

8. J ohn is married.

You, like many other people, might have ranked s tatement 3


(former bas eball player) as one of the mos t likely pos s ibilities
and 1 (c omputer s c ientis t) as one of the leas t likely. So far, not
s o c razy; at leas t they are reas onable gues s es bas ed on the
c onvers ation you had.

T he s ymptom related to C onjunc tionitus has to do with the


pos ition you as s igned s tatement 5 in your rankings . I 'm betting
you ranked it as more likely than 1 . I f s o, you might s uffer from
C onjunc tionitus , a c ondition that res ults in people making poor
probability judgments .

T he truth is that the probability of two events oc c urring together


c an never be greater than the probability of either one oc c urring
alone. T hus , it c annot be more likely that J ohn is a c omputer
s c ientis t who us ed to play bas eball than it is that J ohn is a
c omputer s c ientis t. N ever fear, though; the firs t s tep in
improving your ability to make likelihood judgments in thes e
s ituations is to admit you have a problem. T he next s tep is to
unders tand the c ondition, s o that healing c an begin.
The Problem

A lthough more information might make a des c ription s eem more


s imilar or repres entative of s omeone or s omething, more
information does not make s omething more likely. A s mentioned
earlier, the probability of two events oc c urring together c annot
be more likely than one of them oc c urring alone. C ons ider all of
the pos s ible things a man c an be in the world. H ow do you dec ide
whic h things J ohn is mos t likely to be? You c ould s tart by
looking at bas e rates .

T here are probably more married men in the world than there are
c omputer s c ientis ts , c ar s ales man, former bas eball players ,
Republic ans , preac hers , marathon runners , and c larinet players .
T hus , it is mos t likely that J ohn is married. Where did you rank
that pos s ibility?

Bec aus e we probably don't really know the bas e rates of all the
other pos s ibilities , we c an us e the information we have about
J ohn to predic t whic h of the other pos s ibilities is mos t likely. We
do know that if we c ons ider the group of all former bas eball
players and the group of all c omputer s c ientis ts , there will
probably only be a s mall number of men who belong to both
groups . T hus , the likelihood of being in that group of c omputer
s c ientis ts who us ed to play bas eball mus t be s maller than the
likelihood of being in the group of c omputer s c ientis ts or of being
in the group of former bas eball players .

M os t people, however, even though they are rational, intelligent


dec is ion makers , will be drawn toward s entenc es that are
conj unctions (i.e., that lis t two s eparate "fac ts "), as if the lis ting
of the "fac ts " together makes them more likely to be true. E ven
if, and maybe es pec ially if, the s ec ond "fac t" by its elf s eems
unlikely.
Conjunction Junction, What's Your
Function?

Why do our minds tend to work this way? I n the 1 9 7 0 s , N obel


P rize winner D aniel Kahneman and his c olleague A mos Tvers ky
pres ented c ollege s tudents with s everal problems in whic h one
option was highly repres entative of a given pers onality
des c ription, one option was inc ongruent with the des c ription, and
one option inc luded both the highly s imilar and the inc ongruent
options .

P erhaps the mos t well- known problem that demons trates the
c onjunc tion fallac y is the now- famous (at leas t in c ognitive
ps yc hology c irc les ) L inda P roblem:

L inda is 3 1 years old, s ingle, outs poken, and very bright. She
majored in P hilos ophy. A s a s tudent, s he was deeply c onc erned
with is s ues of dis c rimination and s oc ial jus tic e, and s he als o
partic ipated in antinuc lear demons trations .

Subjec ts were as ked to rank thes e s tatements bas ed on high


likely they were to be true:

1. L inda is a teac her in elementary s c hool.

2. L inda works in a books tore and takes Yoga c las s es .

3. L inda is ac tive in the feminis t movement.

4. L inda is a ps yc hiatric s oc ial worker.

5. L inda is a member of the L eague of Women Voters .

6. L inda is a bank teller.


7. L inda is an ins uranc e s ales pers on.

8. L inda is a bank teller and is ac tive in the feminis t


movement.

Kahneman and Tvers ky (and many others who have s inc e


replic ated their work) found that people c ons is tently ranked
option 8 (a bank teller ac tive in the feminis t movement) as being
more likely than option 6 (a bank teller). T his is bec aus e option
8 provides more information, whic h s eems to be more
repres entative of L inda. Bec aus e we expec t her to be politic ally
ac tive, but we don't expec t her to be a bank teller, it s eems as
though the only way s he c ould be a bank teller is if s he is als o
politic ally ac tive.

H owever, we know that 8 c an never be more likely than options 3


or 6 , bec aus e if we imagine all people ac tive in the feminis t
movement, a s ubs et of them (perhaps a s mall s ubs et) will be
bank tellers . L ikewis e, if we imagine all of the bank tellers in the
world, a s ubs et (again, perhaps a s mall one) will be ac tive in the
feminis t movement. T hus , the likelihood of being a bank teller
mus t be greater than the likelihood of being a bank teller who is
ac tive in the feminis t movement. M akes s ens e, right? But your
mind does n't want to work that way.

T he rule that s tates that the probability of


two events oc c urring together c annot be
greater than the probability of either one of
them oc c urring alone is c alled the
conj unction rule. T he fac t that many people
often believe that the c onjunc tion of two
events is s ometimes more likely than one
event oc c urring alone is c alled the
conj unction fallacy.
The Cure

To s top thinking wrongly about thes e s orts of propos itions , the


c ure is s imple:

1. C ut it out.

2. Stop.

3. D on't do that.

T he c onjunc tion fallac y c an be s een at work in numerous plac es .


Be aware of s ituations in whic h it might oc c ur and analyze the
s ituation. For example, you c an as k a bas eball fan about a
favorite player who does n't often hit home runs . A s k whether the
player is more likely in the next game to do whic h of the
following:

H it a home run

Strike out

Strike out and hit a home run

T he fan probably believes that a home run with a s trikeout in the


game is more likely than jus t a home run. But it c annot be s o.

T here are s ome s ituations in whic h it


might be okay to pic k the c onjunc tion
propos ition. I f two things mus t always
oc c ur together (s uc h as thunder and
lightning), then the likelihood of both of
them oc c urring is the s ame as one of them
oc c urring. A nd if you add to the thunder
and lightning s tatement and c hange it to
the likelihood of thunder (and no lightning)
vers us the likelihood of thunder and
lighting, then, in fac t, the likelihood of
thunder and lightning is more probable.
H owever, this is true only if one c an never
oc c ur without the other.

O nc e you are aware of this c ommon error in probability


es timation, you c an s ee it everywhere. For example, one plac e in
whic h you c an readily find the c onjunc tion fallac y is in the
politic al predic tion arena. I s G eorge W. Bus h more likely to:

N ominate a moderate Supreme C ourt jus tic e

N ominate one moderate Supreme C ourt jus tic e and one


right- wing Supreme C ourt jus tic e
O f c ours e, you know the ans wer now, but many politic al analys ts
might argue with you. But that's bec aus e they have the
s ic knes s . T hey have C onjunc tionitus . You did too, onc e, but now
you are c ured.

See Also

Tvers ky, A . (1 9 7 7 ). "Features of s imilarity."


Ps ychological Review, 8 4 , 3 2 7 - 3 5 2 .

Tvers ky, A . and Kahneman, D . (1 9 7 4 ). "J udgment under


unc ertainty: H euris tic s and bias es ." Science, 1 8 5 ,
1 1 2 4 -1 1 3 1 .

J ill L ohmeier
Hack 70. Break Codes with Etaoin Shrdlu

You never know when you will have to decipher a cryptic


message, whether it's one intercepted by your man, James
Bond, or one scribbled illegibly onto a prescription pad by your
doctor. Here are all the statistical tricks you'll need, A gent
003.14159.

You might have notic ed that c ertain keys on your c omputer


keyboard get dirty or wear out more quic kly than others . T hat's
bec aus e you hit them more often than the others . You might als o
notic e that thes e letters tend to be in the middle of the keyboard
or, more c orrec tly, in s mall c irc les near where your hands are
when they are c entered on a keyboard.

Both the wear and tear on your keys and the plac ement of them
in a s tandard typewriter (a.k.a. Q WE RT Y, for the firs t s ix letters
on the top row) pattern are bas ed on their frequenc y of us e in
E nglis h. D ifferent letters in the alphabet are us ed with different
frequenc ies in the s pelling of words in a language. By applying
the known frequenc y of thes e letters , along with other s tatis tic al
tric ks , you c an quic kly dec ode c las s ified doc uments , whether
they are L eonardo da V inc i's diary, a puzzle in the news paper, or
big, bright letters being turned by Vanna White on T V.

Single Substitution Ciphers


T he s imples t and oldes t type of letter- bas ed c ode is the s ingle
s ubs titution format. I n thes e c odes , s ome mes s age is
trans formed from the ac tual letters in the words to other letters
in the alphabet. I n the s imples t form of this type of c oding, the
s ame letter s ubs titutes for the s ame letter throughout the
mes s age. For example, a s imple c ipher might us e the
s ubs titution pattern s hown in Table 6 - 1 8 , in whic h the letters on
the top row (the plain text) are replac ed by the letters on the
bottom row (the cipher text).

Table A single substitution cipher


Plain
A BC DE F G HI J KLMNO P Q RS TUVW
text
Cipher
NAO B P C Q DRESFT GU HV I W J X KY
text

With a c ode like the one s hown in Table 6 - 1 8 , the following


plain- text pas s age:

Tom appeared on the s idewalk with a buc ket of


whitewas h and a long- handled brus h.

appears in c ipher text like this :

J ut nhhpnipb ug jdp wrbpynfs yrjd n axos pj uc ydrjp yhwd


ngb u fugq- dngbfpb aixwd.

T he pas s age looks like nons ens e, but with the key s hown in
Table 6 - 1 8 , anyone c ould eas ily replac e the nons ens e letters
with the original letters , c aus ing the opening s entenc e of the
s ec ond paragraph in C hapter Two of Tom Sawyer to reveal its elf.
Using Probability to Decode Substitution
Ciphers

O f c ours e, the real tas k when dec iphering c iphers is to do it


without ac c es s to the c ode key. Real- life c ode breakers and
winning c ontes tants on Wheel of Fortune us e the s ame tool to
s olve their problems : they apply the known dis tribution of letters
in E nglis h language words .

T he advent of c omputers , c omputer analys is , and elec tronic


c opies of millions of books has made the c alc ulation of exac t
probabilities for eac h letter of the alphabet pos s ible, though
c ryptographers (c ode makers and breakers ) have known the
bas ic s for s ome time. H ere are s ome of thes e bas ic s :

T he mos t c ommon letter, in terms of us age in E nglis h, is


E.

T he leas t c ommonly us ed letter is Z.

T he mos t c ommon c ons onant is T.

J and X are rarely us ed, as is Q .

When Q is us ed, it is almos t always followed by U .

O nly A and I are us ed as one- letter words in E nglis h.


With even jus t thes e bas ic probability fac ts , you c ould begin to
tac kle dec oding a c ipher s uc h as our M ark Twain pas s age. T he
mos t c ommonly appearing letters in the garbled vers ion are P
and N . Bec aus e N is us ed as a s ingle- letter word, it c annot be E
(N is mos t likely A ), s o a good firs t gues s for P is that it
s ubs titutes for E .

With jus t a little knowledge of letter dis tribution, we have already


identified the s ubs titutes for E and A . We c an't be s ure we are
right, but like any good s tatis tic ian, we think we are probably
right. Table 6 - 1 9 s hows the likely dis tribution for eac h letter of
the alphabet.

Table Frequency distribution of letters in English


Letter Frequency
A 8.04 percent
B 1.54 percent
C 3.06 percent
D 3.99 percent
E 12.51 percent
F 2.30 percent
G 1.96 percent
H 5.49 percent
I 7.26 percent
J 0.16 percent
K 0.67 percent
L 4.14 percent
M 2.53 percent
N 7.09 percent
O 7.60 percent
P 2.00 percent
Q 0.11 percent
R 6.12 percent
S 6.54 percent
T 9.25 percent
U 2.71 percent
V 0.99 percent
W 1.92 percent
X 0.19 percent
Y 1.73 percent
Z 0.09 percent

ETAOIN SHRDLU

T he s trange phras e "E TA O I N SH RD L U " is a mnemonic device


(memory tool) for remembering the mos t frequently oc c urring
letters . T hes e 1 2 letters ac c ount for over 8 0 perc ent of total
letter frequenc y.

You might notic e that the order of letters in E TA O I N SH RD L U is


not exac tly the rank order of popularity s hown in Table 6 - 1 9 . I t
is c los e enough, though, and eas ier to pronounc e than if it were
exac tly c orrec t. A nother thing to remember is that any
"definitive" lis t of letter probability depends on the s ourc e
material for the letter c ount. You c an find many different lis ts of
letter order and frequenc y, and s ome differ s lightly from others .

For example, one organization that produc ed a lis t of s tatis tic al


dis tributions of letters in E nglis h text relied on a c omputer
analys is and ac tual c ount of letter oc c urrenc e in s even literary
c las s ic s , s uc h as Jane Eyre and Wuthering Heights . Two of thes e
s even books were Tarzan novels . I 'm gues s ing that if we were to
c ompare that table of letter dis tributions with others , we would
find that the proportional number of times the letter Z appeared
was greater than if other s ourc es were us ed. For the c ommon
letters , thoughs uc h as E , T, and A there is wide agreement on
their us e as bes t firs t gues s es for c ode breaking.
Wheel of Fortune Strategy

O n the T V game s how Wheel of Fortune, before s olving


the big puzzle at the end, the produc ers are nic e enough
to provide c ertain letters and s how whether they appear
in the hangman- type phras e. T hey provide R, S, T, L , N ,
and E . T hes e are given, of c ours e, bec aus e they are
c ommon letters , and are in our top 1 2 : E TA O I N
SH RD L U . T he player is allowed to c hoos e three more
c ons onants and another vowel. U s ing our s tatis tic al
knowledge of letter frequenc y, a good bas ic s trategy
would be to pic k A as the vowel and the three mos t
c ommon c ons onants not yet s hown: H , D , and C .

Statistical Analysis of Coded Texts

H ere's how you might us e thes e letter s tats in real life to dec ode
a s ec ret mes s age or s olve a puzzle. T his method works bes t if
the c oded text is lengthy, but it works s urpris ingly well even for
s horter pas s ages . C alc ulate the dis tribution of the c oded,
s ubs titute letters (the c ipher text), and then c ompare it to the
dis tribution s hown in Table 6 - 1 9 .

Figure 6 - 8 s hows how this proc es s might look graphic ally. O nly
the firs t 1 0 mos t c ommon letters are s hown, but the analys is
would us e all the letters . T his example pretends that there is a
lot of c oded text and that the s ubs titute c ipher s hown in Table 6 -
1 8 is being us ed.

Figure 6-8. English letter frequency (left) and


coded letter frequency (right)

Bec aus e the mos t c ommon s ubs titute letters are P, followed by
J , a good gues s for breaking the c ode would be to s ee whether P
c ould really be E and J c ould really be T. T hes e firs t gues s es c an
be made all the way down the line for eac h letter. By s tarting with
the mos t frequently appearing letters and moving down the lis t, a
c ode breaker c an quic kly s ee whether thes e firs t hypothes es are
right or wrong and c hange gues s es around until E nglis h words
s tart to appear.
Other Common Letter Patterns

Beyond jus t knowing the frequenc y of individual letters


appearing, good c ode breakers us e probability information about
other patterns of letters :

Words are mos t likely to s tart with T, O , A , W, or B.

M os t words end with an E , T, D , or S.

I f two letters are doubled in a word, they are mos t likely


to be SS, E E , T T, FF, or L L .

Frequently appearing two- letter words inc lude of, to, in,
it, and is .

By far, the mos t c ommon three- letter words are the and
and. O ther c ommon three- letter words inc lude for, are,
and but.

L etters that tend to c ome in pairs inc lude T H , H E , A N ,


I N , and E R.

T he mos t frequently us ed words are the, of, and, to, in, a,


is , that, be, and it.

P erhaps indic ating what people tend to write about, the


top 1 0 0 mos t- us ed words in written texts inc lude
dollars , great, general, and public. Debts jus t barely failed
to make the top 1 0 0 , but it is s urpris ingly c ommon.

See Also

A good explanation of s ingle s ubs titution c iphers c an be


found under the entry for frequency analys is at
http://en.wikipedia.org/wiki/Frequenc y_analys is .

Some of the s tatis tic s reported in this hac k were found


at http://www.data- c ompres s ion.c om and
http://www.s c ottbryc e.c om. G ood information and advic e
for s olving c ryptograms and other c odes us ing s tatis tic s
c an be found at thos e s ites .
Hack 71. Discover a New Species

While everyday entire species of creatures become extinct,


occasionally new species are identif ied that were previously
unknown. Surprisingly, statistical tools, not biological tools, can
do the trick.

A few years bac k, a new s pec ies , a type of pos s um, was
identified. T he new s pec ies was named trichos urus cunninghamii.
Trichos urus means , um...pos s um (I gues s ), and the cunninghamii
part refers to its dis c overer, Ros s C unningham, a s tatis tic ian at
A us tralian N ational U nivers ity. I f you'd like to have a s pec ies
named for you, here's how s tatis tic s c an help.

Identifying Species with Statistics

T here is a family of s tatis tic al analys es that looks at a bunc h of


variables and finds naturally oc c urring groupings among them.
Typic ally, the groupings or c lus ters of variables are identified on
the bas is of the c orrelations among them [H ac k #1 1 ].

O ne proc edure that us es this s trategy attempts to find


underlying dimens ions or invis ible, giant bas ic variables that
ac c ount for a bunc h of les s important variables . T his proc edure
is factor analys is , and els ewhere we s ee how it c an, among other
things , be us ed to identify writers ' s tyles [H ac k #6 5 ].
Statis tic s is full of s imilar tec hniques that c an identify
dimens ions , underlying c aus es , and groupings . T he goal of
identifying groupings is of greates t us e to biologic ally inc lined
s tatis tic ians who wis h to identify new s pec ies .

For s ome group of animals to tec hnic ally be a s eparate s pec ies ,
it mus t s hare a unique s et of biologic al c harac teris tic s that
make it dis tinc t from s imilar animals . Sure, animals within the
s ame family all look a little different from eac h other, but then,
people look a lot different from eac h other and we are all one
s pec ies (my U nc le Frank being perhaps the exc eption that
proves the rule).

I f a group of animals , s uc h as D r. C unningham's pos s ums , have


more in c ommon with eac h other than they do with the other
c reatures in their s pec ies , they might be c andidates for
c ons ideration as a s pec ies in their own right. Statis tic s c an
determine that "more like eac h other and more different from the
res t of the s pec ies than c hanc e alone would produc e" point.

U s ing C unningham's dis c overy as a model, there are a few s teps


to follow for you to make your own dis c overy.

Collect some data

T his pos s um exis ted in A us tralia near people for more than 2 0 0
years and no one notic ed. To be fair, it looked an awful lot like the
other pos s ums , the mos t c ommon of whic h was the trichos urus
caninus , now c alled the s hort-eared pos s um.

I t was as s umed for s ome time that there was really jus t this one
s pec ies of the little guys . P art of D r. C unningham's job was to
c ollec t and organize des c riptive data for the wildlife around him.
C ons equently, he had a ton of very s pec ific quantitative
des c riptions of various pos s um parts eyes , ears , nos e, and
throatand meas urements of other phys ic al c harac teris tic s .

Choose a statistical method

C unningham's c hoic e was a tec hnique s imilar to fac tor analys is


but with a more impos ing name: canonical variate analys is . You
c an us e any method that us es the variability in s c ores to c reate
dis tinc t groupings . Some of thos e are dis c us s ed in this
books uc h as fac tor analys is , mentioned earlier in this hac kbut
there are many other proc edures that would work.

I f you are really s tatis tic ally s avvy, it will


help you to know that canonical variate
analys is is func tionally the s ame as
dis criminant analys is or multivariate
analys is of variance (M A N O V A ), two other
proc edures that c reate linear c ompos ites
of variables with the goal of c onc eptually
defining two or more dis tinc tly different
groups .

C unningham us ed this s tatis tic al proc edure to examine the


des c riptive data for this pres umably s ingle s pec ies (you know,
thes e trichos urus caninus fellers ) and demons trated that there
were likely two different s pec ies .

Select a hypothesis and analyze the data

Statis tic ians tes t hypothes es , s o you s hould begin your analys is
with a gues s about whether there is or is not a dis tinc tion
between the groups of partic ipates who s upplied your data.

I n the example of our hero, C unningham as s umed that there


were two different groups of c ritters that ac c ounted for the data.
T hen, the proc edure (us ing a c omputer for the c alc ulations , of
c ours e) identified whic h variables worked bes t as key
dis tinguis hing c harac teris tic s between the theoretic al groups .

T he differenc e between us ing this tool,


c anonic al variate analys is , and s omething
like regres s ion is that, when us ing
variables to make predic tions in
regres s ion, the res earc her has s ome
known data about s c ores of ac tual
s ubjec ts : whic h "group" they belong to
[H ac k #1 3 ]. H ere, the proc edure works
blindly without knowing what the c orrec t
ans wer is . I ns tead, it finds groups that c an
be made the mos t different with the
variables at hand.
H ere are the variables C unningham us ed:

H ead length

Skull width

E ye s ize

E ar length

Body length (from tip of nos e to tip of unc urled tail)

Tail length

C hes t width

Foot length

While other variables were c ons idered, C unningham c hos e thes e


bec aus e they were eventually found to be mos t important in
dis tinguis hing one s pec ies from another and als o bec aus e they
were c harac teris tic s that would probably be unaffec ted by
environment.
Interpret results

T he las t s tep in any s tatis tic al analys is is to des c ribe and


unders tand whatever you found. For dis c overing s pec ies , you
need to be able to des c ribe that new s pec ies in enough detail to
differentiate it form other, s imilar s pec ies .

T he proc edures us ed by C unningham identified a s eries of


different equations that weighted eac h of the biologic al variables
differently, to find the c ombination that bes t identified two
s eparate groups . T hes e equations (whic h the proc edure labels
variates ) are s imilar to regres s ion equations , with the outc ome or
c riterion variable determining whic h group a pos s um belongs to.

H ere's the s ingle bes t equation that ac c ounted for an


as tonis hing 8 9 perc ent of the variability on thes e
c harac teris tic s for all the pos s ums in his databas e:

(head lengthx.4 4 ) + (s kull widthx.0 7 ) + (eye s izex.0 5 ) +


(ear lengthx.8 2 ) + (body lengthx.3 5 ) + (tail lengthx.7 2 )
+ (c hes t widthx.1 6 ) + (foot lengthx.7 0 )

I 've provided the s tandardized weights from the s tudy, s o we c an


c ompare them to eac h other. T he larger weights indic ate the
pos s um parts that differed the mos t between the mathematic ally
c hos en two groups of pos s ums .

I n this data, you c ould find two groups of pos s ums that differed
the mos t bas ed on ear length, tail length, and foot length. T he
amount of variability explained was s o large that, s tatis tic ally,
C unningham c onc luded that the mathematic ally identified
groupings were real. T he two groups of pos s ums found in the
data were ac tually two different s pec ies of pos s um, and the
s pec ies c ould be defined by their ear length and a c ouple of
other variables . T he larger the weights in the equation s hown
earlier, the more the two s pec ies differed on thes e body parts .

Two Possum Species

Table 6 - 2 0 s hows the offic ial des c riptions of the two pos s um
s pec ies firs t identified as s uc h by our s tatis tic ian and his
mathematic s . N otic e the names are even bas ed on the key
predic tors found in the s tatis tic al analys is !

Table Two common Australian possums


trichosurus trichosurus
caninus cunninghamii
Common Short-eared Mountain brushtail
name possum possum
Habitat Lives in the north Lives in the south
Ears Shorter ears Longer ears
Feet Smaller feet Larger feet
Head Bigger head Smaller head
Tail Longer tail Smaller tail

So, s tart c ollec ting your own data on thos e odd, s tinky bugs you
find on your s c reen door and you are well on your way to
greatnes s and immortality. I s there one s pec ies of s tink bug or
two? You tell me.

See Also

I firs t learned about this approac h to identifying s pec ies


in this fine artic le: H all, P. (2 0 0 3 ). Chance, 16, 1 .
Hack 72. Feel Connected

The concept of "six degrees of separation" is more than just a


New A ge metaphor f or community or a party game involving the
actor Kevin Bacon. If you want to actually test the idea that
we all know someone who knows everybody else, f ind out how
closely linked you really are to everyone.

I know a guy who knew a guy who us ed to work for the P res ident
of the U nited States . Small world, eh? I 'm not s aying I have
great c onnec tions , but I am jus t two hands hakes away from the
leader of the free world. Before you get too impres s ed, you
s hould know that you probably are jus t a few links away from
almos t anybody in the world.

I t is probably true that any two people are within s ix degrees of


s eparation, and that magic and oft- quoted number of 6 is ac tually
taken from a real s c ientific s tudy! H ere are s ome c lever
res earc h methods to let you reveal the invis ible c onnec tions
that unite us all, or at leas t link you to that pers on on the other
s ide of the c oc ktail party.

Six Degrees of Separation

T here is a play c alled Six Degrees of Separation by J ohn G uare


and a movie bas ed on that play s tarring Will Smith. T here is als o
a popular party trivia game, s ometimes c alled Six Degrees of
Kevin Bacon, that attempts to link any ac tor or ac tres s through a
s eries of movies and other performers until they s hare a
c onnec tion with ac tor Kevin Bac on.

T he phras e and c onc ept c ome from a s tudy that c ons idered the
s mall-world problem. H ave you ever been at a party or been
c hatting with a s tranger at a c offee s hop and dis c overed that you
both know the s ame pers on? Soc ial ps yc hologis t Stanley
M ilgram was c urious about this phenomenon in the late 1 9 6 0 s
(when there were a lot more c oc ktail parties than there are now).
H ow muc h overlap was there in s oc ial networks ? I f we c ould all
get together and lis t everyone we know, would there always be
s ome c onnec tion? P robably, eventually, as we explored further
and further out of the c enter of our web of ac quaintanc es , we
would find s ome c onnec tion with almos t everyone. But how many
links would it take?

J us t one degree of s eparation means we all know everyone. Well,


I don't know you (no offens e), s o we know that one is too few
links to c onnec t everyone. A re there jus t two degrees of
s eparation? I f we don't know eac h other, maybe we have a friend
in c ommon?

T he ques tion, therefore, is how many degrees of s eparation are


there between you and anyone els e? To get the ans wer, do a big
s tudy or a s mall s tudy us ing the methods in this hac k.

Doing a Big Study

H ow c ould one s tudy the problem of whether we ac tually live in a


s mall world? T he bes t way is to duplic ate the methods us ed by
Stanley M ilgram.
Choose a target

M ilgram s tarted by pic king s omeone he knew who worked in


Bos ton, M as s ac hus etts , where M ilgram lived. I t was n't Kevin
Bac on, but a s toc kbroker who agreed to ac t as the target, the
final end of a c hain that M ilgram hoped to build. You c ould pic k
your bes t friend or your s c hool princ ipal or your U nivers ity's
pres ident. You gotta as k their permis s ion firs t, though
(s omething about ethic s ).

Recruit participates

M ilgram then randomly s ampled from two c ommunities : Bos ton


and O maha, N ebras ka. T his s ampling s c heme was meant to
repres ent the two extremes of likelihood that anyone would know
the target. Start with people c los e by and people far away, and
the average of their data s hould be fairly repres entative of the
population. M ilgram us ed 3 0 0 randomly c hos en rec ruits . You
s hould us e as many as you c an afford or have time for.

Train participates

M ilgram s ent a pac ket in the mail to eac h rec ruit. T he pac ket
c ontained ins truc tions des c ribing the s tudy and a letter for the
Bos ton broker. T hey were as ked to deliver the letter to our guy,
but only if they knew him pers onally. I f they did not know him
pers onally, they were as ked to rec ord s ome information, s uc h as
their name, and s end the pac ket on to s omeone who they did
know who they thought might have a better c hanc e of knowing
him. T hos e next people in the c hain rec eived the s ame pac ket
with the ins truc tions and the letter. T hey might have s ent it to
the broker if they knew him, or s ent it on to a third link in the
c hain, and s o on.

I n your own s tudy, make s ure to write the ins truc tions c learly
and s imply, and, thes e days , you might explain that this is
legitimate res earc h, not a c ommerc ial s olic itation and not a
c hain letter (though it literally is , I gues s ), and all the
dis c laimers you think will help. You s hould als o inc lude c ontac t
information for you if anyone has any ques tions about the
legitimac y of the projec t.

Collect and analyze the results

A fter a reas onable amount of time, c hec k with your target and
gather all the letters rec eived. O n eac h letter, c ount the number
of names that form the c hain. A verage all the different lengths of
c hains to determine the typic al number of c onnec tions . Find the
s malles t number nec es s ary to inc lude even the longes t c hain,
and you have the maximum dis tanc e.

T he Bos ton target in M ilgram's s tudy eventually rec eived about


1 0 0 letters . O f thos e, the average number of links was s ixthus ,
the origin of the number s ix in "s ix degrees of s eparation."

N otic e, however, that not all letters arrived, s o we don't know


from this one s tudy that s ix is really the right number. T he s tudy
als o took plac e in the U .S. only, not worldwide, s o grander views
of there being only a few degrees of s eparation between any two
people on the whole planet are philos ophic ally bas ed, not
empiric ally derived.

T he res pons e rate that M ilgram enjoyed


was very high, c ons idering the
c omplic ated reques ts made of
partic ipants . T his is not s urpris ing,
bec aus e M ilgram knew s omething about
obedienc e. Stanley M ilgram is probably
better known for another c lever s tudy with
more dis turbing res ults he c onduc ted
s ome years before his s mall world s tudy.
With his obedienc e s tudies of the early
1 9 6 0 s , M ilgram demons trated that when
people of authority (s uc h as res earc h
as s is tants in lab c oats ) as k s tudy
partic ipants to do s omething that makes
them unc omfortable, s uc h as
adminis tering (or believing that they are
adminis tering) an elec tric s hoc k to
another res earc h s ubjec t, a s urpris ing
number of people will do it. H is res earc h
led to muc h ins ight as to why people might
"obey orders " even if they dis agree with
them.

Two more rec ent s tudies have c onfirmed that the average
number of c onnec tions between people in s oc ial networks is
about s ix or even a little les s .
Doing a Small Study

T here are a c ouple of ways to us e thes e methods that don't take


quite as muc h work. T he goal of the ac tivity c ould be s c ientific
or jus t party fun.

Milgram via email

D uplic ate the M ilgram s tudy, but us e the c onvenienc e of email.


H ere, the ques tion would be how many links between people
us ing their email addres s es . E mail is eas ier to work with than
s nail mail and is virtually c os t- free.

O f c ours e, c hoos ing rec ruits through email is probably more


diffic ult. I t is hard to c hoos e email addres s es randomly, bec aus e
there is n't a big phonebook- type lis t to s ample from. A ls o, your
email reques ts might quic kly be mis taken(? ) for s pam and
ignored. By the way, bec aus e your res earc h interes t is
legitimate, you s houldn't have to worry about violating any
I nternet protoc ols .

Throw a party

When hos ting a large party (M ilgram would have loved it if you
us ed a c oc ktail party, the ins piration for his original s tudy), hand
out s upplies to your gues ts . G ive them eac h a large index c ard
and a pen. A t the bottom of eac h c ard, lis t the name of a gues t at
the party. I f gues ts don't know the pers on lis ted below, they
s hould s ign their name at the top of the c ard and hand it to
s omeone els e who they think might know the pers on.

T he proc es s s hould c ontinue, jus t as in the M ilgram s tudy, until


the c ards reac h the pers on who is named on the bottom. T hat
pers on then turns the c ard in. A t the end of the party, you c an
analyze the data and prove to your gues ts that they all really
know eac h other.

Just Doing the Math

E ven without s c ientific s tudies , however, a quic k mathematic al


analys is might c onvinc e you that the number of people between
you and anyone els e is a fairly low number. H ow many people do
you know by their firs t names ? 1 0 0 ? 2 0 0 ? L et's s ay it is about
1 0 0 . T hey eac h know about 1 0 0 people by their firs t names , too,
pres umably, s o you are already c onnec ted to 1 0 ,0 0 0 people
through jus t two degrees of s eparation. (A c tually, 1 0 ,1 0 0 , in
total, c ounting the 1 0 0 people who are within one degree of you.)
I t wouldn't take too many degrees before you are c onnec ted to a
whole lot of people, as s hown in Table 6 - 2 1 .

Table Degrees of separation and corresponding


connections
Degrees of separation Connections
1 100
2 10,000
3 1,000,000
4 100,000,000
5 10,000,000,000
I n fac t, with jus t five degrees of s eparation, you s hould be
c onnec ted to 1 0 billion people, more than there are on earth!

So, why, in reality, are a greater number of c onnec tions needed


to really link all people? T he problem is that the groups of 1 0 0
people that eac h pers on knows are not independent of eac h
other. T here is not a different group of 1 0 0 friends for eac h of
your 1 0 0 friends . A good proportion of the 1 0 0 people that you
know well are on many different lis ts for that group.

T here is muc h overlap in s oc ial networks . T his overlap ac tually


helps inc reas e the c hanc e that there will be a fairly direc t link
between you and anyone els e who lives relatively c los e to you
(in the s ame c ountry, s ay).
The Grandparent Paradox

A s imilar problem related to network overlap is the


Grandparent Paradox. You had two parents . Your parents
had two parents eac h, s o that's four grandparents . E ac h
grandparent had two parents and four grandparents . You
don't have to c ount bac k more than a few generations to
get to a huge number of people.

C ount the grandparents going bac k 4 0 generations ago,


and you require a trillion people. T hat's more than have
ever lived on earth for all time c ombined. A nd that's for
jus t the las t 1 ,0 0 0 years or s o. Where did we get all
thes e other grandparents ? J upiter, perhaps ?

T he ans wer, of c ours e, is that s omewhere along the way


there mus t have been s ome overlap on the genetic tree.
Some already related family members mus t marry
oc c as ionally and have c hildren. For the s ake of
dec orum, I 'll s ugges t they were s ec ond c ous ins or
s omething like that.

T he tec hnique that M ilgram us edthe s mall-world methodhas been


found to be very us eful in all s orts of s oc ial network res earc h.
T he c onc ept of a few degrees of s eparation has an intuitive
appeal bec aus e it makes us all feel part of a s mall c ommunity.

I t is als o reinforc ed every time we do find a c onnec tion with a


s tranger through s ome c ommon friend. I don't know about you,
but in my own world, I have s uc h importanc e that I c an eas ily
c onnec t mys elf to all s orts of famous people. For example, as a
c ollege s tudent at the U nivers ity of Kans as in L awrenc e, Kans as
in the early 1 9 8 0 s , I was an extra in the A BC T V - movie The Day
After, an ac c laimed film about the potential after effec ts of
nuc lear war in the U .S. The Day After featured ac tor J ohn L ithgow
as a s c ienc e profes s or. L ithgow later appeared in the film
Footloos e, s tarring M r. Kevin Bac on! I t's a s mall world, after all.

See Also

T he two more rec ent s tudies c onfirming s ix or les s


degrees of s eparation are des c ribed in the Ps ychology
Today artic le "Six D egrees of Separation" by D arby
Saxbe, whic h appeared in the N ovember/D ec ember
2 0 0 3 is s ue.

Watts , D .J . (2 0 0 3 ). Six degrees . N ew York: N orton. A


book on the new s c ienc e of networks provides a
c omprehens ive and fas c inating dis c us s ion of the
c onnec ted age in whic h we live, inc luding the s ix
degrees of s eparation c onc ept:
Hack 73. Learn to Ride a Votercycle

Though a f ree election seems to be the f airest and wisest


system f or making policy decisions and electing of f icials,
statisticians sometimes f ear that a paradox political scientists
call "vote cycling" can result in a win f or the minority. There's a
better way to hold an election.

When I was a little c hild s tatis tic ian, my parents would


oc c as ionally allow me to make c hoic es about pers onal
things what to wear, what to eat, whic h s tory book to read at
bedtime, and s o on. I notic ed that s ometimes the c hoic e was
open- ended: "Your c hoic e, Bruc e: when would you like to go to
bed? " A nd s ometimes the c hoic e was pres ented as a s et of
alternatives for me to c hoos e between: "Your c hoic e, Bruc e:
would you like to go to bed now or in five minutes ? "

O f c ours e, the s ec ond c hoic e is n't muc h of a c hoic e, really.


When I had to c hoos e between various alternatives , my true
opinion was n't reflec ted as ac c urately as when I c ould c hoos e
anything I wanted.

D emoc rac y works like that as well. When it is time to vote for
P res ident, or M ayor, or D ogc atc her, we us ually mus t c hoos e
between s everal alternatives . We might not be happy with any of
the options , but we vote anyway (at leas t s tatis tic ians do).

D id you ever leave the voting booth, though, and feel that
s omehow your real feelings weren't repres ented very well by
thos e c hoic es ? P olitic al s c ientis ts know that feeling. T hey have
analyzed the s ometimes uns atis fying outc omes of votes
between alternatives and dis c overed that s uc h a proc es s c an
res ult in outc omes in whic h no one is happy (exc ept the winner,
of c ours e).

Vote Cycling

T here are a variety of ways that elec tions c an be s truc tured.


I magine that an elec torate (s uc h as the res idents of a c ity,
members of a c lub, or fac ulty at a univers ity) is as ked to vote on
a polic y and there are three c hoic es . I magine, als o, that there
are three groups of s upporters that eac h favor one of the three
options over the others . T he elec tion c ould as k people to vote
for their favorite polic y. U nder that s ys tem, the polic y favored by
the larges t group is likely to win the mos t votes . T his s eems fair,
and this is the s ys tem we mos t c ommonly s ee.

A nother s ys tem that makes good s ens e, too, at leas t on the


s urfac e, would be to pres ent eac h pair of options agains t eac h
other and have a kind of runoff elec tion, in whic h A is c ompared
to B, B is c ompared to C , and C is c ompared to A . T he bigges t
vote getter in this s ort of s ys tem s hould res ult in an equally fair
dec is ion. I t turns out, though, that this type of s ys tem, c alled
vote cycling, is diffic ult to us e fairly bec aus e the order in whic h
you pres ent the options c an determine the outc ome of the
elec tion!

Vote c yc ling in elec tions works in the


s ame way as how you put together a
bas ketball tournament: the order in whic h
the games oc c urred c ould affec t who wins
the whole thing.

How It Works

H ere's an example of how vote c yc ling c an work. I magine that


your s c out troop has to dec ide what c olor to paint the ins ide of
the troop c lubhous e (or wherever s c outs meet thes e days ). A s a
group, you will be voting for Red, White, or Blue. D ifferent
politic al "groups " have formed among your c olleagues who favor
different c olor c hoic es .

T here are the Apples who prefer red, the Elephants who favor
white, and the Jayhawks who like blue bes t. T he groups als o differ
on whic h c olor they like s ec ond bes t and whic h c olor they like
leas t. Table 6 - 2 2 s hows the three groups and their politic al
agendas .

Table Painting preference and politics


Percentage
of First Second
Group
electorate choice choice ch

Apples 20 percent Red White Blu


Elephants 40 percent White Blue Re
Jayhawks 40 percent Blue Red W

To determine the will of the s c outs , you c ould hold a two- s tage
elec tion. Stage one pres ents two alternatives . T he winner of that
s tage then "c ompetes " with the third alternative to pic k a winner.
T he two s tages and res ults c ould look like this :

1. Red or White? Referring to Table 6 - 2 2 , it is likely that


Red would rec eive 6 0 perc ent of the vote, knoc king out
White. N ow, the winner goes up agains t Blue.

2. Red or Blue? I n this matc hup, Red rec eives 2 0 perc ent
of the vote and Blue wins with a huge 8 0 perc ent.

So, blue paint mus t be the will of the people! T his is a


paradoxic al outc ome, though, bec aus e only one group,
repres enting 4 0 perc ent of s c outs , liked blue bes t. A n equal
number liked white bes t, and another 2 0 perc ent hated blue. T he
order of dec is ion making affec ted the outc ome. L et's do it again
in a different order:

1. Red or Blue? Blue wins with 8 0 perc ent of the vote.

2. Blue or White? White wins this matc h with 6 0 perc ent of


the vote.
We have a different outc ome than before, jus t bec aus e of the
order of matc hups . T his is fun; let's do it one more time. M aybe
we c an arrange for red to win this time:

1. Blue or White? White will get 6 0 perc ent of the vote in


this battle and s urvive to fac e off with Red.

2. White or Red? Red wins this one, with a majority of 6 0


perc ent. Well done, Red. Red c learly is the favorite c olor!

T hree potential orders of matc hups res ult in three c ompletely


different polic y dec is ions .

Getting Off the Votercycle

I f we think of voting s ys tems as meas urement s ys tems , this


matc hup method of making dec is ions has low validity. T here is
information that c ould be gleaned from the voters that is being
los t here. H owever, there are a c ouple s olutions that c ome to
mind to s olve the problem of vote c yc ling.

I f the des igners of the voting s ys tem are interes ted in the rank-
order preferenc es of voters , voters c ould be as ked to rank- order
all c andidates . T he lowes t mean rank wins . T his is a fairer
method that us es all the information available, but it c an lead to
c hoic es that no one is really thrilled about.

For example, s uc h a s ys tem res ulted in my


family's infamous dec is ion to go s ee Home
Alone as our C hris tmas E ve movie many
years bac k.
A nother s olution is to make all c andidates available for a s ingle
vote, with the majority winning. T his is the mos t c ommon
s ys tem, but it does have the dis advantage of s ometimes
c hoos ing c andidates that have no majority s upport.

For elec tions in whic h there are many c andidates (in s ome
mayoral or governor elec tions , for example), there is often a
runoff elec tion in whic h the larger number of c andidates is
whittled down to a s maller number. T his does n't have the
weaknes s of vote c yc ling, bec aus e all alternatives are
c ons idered at the s ame time. I t als o eliminates the weaknes s of
the s ingle- trip- to- the polls approac h bec aus e it inc reas es the
likelihood of a winning c andidate with majority s upport.
Hack 74. Live Life in the Fast Lane
(You're Already In)

By applying the laws of chance, knowledge of human nature,


and some f acts about highway-driving behavior, you can make
wiser lane-changing decisions.

N othing is more frus trating than being s tuc k in a traffic jam,


es pec ially when the other c ars are moving fas ter than you. While
it is tempting to c hange to a fas ter lane, it turns out that your
judgment might be flawed and the other lane is probably not
really any fas ter than yours .

D ec iding to c hange lanes when you s houldn't is a dangerous


propos ition. N ot only are the majority of c ar c ras hes due to
driver error, but 3 0 0 ,0 0 0 c ar ac c idents eac h year in the U .S.
oc c ur s pec ific ally while a driver is c hanging lanes . O f c ours e, if
you are in a hurry and the lane next to you is moving more
quic kly, as long as you do s o s afely, why s houldn't a s mart driver
move into the fas t lanes of life? A fter all, as I 've patiently
explained to c ourt authorities a number of times , a "good" driver
is n't nec es s arily a s afer driver; he's jus t a driver who gets where
he wants to go as quic kly as pos s ible.

T he problem is that rec ent res earc h involving s tatis tic ally bas ed
c omputer s imulations s ugges ts that drivers will us ually judge
another lane is moving more quic kly than theirs , even if it is
ac tually moving at the s ame s peed! T his mis perc eption, s urvey
res earc h s hows , is enough for mos t drivers to try to c hange into
that other lane.

Skips, Slips, and Epochs

O ur perc eptual world while on a bus y highway or in a traffic jam


c ons is ts of the big truc k in front of us , the c ars we s ee to the
right and left of us , and the poor s ap s tuc k behind us . To judge
our s peed of travel, while we do have a s peedometer, the mos t
c ompelling data tends to be the c ars on either s ide of us . (A re
they pas s ing us or are we pas s ing them? )

Traffic res earc hers c all the times when you are pas s ing other
c ars s kips and the times when other c ars are pas s ing you s lips .
Rec ent res earc h refers to s kips as pas s ing epochs and s lips as
being-overtaken epochs . I t probably does not s urpris e you that
drivers greatly prefer pas s ing epoc hs over being- overtaken
epoc hs .

A n epoch is a period of time. D rivers ' lives


while driving in heavy traffic are
es s entially defined by s eries of epoc hs of
very s hort duration.

I n addition to looking for fas ter lanes to move into, drivers have
another goal, whic h is to keep their own vehic le moving as
quic kly as pos s ible, or at leas t c los e to their target s peed (whic h
might be the s peed limit, for example). I f there are perc eived
gaps between thems elves and the vehic le in front of them, and
they are not c urrently moving at their target s peed, drivers will
ac c elerate to c los e the gap. I t is thes e burs ts of ac c eleration
that ac c ount for the s kips (periods of pas s ing other c ars ) and
s lips (periods of other c ars pas s ing them). We are likely to
experienc e more periods of time when we are being pas s ed than
periods when we are doing the pas s ing. I t is this perc eived
inequity that c an res ult in drivers c onc luding that they are in the
s low lane, even if both lanes are equally s low.

I magine two lanes of traffic s ide by s ide that are moving at the
s ame average s peed. G aps between c ars form randomly; more
ac c urately, they form s ys tematically, but bas ed on a random
s tarting c onfiguration. G aps are filled as they form, and when
gaps are filled, c ars have ac c elerated.

A verage s peed for a lane of traffic c an be


c alc ulated as dis tanc e traveled divided by
a period of time. So, if c ars in two lanes
c over 1 ,0 0 0 yards in five minutes , they
both have the s ame average s peed of 2 0 0
yards per minute, or 6 .8 miles per hour.

D rivers on c rowded roads oc c as ionally have gaps they s eek to


c los e, but they ac tually s pend muc h more time (relatively
s peaking) moving s lowly or not moving at all. D uring thos e times
of s low movement, whic h, of c ours e, take more time, there will
oc c as ionally be c ars in other lanes filling gaps and pas s ing the
drivers in thos e temporarily s low lanes .
A s meas ured by epoc hs , for any one driver there will be more
time s pent being pas s ed than there will be time s pent doing the
pas s ing. T his is bec aus e you pas s while moving quic kly and you
are pas s ed when you are moving s lowly. Figure 6 - 9 paints a
pic ture of this perc eption.

Figure 6-9. The perception of time spent getting


passed

Sitting s till while watc hing other c ars ac c elerate to fill gaps
c reates the illus ion that our lane is moving more s lowly.

Probability and Traffic Patterns

C anadian res earc hers D onald Redelmeier and Robert Tobs hirani,
who c onduc ted c omputer s imulations to determine the ac c urac y
of driver perc eptions of other lanes ' s peed, made s ome
as s umptions about traffic patterns that were bas ed on the
normal dis tribution [H ac k #2 3 ].
To mirror the reality that a partic ular pattern of s pac ing on a
c rowded highway has s everal c aus es (c onditions , exits and
entranc es , and s o on), they randomly as s igned intervals
between moving c ars bas ed on two normal dis tributions : 9 0
perc ent of intervals were about two meters apart, give or take a
1 0 th of a meter, while 1 0 perc ent of the intervals were 1 0 0
meters apart, give or take 5 meters . A t the s tart of eac h of
hundreds of s imulations , c ars were c reated and s pac ed following
this randomization plan.

T he res earc hers c reated data for two lanes of traffic moving in
the s ame direc tion at the s ame s peed, full of hundreds of
imaginary vehic les with typic al ac c eleration and braking
c apabilities . T hey programmed in a s afe driver s trategy of
moving up when there was s pac e in a lane, but not getting too
c los e. T heir s imulated drivers were not allowed to get too c los e
to another vehic le's tailgate. A ls o, they were not allowed to
c hange lanes , whic h mus t have been frus trating for the little
c omputer- c ontrolled drivers . N o ac c idents here.

I n terms of the average ac c eleration and


braking s peed for their s imulated vehic les ,
Redelmeier and T ibs hirani c hos e typic al
s tatis tic al s pec ific ations (the ability to go
from 0 to 6 3 miles per hour in 1 0 s ec onds
and the ability to go from 6 3 miles per
hour to 0 in 5 s ec onds ), whic h happen to
pretty muc h matc h a H onda A c c ord.
Making Wise Lane-Changing Decisions

Redelmeier and Tobs hirani found that 1 3 perc ent of the time,
c ars are either pas s ing or being pas s ed. M os t of the time, c ars
are running equal to eac h other. While there was a better c hanc e
that any partic ular driver was being pas s ed than that s he was
doing the pas s ing, when s he did pas s c ars , s he pas s ed a bunc h.
T he math worked out to a draw in terms of c ars pas s ed and the
number of c ars doing the pas s ing. T he total number of c ars
overtaken by our driver was equal to the number of c ars that
pas s ed her.

U nder c rowded driving c onditions , the other lane will s eem


greener muc h of the time. T here are s ome ways to deal with the
mis perc eption and make wis er (and s tatis tic ally s afer) driving
c hoic es :

A s a logic al s c ientis t, you c an evaluate your driving by


the length of the journey, not by whether you won or los t
the traffic jam c ompetition. I t s houldn't really matter if
you think more c ars pas s ed you than the other way
around.

Keep this other lane is better mis perc eption in mind and
find better ways to judge the s peed of other lanes . P ic k a
unique c ar in the other lane, and after a few minutes
c ompare your pos ition to it. A fter all, there s ometimes
are fas ter lanes than others ; it's jus t that you c an't look
at pas s ing c ars as the bes t evidenc e for s peed.

O n large highways , pic k a lane far to the left or right of


upc oming exits , as traffic exiting and entering the road
is the main c aus e for s low- downs and s peed- ups .

C urb your aggres s ive tendenc ies in both driving and in


c ar purc has ing. I nteres tingly, the s imulations s howed
that aggres s ive driving, s uc h as minimizing the s tandard
following dis tanc e between you and another vehic le, will
ac tually inc reas e the amount of time you'll notic e other
c ars pas s ing you. A ls o, fas ter c ars (thos e that c an
ac c elerate quic kly) s pend les s time pas s ing bec aus e
they c an do it quic ker. So, your s uper- powered s ports
c ar might lead to more frus tration for you on c rowded
highways .

T he wis es t tac tic when it c omes to dealing with the likely


mis perc eption that the other lane is fas ter than yours might be
the s imples t. J us t don't pay attention to it. T he s imulations s how
that if you look at other lanes half as often, you'll notic e c ars
pas s ing you half as often.

I s uppos e, though, that we don't need a s tatis tic al analys is to


tell us this . I ns tead of the c ars bes ide you, pay more attention
to the c ars behind you. You're way ahead of them and there are
thous ands of them. You've already won that game.

See Also

Redelmeier, D .A . and T ibs hirani, R.J . (1 9 9 9 ). "Why c ars


in the next lane s eem to go fas ter." Nature, 401, 3 5 . T he
original s tudy reporting this mos t rec ent traffic analys is .
Redelmeier, D .A . and T ibs hirani, R.J . (2 0 0 0 ). "A re thos e
other drivers really going fas ter? " Chance, 13, 3 , 8 - 1 4 . A
more detailed des c ription of the findings reported in the
Nature artic le.
Hack 75. Seek Out New Life and New
Civilizations

The search f or extraterrestrial lif e is alive and well. You can use
statistical sampling and probability to f ocus the search.

T he s c ientific ques t to make c ontac t with life on other worlds


requires that dec is ions be made. Firs t, one mus t dec ide if life
exis ts at all beyond on our own planet (mine's E arth, what's
yours ? ). Sec ond, one mus t determine how and where to look for
it. You c an apply s tatis tic al proc edures to make both thes e
dec is ions .

Estimating the Number of Smart Planets

I n 1 9 6 1 , Frank D rake, an as tronomer who was interes ted in


looking at the univers e from afar by reading radio waves (a
bunc h of whic h are bounc ing off E arth all the time), dec ided to
es timate how many other tec hnologic ally advanc ed c ivilizations
probably exis t.

Being a little M ilky Way- c entric , he was mos t interes ted in


determining the number of advanc ed worlds (planets willing and
able to talk with us ) that are nearby, in our own galaxy. D rake
s ugges ted this equation:

Table 6 - 2 3 s hows the meanings of the abbreviations in D rake's


equation.

Table Drake equation components


Term Meaning
Rate at which new stars are produced in
R
the galaxy (per year)
Nh Average number of planets orbiting each
star that can support life
Proportion of planets (from Nh ) on which
Fl
life does develop
Proportion of planets (from Fl) on which
Fi
intelligent life develops
Proportion of planets (from Fi) on which
Fc
civilizations develop
Average lifetime (in years) of civilizations
L (from Fc)

T he formula is really nothing more than a c hain of probabilities .


T he number of expec ted pos itive outc omes is determined by
multiplying all the s eparate likelihoods together. T hough a
s impler equation without all the different permutations of F would
work jus t as well, the s pec ific different c omponents were
inc luded to help s c ientis ts identify the important ques tions that
needed to be ans wered to es timate the probability that we are
not alone.
Applying Drake's Equation

To c alc ulate a realis tic number of planets in our galaxy that


c urrently have intelligent life, you have to plug in s ome realis tic
numbers . A ls o, we know that the correct ans wer (the s olution)
mus t be at leas t 1 , bec aus e there is intelligent life on E arth
(ins ert your own joke here), and mus t be no more than
2 5 0 ,0 0 0 ,0 0 0 ,0 0 0 (the number of s tars in the M ilky Way) times
the average number of planets around s tars that c ould s upport
life.

When the equation was firs t introduc ed, only one of the terms
c ould be es timated with any c ons ens us among as tronomers . R,
the number of new s tars produc ed in our galaxy eac h year, is
believed to be about 1 0 .

I f R were known to be 1 0 in the 1 9 6 0 s , I


gues s the c orrec t number of s tars in our
galaxy would be c los er to 2 5 0 billion + 4 0 .

I n 1 9 8 0 , C arl Sagan, popularizer of as tronomy, dis c us s ed the


D rake equation in his televis ion s eries and book, Cos mos .
Bec aus e we knew les s about the planets in our own s olar s ys tem
then and, more importantly, knew nothing about planets in other
s olar s ys tems (or even if there were s uc h things ), Sagan's
es timates for eac h value and his bes t- gues s s olution was
s omewhat s pec ulative, but his ans wer was that about s ix million
planets in the M ilky Way at any given time have the tec hnology
to c ommunic ate with us .

U s ing what we know today, Table 6 - 2 4 provides one s et of


values that produc es one pos s ible ans wer. T hes e values are
taken from an es s ay in an O c tober 2 0 0 5 edition of As trobiology
Magazine (you probably have a c opy on your c offee table) by D r.
Steven Soter of N ew York U nivers ity. I n s ome c as es , I c hos e an
exac t value from Soter's dis c us s ion of a range of values .

Table One application of Drake's equation


Term Estimates Calculations
R 10 per year 10
Nh .01 (1 planet out of 100
10x.01 = .10
stars)
Fl 1 (assuming Earth is
.10x.10 = .10
representative)
Fi .001 (Soter suggests
.10x.001 = .0001
"small fraction")
Fc .0001x.20 =
.20
.00002
.00002x100,000 =
L 100,000 years
2

With thes e numbers , the equation es timates a total of two


planets in the entire galaxy who c ould c ommunic ate with eac h
other at any given time. E arth is one of thos e. What is the other?
A s Sagan, Soter, and other authors point out, the number of
planets in our galaxy that s upport advanc ed life at any given
time depends on s o many arbitrarily es timated fac tors that any
little c hoic e one makes when entering values dramatic ally
c hanges the res ult. T here is an important differenc e between s ix
million pos s ible friends and only two pos s ible friends , but both
es timates c ome from reas onable s ets of as s umptions .

N otic e how the s olution to the equation c hanges as you try


different es timates for eac h c omponent. I f mos t groups of
intelligent c reatures s ay, 8 0 perc ent, for exampleeventually
produc e c ivilizations , the number of s mart planets jumps to
eight. I f the average number of planets around a s tar that c ould
s upport life is ac tually 2 (as Sagan s ugges ts ), our 8 would jump
to 1 ,6 0 0 planets .

Soter advis es that different reas onable es timates c ould produc e


an ans wer between a c ouple thous and and s o few that our own
planet's radio c apabilities make it a s tatis tic al improbability,
plac ing us as the only advanc ed c ivilization ac ros s many
thous ands of galaxies .

Finding our Space Chums

O ne pos s ible outc ome of the D rake equation is that there are
only two planets in our galaxy with advanc ed intelligent
c ivilizations c apable of s ending and rec eiving radio waves . I f we
really have only one other potential c os mic pen pal, it will be
tough to find him or her or it in s uc h a large hays tac k of planets .
So, what to do?

T he c urrent s trategy in s eeking new life and new c ivilizations is


to s c an the s kies with mic rowave rec eivers . Radio s ignals have
a wide range of s pec trums . Some oc c ur naturally, and others are
a partic ularly narrow range that are believed to only be c reated
artific iallys uc h as from the trans mis s ion of Three's Company T V
epis odes , or by radar, for example. By paying partic ular attention
to thos e s ignals that are within this s uppos ed artific ial
s pec trum, thos e who s earc h for alien life forms hope to dis c over
and is olate either the random output of an advanc ed c ivilization
or, perhaps , intentional s ignals broadc as t for the benefit of any
interes ted obs erver.

I f you own your own array of mic rowave


lis tening s tations , you'll want to tune them
to the favored frequenc y for hunting life on
other planets : 1 .4 2 gigahertz. I t is
believed unlikely that any natural s ourc e
would emit waves at that frequenc y.

T he s ky is big, though, and res earc hers us e both targeted and


c onvenienc e s ampling tec hniques to dec ide where to look. T he
s earc h s trategy is to foc us on a s ubpopulation of s tars that meet
two c riteria:

T hey are s uns that s hare c harac teris tic s of our own.

T hey are nearby (within a mere 1 0 0 light years of E arth).


Data Analysis

I f the number of planets that c ould be emitting thes e key s ignals


of life is very s mall (as s ome of the D rake equation permutations
s ugges t), a s earc h of this s ample mus t be very thorough;
otherwis e, we might mis s it. Statis tic ians would refer to this
s ituation as a s tudy that needs a great deal of power [H ac k #8 ]
bec aus e the effect s ize is s o s mall.

T here is s o muc h data being c ollec ted as part of s ys tematic


efforts to s c an the s kies , no one pers on or even one c omputer
c an pos s ibly analyze it all. You c an help! SE T I @ home is a
Berkeley U nivers ity- bas ed program that arranges for regular
people with regular home or offic e c omputers to rec eive s ome of
this data, s o their c omputers c an analyze it when they're not
doing s omething els e. SETI is the ac ronym for Searc h for
E xtraterres trial I ntelligenc e. T he program works like a
s c reens aver and c an be downloaded for free at
http://s etiathome.berkely.edu.

T he data won't make s ens e to you when you get it, but your
c omputer will begin to us e s tatis tic al analys es to s ort through
the s ignal information, looking for the telltale nonrandom narrow
bandwidths that might mean another planet has reac hed the
level of s ophis tic ation to produc e s omething like Gomer Pyle or
Melros e Place. You c ould be the firs t to dis c over life on other
planets , s o get to work!
Colophon
O ur look is the res ult of reader c omments , our own
experimentation, and feedbac k from dis tribution c hannels .
D is tinc tive c overs c omplement our dis tinc tive approac h to
tec hnic al topic s , breathing pers onality and life into potentially
dry s ubjec ts .

T he tool appearing on the c over of Statis tic s H ac ks is a C hines e


abac us , or s uanpan. C enturies before the emergenc e of the
written H indu- A rabic numeral s ys tem, the abac us , often
c ons truc ted of a wooden frame with beads s liding on wires , was
us ed as a c alc ulation tool. H is torians plac e its invention
between 2 ,4 0 0 and 3 0 0 BC . A t that time, when mos t people
c ould not read or write, it might have s eemed ridic ulous to
s c ribble s ymbols on expens ive papyrus when s uc h an exc ellent
c alc ulating devic e was available. T he s uanpan differs from the
E uropean abac us in that its board is s plit into two parts . T he
lower part holds five c ounters on eac h wire; the upper s ec tion
holds two. C omplex s uanpan tec hniques ac c omplis h not only
s imple addition, but als o multiplic ation, divis ion, s ubtrac tion,
and s quare and c ube root operations effic iently.

T he c over image is a s toc k photograph from C M C D E veryday


O bjec ts . T he c over font is A dobe I T C G aramond. T he text font
is L inotype Birka; the heading font is A dobe H elvetic a N eue
C ondens ed; and the c ode font is L uc as Font's T heSans M ono
C ondens ed.
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

! (exclamation point)
<Emphasis>d</> (effect size) 2nd

<Emphasis>t</> tests
comparing groups

effect size 2nd 3rd

establishing validity

measuring relationships

purpose

sample distribution

variance and

Z-test and
Index

[SYMBOL] [A] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

accuracy, concept of 2nd

Aces
counting

Texas Hold 'Em and

ACT (American College Test)


standardizing scores 2nd

test versions and

z scores and
additive rule 2nd 3rd
Adler, Joseph
aliquot parts
all-in
analysis level (learning) 2nd
analysis of answ er options
analysis of variance

answer options
analysis of

multiple-choice questions 2nd 3rd 4th


Apple iTunes
application level (learning)
artificial intelligence
averages
axes, graphs and
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

Bach, Johann Sebastian 2nd


Bacon, Kevin 2nd 3rd

bar bets
designing

dice and

li'l flushes

matches in tw o card decks

sharing birthdays 2nd


bar charts 2nd
base invariance
base rates 2nd 3rd
baseball games
batting averages
Bayer, Dave 2nd
Bayes, Thomas 2nd
Becker, T. J.
behavior, driving
being-overtaken epochs
Benford's law
Benford, Frank 2nd
Bernoulli, Daniel 2nd
Bernoulli, Jakob 2nd
betting systems
biased samples 2nd
big blind
Big Slick
binomial coefficient 2nd
binomial distribution 2nd 3rd
Binongo, J.N.G.
birthdays, sharing
blackjack
blinded out
Bloom's Taxonomy
Bloom, Benjamin 2nd 3rd
bottle-cap effect 2nd 3rd
breast cancer screening
Brow ne, M.
Buffon's Needle Problem
bust (blackjack) 2nd
Butler, Bill
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

Campbell, D.T. 2nd


canonical variate analysis 2nd

card games
card-sharping

counting cards 2nd

getting lucky

matches w ith tw o decks

probabilities and

rank ordering 2nd 3rd 4th

shuffling cards for

w ild cards
card tricks
card-sharping

casinos
card counting in

improving chances against

money and 2nd

profit on roulette
categorical measurement
categorical variables

cause-and-effect relationships
correlation and

lottery numbers and

show ing

Central Limit Theorem


<Emphasis>t</> tests and

beauty of

overview
central tendency, measures of

chi-square test
one-w ay
tw o-w ay
ciphers, decoding
classical test theory 2nd
cluster sampling
coefficient alpha 2nd 3rd

coin toss
heads or tails

Law of Total Probability

possible outcomes

probability of patterns 2nd

St. Petersburg Paradox


coincidences, interpreting
collecting data
Collins, Truman
combinations 2nd 3rd

community cards
flop as

improving hands and 2nd

reading quickly

comparison groups
<Emphasis>t</> test

pretests and
comprehension level (learning)
CONCATENATE function
concurrent validity
conditional probabilities 2nd 3rd

confidence intervals
building 2nd 3rd 4th

Gott's Principle 2nd

normal curve and

standard errors and 2nd


conjunction fallacy
conjunction rule
Conjunctionitus
connections
consequences-based arguments (validity) 2nd 3rd
constants, linear equations and 2nd
construct-based arguments (validity) 2nd 3rd
constructs
content-based arguments (validity) 2nd 3rd
contingency table analyses

continuous values
discrete values vs.

graphs and 2nd 3rd


control groups
convenience sampling
Cook, T.D. 2nd
Copernican Principle
correct answ ers 2nd

correlation
betw een variables 2nd

cause and effect and

defined 2nd

direction of

effect size standards

factor analysis and

negative 2nd

partial

positive 2nd 3rd 4th

predictor variables and 2nd

standard error of the estimate and


statistical significance and 2nd

test reliability and 2nd

variable groupings

variance and

z scores and

correlation coefficient
(<Emphasis>r</>)
beauty of

computing

conditions

defined 2nd 3rd

effect size standards

establishing validity

formula

interpreting

linear regression and 2nd 3rd

counting method
counting cards 2nd

rule of four
craps (dice game)

criterion variables
defined

multiple regression and

predictor variables and


criterion-based arguments (validity) 2nd

criterion-referenced scores
defined

functionality 2nd
Cronbach's alpha
Cunningham, Ross
cut score
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

Darw in, Charles


Data Analysis ToolPak (Excel) 2nd 3rd
data sets, checking authenticity
decoding ciphers

dependent variables
<Emphasis>t</> tests and 2nd 3rd

defined
descriptive statistics 2nd
Diaconis, Persi 2nd

dice roll
gambler's fallacy about 2nd

Law of Large Numbers and

likelihood of group of outcomes


likelihood of series of outcomes

likelihood of specific outcome 2nd

overview
dichotomous variables
difficulty index 2nd
discrete values, continuous vs.
discriminant analysis
discrimination index 2nd 3rd 4th

distances
in distributions

interval level of measurement


distractors

distribution [See also normal distribution]


binomial 2nd 3rd

chi-square values

defined

dice outcomes 2nd

distances in

histograms and
letters in English alphabet

probabilities for cards

probability and

sample

standard deviation of 2nd 3rd

standardized score

Texas Hold 'Em

w ell-defined

z score
double dow n 2nd
dovetail shuffles 2nd
Dow ning, S.M
Drake equation
Drake, Frank
draw poker

drawing the card you want [See


drawing the card you want (see
outcomes\\]
driving, lane-changing w hile
dynamic programming
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

Edw ards, L.M.

effect size
applying interpretations

pow er and

sample size and 2nd

statistical significance and 2nd


elections, voting cycle
epochs 2nd

error [See also standard error]


hypothesis testing and

random 2nd 3rd

sampling 2nd
trial and error

Type 1 error

Type I error

Type II error 2nd


error score
ESP (extra-sensory perception)
<Symbol>h<Default Para Font><!sthkSuperscript>2<Default Para
Font> (eta-squared) 2nd
ETAOIN SHRDLU
evaluation level (learning) 2nd 3rd

Excel (Microsoft)
DATAS softw are

histograms

predicting football games


exclamation point (!)
expected payoff 2nd
expected value

experimental designs
comparison groups 2nd

defined
effective 2nd

validity of
experimental groups
extra-sensory perception (ESP)
extraterrestrial life
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

face validity argument


factor analysis 2nd
factorials
fair payouts
Faw cett, W.
feedback, trial-and-error learning
Ferris, Timothy
first significant digit law
Fisher, R. A.

flop
defined

improving hands and

pot odds after

football
betting on

tw o-point conversion
fractions

frequency tables
dice rolls

percentile ranks 2nd


Frey, Bruce
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

Galton, Francis
gambler's fallacy 2nd

gambling
basic truths

blackjack

card-sharping

coin toss

designing bar bets

know ing your limits

lottery

playing cards
playing w ith dice

pot odds

roulette

rule of four

sharing birthdays

short-stacked

w ild cards

game playing
card tricks

estimating pi

histograms in Excel

iPods and

Monopoly

predicting baseball games

predicting game w inners

random selection and

ranking players
strategies

tw o-point conversion
game show s 2nd

games of chance
fair payouts

Monopoly

roulette as
general universe

generalizations
cause-and-effect and 2nd

inferential statistics and

samples and
Gigerenzer, G.
Gilligan's Island 2nd
GMAT (Graduate Management Admission Test)
Golden Theorem
goodness-of-fit statistic 2nd
Gott's Principle
Gott, J. Richard, III 2nd 3rd
Grandparent Paradox
graphing relationships 2nd
GRE (Graduate Record Exam)
group designs 2nd 3rd
Guare, John
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

Haladyna, T.M.
Hale-Evans, Ron
Hall, P.
Hanks, Tom
Hansen, Brian
haphazard sampling
Hastings, J.T.
Haw thorne Effect

heads or tails coin toss


Law of Total Probability

overview

randomness and
high scores
higher scores, likelihood of
Hill, Theodore 2nd 3rd 4th
histograms 2nd
hit the nuts
Hitzges, Norm
Hofferth, Jerrod
house edge 2nd 3rd
Huff, D.

hypothesis [See also null hypothesis]


defined

research 2nd 3rd 4th

statistical

hypothesis testing
errors in

interpreting findings

rejecting null 2nd

about relationships 2nd


Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

implied pot odds 2nd


independent events 2nd 3rd
independent variables 2nd 3rd 4th

inferential statistics
controversial tools

defined 2nd

overview

populations and

relationships and

samples and
insurance in card games
intelligence tests 2nd
inter-rater reliability
INTERCEPT function
internal reliability 2nd

interval level of measurement


controversial tools

defined 2nd

negative numbers

pow er of

strengths/w eaknesses 2nd


iPods
item analysis
iTunes (Apple) 2nd
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

Jordan, C.T.
judgment sampling
Jung, Carl
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

Kahneman, Daniel 2nd 3rd


Kennedy, John F.
know ledge level (learning) 2nd
know n information 2nd
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W]
[X] [Y ] [Z]

labels
lane-changing decisions
law of finite pocket size
Law of Large Numbers 2nd
Law of Total Probability

learning
cognitive levels of 2nd

trial-and-error
Leclerc, Georges-Louis 2nd
Let's Make a Deal

level of significance [See statistical


significance]
Levy, Steven
li'l flushes
lifetime, predicting length of

likelihood of outcomes [See


likelihood of outcomes (see
outcomes\\]
Lincoln, Abraham
line charts 2nd

[See also multiple


linear regression

regression]
graphing relationships

multiple predictor variables

predicting outcome of events


Lithgow, John
LOG function
Lohmeier, Jill 2nd 3rd
Lois Lane
Lord, Frederick
lottery
low scores
LSAT (Law School Admission Test)
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M ] [N] [O] [P] [Q] [R] [S] [T] [V ] [W]
[X] [Y ] [Z]

Madaus, G.F.
magic number, lotteries and
MANOVA (multivariate analysis of variance)
MCAT (Medical College Admission Test)

[See also standard error of the


mean

mean]
ACT

calculating

Central Limit Theorem

central tendency and

cut score and 2nd

defined 2nd
effect size and

linear regression and

normal curve and 2nd

normal distribution

precision of

predicting test performance 2nd

regression tow ard 2nd 3rd

T scores

z score 2nd 3rd

[See also standard error of


measurement

measurement]
<Emphasis>t</> tests

asking questions

categorical

converting raw scores

defined

effect of increasing sample size


Gott's Principle

graphs and

improving test scores

levels of 2nd

normal distribution

percentile ranks

precise

predicting w ith normal curve

probability characteristics

reliability of

standardized scores 2nd

testing fairly

validity of 2nd 3rd


measures of central tendency

median
central tendency and 2nd 3rd

defined

normal curve and


medical decisions
Michie, Donald

Microsoft Excel
DATAS softw are

histograms

predicting football games


Milgram, Stanley 2nd 3rd 4th
mind control
Minnesota Multiphase Personality Inventory-II test
mnemonic devices

mode
central tendency and 2nd

defined

normal curve and

models
building 2nd

defined

goodness-of-fit statistic and

money
casinos and 2nd

infinite doubling of
Monopoly
Monty Hall problem

multiple choice questions


analysis of answ er options

w riting good 2nd 3rd

multiple regression
criterion variables and

defined

multiple predictor variables

predicting football games


multiple regression)
multiplicative rule 2nd
multivariate analysis of variance (MANOVA)
mutually exclusive outcomes
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

negative correlation 2nd


negative numbers
negative w ording
New comb, Simon 2nd
Nigrini, Mark 2nd 3rd 4th 5th 6th
nominal level of measurement 2nd 3rd
non-experimental designs

norm-referenced scoring
defined 2nd

percentile ranks

simplicity of

normal curve
Central Limit Theorem and
overview

precision of

predicting w ith

z score and 2nd

normal distribution
applying characteristics

iTunes shuffle and

overview

shape of

traffic patterns

null hypothesis
defined

errors in testing

Law of Large Numbers and

possible outcomes

purpose 2nd 3rd

research hypothesis and


statistical significance and
nuts 2nd
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

O'Reilly Media 2nd


observed score 2nd 3rd

odds [See also odds: (see also


gambling\\] [See also odds: (see
also gambling\\]
figuring out 2nd

pot odds 2nd

Pow erball lottery


one-w ay chi-square test
ordering scores
ordinal level of measurement

outcomes
blackjack 2nd

coin toss

comparing number of possible 2nd

dice rolls 2nd

gambler's fallacy about

identifying unexpected

likelihood of 2nd

mutually exclusive

occurrence of specific 2nd

predicting 2nd

predicting baseball games

shuffled deck of cards

spotting random

trial-and-error learning

tw o-point conversion chart and


outs
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W]
[X] [Y ] [Z]

p-values
pairs of cards, counting by
parallel forms reliability
partial correlations
Party Shuffle (iTunes) 2nd 3rd 4th
Pascal's Triangle
Pascal, Blaise
passing epochs

payoffs
expected 2nd

magic number for lotteries

Pow erball lottery


Pearson correlation coefficient 2nd
Pedrotti, J.T.
percentages
ratio level of measurement

sample estimates

of scores
percentile ranks 2nd

performance
criterion-based arguments

ranking players
permutations 2nd 3rd
Petersen, S.E.
Peyton, V.
Phye, G.D.
pi, estimating
pivot tables
plain text
pocket pair 2nd
Poe, Edgar Allen

point system
ranking players
Poisson, Sim\x8e on-Denis

poker games [See also Texas Hold 'Em]


odds for royal flush 2nd

shuffling cards for

w ild cards

populations
Central Limit Theorem

defined 2nd

effect size and 2nd

hypothesis testing and 2nd 3rd

inferential statistics and

linear regression and

normal distribution and

samples representative of 2nd 3rd

positive correlation
defined

variables and 2nd 3rd


post-tests
pot odds 2nd

power
effect size and

Law of Large Numbers and

statistical significance and


Pow erball lottery
pre-experimental designs

precision
calibrating 2nd

interval level of measurement

test scores and 2nd

predictions [See also probability]


baseball game outcomes

coin toss outcomes

criterion-based arguments and

game w inners

length of lifetime

likelihood of higher scores

normal curve and

outcome of events
regression analysis and

test performance
predictive validity

predictor variables
combining

defined

predicting football games


pretests 2nd

probability
additive rule

analytic view of

basic equation

blackjack and 2nd

of card distributions

chances of scoring w ithin range

coin toss patterns 2nd

conditional 2nd 3rd

confidence intervals and


Conjunctionitus and

decoding ciphers

defined

dice outcomes 2nd

expressing distributions as

extraterrestrial life

focusing on specific thoughts

gambler's fallacy and

of given events 2nd

of independent events 2nd 3rd

li'l flushes

likelihood of group of outcomes

likelihood of series of outcomes

likelihood of specific event

multiplicative rule

mutually exclusive outcomes 2nd

normal curve and


p-values

Pascal's Triangle

patterns of

pi and

Pow erball lottery

sensitivity and specificity

series of events occurring

sharing birthdays

statistical hypothesis testing

traffic patterns

tw o-point conversion chart and 2nd

various digits

w ild cards and 2nd 3rd

w ord association and

zonks and
probability) 2nd 3rd

proportions
level of measurement and
normal curve and 2nd

ratio level of measurement

sample estimates 2nd 3rd

table of areas under the normal curve

z score and
proxy variables
psychic abilities
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

quasi-experimental designs 2nd

questions
asking

difficulty index
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

Ramseyer, Gary
random data
random error 2nd 3rd 4th

random sampling
Benford's law 2nd

defined 2nd

random selection
game playing and

iTunes option
random shuffle
random thoughts
randomness of life

ranking
determining for players

order of cards 2nd 3rd


ratio level of measurement 2nd 3rd

raw scores
converting to z scores

standardized scores and


Redelmeier, Donald 2nd 3rd

regression [See regression (see


linear regression\\]
regression tow ard the mean
reinforcement

relationships
averages in

comparing groups

determining standard error

discovering 2nd

discrete sampling

effect sizes

efficient sampling
graphing 2nd

hypothesis testing and 2nd

identifying unexpected

identifying unexpected outcomes

lottery numbers and

predicting outcomes

show ing cause and effect

six degrees of separation

statistical inference

statistical significance of

reliability
defined

of medical screening tests

standardized tests and

test score precision


reliability theory 2nd

research designs
categories of 2nd
threats to validity
research hypothesis 2nd 3rd 4th
response rate
Rhine, J. B.
riffle shuffles 2nd 3rd
rising sequences
river
Rodriguez, M.C.
Rothman, Ernest E.

roulette
fair payouts

gambler's fallacy about

overview
ROUND function 2nd
ROUNDDOWN function 2nd
ROUNDUP function 2nd
rule of four
rule of tw o
rule of tw o plus tw o
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

Sackrow itz, Harold 2nd


safe cracking
Sagan, Carl 2nd
Salkind, Neil 2nd

sample size
coin toss and

effect size and 2nd

sampling error and

statistical significance and 2nd

samples
<Emphasis>t</> tests and

cluster
defined 2nd

discrete/continuous objects in

efficient

extraterrestrial life

inferential statistics and

predicting baseball games

statistical significance of
sampling errors 2nd
sampling frame 2nd
sampling unit 2nd
SAS softw are
SAT 2nd
Saxbe, Darby
scale invariance

scores [See also scores: (see also


test scores\\] [See also scores: (see
also test scores\\]
central tendency of

consistency in

correlation coefficient
defined

error

high

level of measurement

likelihood of higher

observed 2nd 3rd

ordinal level of measurement

percentage of

predicting 2nd

reliability of 2nd 3rd

standardized 2nd

T scores 2nd

true 2nd 3rd

validity of
Search for Extraterrestrial Intelligence (SETI)
sensitivity
serendipity, interpreting
SETI (Search for Extraterrestrial Intelligence)
Shadish, W.R.
shared variance 2nd
short-stacked
shuffling cards

significance [See statistical


significance]
simple linear regression [See linear
regression]
single substitution format
six degrees of separation
six-sided dice 2nd
skips
Skorupski, William 2nd
slips
SLOPE function
slot machines
small-w orld problem 2nd 3rd
Smart Shuffle (iTunes)
Smith, Will
Soter, Steven 2nd
Spears, Britney 2nd
species, discovering
specificity
splitting hands 2nd 3rd
SPSS softw are 2nd
St. Petersburg Paradox

standard deviation
ACT

Central Limit Theorem

cut score and

defined 2nd

of distributions 2nd 3rd

effect size and

formula

linear regression and 2nd

normal curve and 2nd 3rd 4th

regression formulas and

standard error of measurement

standard error of the estimate

standard error of the mean

standardized w eights and

T scores

z score
standard error
calibrating precision 2nd

defined

determining

Law of Large Numbers

overview

standard error of measurement


defined 2nd

formula for 2nd 3rd

scores and

standard deviation and

standard error of the estimate


applying

defined 2nd

determining

regression analysis and

standard error of the mean


applying

defined 2nd 3rd 4th

standard error of the proportion


applying

defined

sample sizes and


standardized scores 2nd
standardized w eights
Stanford-Binet Intelligence Test
Stanley, J.C.
statistical hypothesis

statistical significance
Central Limit Theorem and

chi-square values

correlation and 2nd

heads or tails

increasing pow er

judging importance

table of areas under the normal curve


stem 2nd 3rd
stock market 2nd
Stockburger, David
stratified sampling
street addresses
stylometrics
substitution ciphers
sucker bets
Superman
surrogate variables
synchronicity
synthesis level (learning) 2nd
systematic sampling
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W]
[X] [Y ] [Z]

T scores 2nd
table of areas under the normal curve
table of specifications
tax returns, fraudulent

test scores
establishing reliability

establishing validity

improving

norm-referenced scoring

precision for

predicting performance

regression tow ard the mean


standard error of measurement

statistical significance
test-retest reliability 2nd 3rd

testing
fairly

improving scores

validity in
tests of significance

Texas Hold 'Em


improving skills

odds for royal flush

pot odds

ranking players

rule of four

short-stacked
thoughts, random
Tibshirani, Robert 2nd 3rd
Tic-Tac-Toe
traffic patterns
trial-and-error learning
true score 2nd 3rd
true zero
turn
TV game show s 2nd
Tversky, Amos 2nd 3rd
tw o-point conversion (football)
tw o-tailed test
tw o-w ay chi-square test
Type 1 error 2nd
Type I error
Type II error 2nd
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

validity
establishing

scores and

threats to

variables
categorical

cause-and-effect relationships

correlation and 2nd 3rd

criterion 2nd 3rd

dependent 2nd 3rd 4th

dichotomous
discovering relationships

effect sizes

factor analysis and 2nd

groupings in

independent 2nd 3rd 4th

linear regression and

measuring correlation

predicting outcomes of events

predictor 2nd 3rd

proxy

surrogate

variance
correlation and

defined 2nd

shared 2nd
Vermeil, Dick
Vos Savant, Marilyn
voting cycle paradox
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

wagers
betting systems based on

coin toss outcomes

dice and

increasing

roulette and

St. Petersburg Paradox


Watts, D. J.
Wechsler Intelligence Scales
w ell-defined distribution
Wheel of Fortune
w ild cards (card games)
Williams, C.O.
winning events
likelihood of group of outcomes

likelihood of series of outcomes

likelihood of specific event


w ired pair
w ord association
w orking universe
Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

X-axis, graphs and


Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[Y] [Z]

Y-axis, graphs and


Index

[SYMBOL] [A ] [B] [C] [D] [E] [F] [G] [H] [I] [J]
[K] [L] [M] [N] [O] [P] [Q] [R] [S] [T] [V ] [W] [X]
[ Y ] [ Z]

z score
coin toss and

converting raw scores

defined

mean and 2nd 3rd

normal curve and 2nd 3rd 4th 5th

problems w ith

standardized score distribution

standardized w eights and


z score)
Z-test
Zener cards
zero, true
zonks

You might also like