Professional Documents
Culture Documents
Revenge of The White Swan
Revenge of The White Swan
net/publication/4741329
CITATIONS READS
3 664
1 author:
Robert Lund
Clemson University
141 PUBLICATIONS 3,170 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Robert Lund on 29 November 2015.
Robert Lund
have paramount importance in life, but are generally underap-
preciated. Taleb works in finance, an area notorious for unpre-
This article comprehensively reviews the recent controver- dictability. In this setting, black-swan events typically involve
sial book The Black Swan: Impact of the Highly Improbable. stock market sell-offs, which frequently happen with little ad-
Whereas the book is statistically reckless about many issues in vance warning. Taleb’s premise is that black-swan events also
our profession, I also find reason to recommend perusal of its permeate life’s everyday aspects. A priori, who could have pre-
contents. dicted 9/11, the invention and subsequent evolution of the Inter-
net, or the serendipitous circumstances surrounding one’s first
KEY WORDS: Extremes; Normality; Philosophy; Probability; encounter with their current love interest? Taleb also believes
Random sampling. that we attach bogus explanations to black-swan events in hind-
sight; the best rationale for why some events happened is truly
randomness.
The Black Swan: The Impact of the Highly Improbable is not I concur with the book’s central premise that the extremes of a
your normal (pun intended) statistics book. Foremost, the au- distribution are frequently more important (and impactful) than
thor, Nassim Nicholas Taleb, insults his audience more than a the means. To appreciate this, simply watch an NBA basketball
vintage Don Rickles performance. Statisticians raised on math- game (extreme height), PGA golf tournament (extreme hand-to-
ematical discourse will find the author’s philosophical and con- eye coordination and touch), or a local newscast (a summary of
descending prose entertaining, if not awkward. The book fires the days’ extreme events). Taleb refers to those who dwell on the
shots at economists, philosophers, social scientists, and statisti- central tendencies of a distribution as mediocristans. While the
cians, with tie-wearing financial managers and the French taking world certainly has its share of mediocristans, Taleb completely
the bulk of the trauma. Certainly, Taleb prefers to raise his voice ignores those of us who study extremes . . . and there quite a few
rather than reinforce his argument. Between the barbs, Taleb extremeophiles about. This brings us to my primary complaint
makes many excellent points that merit discussion. The author with the book: it is frequently so mathematically naive as to make
is also reckless at times and subject to grandiose overstatements; statisticians (and many others) look unjustifiably foolish.
the professional statistician will find the book ubiquitously naive. It took me about a week to read this book. During this week, I
Many of Taleb’s remarks (e.g., that “ . . . the bell curve is . . . vowed to scrutinize my local world for evidence of extremes. Are
[the] . . . Great Intellectual Fraud”) deserve a response from our we really ignoring them? It was a bad week for Taleb. Foremost,
community. I was on a winter ice-climbing trip in the Colorado Rockies, a
Ordinarily, a book like this gets only a short review in The non-mediocristan activity by most people’s standards. Traveling
American Statistician. The increased airplay here is justified in to Colorado, the airline reminded me before takeoff that “the
part by the unprecedented success of Taleb’s previous effort, seat cushion could be used as a flotation device in the event of a
Fooled by Randomness: The Hidden Role of Chance in Life and water landing”; the highway State Troopers (watch your speed
in the Markets (Taleb 2001). This predecessor became a mathe- in Georgia) were ticketing the fastest speeders on the drive to
matics bestseller, alerting society to the importance of random- the airport; there was a fire extinguisher in my hotel room (it
ness. Another reason for the heightened coverage is, as this ar- needed recharging); the local highway (U.S. Highway 550 near
ticle’s title suggests, revenge. Many of the claims in The Black Telluride) was equipped with an avalanche house over the road
Swan need clarification and/or moderation. protecting motorists; I was revising a paper (Parisi and Lund
The theme of the current book centers on rare events with in press) that estimates return periods of extreme Atlantic hur-
extreme impact, which the author calls black swans. The term ricanes hitting the United States mainland (incidentally, Parisi
“black swan” stems from the fact that Westerners knew only of works on Taleb’s home turf—Standard & Poor’s on Wall Street),
white swans before Australia, home of black swans, was col- and perhaps most relevant to me, much of the climbing gear I
onized. Taleb’s premise is that black-swan events—happenings was laboriously hauling (ropes, ice screws, etc.) was designed to
that would be difficult to predict with available data at the time— prevent the black-swan event of a deadly fall. Whereas society
could be even more conscious about extremes, Taleb seemed to
Robert Lund is Professor, Department of Mathematical Sciences, Clemson Uni- be overstating the neglect.
versity, Clemson, SC 29634-0975 (E-mail: lund@clemson.edu). Taleb sells himself as an empirical skeptic, someone scornful
2007
c American Statistical Association DOI: 10.1198/000313007X219374 The American Statistician, August 2007, Vol. 61, No. 3 189
of advanced mathematical models, a person who needs to be all possible mathematical limit laws for scaled maximums—it
convinced with raw data. While empirical data analysts are ad- is analogous to a central limit theorem for maximums. By fit-
mittedly underappreciated, the book at times felt logically con- ting the distribution family in (1) to the sample maximums, one
flicted because of its simplicity. Elaborating, Taleb’s stance is intrinsically accounts for the tail features in the data. Normality
that the real world is very complex, something not easily de- assumptions are not necessary nor relevant; the independent and
scribed by simple mathematics. However, advanced mathemat- identically distributed caveat for the sample can even be signifi-
ical models (examples forthcoming) were often dismissed with cantly relaxed (see Leadbetter, Lindgren, and Rootzen 1983). As
nothing more than raw rhetoric. (Taleb concludes that statisti- a whole, extreme value methods are more reliable for estimat-
cians are just better at “smoking you with complicated math- ing tail properties of a random sequence than raw extrapolation
ematical models” than other disciplines.) Subfields of statistics techniques. Yes,√the convergence of the estimate of F is much
tailored to the exact problem at hand were often not mentioned or slower than the n-central limit theorem rate for sample aver-
naively dismissed. This shortcoming is perhaps best illustrated ages (again, there are fewer observations in the extreme tail), but
in the subject of extremes itself, the central topic of the book. the situation is not hopeless, nor does normality enter into the
Extreme value theory is the statistician’s bible for quantify- picture.
ing rare events. Extreme value methods have a long and sto- Established theory also explains why the peaks of many sta-
ried history [see Gumbel (1958) and the references therein]; the tionary stochastic processes occur at times approximately de-
topic even has its own journal (Extremes) and a text devoted scribed by a simple time-homogeneous Poisson process. This
purely to the impacts of extremes in finance and the markets is the so-called peaks-over-threshold paradigm (see Pickands
(Embrechts, Klüppelberg, and Mikosch 1999) which, inciden- 1975). As the interevent times of a Poisson process are expo-
tally, has a picture of an extreme drop in a stock price series nentially distributed and hence memoryless, there is indeed sig-
on its cover. SAMSI’s 2007–2008 program lists Risk Analy- nificant unpredictability in the times of extreme maxima. The
sis, Extreme Events, and Decision Theory as one of its three point is again that things have been accurately quantified in the
major subject foci. Foundational omissions such as this occur probability literature previously.
throughout the book and paint the author poorly to the statistical In fairness to Taleb, his concern lies more with random el-
community. Taleb insinuates that statisticians are totally inept ements not captured in the data sample than with defects of
at quantifying the tails of distributions, and worse, that we de- probability theory or statistical methods. An example may help
fault to Gaussianity to make tail property conclusions (this may clarify: Taleb notes that the six largest losses sustained in Las
be inadvisable as the normal distribution has very skinny tails). Vegas have nothing to do with gambling. One of these losses
While tail features of distributions are indeed harder to model in fact stems from the kidnapping of a casino owner’s daughter,
than means or medians (after all, there are fewer observations something unlikely to be considered in any probability model.
in the tails), the situation is not as hopeless as conveyed. As an While the point is taken, two counterpoints are immediate: (i)
aside, extremes are also an area where limit distributions are de- exactly what is being modeled? and (ii) what does the sampled
cisively non-normal; Taleb misleads the layman with the mantra data represent?
that the statistician’s diet is all normal.
Toward counterpoint (i), Taleb is a true philosopher, never
For concreteness, suppose that we want to estimate the largest
quantifying exactitudes. Mathematics on rigorously posed prob-
earthquake in the United States over the next century, say from
lems is generally dismissed as being impractical. I do agree that
earthquake data from the last 100 years. Taleb insinuates that
statisticians are apt to dwell on technicalities, sometimes to ex-
statisticians merely fit normal distributions to the raw data (say
cess. However, the antipodal position seems equally problem-
earthquake Richter magnitudes, which in a later point, are noted
atic. It is one thing to be smoked with mathematical models; I
to be naturally measured on a logarithmic scale). In truth, statis-
am equally skeptical of those who philosophize so much as to
ticians handle the problem in a well-established manner: fit the
not solve anything. In regard to the Las Vegas example above,
extreme value cumulative distribution function
ground zero should clarify what an insurable loss is. Within the
x − µ −1/ξ context of the particular problem at hand, what is a black-swan
F (x) = exp − 1 + ξ , −∞<x <∞ event? Most insurance policies seem very specific on what they
σ +
will pay out for. Taleb wants to be viewed as an expert, but also
(1)
does not want to be pinned down.
to the largest quake observed during the each of the last 100 On counterpoint (ii), the pitfalls of nonrandom samples and
years, say from yearly maximum earthquake magnitudes, which lack of control groups are well documented in elementary texts
we denote by X1 , . . . , X100 . Here, x+ = max(x, 0) and ξ , µ, (e.g., Aliaga and Gunderson 1998). The first third of the book
and σ > 0 are the three parameters of the extreme value family contains many anecdotal stories of defective random samples.
(the case where ξ = 0 is interpreted as a limit as ξ → 0 in Statisticians will recognize several classic sampling pitfalls, mis-
(1) and is called a Gumbel distribution). The result is an esti- takes involving nonresponse bias and length-biased sampling.
mate of the distribution of the largest earthquake magnitude in These problems are taught to statistical literacy 101 classes; it
the United States in any year; raise this to the 100th power to concerns me that the faux pas are not mentioned by name. While
get the distribution of the largest quake over a century [year-to- Taleb is quite entertaining here, the stories also collectively im-
year correlation in the annual maximums is not overly problem- part a “divergent irrelevancies” feel, maybe best described as
atic as theory explains why extremes are less correlated in time a random sample of random sampling blunders. Of course, true
than sample means; see Hsing (1995)]. Equation (1) contains random samples are painful, if not impossible, to take. And while